Prerequisites for graduation: Python collects global epidemic data and makes visual analysis

Knowledge points:

  1. The basic process of crawler
  2. requests send a request
  3. re regular expression
  4. json structured data parsing

Development environment:

  • python 3.8: interpreter
  • pycharm: code editor
  • requests send a request
  • pyecharts draw charts
  • pandas read data


Simulate the process of sending a request from a browser/client to a server


find data sources

Static data: data you can find by right-clicking to view the source code of the web page
Dynamic data: You are right-clicking to view the source code of the web page. Data that cannot be found

The process of implementing the crawler code:

  1. Send a request (access the data source above / visit the website by means of the code)
  2. retrieve data
  3. Analytical data
  4. save data

Collection code

import requests     # send request
import csv          # Built-in modules do not need to be installed by you

mode='a': append write

encoding='utf-8': encoding method / gbk

newline='': data blank line

f = open('Epidemic data.csv', mode='a', encoding='utf-8', newline='')
csv_writer = csv.writer(f)
csv_writer.writerow(['name', 'confirm', 'confirmAdd', 'dead', 'heal', 'nowConfirm'])

headers masquerading public data

url = ',WomWorld,WomAboard'

send request

response =

<Response [200]>: 200, the request was successful

retrieve data

.text: Get the text content directly

.json(): dictionary key-value pair to get data out

.content: get binary content, video/audio/picture

json_data = response.json()

Analytical data

very standard structure

Structured data json data Get values ​​directly through dictionary key-value pairs ['data'] ['WomAboard']

Unstructured data Web page source code css/xpath/re

###
WomAboard = json_data['data']['WomAboard']
# 0, 224
for i in range(0, 225):
    name = WomAboard[i]['name']
    confirm = WomAboard[i]['confirm']
    confirmAdd = WomAboard[i]['confirmAdd']
    dead = WomAboard[i]['dead']
    heal = WomAboard[i]['heal']
    nowConfirm = WomAboard[i]['nowConfirm']
    print(name, confirm, confirmAdd, dead, heal, nowConfirm)

save data

    csv_writer.writerow([name, confirm, confirmAdd, dead, heal, nowConfirm])

Visualize the code

###<—— Source code collection
name_map = {
    'Singapore Rep.': 'Singapore',
    'Dominican Rep.': 'Dominica',
    'Palestine': 'Palestine',
    'Bahamas': 'Bahamas',
    'Timor-Leste': 'East Timor',
    'Afghanistan': 'Afghanistan',
    'Guinea-Bissau': 'Guinea-Bissau',
    "Côte d'Ivoire": 'Côte d'Ivoire',
    'Siachen Glacier': 'Siachen Glacier',
    "Br. Indian Ocean Ter.": 'British Indian Ocean Territory',
    'Angola': 'Angola',
    'Albania': 'Albania',
    'United Arab Emirates': 'United Arab Emirates',
    'Argentina': 'Argentina',
    'Armenia': 'Armenia',
    'French Southern and Antarctic Lands': 'French Southern Hemisphere and Antarctic Territories',
    'Australia': 'Australia',
    'Austria': 'Austria',
    'Azerbaijan': 'Azerbaijan',
    'Burundi': 'Burundi',
    'Belgium': 'Belgium',
    'Benin': 'Benin',
    'Burkina Faso': 'Burkina Faso',
    'Bangladesh': 'Bangladesh',
    'Bulgaria': 'Bulgaria',
    'The Bahamas': 'Bahamas',
    'Bosnia and Herz.': 'Bosnia and Herzegovina',
    'Belarus': 'Belarus',
    'Belize': 'Belize',
    'Bermuda': 'Bermuda',
    'Bolivia': 'Bolivia',
    'Brazil': 'Brazil',
    'Brunei': 'Brunei',
    'Bhutan': 'Bhutan',
    'Botswana': 'Botswana',
    'Central African Rep.': 'Central African Republic',
    'Canada': 'Canada',
    'Switzerland': 'Switzerland',
    'Chile': 'Chile',
    'China': 'China',
    'Ivory Coast': 'ivory coast',
    'Cameroon': 'Cameroon',
    'Dem. Rep. Congo': 'Congo (gold)',
    'Congo': 'Republic of Congo)',
    'Colombia': 'Colombia',
    'Costa Rica': 'Costa Rica',
    'Cuba': 'Cuba',
    'N. Cyprus': 'northern cyprus',
    'Cyprus': 'Cyprus',
    'Czech Rep.': 'Czech',
    'Germany': 'Germany',
    'Djibouti': 'Djibouti',
    'Denmark': 'Denmark',
    'Algeria': 'Algeria',
    'Ecuador': 'Ecuador',
    'Egypt': 'Egypt',
    'Eritrea': 'Eritrea',
    'Spain': 'Spain',
    'Estonia': 'Estonia',
    'Ethiopia': 'Ethiopia',
    'Finland': 'Finland',
    'Fiji': 'Fiji',
    'Falkland Islands': 'Falkland Islands',
    'France': 'France',
    'Gabon': 'Gabon',
    'United Kingdom': 'U.K.',
    'Georgia': 'Georgia',
    'Ghana': 'Ghana',
    'Guinea': 'Guinea',
    'Gambia': 'Gambia',
    'Guinea Bissau': 'Guinea-Bissau',
    'Eq. Guinea': 'Equatorial Guinea',
    'Greece': 'Greece',
    'Greenland': 'Greenland',
    'Guatemala': 'Guatemala',
    'French Guiana': 'French Guiana',
    'Guyana': 'Guyana',
    'Honduras': 'Honduras',
    'Croatia': 'Croatia',
    'Haiti': 'Haiti',
    'Hungary': 'Hungary',
    'Indonesia': 'Indonesia',
    'India': 'India',
    'Ireland': 'Ireland',
    'Iran': 'Iran',
    'Iraq': 'Iraq',
    'Iceland': 'Iceland',
    'Israel': 'Israel',
    'Italy': 'Italy',
    'Jamaica': 'Jamaica',
    'Jordan': 'Jordan',
    'Japan': 'Japan',
    'Kazakhstan': 'Kazakhstan',
    'Kenya': 'Kenya',
    'Kyrgyzstan': 'Kyrgyzstan',
    'Cambodia': 'Cambodia',
    'Korea': 'South Korea',
    'Kosovo': 'Kosovo',
    'Kuwait': 'Kuwait',
    'Lao PDR': 'Laos',
    'Lebanon': 'Lebanon',
    'Liberia': 'Liberia',
    'Libya': 'Libya',
    'Sri Lanka': 'Sri Lanka',
    'Lesotho': 'Lesotho',
    'Lithuania': 'Lithuania',
    'Luxembourg': 'Luxembourg',
    'Latvia': 'Latvia',
    'Morocco': 'Morocco',
    'Moldova': 'Moldova',
    'Madagascar': 'Madagascar',
    'Mexico': 'Mexico',
    'Macedonia': 'Macedonia',
    'Mali': 'Mali',
    'Myanmar': 'Myanmar',
    'Montenegro': 'Montenegro',
    'Mongolia': 'Mongolia',
    'Mozambique': 'Mozambique',
    'Mauritania': 'Mauritania',
    'Malawi': 'Malawi',
    'Malaysia': 'Malaysia',
    'Namibia': 'Namibia',
    'New Caledonia': 'new caledonia',
    'Niger': 'Niger',
    'Nigeria': 'Nigeria',
    'Nicaragua': 'Nicaragua',
    'Netherlands': 'Netherlands',
    'Norway': 'Norway',
    'Nepal': 'Nepal',
    'New Zealand': 'new Zealand',
    'Oman': 'Oman',
    'Pakistan': 'Pakistan',
    'Panama': 'Panama',
    'Peru': 'Peru',
    'Philippines': 'the Philippines',
    'Papua New Guinea': 'Papua New Guinea',
    'Poland': 'Poland',
    'Puerto Rico': 'Puerto Rico',
    'Dem. Rep. Korea': 'North Korea',
    'Portugal': 'Portugal',
    'Paraguay': 'Paraguay',
    'Qatar': 'Qatar',
    'Romania': 'Romania',
    'Russia': 'Russia',
    'Rwanda': 'Rwanda',
    'W. Sahara': 'Western Sahara',
    'Saudi Arabia': 'Saudi Arabia',
    'Sudan': 'Sudan',
    'S. Sudan': 'South Sudan',
    'Senegal': 'Senegal',
    'Solomon Is.': 'Solomon Islands',
    'Sierra Leone': 'Sierra Leone',
    'El Salvador': 'salvador',
    'Somaliland': 'Somaliland',
    'Somalia': 'Somalia',
    'Serbia': 'Serbia',
    'Suriname': 'Suriname',
    'Slovakia': 'Slovakia',
    'Slovenia': 'Slovenia',
    'Sweden': 'Sweden',
    'Swaziland': 'Swaziland',
    'Syria': 'Syria',
    'Chad': 'Chad',
    'Togo': 'togo',
    'Thailand': 'Thailand',
    'Tajikistan': 'Tajikistan',
    'Turkmenistan': 'Turkmenistan',
    'East Timor': 'East Timor',
    'Trinidad and Tobago': 'Trinidad and Tobago',
    'Tunisia': 'Tunisia',
    'Turkey': 'Turkey',
    'Tanzania': 'Tanzania',
    'Uganda': 'Uganda',
    'Ukraine': 'Ukraine',
    'Uruguay': 'Uruguay',
    'United States': 'U.S.',
    'Uzbekistan': 'Uzbekistan',
    'Venezuela': 'Venezuela',
    'Vietnam': 'Vietnam',
    'Vanuatu': 'Vanuatu',
    'West Bank': 'West Bank',
    'Yemen': 'Yemen',
    'South Africa': 'South Africa',
    'Zambia': 'Zambia',
    'Zimbabwe': 'Zimbabwe',
    'Comoros': 'Comoros'
pieces = [
    {"min": 1000000},
    {"min": 100000, "max": 999999},
    {"min": 10000, "max": 99999},
    {"min": 1000, "max": 9999},
    {"min": 100, "max": 999},
    {"min": 0, "max": 99},

df = pd.read_csv('Epidemic data.csv')
# Convert to list
name = df['name']
confirm = df['confirm']
dead = df['dead']
world_map = (
    .add('Cumulative diagnosis', [list(i) for i in zip(name, confirm)], 'world', name_map=name_map, is_map_symbol_show=False)
    .add('death toll', [list(i) for i in zip(name, dead)], 'world', name_map=name_map, is_map_symbol_show=False)
        title_opts=opts.TitleOpts(title='World Epidemic Situation'),
        visualmap_opts=opts.VisualMapOpts(max_=1000000, is_piecewise=True, pieces=pieces)

Show results

Well, today's sharing ends here~

If you need video learning, you can search at station b: Python small circle

If you have any questions about the article, or have other questions about python, you can leave a message in the comment area or privately message me.
If you think the articles I share are good, you can follow me, or like the articles (/≧▽≦)/

