使用 Python 绘制不同国家新冠病毒增长曲线

"通过 Python 探索新冠疫情的动态世界，我们将分析、可视化并预测病毒在不同国家的增长曲线。在本文中，通过利用数据预处理、清洗以及 pandas 和 matplotlib 等强大的库，我们深入到绘制和预测疫情影响的交互领域，提供对其轨迹和全球影响的见解。"

使用 Python 绘制不同国家新冠病毒增长曲线

我们将通过图表可视化给定国家（由用户提供）的累计确诊病例总数和累计死亡总数的增长情况，并打印可用国家/地区的列表。本文中使用的数据集可以从此处下载 - https://ourworldindata.org/。

以下是我们将遵循的步骤，以使用 Python 绘制不同国家的新冠病毒增长曲线：

导入所需的库 -

我们首先导入必要的库：pandas 和 plotly.express。
pandas 用于数据操作和预处理。
plotly.express 用于创建交互式可视化。

加载数据 -

程序使用 pandas 库中的 pd.read_csv() 函数从 'owid-covid-data.csv' 文件加载新冠疫情数据。
数据包含有关日期、地点和累计确诊病例的信息。

数据预处理和清洗 -

我们执行数据预处理和清洗，以准备用于分析的数据。
我们选择用于分析的相关列，包括 'date'、'location' 和 'total_cases'。
使用 dropna() 函数删除任何包含缺失值的行。

获取可用国家/地区的列表 -

我们使用 unique() 函数从数据的 'location' 列中提取唯一的国家/地区名称。
这将创建一个可用国家/地区的列表，以供以后使用。

分析数据 -

我们使用 groupby() 函数按地点对数据进行分组，并使用 max() 函数计算每个地点的累计确诊病例总数的最大值。
生成的已分组数据根据累计确诊病例总数降序排序。

绘制增长曲线 -

我们提示用户使用 input() 函数输入国家/地区名称。
如果输入的国家/地区名称有效（即存在于可用国家/地区列表中），我们将继续绘制该国家/地区的增长曲线。
我们过滤数据以提取使用布尔索引 (data['location'] == country_name) 对应于指定国家/地区的行。
过滤后的数据传递给 plotly.express 中的 px.line() 函数以创建折线图。
x 参数设置为 'date'，y 参数设置为 'total_cases'。

图表的标题设置为包含所选国家/地区名称。

显示和保存图表 -

我们使用 fig.show() 函数显示交互式增长曲线图。
要将图表另存为 HTML 文件，我们使用 fig.write_html() 函数并提供所需的文件名 ('growth_curve.html')。
打印确认消息，指示图表已成功保存。

显示可用国家/地区的列表 -

最后，我们显示可用国家/地区的列表，供用户参考。
每个国家/地区名称都使用一个循环打印，该循环遍历 'countries' 列表。

示例

以下是使用上述步骤的程序示例 -

import pandas as pd
import plotly.express as px

# Step 1: Load the data
data = pd.read_csv('owid-covid-data.csv')

# Step 2: Data preprocessing and cleaning
# Select the relevant columns for analysis
data = data[['date', 'location', 'total_cases']]

# Remove rows with missing values
data = data.dropna()

# Get the list of available countries
countries = data['location'].unique()

# Step 3: Analyzing the data
# Group the data by location and calculate the total cases for each location
grouped_data = data.groupby('location')['total_cases'].max()

# Sort the data in descending order
sorted_data = grouped_data.sort_values(ascending=False)

# Step 4: Data prediction
# Fit a curve to the data using polynomial regression or any other suitable method

# Step 5: Plotting the growth curve
# Prompt the user to enter a country name
country_name = input("Enter a country name: ")

if country_name in countries:
   # Plot the growth curve for the specified country
   country_data = data[data['location'] == country_name]

   # Create the plot using Plotly
   fig = px.line(country_data, x='date', y='total_cases', title=f'COVID-19 Growth Curve in {country_name}')
   fig.show()

   # Save the plot as an HTML file
   fig.write_html('growth_curve.html')

   print(f"Graph saved as 'growth_curve.html'")
else:
   print("Invalid country name. Please try again.")

# Display the list of available countries
print("Available countries:")
for country in countries:
   print(country)