由于年份重复,我在创建字典时遇到了麻烦- Python/Hurricane 项目
Im having troubles creating a dicitionary due to duplicated years- Python/Hurricane Project
我正在做 Coadeacademy 的飓风项目。
请参阅下面的变量和值,exercise.It 是 34 次飓风的样本。请注意,有些年它有 2 次飓风。例如在 1933 年,飓风 'Bahamas' 和 'Cuba II'.
都发生了
飓风的名称
names = ['Cuba I', 'San Felipe II Okeechobee', 'Bahamas', 'Cuba II', 'CubaBrownsville', 'Tampico', 'Labor Day', 'New England', 'Carol', 'Janet', 'Carla', 'Hattie', 'Beulah', 'Camille', 'Edith', 'Anita', 'David', 'Allen', 'Gilbert', 'Hugo', 'Andrew', 'Mitch', 'Isabel', 'Ivan', 'Emily', 'Katrina', 'Rita', 'Wilma', 'Dean', 'Felix', 'Matthew', 'Irma', 'Maria', 'Michael']
几个月的飓风
`months = ['October', 'September', 'September', 'November', 'August', 'September', 'September', 'September', 'September', 'September', 'September', 'October', 'September', 'August', 'September', 'September', 'August', 'August', 'September', 'September', 'August', 'October', 'September', 'September', 'July', 'August', 'September', 'October', 'August', 'September', 'October', 'September', 'September', 'October']`
年的飓风
`years = [1924, 1928, 1932, 1932, 1933, 1933, 1935, 1938, 1953, 1955, 1961, 1961, 1967, 1969, 1971, 1977, 1979, 1980, 1988, 1989, 1992, 1998, 2003, 2004, 2005, 2005, 2005, 2005, 2007, 2007, 2016, 2017, 2017, 2018]`
飓风的最大持续风速 (mph)
max_sustained_winds = [165, 160, 160, 175, 160, 160, 185, 160, 160, 175, 175, 160, 160, 175, 160, 175, 175, 190, 185, 160, 175, 180, 165, 165, 160, 175, 180, 185, 175, 175, 165, 180, 175, 160]
每次飓风影响的地区
areas_affected = [['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], ['Lesser Antilles', 'The Bahamas', 'United States East Coast', 'Atlantic Canada'], ['The Bahamas', 'Northeastern United States'], ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], ['The Bahamas', 'Cuba', 'Florida', 'Texas', 'Tamaulipas'], ['Jamaica', 'Yucatn Peninsula'], ['The Bahamas', 'Florida', 'Georgia', 'The Carolinas', 'Virginia'], ['Southeastern United States', 'Northeastern United States', 'Southwestern Quebec'], ['Bermuda', 'New England', 'Atlantic Canada'], ['Lesser Antilles', 'Central America'], ['Texas', 'Louisiana', 'Midwestern United States'], ['Central America'], ['The Caribbean', 'Mexico', 'Texas'], ['Cuba', 'United States Gulf Coast'], ['The Caribbean', 'Central America', 'Mexico', 'United States Gulf Coast'], ['Mexico'], ['The Caribbean', 'United States East coast'], ['The Caribbean', 'Yucatn Peninsula', 'Mexico', 'South Texas'], ['Jamaica', 'Venezuela', 'Central America', 'Hispaniola', 'Mexico'], ['The Caribbean', 'United States East Coast'], ['The Bahamas', 'Florida', 'United States Gulf Coast'], ['Central America', 'Yucatn Peninsula', 'South Florida'], ['Greater Antilles', 'Bahamas', 'Eastern United States', 'Ontario'], ['The Caribbean', 'Venezuela', 'United States Gulf Coast'], ['Windward Islands', 'Jamaica', 'Mexico', 'Texas'], ['Bahamas', 'United States Gulf Coast'], ['Cuba', 'United States Gulf Coast'], ['Greater Antilles', 'Central America', 'Florida'], ['The Caribbean', 'Central America'], ['Nicaragua', 'Honduras'], ['Antilles', 'Venezuela', 'Colombia', 'United States East Coast', 'Atlantic Canada'], ['Cape Verde', 'The Caribbean', 'British Virgin Islands', 'U.S. Virgin Islands', 'Cuba', 'Florida'], ['Lesser Antilles', 'Virgin Islands', 'Puerto Rico', 'Dominican Republic', 'Turks and Caicos Islands'], ['Central America', 'United States Gulf Coast (especially Florida Panhandle)']]
飓风造成的损失(美元($))
damages = ['Damages not recorded', '100M', 'Damages not recorded', '40M', '27.9M', '5M', 'Damages not recorded', '306M', '2M', '65.8M', '326M', '60.3M', '208M', '1.42B', '25.4M', 'Damages not recorded', '1.54B', '1.24B', '7.1B', '10B', '26.5B', '6.2B', '5.37B', '23.3B', '1.01B', '125B', '12B', '29.4B', '1.76B', '720M', '15.1B', '64.8B', '91.6B', '25.1B']
每次飓风造成的死亡人数
deaths = [90,4000,16,3103,179,184,408,682,5,1023,43,319,688,259,37,11,2068,269,318,107,65,19325,51,124,17,1836,125,87,45,133,603,138,3057,74]
第一题是写一个以name为key的飓风字典函数:
我创建了以下功能,效果很好。
def hurricane_dict(names, month, year, sustained_winds, areas_affected, damage, death):
hurricane = {}
for i in range(len(names)):
hurricane[names[i]] = {"Name": names[i],
"Month": month[i],
"Year" : year[i],
"Max Sustained Wind": sustained_winds[i],
"Areas Affected": areas_affected[i],
"Damage": damage[i],
"Deaths": death[i]}
return hurricane
hurricane = hurricane_dict(names, months, years,max_sustained_winds, areas_affected, update_damages, deaths)
hurricane['Cuba I']
Output: {'Name': 'Cuba I',
'Month': 'October',
'Year': 1924,
'Max Sustained Wind': 165,
'Areas Affected': ['Central America',
'Mexico',
'Cuba',
'Florida',
'The Bahamas'],
'Damage': 'Damages not recorded',
'Deaths': 90}
第二题是再写一个飓风字典函数但是用年份作为key:
我本可以按照之前的逻辑构建字典,但是我正在尝试使用现有字典 (hurricane) 作为构建新字典的参数。见下面的编码:
def hurricane_by_year(dictionary):
for name in names:
for year in years:
if year == hurricane[name]['Year']:
hurricanes_by_year_v2[year] = hurricane[name]
return hurricanes_by_year_v2
hurricanes_by_year_v2[1924]
Output: {'Name': 'Cuba I',
'Month': 'October',
'Year': 1924,
'Max Sustained Wind': 165,
'Areas Affected': ['Central America',
'Mexico',
'Cuba',
'Florida',
'The Bahamas'],
'Damage': 'Damages not recorded',
'Deaths': 90}
乍一看,功能和字典看起来还不错,但是它并没有记录所有数据样本。仅记录这些年的第一场飓风,如果同一年发生了另一场飓风,则不会显示。完整的样本是34个,创建的字典只有26个值。
print(range(len(hurricanes_by_year_v2)))
range(0, 26)
如果有人可以帮助我创建正确的函数并创建一个以 Years 为键并使用先前的字典作为参数的完整字典,我将不胜感激。
提前致谢,
米贾尔
每年可能有很多值,因此您应该使用列表来列出年份中的所有值。
def hurricane_by_year(hurricanes):
results = {}
for name, data in hurricanes.items():
year = data['Year']
if year not in results:
results[year] = [] # create list for all values
results[year].append(data) # add to list
return results
hurricanes_by_year_v2 = hurricane_by_year(hurricane)
完整的工作代码:
names = ['Cuba I', 'San Felipe II Okeechobee', 'Bahamas', 'Cuba II', 'CubaBrownsville', 'Tampico', 'Labor Day', 'New England', 'Carol', 'Janet', 'Carla', 'Hattie', 'Beulah', 'Camille', 'Edith', 'Anita', 'David', 'Allen', 'Gilbert', 'Hugo', 'Andrew', 'Mitch', 'Isabel', 'Ivan', 'Emily', 'Katrina', 'Rita', 'Wilma', 'Dean', 'Felix', 'Matthew', 'Irma', 'Maria', 'Michael']
months = ['October', 'September', 'September', 'November', 'August', 'September', 'September', 'September', 'September', 'September', 'September', 'October', 'September', 'August', 'September', 'September', 'August', 'August', 'September', 'September', 'August', 'October', 'September', 'September', 'July', 'August', 'September', 'October', 'August', 'September', 'October', 'September', 'September', 'October']
years = [1924, 1928, 1932, 1932, 1933, 1933, 1935, 1938, 1953, 1955, 1961, 1961, 1967, 1969, 1971, 1977, 1979, 1980, 1988, 1989, 1992, 1998, 2003, 2004, 2005, 2005, 2005, 2005, 2007, 2007, 2016, 2017, 2017, 2018]
max_sustained_winds = [165, 160, 160, 175, 160, 160, 185, 160, 160, 175, 175, 160, 160, 175, 160, 175, 175, 190, 185, 160, 175, 180, 165, 165, 160, 175, 180, 185, 175, 175, 165, 180, 175, 160]
areas_affected = [['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], ['Lesser Antilles', 'The Bahamas', 'United States East Coast', 'Atlantic Canada'], ['The Bahamas', 'Northeastern United States'], ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], ['The Bahamas', 'Cuba', 'Florida', 'Texas', 'Tamaulipas'], ['Jamaica', 'Yucatn Peninsula'], ['The Bahamas', 'Florida', 'Georgia', 'The Carolinas', 'Virginia'], ['Southeastern United States', 'Northeastern United States', 'Southwestern Quebec'], ['Bermuda', 'New England', 'Atlantic Canada'], ['Lesser Antilles', 'Central America'], ['Texas', 'Louisiana', 'Midwestern United States'], ['Central America'], ['The Caribbean', 'Mexico', 'Texas'], ['Cuba', 'United States Gulf Coast'], ['The Caribbean', 'Central America', 'Mexico', 'United States Gulf Coast'], ['Mexico'], ['The Caribbean', 'United States East coast'], ['The Caribbean', 'Yucatn Peninsula', 'Mexico', 'South Texas'], ['Jamaica', 'Venezuela', 'Central America', 'Hispaniola', 'Mexico'], ['The Caribbean', 'United States East Coast'], ['The Bahamas', 'Florida', 'United States Gulf Coast'], ['Central America', 'Yucatn Peninsula', 'South Florida'], ['Greater Antilles', 'Bahamas', 'Eastern United States', 'Ontario'], ['The Caribbean', 'Venezuela', 'United States Gulf Coast'], ['Windward Islands', 'Jamaica', 'Mexico', 'Texas'], ['Bahamas', 'United States Gulf Coast'], ['Cuba', 'United States Gulf Coast'], ['Greater Antilles', 'Central America', 'Florida'], ['The Caribbean', 'Central America'], ['Nicaragua', 'Honduras'], ['Antilles', 'Venezuela', 'Colombia', 'United States East Coast', 'Atlantic Canada'], ['Cape Verde', 'The Caribbean', 'British Virgin Islands', 'U.S. Virgin Islands', 'Cuba', 'Florida'], ['Lesser Antilles', 'Virgin Islands', 'Puerto Rico', 'Dominican Republic', 'Turks and Caicos Islands'], ['Central America', 'United States Gulf Coast (especially Florida Panhandle)']]
damages = ['Damages not recorded', '100M', 'Damages not recorded', '40M', '27.9M', '5M', 'Damages not recorded', '306M', '2M', '65.8M', '326M', '60.3M', '208M', '1.42B', '25.4M', 'Damages not recorded', '1.54B', '1.24B', '7.1B', '10B', '26.5B', '6.2B', '5.37B', '23.3B', '1.01B', '125B', '12B', '29.4B', '1.76B', '720M', '15.1B', '64.8B', '91.6B', '25.1B']
deaths = [90,4000,16,3103,179,184,408,682,5,1023,43,319,688,259,37,11,2068,269,318,107,65,19325,51,124,17,1836,125,87,45,133,603,138,3057,74]
# ----------------------------------------
def hurricane_dict(names, month, year, sustained_winds, areas_affected, damage, death):
results = {}
for data in zip(names, month, year, sustained_winds, areas_affected, damage, death):
results[data[0]] = {
"Name" : data[0],
"Month": data[1],
"Year" : data[2],
"Max Sustained Wind": data[3],
"Areas Affected" : data[4],
"Damage": data[5],
"Deaths": data[6]
}
return results
hurricane = hurricane_dict(names, months, years, max_sustained_winds, areas_affected, damages, deaths)
#print(hurricane['Cuba II'])
def hurricane_by_year(hurricanes):
results = {}
for name, data in hurricane.items():
year = data['Year']
if year not in results:
results[year] = []
results[year].append(data)
return results
hurricanes_by_year_v2 = hurricane_by_year(hurricane)
print('\n--- year 1932 ---\n')
for item in hurricanes_by_year_v2[1932]:
print(item)
print('---')
结果:
--- year 1932 ---
{'Name': 'Bahamas', 'Month': 'September', 'Year': 1932, 'Max Sustained Wind': 160, 'Areas Affected': ['The Bahamas', 'Northeastern United States'], 'Damage': 'Damages not recorded', 'Deaths': 16}
---
{'Name': 'Cuba II', 'Month': 'November', 'Year': 1932, 'Max Sustained Wind': 175, 'Areas Affected': ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], 'Damage': '40M', 'Deaths': 3103}
---
编辑:
我认为使用DataFrame
会更简单。
- 它可以简单地select按年、月、名。
- 可以在
<
、>
、 范围内过滤
- 它可以像求和一样进行计算,average/mean。
- 它可以绘制它。
import pandas as pd
df = pd.DataFrame({
'Name': ['Cuba I', 'San Felipe II Okeechobee', 'Bahamas', 'Cuba II', 'CubaBrownsville', 'Tampico', 'Labor Day', 'New England', 'Carol', 'Janet', 'Carla', 'Hattie', 'Beulah', 'Camille', 'Edith', 'Anita', 'David', 'Allen', 'Gilbert', 'Hugo', 'Andrew', 'Mitch', 'Isabel', 'Ivan', 'Emily', 'Katrina', 'Rita', 'Wilma', 'Dean', 'Felix', 'Matthew', 'Irma', 'Maria', 'Michael'],
'Month': ['October', 'September', 'September', 'November', 'August', 'September', 'September', 'September', 'September', 'September', 'September', 'October', 'September', 'August', 'September', 'September', 'August', 'August', 'September', 'September', 'August', 'October', 'September', 'September', 'July', 'August', 'September', 'October', 'August', 'September', 'October', 'September', 'September', 'October'],
'Year': [1924, 1928, 1932, 1932, 1933, 1933, 1935, 1938, 1953, 1955, 1961, 1961, 1967, 1969, 1971, 1977, 1979, 1980, 1988, 1989, 1992, 1998, 2003, 2004, 2005, 2005, 2005, 2005, 2007, 2007, 2016, 2017, 2017, 2018],
'Max sustained winds': [165, 160, 160, 175, 160, 160, 185, 160, 160, 175, 175, 160, 160, 175, 160, 175, 175, 190, 185, 160, 175, 180, 165, 165, 160, 175, 180, 185, 175, 175, 165, 180, 175, 160],
'Areas affected': [['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], ['Lesser Antilles', 'The Bahamas', 'United States East Coast', 'Atlantic Canada'], ['The Bahamas', 'Northeastern United States'], ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], ['The Bahamas', 'Cuba', 'Florida', 'Texas', 'Tamaulipas'], ['Jamaica', 'Yucatn Peninsula'], ['The Bahamas', 'Florida', 'Georgia', 'The Carolinas', 'Virginia'], ['Southeastern United States', 'Northeastern United States', 'Southwestern Quebec'], ['Bermuda', 'New England', 'Atlantic Canada'], ['Lesser Antilles', 'Central America'], ['Texas', 'Louisiana', 'Midwestern United States'], ['Central America'], ['The Caribbean', 'Mexico', 'Texas'], ['Cuba', 'United States Gulf Coast'], ['The Caribbean', 'Central America', 'Mexico', 'United States Gulf Coast'], ['Mexico'], ['The Caribbean', 'United States East coast'], ['The Caribbean', 'Yucatn Peninsula', 'Mexico', 'South Texas'], ['Jamaica', 'Venezuela', 'Central America', 'Hispaniola', 'Mexico'], ['The Caribbean', 'United States East Coast'], ['The Bahamas', 'Florida', 'United States Gulf Coast'], ['Central America', 'Yucatn Peninsula', 'South Florida'], ['Greater Antilles', 'Bahamas', 'Eastern United States', 'Ontario'], ['The Caribbean', 'Venezuela', 'United States Gulf Coast'], ['Windward Islands', 'Jamaica', 'Mexico', 'Texas'], ['Bahamas', 'United States Gulf Coast'], ['Cuba', 'United States Gulf Coast'], ['Greater Antilles', 'Central America', 'Florida'], ['The Caribbean', 'Central America'], ['Nicaragua', 'Honduras'], ['Antilles', 'Venezuela', 'Colombia', 'United States East Coast', 'Atlantic Canada'], ['Cape Verde', 'The Caribbean', 'British Virgin Islands', 'U.S. Virgin Islands', 'Cuba', 'Florida'], ['Lesser Antilles', 'Virgin Islands', 'Puerto Rico', 'Dominican Republic', 'Turks and Caicos Islands'], ['Central America', 'United States Gulf Coast (especially Florida Panhandle)']],
'Damages': ['Damages not recorded', '100M', 'Damages not recorded', '40M', '27.9M', '5M', 'Damages not recorded', '306M', '2M', '65.8M', '326M', '60.3M', '208M', '1.42B', '25.4M', 'Damages not recorded', '1.54B', '1.24B', '7.1B', '10B', '26.5B', '6.2B', '5.37B', '23.3B', '1.01B', '125B', '12B', '29.4B', '1.76B', '720M', '15.1B', '64.8B', '91.6B', '25.1B'],
'Deaths': [90,4000,16,3103,179,184,408,682,5,1023,43,319,688,259,37,11,2068,269,318,107,65,19325,51,124,17,1836,125,87,45,133,603,138,3057,74],
})
#groups = df.groupby('years')
print('\n--- Year 1932 ---\n')
selected = df[ df['Year'] == 1932 ]
print( selected )
print('\n--- Name contains Cuba ---\n')
selected = df[ df['Name'].str.contains('Cuba') ]
print( selected )
print('\n--- Month August ---\n')
selected = df[ df['Month'] == 'August' ]
print( selected )
print('\n--- Deaths < 20 ---\n')
selected = df[ df['Deaths'] < 20 ].sort_values('Deaths')
print( selected[ ['Deaths', 'Year'] ] )
print('\n--- sum Deaths ---\n')
result = df['Deaths'].sum()
print( result )
print( f'{result:_}' ) # display with `_` to make it more readble
print('\n--- Area Mexico ---\n')
selected = df[ df['Areas affected'].apply(lambda item: 'Mexico' in item) ]
print( selected[ ['Year', 'Areas affected'] ].to_string() ) # `to_string()` to display without `...`
# ---
import matplotlib.pyplot as plt
df.plot(x='Year', y='Deaths')
plt.show()
结果:
--- Year 1932 ---
Name Month ... Damages Deaths
2 Bahamas September ... Damages not recorded 16
3 Cuba II November ... 40M 3103
[2 rows x 7 columns]
--- Name contains Cuba ---
Name Month ... Damages Deaths
0 Cuba I October ... Damages not recorded 90
3 Cuba II November ... 40M 3103
4 CubaBrownsville August ... 27.9M 179
[3 rows x 7 columns]
--- Month August ---
Name Month ... Damages Deaths
4 CubaBrownsville August ... 27.9M 179
13 Camille August ... 1.42B 259
16 David August ... 1.54B 2068
17 Allen August ... 1.24B 269
20 Andrew August ... 26.5B 65
25 Katrina August ... 125B 1836
28 Dean August ... 1.76B 45
[7 rows x 7 columns]
--- Deaths < 20 ---
Deaths Year
8 5 1953
15 11 1977
2 16 1932
24 17 2005
--- sum Deaths ---
39489
39_489
--- Area Mexico ---
Year Areas affected
0 1924 [Central America, Mexico, Cuba, Florida, The Bahamas]
12 1967 [The Caribbean, Mexico, Texas]
14 1971 [The Caribbean, Central America, Mexico, United States Gulf Coast]
15 1977 [Mexico]
17 1980 [The Caribbean, Yucatn Peninsula, Mexico, South Texas]
18 1988 [Jamaica, Venezuela, Central America, Hispaniola, Mexico]
24 2005 [Windward Islands, Jamaica, Mexico, Texas]
我正在做 Coadeacademy 的飓风项目。
请参阅下面的变量和值,exercise.It 是 34 次飓风的样本。请注意,有些年它有 2 次飓风。例如在 1933 年,飓风 'Bahamas' 和 'Cuba II'.
都发生了飓风的名称
names = ['Cuba I', 'San Felipe II Okeechobee', 'Bahamas', 'Cuba II', 'CubaBrownsville', 'Tampico', 'Labor Day', 'New England', 'Carol', 'Janet', 'Carla', 'Hattie', 'Beulah', 'Camille', 'Edith', 'Anita', 'David', 'Allen', 'Gilbert', 'Hugo', 'Andrew', 'Mitch', 'Isabel', 'Ivan', 'Emily', 'Katrina', 'Rita', 'Wilma', 'Dean', 'Felix', 'Matthew', 'Irma', 'Maria', 'Michael']
几个月的飓风
`months = ['October', 'September', 'September', 'November', 'August', 'September', 'September', 'September', 'September', 'September', 'September', 'October', 'September', 'August', 'September', 'September', 'August', 'August', 'September', 'September', 'August', 'October', 'September', 'September', 'July', 'August', 'September', 'October', 'August', 'September', 'October', 'September', 'September', 'October']`
年的飓风
`years = [1924, 1928, 1932, 1932, 1933, 1933, 1935, 1938, 1953, 1955, 1961, 1961, 1967, 1969, 1971, 1977, 1979, 1980, 1988, 1989, 1992, 1998, 2003, 2004, 2005, 2005, 2005, 2005, 2007, 2007, 2016, 2017, 2017, 2018]`
飓风的最大持续风速 (mph)
max_sustained_winds = [165, 160, 160, 175, 160, 160, 185, 160, 160, 175, 175, 160, 160, 175, 160, 175, 175, 190, 185, 160, 175, 180, 165, 165, 160, 175, 180, 185, 175, 175, 165, 180, 175, 160]
每次飓风影响的地区
areas_affected = [['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], ['Lesser Antilles', 'The Bahamas', 'United States East Coast', 'Atlantic Canada'], ['The Bahamas', 'Northeastern United States'], ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], ['The Bahamas', 'Cuba', 'Florida', 'Texas', 'Tamaulipas'], ['Jamaica', 'Yucatn Peninsula'], ['The Bahamas', 'Florida', 'Georgia', 'The Carolinas', 'Virginia'], ['Southeastern United States', 'Northeastern United States', 'Southwestern Quebec'], ['Bermuda', 'New England', 'Atlantic Canada'], ['Lesser Antilles', 'Central America'], ['Texas', 'Louisiana', 'Midwestern United States'], ['Central America'], ['The Caribbean', 'Mexico', 'Texas'], ['Cuba', 'United States Gulf Coast'], ['The Caribbean', 'Central America', 'Mexico', 'United States Gulf Coast'], ['Mexico'], ['The Caribbean', 'United States East coast'], ['The Caribbean', 'Yucatn Peninsula', 'Mexico', 'South Texas'], ['Jamaica', 'Venezuela', 'Central America', 'Hispaniola', 'Mexico'], ['The Caribbean', 'United States East Coast'], ['The Bahamas', 'Florida', 'United States Gulf Coast'], ['Central America', 'Yucatn Peninsula', 'South Florida'], ['Greater Antilles', 'Bahamas', 'Eastern United States', 'Ontario'], ['The Caribbean', 'Venezuela', 'United States Gulf Coast'], ['Windward Islands', 'Jamaica', 'Mexico', 'Texas'], ['Bahamas', 'United States Gulf Coast'], ['Cuba', 'United States Gulf Coast'], ['Greater Antilles', 'Central America', 'Florida'], ['The Caribbean', 'Central America'], ['Nicaragua', 'Honduras'], ['Antilles', 'Venezuela', 'Colombia', 'United States East Coast', 'Atlantic Canada'], ['Cape Verde', 'The Caribbean', 'British Virgin Islands', 'U.S. Virgin Islands', 'Cuba', 'Florida'], ['Lesser Antilles', 'Virgin Islands', 'Puerto Rico', 'Dominican Republic', 'Turks and Caicos Islands'], ['Central America', 'United States Gulf Coast (especially Florida Panhandle)']]
飓风造成的损失(美元($))
damages = ['Damages not recorded', '100M', 'Damages not recorded', '40M', '27.9M', '5M', 'Damages not recorded', '306M', '2M', '65.8M', '326M', '60.3M', '208M', '1.42B', '25.4M', 'Damages not recorded', '1.54B', '1.24B', '7.1B', '10B', '26.5B', '6.2B', '5.37B', '23.3B', '1.01B', '125B', '12B', '29.4B', '1.76B', '720M', '15.1B', '64.8B', '91.6B', '25.1B']
每次飓风造成的死亡人数
deaths = [90,4000,16,3103,179,184,408,682,5,1023,43,319,688,259,37,11,2068,269,318,107,65,19325,51,124,17,1836,125,87,45,133,603,138,3057,74]
第一题是写一个以name为key的飓风字典函数:
我创建了以下功能,效果很好。
def hurricane_dict(names, month, year, sustained_winds, areas_affected, damage, death):
hurricane = {}
for i in range(len(names)):
hurricane[names[i]] = {"Name": names[i],
"Month": month[i],
"Year" : year[i],
"Max Sustained Wind": sustained_winds[i],
"Areas Affected": areas_affected[i],
"Damage": damage[i],
"Deaths": death[i]}
return hurricane
hurricane = hurricane_dict(names, months, years,max_sustained_winds, areas_affected, update_damages, deaths)
hurricane['Cuba I']
Output: {'Name': 'Cuba I',
'Month': 'October',
'Year': 1924,
'Max Sustained Wind': 165,
'Areas Affected': ['Central America',
'Mexico',
'Cuba',
'Florida',
'The Bahamas'],
'Damage': 'Damages not recorded',
'Deaths': 90}
第二题是再写一个飓风字典函数但是用年份作为key:
我本可以按照之前的逻辑构建字典,但是我正在尝试使用现有字典 (hurricane) 作为构建新字典的参数。见下面的编码:
def hurricane_by_year(dictionary):
for name in names:
for year in years:
if year == hurricane[name]['Year']:
hurricanes_by_year_v2[year] = hurricane[name]
return hurricanes_by_year_v2
hurricanes_by_year_v2[1924]
Output: {'Name': 'Cuba I',
'Month': 'October',
'Year': 1924,
'Max Sustained Wind': 165,
'Areas Affected': ['Central America',
'Mexico',
'Cuba',
'Florida',
'The Bahamas'],
'Damage': 'Damages not recorded',
'Deaths': 90}
乍一看,功能和字典看起来还不错,但是它并没有记录所有数据样本。仅记录这些年的第一场飓风,如果同一年发生了另一场飓风,则不会显示。完整的样本是34个,创建的字典只有26个值。
print(range(len(hurricanes_by_year_v2)))
range(0, 26)
如果有人可以帮助我创建正确的函数并创建一个以 Years 为键并使用先前的字典作为参数的完整字典,我将不胜感激。
提前致谢, 米贾尔
每年可能有很多值,因此您应该使用列表来列出年份中的所有值。
def hurricane_by_year(hurricanes):
results = {}
for name, data in hurricanes.items():
year = data['Year']
if year not in results:
results[year] = [] # create list for all values
results[year].append(data) # add to list
return results
hurricanes_by_year_v2 = hurricane_by_year(hurricane)
完整的工作代码:
names = ['Cuba I', 'San Felipe II Okeechobee', 'Bahamas', 'Cuba II', 'CubaBrownsville', 'Tampico', 'Labor Day', 'New England', 'Carol', 'Janet', 'Carla', 'Hattie', 'Beulah', 'Camille', 'Edith', 'Anita', 'David', 'Allen', 'Gilbert', 'Hugo', 'Andrew', 'Mitch', 'Isabel', 'Ivan', 'Emily', 'Katrina', 'Rita', 'Wilma', 'Dean', 'Felix', 'Matthew', 'Irma', 'Maria', 'Michael']
months = ['October', 'September', 'September', 'November', 'August', 'September', 'September', 'September', 'September', 'September', 'September', 'October', 'September', 'August', 'September', 'September', 'August', 'August', 'September', 'September', 'August', 'October', 'September', 'September', 'July', 'August', 'September', 'October', 'August', 'September', 'October', 'September', 'September', 'October']
years = [1924, 1928, 1932, 1932, 1933, 1933, 1935, 1938, 1953, 1955, 1961, 1961, 1967, 1969, 1971, 1977, 1979, 1980, 1988, 1989, 1992, 1998, 2003, 2004, 2005, 2005, 2005, 2005, 2007, 2007, 2016, 2017, 2017, 2018]
max_sustained_winds = [165, 160, 160, 175, 160, 160, 185, 160, 160, 175, 175, 160, 160, 175, 160, 175, 175, 190, 185, 160, 175, 180, 165, 165, 160, 175, 180, 185, 175, 175, 165, 180, 175, 160]
areas_affected = [['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], ['Lesser Antilles', 'The Bahamas', 'United States East Coast', 'Atlantic Canada'], ['The Bahamas', 'Northeastern United States'], ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], ['The Bahamas', 'Cuba', 'Florida', 'Texas', 'Tamaulipas'], ['Jamaica', 'Yucatn Peninsula'], ['The Bahamas', 'Florida', 'Georgia', 'The Carolinas', 'Virginia'], ['Southeastern United States', 'Northeastern United States', 'Southwestern Quebec'], ['Bermuda', 'New England', 'Atlantic Canada'], ['Lesser Antilles', 'Central America'], ['Texas', 'Louisiana', 'Midwestern United States'], ['Central America'], ['The Caribbean', 'Mexico', 'Texas'], ['Cuba', 'United States Gulf Coast'], ['The Caribbean', 'Central America', 'Mexico', 'United States Gulf Coast'], ['Mexico'], ['The Caribbean', 'United States East coast'], ['The Caribbean', 'Yucatn Peninsula', 'Mexico', 'South Texas'], ['Jamaica', 'Venezuela', 'Central America', 'Hispaniola', 'Mexico'], ['The Caribbean', 'United States East Coast'], ['The Bahamas', 'Florida', 'United States Gulf Coast'], ['Central America', 'Yucatn Peninsula', 'South Florida'], ['Greater Antilles', 'Bahamas', 'Eastern United States', 'Ontario'], ['The Caribbean', 'Venezuela', 'United States Gulf Coast'], ['Windward Islands', 'Jamaica', 'Mexico', 'Texas'], ['Bahamas', 'United States Gulf Coast'], ['Cuba', 'United States Gulf Coast'], ['Greater Antilles', 'Central America', 'Florida'], ['The Caribbean', 'Central America'], ['Nicaragua', 'Honduras'], ['Antilles', 'Venezuela', 'Colombia', 'United States East Coast', 'Atlantic Canada'], ['Cape Verde', 'The Caribbean', 'British Virgin Islands', 'U.S. Virgin Islands', 'Cuba', 'Florida'], ['Lesser Antilles', 'Virgin Islands', 'Puerto Rico', 'Dominican Republic', 'Turks and Caicos Islands'], ['Central America', 'United States Gulf Coast (especially Florida Panhandle)']]
damages = ['Damages not recorded', '100M', 'Damages not recorded', '40M', '27.9M', '5M', 'Damages not recorded', '306M', '2M', '65.8M', '326M', '60.3M', '208M', '1.42B', '25.4M', 'Damages not recorded', '1.54B', '1.24B', '7.1B', '10B', '26.5B', '6.2B', '5.37B', '23.3B', '1.01B', '125B', '12B', '29.4B', '1.76B', '720M', '15.1B', '64.8B', '91.6B', '25.1B']
deaths = [90,4000,16,3103,179,184,408,682,5,1023,43,319,688,259,37,11,2068,269,318,107,65,19325,51,124,17,1836,125,87,45,133,603,138,3057,74]
# ----------------------------------------
def hurricane_dict(names, month, year, sustained_winds, areas_affected, damage, death):
results = {}
for data in zip(names, month, year, sustained_winds, areas_affected, damage, death):
results[data[0]] = {
"Name" : data[0],
"Month": data[1],
"Year" : data[2],
"Max Sustained Wind": data[3],
"Areas Affected" : data[4],
"Damage": data[5],
"Deaths": data[6]
}
return results
hurricane = hurricane_dict(names, months, years, max_sustained_winds, areas_affected, damages, deaths)
#print(hurricane['Cuba II'])
def hurricane_by_year(hurricanes):
results = {}
for name, data in hurricane.items():
year = data['Year']
if year not in results:
results[year] = []
results[year].append(data)
return results
hurricanes_by_year_v2 = hurricane_by_year(hurricane)
print('\n--- year 1932 ---\n')
for item in hurricanes_by_year_v2[1932]:
print(item)
print('---')
结果:
--- year 1932 ---
{'Name': 'Bahamas', 'Month': 'September', 'Year': 1932, 'Max Sustained Wind': 160, 'Areas Affected': ['The Bahamas', 'Northeastern United States'], 'Damage': 'Damages not recorded', 'Deaths': 16}
---
{'Name': 'Cuba II', 'Month': 'November', 'Year': 1932, 'Max Sustained Wind': 175, 'Areas Affected': ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], 'Damage': '40M', 'Deaths': 3103}
---
编辑:
我认为使用DataFrame
会更简单。
- 它可以简单地select按年、月、名。
- 可以在
<
、>
、 范围内过滤
- 它可以像求和一样进行计算,average/mean。
- 它可以绘制它。
import pandas as pd
df = pd.DataFrame({
'Name': ['Cuba I', 'San Felipe II Okeechobee', 'Bahamas', 'Cuba II', 'CubaBrownsville', 'Tampico', 'Labor Day', 'New England', 'Carol', 'Janet', 'Carla', 'Hattie', 'Beulah', 'Camille', 'Edith', 'Anita', 'David', 'Allen', 'Gilbert', 'Hugo', 'Andrew', 'Mitch', 'Isabel', 'Ivan', 'Emily', 'Katrina', 'Rita', 'Wilma', 'Dean', 'Felix', 'Matthew', 'Irma', 'Maria', 'Michael'],
'Month': ['October', 'September', 'September', 'November', 'August', 'September', 'September', 'September', 'September', 'September', 'September', 'October', 'September', 'August', 'September', 'September', 'August', 'August', 'September', 'September', 'August', 'October', 'September', 'September', 'July', 'August', 'September', 'October', 'August', 'September', 'October', 'September', 'September', 'October'],
'Year': [1924, 1928, 1932, 1932, 1933, 1933, 1935, 1938, 1953, 1955, 1961, 1961, 1967, 1969, 1971, 1977, 1979, 1980, 1988, 1989, 1992, 1998, 2003, 2004, 2005, 2005, 2005, 2005, 2007, 2007, 2016, 2017, 2017, 2018],
'Max sustained winds': [165, 160, 160, 175, 160, 160, 185, 160, 160, 175, 175, 160, 160, 175, 160, 175, 175, 190, 185, 160, 175, 180, 165, 165, 160, 175, 180, 185, 175, 175, 165, 180, 175, 160],
'Areas affected': [['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], ['Lesser Antilles', 'The Bahamas', 'United States East Coast', 'Atlantic Canada'], ['The Bahamas', 'Northeastern United States'], ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], ['The Bahamas', 'Cuba', 'Florida', 'Texas', 'Tamaulipas'], ['Jamaica', 'Yucatn Peninsula'], ['The Bahamas', 'Florida', 'Georgia', 'The Carolinas', 'Virginia'], ['Southeastern United States', 'Northeastern United States', 'Southwestern Quebec'], ['Bermuda', 'New England', 'Atlantic Canada'], ['Lesser Antilles', 'Central America'], ['Texas', 'Louisiana', 'Midwestern United States'], ['Central America'], ['The Caribbean', 'Mexico', 'Texas'], ['Cuba', 'United States Gulf Coast'], ['The Caribbean', 'Central America', 'Mexico', 'United States Gulf Coast'], ['Mexico'], ['The Caribbean', 'United States East coast'], ['The Caribbean', 'Yucatn Peninsula', 'Mexico', 'South Texas'], ['Jamaica', 'Venezuela', 'Central America', 'Hispaniola', 'Mexico'], ['The Caribbean', 'United States East Coast'], ['The Bahamas', 'Florida', 'United States Gulf Coast'], ['Central America', 'Yucatn Peninsula', 'South Florida'], ['Greater Antilles', 'Bahamas', 'Eastern United States', 'Ontario'], ['The Caribbean', 'Venezuela', 'United States Gulf Coast'], ['Windward Islands', 'Jamaica', 'Mexico', 'Texas'], ['Bahamas', 'United States Gulf Coast'], ['Cuba', 'United States Gulf Coast'], ['Greater Antilles', 'Central America', 'Florida'], ['The Caribbean', 'Central America'], ['Nicaragua', 'Honduras'], ['Antilles', 'Venezuela', 'Colombia', 'United States East Coast', 'Atlantic Canada'], ['Cape Verde', 'The Caribbean', 'British Virgin Islands', 'U.S. Virgin Islands', 'Cuba', 'Florida'], ['Lesser Antilles', 'Virgin Islands', 'Puerto Rico', 'Dominican Republic', 'Turks and Caicos Islands'], ['Central America', 'United States Gulf Coast (especially Florida Panhandle)']],
'Damages': ['Damages not recorded', '100M', 'Damages not recorded', '40M', '27.9M', '5M', 'Damages not recorded', '306M', '2M', '65.8M', '326M', '60.3M', '208M', '1.42B', '25.4M', 'Damages not recorded', '1.54B', '1.24B', '7.1B', '10B', '26.5B', '6.2B', '5.37B', '23.3B', '1.01B', '125B', '12B', '29.4B', '1.76B', '720M', '15.1B', '64.8B', '91.6B', '25.1B'],
'Deaths': [90,4000,16,3103,179,184,408,682,5,1023,43,319,688,259,37,11,2068,269,318,107,65,19325,51,124,17,1836,125,87,45,133,603,138,3057,74],
})
#groups = df.groupby('years')
print('\n--- Year 1932 ---\n')
selected = df[ df['Year'] == 1932 ]
print( selected )
print('\n--- Name contains Cuba ---\n')
selected = df[ df['Name'].str.contains('Cuba') ]
print( selected )
print('\n--- Month August ---\n')
selected = df[ df['Month'] == 'August' ]
print( selected )
print('\n--- Deaths < 20 ---\n')
selected = df[ df['Deaths'] < 20 ].sort_values('Deaths')
print( selected[ ['Deaths', 'Year'] ] )
print('\n--- sum Deaths ---\n')
result = df['Deaths'].sum()
print( result )
print( f'{result:_}' ) # display with `_` to make it more readble
print('\n--- Area Mexico ---\n')
selected = df[ df['Areas affected'].apply(lambda item: 'Mexico' in item) ]
print( selected[ ['Year', 'Areas affected'] ].to_string() ) # `to_string()` to display without `...`
# ---
import matplotlib.pyplot as plt
df.plot(x='Year', y='Deaths')
plt.show()
结果:
--- Year 1932 ---
Name Month ... Damages Deaths
2 Bahamas September ... Damages not recorded 16
3 Cuba II November ... 40M 3103
[2 rows x 7 columns]
--- Name contains Cuba ---
Name Month ... Damages Deaths
0 Cuba I October ... Damages not recorded 90
3 Cuba II November ... 40M 3103
4 CubaBrownsville August ... 27.9M 179
[3 rows x 7 columns]
--- Month August ---
Name Month ... Damages Deaths
4 CubaBrownsville August ... 27.9M 179
13 Camille August ... 1.42B 259
16 David August ... 1.54B 2068
17 Allen August ... 1.24B 269
20 Andrew August ... 26.5B 65
25 Katrina August ... 125B 1836
28 Dean August ... 1.76B 45
[7 rows x 7 columns]
--- Deaths < 20 ---
Deaths Year
8 5 1953
15 11 1977
2 16 1932
24 17 2005
--- sum Deaths ---
39489
39_489
--- Area Mexico ---
Year Areas affected
0 1924 [Central America, Mexico, Cuba, Florida, The Bahamas]
12 1967 [The Caribbean, Mexico, Texas]
14 1971 [The Caribbean, Central America, Mexico, United States Gulf Coast]
15 1977 [Mexico]
17 1980 [The Caribbean, Yucatn Peninsula, Mexico, South Texas]
18 1988 [Jamaica, Venezuela, Central America, Hispaniola, Mexico]
24 2005 [Windward Islands, Jamaica, Mexico, Texas]