Here are the first few rows of the dataset.
As you can see, there are 10 columns in the dataset.
Notice the column incident date. Using this column, find out the month in which the most kills + injuries happened.
Print a single value denoting the month in which most kills + injuries happened.
import pandas as pd
# Read the gun violence dataset
gun = pd.read_csv('https://query.data.world/s/3iebgxp57luarsikz5wtcwahpjwed7')
# Extract the month from the Incident Date column
gun['Incident Month'] = gun['Incident Date'].apply(lambda x: x.split('-')[1])
# Group the function by Incident Month
gun_g = gun.pivot_table(values = ['# Killed', '# Injured'], index = ['Incident Month'], aggfunc = 'sum')
# Add a column 'Total' which will contain the sum of # Injured and # Killed
gun_g['Total'] = gun_g['# Injured'] + gun_g['# Killed']
# Now, since the incident month will be in the index, you need to remove it from
# the index of the dataframe so that you can access it as a column easily
gun_g.reset_index(inplace=True)
# Extract the month in which most kills + injuries happened and store it in a
# variable month
month = gun_g[gun_g['Total'] == max(gun_g['Total'])]
# Print the final value of month using indexing since month will have all the
# columns for the single row that it has
print(month['Incident Month'].iloc[0])
# Alternatively, instead of the last 2 lines of code, you could have also used the
# following:
# Sort the values in the dataset in a descending order
#group.sort_values(by='Total', inplace = True, ascending = False)
# Print the first row of the column 'Incident Month'
#print(group.loc[0]['Incident Month'])
Comments