# Hi. Need help on below query: You are provided with 2 fictional dataframes based on Indian Premier League (IPL)

, 11/04/2021 - 09:32

### Practice Exercise 1

This is the first exercise for Pandas. You are provided with 2 fictional dataframes based on Indian Premier League (IPL). The first dataframe contains the data for all the teams who participated in the year 2018. In the second dataframe, the data is for the year 2017.

In [1]:

``````# import the required libraries - numpy and pandas
import numpy as np
import pandas as pd
``````
##### Data Dictionary
• Team: Team name
• Matches: Total number of matches played
• Won: Number of matches won
• Lost: Number of matches lost
• Tied': Number of matches tied
• N/R: Number of matches with no result
• NRR: Net run rate (Rate of scoring the runs)
• For: Sum of runs scored by the team against other teams
• Against: Sum of runs scored by the opposite playing teams

In [2]:

``````# ipl18 contains the data for 2018
ipl18 = pd.DataFrame({'Team': ['SRH', 'CSK', 'KKR', 'RR', 'MI', 'RCB', 'KXIP', 'DD'],
'Matches': [14, 14, 14, 14, 14, 14, 14, 14],
'Won': [9, 9, 8, 7, 6, 6, 6, 5],
'Lost': [5, 5, 6, 7, 8, 8, 8, 9],
'Tied': [0, 0, 0, 0, 0, 0, 0, 0],
'N/R': [0, 0, 0, 0, 0, 0, 0, 0],
'NRR': [0.284, 0.253, -0.070, -0.250, 0.317, 0.129, -0.502, -0.222],
'For': [2230, 2488, 2363, 2130, 2380, 2322, 2210, 2297],
'Against': [2193, 2433, 2425, 2141, 2282, 2383, 2259, 2304]},
index = range(1,9)
)
``````

In [3]:

``````# print the entire dataframe to check the entries
ipl18
``````

Out[3]:

In [11]:

``ipl18[ipl18["NRR"]>0]``

Out[11]:

In [12]:

``ipl18[ipl18["For"]>ipl18["Against"]]``

Out[12]:

In [24]:

``ipl18[(ipl18["NRR"]>0)&(ipl18["For"]>ipl18["Against"])]``

Out[24]:

In [5]:

``````# ipl17 contains the data for 2017
ipl17 = pd.DataFrame({'Team': ['MI', 'RPS', 'SRH', 'KKR', 'KXIP', 'DD', 'GL', 'RCB'],
'Matches': [14, 14, 14, 14, 14, 14, 14, 14],
'Won': [10, 9, 8, 8, 7, 6, 4, 3],
'Lost': [4, 5, 5, 6, 7, 8, 10, 10],
'Tied': [0, 0, 0, 0, 0, 0, 0, 0],
'N/R': [0, 0, 1, 0, 0, 0, 0, 1],
'NRR': [0.784, 0.176, 0.469, 0.641, 0.123, -0.512, -0.412, -1.299],
'For': [2407, 2180, 2221, 2329, 2207, 2219, 2406, 1845],
'Against': [2242, 2165, 2118, 2300, 2229, 2255, 2472, 2033]},
index = range(1,9)
)
``````

In [6]:

``````# print the entire dataframe to check the entries
ipl17
``````

Out[6]:

In [35]:

``````a=ipl17.append(ipl18)
a
``````

Out[35]:

In [56]:

``a.info()``

As a part of this exercise, solve the questions that are provided below. There are few tasks that you will have to perform to be able to answer the questions.

You have to create a new column 'Points' in both the dataframes that stores the total points scored by each team. The following scoring system is used to calculate the points of a team:

• Win: 2 points
• Loss: 0 points
• Tie: 1 point
• N/R (no result): 1 point

In [ ]:

``````# Type your code here

``````
##### Q1: Extract Top Four Teams

Which of the following commands can you use to extract the top 4 teams in the dataset ‘ipl18’ with just the ‘Team’ and ‘Points’ column?

• ipl18.loc [0:3, [‘Team’, ‘Points’]]
• ipl18.loc [1:4, [‘Team’, ‘Points’]]
• ipl18.loc [0:4, [‘Team’, ‘Points’]]
• ipl18.loc [1:3, [‘Team’, ‘Points’]]

In [ ]:

``# Type your code here``
##### Q2: Filtering based on conditions

Suppose in ‘ipl18’, you want to filter out the teams that have an NRR greater than zero, and for which the ‘For’ score exceeds the ‘Against’ score, i.e. both the conditions should be satisfied. Which teams will be left after you perform the above filtration? (Run the commands on the Python Notebook provided, rather than performing a manual calculation)

• CSK, MI
• SRH, CSK, MI
• SRH, CSK, RCB
• SRH, CSK, MI, RCB

In [ ]:

``# Type your code here``
##### Q3: Operations on multiple dataframes

If all the stats are taken for both ‘ipl17’ and ‘ipl18’, which team with its total points greater than 25 will have the highest win percentage?

• KKR
• CSK
• RPS
• SRH

``````ipl18['Points']=ipl18['Won']*2 +ipl18['Lost']*0+ ipl18['N/R']*1+ ipl18['Tied']*1
ipl18``````

Output:

Team Matches Won Lost Tied N/R NRR For Against Points
1 SRH 14 9 5 0 0 0.284 2230 2193 18
2 CSK 14 9 5 0 0 0.253 2488 2433 18
3 KKR 14 8 6 0 0 -0.070 2363 2425 16
4 RR 14 7 7 0 0 -0.250 2130 2141 14
5 MI 14 6 8 0 0 0.317 2380 2282 12
6 RCB 14 6 8 0 0 0.129 2322 2383 12
7 KXIP 14 6 8 0 0 -0.502 2210 2259 12
8 DD 14 5 9 0 0 -0.222 2297 2304 10
``````ipl17['Points']=ipl17['Won']*2 +ipl17['Lost']*0+ ipl17['N/R']*1+ ipl17['Tied']*1
ipl17``````

Output:

Team Matches Won Lost Tied N/R NRR For Against Points
1 MI 14 10 4 0 0 0.784 2407 2242 20
2 RPS 14 9 5 0 0 0.176 2180 2165 18
3 SRH 14 8 5 0 1 0.469 2221 2118 17
4 KKR 14 8 6 0 0 0.641 2329 2300 16
5 KXIP 14 7 7 0 0 0.123 2207 2229 14
6 DD 14 6 8 0 0 -0.512 2219 2255 12
7 GL 14 4 10 0 0 -0.412 2406 2472 8
8 RCB 14 3 10 0 1 -1.299 1845 2033 7

### Extract Top Four Teams:

``````q1 = ipl18.loc [0:4, ['Team', 'Points']]
print(q1)``````

Output:

```  Team  Points
1  SRH      18
2  CSK      18
3  KKR      16
4   RR      14```

### Filtering based on conditions:

``q2 = ipl18[(ipl18["NRR"]>0) & (ipl18["For"] > ipl18["Against"])]``

Output:

Team Matches Won Lost Tied N/R NRR For Against Points
1 SRH 14 9 5 0 0 0.284 2230 2193 18
2 CSK 14 9 5 0 0 0.253 2488 2433 18
5 MI 14 6 8 0 0 0.317 2380 2282 12

### Operations on multiple dataframes:

``````# append data of 2 years
a = ipl17.append(ipl18)
# sum won,lost and points by team
b = a[["Team","Won","Lost","Points"]].groupby(["Team"]).sum()
# sort by number of won
b.sort_values(by=["Won"], inplace=True, ascending=False)
# Team with greater than 25 points
c = b[b["Points"]>25]
# calculate % won
q3 = (c["Won"] / (c["Won"] + c["Lost"]))*100
print(q3)``````

Output:

```Team
SRH     62.962963
KKR     57.142857
MI      57.142857
KXIP    46.428571
dtype: float64```