Pandas DataFrame Sum : sum() function

Pandas DataFrame.sum() function is used to return the sum of all the values for the given axis. If the input is an index axis then it adds all the values present in columns and repeats the same for all the columns.It returns a serieswhich contains sum of values of all columns.

It can also skip the missing values in the DataFrame while calculating the sum the sum in the DataFrame.

Syntax

DataFrame.sum(axis=None, skipna=None, level=None, numeric_only=None, min_count=0,
 **kwargs)

Parameters:

  • axis: Axis along which the sum  of values is to be calculated. {index (0), columns (1)},0 is used for sum of values along rows or index and 1 is used for the sum along the columns.
  • skipna: By Default True. It is used to exclude null values while computing the result.
  • level: By Default None. It is used when index is a multindex, then it adds item in the given level only.
  • numerical_only: By Default None.If True, it will include only float, int, boolean columns. If None, it will attempt to use everything.
  • min_count: By Default 0. The required number of valid values to perform the operation.
  • **kwargs: Additional Keywords, to be passed through the function.

Return:

According to the specified level it returns the sum in the from of Series or DataFrame.

Examples

import pandas as pd
import numpy as np

Students_name=['John','Alice','Jack','Tom','Monica']
Physics_marks=[44,np.NaN,47,28,39]
Chemistry_marks=[45,46,np.NaN,40,30]
Maths_marks=[35,38,29,30,np.NaN]

Students_marks=pd.DataFrame({'Name':Students_name, 'Physics':Physics_marks,
                             'Chemistry':Chemistry_marks,
                             'Maths':Maths_marks})
Students_marks

Output:

 NamePhysicsChemistryMaths
0John44.045.035.0
1AliceNaN46.038.0
2Jack47.0NaN29.0
3Tom28.040.030.0
4Monica39.030.0NaN

1. When axis = 0 i.e. along rows (excluding null values )

Students_marks.sum(axis=0)

Output

Name         JohnAliceJackTomMonica
Physics                       158.0
Chemistry                     161.0
Maths                         132.0
dtype: object

2. When axis = 1 i.e. along columns (excluding null values )

Students_marks.sum(axis=1)#the sum of all the values over the column axis.

Output

0    124.0
1     84.0
2     76.0
3     98.0
4     69.0
dtype: float64

3. Including null values

Students_marks.sum(axis=1, skipna=False)#without skipping Null Values

Output

0    124.0
1      NaN
2      NaN
3     98.0
4      NaN
dtype: float64

4. Using min_count

Students_marks.sum(axis=1 ,min_count=2)

Output

0    124.0
1     84.0
2     76.0
3     98.0
4     69.0
dtype: float64

5. Using Specific level in Multi-Index DataFrame

Students_name=['John','Alice','Jack','Tom','Monica']
Roll_No=[10,11,15,17,25]

Physics_marks=[44,np.NaN,47,28,39]
Chemistry_marks=[45,46,np.NaN,40,30]
Maths_marks=[35,38,29,30,np.NaN]

Students_marks=pd.DataFrame({'Name':Students_name, 'Roll No':Roll_No ,'Physics':Physics_marks,
                             'Chemistry':Chemistry_marks,
                             'Maths':Maths_marks})

Students_marks.set_index(['Name','Roll No'], inplace=True)
print(Students_marks)

Output

  PhysicsChemistryMaths
NameRoll No   
John1044.045.035.0
Alice11NaN46.038.0
Jack1547.0NaN29.0
Tom1728.040.030.0
Monica2539.030.0NaN
#sum of values for a level 'Roll No' only
Students_marks.sum(level='Roll No')

Output

 Physics ChemistryMaths
Roll No   
1044.045.035.0
110.046.038.0
1547.00.029.0
17 28.040.030.0
2539.030.00.0