Pandas DataFrame Sum : sum() function

Pandas DataFrame.sum() function is used to return the sum of all the values for the given axis. If the input is an index axis then it adds all the values present in columns and repeats the same for all the columns.It returns a serieswhich contains sum of values of all columns.

It can also skip the missing values in the DataFrame while calculating the sum the sum in the DataFrame.

Syntax

DataFrame.sum(axis=None, skipna=None, level=None, numeric_only=None, min_count=0,
 **kwargs)

Parameters:

  • axis: Axis along which the sum  of values is to be calculated. {index (0), columns (1)},0 is used for sum of values along rows or index and 1 is used for the sum along the columns.
  • skipna: By Default True. It is used to exclude null values while computing the result.
  • level: By Default None. It is used when index is a multindex, then it adds item in the given level only.
  • numerical_only: By Default None.If True, it will include only float, int, boolean columns. If None, it will attempt to use everything.
  • min_count: By Default 0. The required number of valid values to perform the operation.
  • **kwargs: Additional Keywords, to be passed through the function.

Return:

According to the specified level it returns the sum in the from of Series or DataFrame.

Examples

import pandas as pd
import numpy as np

Students_name=['John','Alice','Jack','Tom','Monica']
Physics_marks=[44,np.NaN,47,28,39]
Chemistry_marks=[45,46,np.NaN,40,30]
Maths_marks=[35,38,29,30,np.NaN]

Students_marks=pd.DataFrame({'Name':Students_name, 'Physics':Physics_marks,
                             'Chemistry':Chemistry_marks,
                             'Maths':Maths_marks})
Students_marks

Output:

  Name Physics Chemistry Maths
0 John 44.0 45.0 35.0
1 Alice NaN 46.0 38.0
2 Jack 47.0 NaN 29.0
3 Tom 28.0 40.0 30.0
4 Monica 39.0 30.0 NaN

1. When axis = 0 i.e. along rows (excluding null values )

Students_marks.sum(axis=0)

Output

Name         JohnAliceJackTomMonica
Physics                       158.0
Chemistry                     161.0
Maths                         132.0
dtype: object

2. When axis = 1 i.e. along columns (excluding null values )

Students_marks.sum(axis=1)#the sum of all the values over the column axis.

Output

0    124.0
1     84.0
2     76.0
3     98.0
4     69.0
dtype: float64

3. Including null values

Students_marks.sum(axis=1, skipna=False)#without skipping Null Values

Output

0    124.0
1      NaN
2      NaN
3     98.0
4      NaN
dtype: float64

4. Using min_count

Students_marks.sum(axis=1 ,min_count=2)

Output

0    124.0
1     84.0
2     76.0
3     98.0
4     69.0
dtype: float64

5. Using Specific level in Multi-Index DataFrame

Students_name=['John','Alice','Jack','Tom','Monica']
Roll_No=[10,11,15,17,25]

Physics_marks=[44,np.NaN,47,28,39]
Chemistry_marks=[45,46,np.NaN,40,30]
Maths_marks=[35,38,29,30,np.NaN]

Students_marks=pd.DataFrame({'Name':Students_name, 'Roll No':Roll_No ,'Physics':Physics_marks,
                             'Chemistry':Chemistry_marks,
                             'Maths':Maths_marks})

Students_marks.set_index(['Name','Roll No'], inplace=True)
print(Students_marks)

Output

    Physics Chemistry Maths
Name Roll No      
John 10 44.0 45.0 35.0
Alice 11 NaN 46.0 38.0
Jack 15 47.0 NaN 29.0
Tom 17 28.0 40.0 30.0
Monica 25 39.0 30.0 NaN
#sum of values for a level 'Roll No' only
Students_marks.sum(level='Roll No')

Output

  Physics  Chemistry Maths
Roll No      
10 44.0 45.0 35.0
11 0.0 46.0 38.0
15 47.0 0.0 29.0
17  28.0 40.0 30.0
25 39.0 30.0 0.0
Tue, 02/16/2021 - 17:54

Authored by

Devanshi, is working as a Data Scientist with iVagus. She has expertise in Python, NumPy, Pandas and other data science technologies.