Pandas DataFrame Sum : sum() function

Pandas DataFrame.sum() function is used to return the sum of all the values for the given axis. If the input is an index axis then it adds all the values present in columns and repeats the same for all the columns.It returns a serieswhich contains sum of values of all columns.

It can also skip the missing values in the DataFrame while calculating the sum the sum in the DataFrame.

Syntax

DataFrame.sum(axis=None, skipna=None, level=None, numeric_only=None, min_count=0,
 **kwargs)

Parameters:

axis: Axis along which the sum of values is to be calculated. {index (0), columns (1)},0 is used for sum of values along rows or index and 1 is used for the sum along the columns.
skipna: By Default True. It is used to exclude null values while computing the result.
level: By Default None. It is used when index is a multindex, then it adds item in the given level only.
numerical_only: By Default None.If True, it will include only float, int, boolean columns. If None, it will attempt to use everything.
min_count: By Default 0. The required number of valid values to perform the operation.
**kwargs: Additional Keywords, to be passed through the function.

Return:

According to the specified level it returns the sum in the from of Series or DataFrame.

Examples

import pandas as pd
import numpy as np

Students_name=['John','Alice','Jack','Tom','Monica']
Physics_marks=[44,np.NaN,47,28,39]
Chemistry_marks=[45,46,np.NaN,40,30]
Maths_marks=[35,38,29,30,np.NaN]

Students_marks=pd.DataFrame({'Name':Students_name, 'Physics':Physics_marks,
                             'Chemistry':Chemistry_marks,
                             'Maths':Maths_marks})

Students_marks

Output:

	Name	Physics	Chemistry	Maths
0	John	44.0	45.0	35.0
1	Alice	NaN	46.0	38.0
2	Jack	47.0	NaN	29.0
3	Tom	28.0	40.0	30.0
4	Monica	39.0	30.0	NaN

1. When axis = 0 i.e. along rows (excluding null values )

Students_marks.sum(axis=0)

Output

Name         JohnAliceJackTomMonica
Physics                       158.0
Chemistry                     161.0
Maths                         132.0
dtype: object

2. When axis = 1 i.e. along columns (excluding null values )

Students_marks.sum(axis=1)#the sum of all the values over the column axis.

Output

0    124.0
1     84.0
2     76.0
3     98.0
4     69.0
dtype: float64

3. Including null values

Students_marks.sum(axis=1, skipna=False)#without skipping Null Values

Output

0    124.0
1      NaN
2      NaN
3     98.0
4      NaN
dtype: float64

4. Using min_count

Students_marks.sum(axis=1 ,min_count=2)

Output

0    124.0
1     84.0
2     76.0
3     98.0
4     69.0
dtype: float64

5. Using Specific level in Multi-Index DataFrame

Students_name=['John','Alice','Jack','Tom','Monica']
Roll_No=[10,11,15,17,25]

Physics_marks=[44,np.NaN,47,28,39]
Chemistry_marks=[45,46,np.NaN,40,30]
Maths_marks=[35,38,29,30,np.NaN]

Students_marks=pd.DataFrame({'Name':Students_name, 'Roll No':Roll_No ,'Physics':Physics_marks,
                             'Chemistry':Chemistry_marks,
                             'Maths':Maths_marks})

Students_marks.set_index(['Name','Roll No'], inplace=True)
print(Students_marks)

Output

		Physics	Chemistry	Maths
Name	Roll No
John	10	44.0	45.0	35.0
Alice	11	NaN	46.0	38.0
Jack	15	47.0	NaN	29.0
Tom	17	28.0	40.0	30.0
Monica	25	39.0	30.0	NaN

#sum of values for a level 'Roll No' only
Students_marks.sum(level='Roll No')

Output

	Physics	Chemistry	Maths
Roll No
10	44.0	45.0	35.0
11	0.0	46.0	38.0
15	47.0	0.0	29.0
17	28.0	40.0	30.0
25	39.0	30.0	0.0