Skip to main content

Once you have your values in the DataFrame, you can perform a large variety of operations. For example, you may calculate stats using Pandas.

For instance, let’s say that you want to find the maximum price among all the Cars within the DataFrame.

Obviously, you can derive this value just by looking at the dataset, but the method presented below would work for much larger datasets.

To get the maximum price for our Cars example, you’ll need to add the following portion to the Python code (and then print the results):

Here is the complete Python code:

max1 = df['Price'].max()


import pandas as pd

cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
        'Price': [22000,25000,27000,35000]
        }

df = pd.DataFrame(cars, columns = ['Brand', 'Price'])

max1 = df['Price'].max()
print (max1)

In the real world, a Panda DataFrame will be created by loading the datasets from persistent storage, including but not limited to excel, csv and MySQL database.

However, to help you understand it better, I’ll be using Python Data Structures (Dictionary and list) over here.

As depicted in excel sheet above, if we consider column names as “Keys” and list of items under that column as “Values”, we can easily use a python dictionary to represent the same as

my_dict = { 
     'name' : ["a", "b", "c", "d", "e","f", "g"],
     'age' : [20,27, 35, 55, 18, 21, 35],
     'designation': ["VP", "CEO", "CFO", "VP", "VP", "CEO", "MD"]
}

We can create a Pandas DataFrame out of this dictionary as

import Pandas as pddf = pd.DataFrame(my_dict)

The resultant DataFrame shall look similar to what we’ve seen in the excel sheet above as

Image for post

Result of -> df = pd.DataFrame(my_dict)

There are chances that the Columns are not in sequence as defined in the dictionary because python implements dictionary as hash and doesn’t guarantee to preserve the sequence.

Find maximum values & position in columns and rows of a Dataframe in Pandas

In this article, we are going to discuss how to find maximum value and its index position in columns and rows of a Dataframe.

DataFrame.max()

Pandas dataframe.max() method finds the maximum of the values in the object and returns it. If the input is a series, the method will return a scalar which will be the maximum of the values in the series. If the input is a dataframe, then the method will return a series with maximum of values over the specified axis in the dataframe. The index axis is the default axis taken by this method.

import numpy as np 
import pandas as pd 
# List of Tuples 
matrix = [(10, 56, 17), 
          (np.NaN, 23, 11), 
          (49, 36, 55), 
          (75, np.NaN, 34), 
          (89, 21, 44) 
          ] 
# Create a DataFrame 
abc = pd.DataFrame(matrix, index = list('abcde'), columns = list('xyz')) 
  
# output  
abc 

Output:

How to find Maximum values of every column?

To find the maximum value of each column, call max() method on the Dataframe object without taking any argument.

# find the maximum of each column

maxValues = abc.max()
print(maxValues)

Output :
maximum values each column in dataframe

We can see that it returned a series of maximum values where the index is column name and values are the maxima from each column.

How to find maximum values of every row?

To find the maximum value of each row, call max() method on the Dataframe object with an argument axis = 1.

# find the maximum values of each row

maxValues = abc.max(axis = 1)

print(maxValues)

Output :
maximum values in dataframe-2

We can see that it returned a series of maximum values where the index is row name and values are the maxima from each row. We can see that in the above examples NaN values are skipped while finding the maximum values in any axis. We can include NaN values as well if we want.

How to find maximum values of every column without skipping NaN?

# find maximum value of each 

# column without skipping NaN

maxValues = abc.max(skipna = False)

print(maxValues)

Output :
Find maximum values & position in columns and rows of a Dataframe-3

By putting skipna=False we can include NaN values also. If any NaN value exists it will be considered as the maximum value.

How to find maximum values of a single column or selected columns?

To get the maximum value of a single column see the following example

# find maximum value of a 
# single column 'x'

maxClm = df['x'].max()
print("Maximum value in column 'x': " ) 
print(maxClm)

Output :
maximum vale in column

We have another way to find maximum value of a column :

A list of columns can also be passed instead of a single column to find the maximum values of specified columns
 

# find maximum value of a 
# single column 'x'

maxClm = df.max()['x']

The result will be same as above.

Output:
maximum vale in column

# find maximum values of a list of columns

maxValues = df[['x', 'z']].max()
print("Maximum value in column 'x' & 'z': ")
print(maxValues)

Output :

How to get position of maximum values of every column?

DataFrame.idxmax(): Pandas dataframe.idxmax() method returns index of first occurrence of maximum over requested axis. While finding the index of the maximum value across any index, all NA/null values are excluded.

Syntax: DataFrame.idxmax(axis=0, skipna=True)

Parameters :
axis : 0 or ‘index’ for row-wise, 1 or ‘columns’ for column-wise
skipna : Exclude NA/null values. If an entire row/column is NA, the result will be NA

Returns : idxmax : Series

Let’s take some examples to understand how to use it :

How to get row index label of Maximum value in every column

# find the index position of maximun

# values in every column

maxValueIndex = df.idxmax()

print("Maximum values of columns are at row index position :")

print(maxValueIndex)

Output :

It returns a series containing the column names as index and row as index labels where the maximum value exists in that column.

How to find Column names of Maximum value in every row?

# find the column name of maximum

# values in every row

maxValueIndex = df.idxmax(axis = 1)

print("Max values of row are at following columns :")

print(maxValueIndex)

Output :

It returns a series containing the rows index labels as index and column names as values where the maximum value exists in that row.

Tags
Submitted by shiksha.dahiya on February 11, 2021

Shiksha is working as a Data Scientist at iVagus. She has expertise in Data Science and Machine Learning.

About

Elix is a premium wordpress theme for portfolio, freelancer, design agencies and a wide range of other design institutions.