Skip to main content

There are a wide variety of methods to convert lists into Pandas DataFrames. In this tutorial we will cover a couple of popular methods, each of which may be more applicable to your application than the other. Most methods involve first converting the list data into a dictionary, however a direct conversion from list is possible and can be a more straightforward solution.

Method 1: List Directly to DataFrame

The simplest method is to store the data as a lists of lists, which the pandas.DataFrame object constructor can directly convert into a DataFrame.

import pandas as pd # Don't forget to import!
listOne = [[1, 2, 3],
		   [4, 5, 6],
		   [7, 8, 9]]
df = pd.DataFrame.from_dict(listOne)
print(df)
>    0  1  2
> 0  1  2  3
> 1  4  5  6
> 2  7  8  9

The primary issue with this method is that column labels and row labels cannot be included. If we want to include them, we can assign them to the columns and index attributes separately as lists:

listOne = [[1, 2, 3],
		   [4, 5, 6],
		   [7, 8, 9]]
colNames = ["Column A", "Column B", "Column C"]
rowNames = ["Row 1", "Row 2", "Row 3"]
df = pd.DataFrame.from_dict(listOne)
df.columns = colNames
df.index = rowNames
print(df)
>        Column A  Column B  Column C
> Row 1         1         2         3
> Row 2         4         5         6
> Row 3         7         8         9

You can see how adding the column and row labels help us organize our DataFrames for our data science projects.

One drawback of this method is the case where the column and row indices are already included in the list of lists. In this case, the pandas.DataFrame object constructor will naively store them as entries within the DataFrame

listTwo = [["Rows", "Column A", "Column B", "Column C"],
		   ["Row 1", 1, 2, 3],
		   ["Row 2", 4, 5, 6],
		   ["Row 3", 7, 8, 9]]
df = pd.DataFrame.from_dict(listTwo)
print(df)
>        0         1         2         3
> 0   Rows  Column A  Column B  Column C
> 1  Row 1         1         2         3
> 2  Row 2         4         5         6
> 3  Row 3         7         8         9

As you can see, the row indices and column names were included in the data, which was also forced to be a collection of strings rather than integers. This isn’t what you want. A simple way around this is to clean up the list and store column names and indices into separate lists prior to DataFrame conversion, and reinsert them afterwards. This can be done using a script like the following:

listTwo = [["Rows", "Column A", "Column B", "Column C"],
		   ["Row 1", 1, 2, 3],
		   ["Row 2", 4, 5, 6],
		   ["Row 3", 7, 8, 9]]
colNames = listTwo[0][1:]  # Exclude the first column, as the indices don't need a names
del(listTwo[0])  # Remove the column names from the data
rowNames = []  # Initialize the row names list for looping
for row in listTwo:
	rowNames.append(row[0])  # Collect the row indices
	del(row[0])  # Delete row indices from data list
df = pd.DataFrame.from_dict(listTwo)
df.columns = colNames  # Add back in the column names
df.index = rowNames  # Add back in the index names
print(df)
>        Column A  Column B  Column C
> Row 1         1         2         3
> Row 2         4         5         6
> Row 3         7         8         9

You can take this Python script and adapt it to your own project. It’s powerful way of cleaning up your lists so you can use them in Pandas DataFrames.

Method 2: List to Dictionary to DataFrame Conversion

We can start by creating a list of column names and row names. The code we presented to convert a list of lists with column names and row indices into seperate list of rows indices, column names,and data can be used. We’ll assume that these three lists have already been created in the following examples.

One way to join lists into dictionaries is to use the Python zip function to convert lists of lists of data and column names into tuples, which are in turn converted to dictionaries which are converted into a DataFrame.

listOne = [[1, 2, 3],
		   [4, 5, 6],
		   [7, 8, 9]]
colNames = ["Column A", "Column B", "Column C"]
rowNames = ["Row 1", "Row 2", "Row 3"]
dictOne = dict(zip(colNames, listOne))
print(dictOne)
> {'Column A': [1, 2, 3], 'Column B': [4, 5, 6], 'Column C': [7, 8, 9]}

Notice that the above method assumes that the sublists are columns instead of rows. If each list is a list of rows, then you’ll need to use the row indices instead of the column names, like this:

listOne = [[1, 2, 3],
		   [4, 5, 6],
		   [7, 8, 9]]
colNames = ["Column A", "Column B", "Column C"]
rowNames = ["Row 1", "Row 2", "Row 3"]
dictTwo = dict(zip(rowNames, listOne))
print(dictTwo)
> {'Row 1': [1, 2, 3], 'Row 2': [4, 5, 6], 'Row 3': [7, 8, 9]}

Now we can implement the DataFrame using the conversion from dictionary to DataFrame

# Assuming we have a dictionary of rows in dictTwo
df = pd.DataFrame.from_dict(dictTwo, orient='index')
df.columns = colNames	
print(df)
>        Column A  Column B  Column C
> Row 1         1         2         3
> Row 2         4         5         6
> Row 3         7         8         9

At times, you may need to convert your list to a DataFrame in Python.

You may then use this template to convert your list to pandas dataframe:

from pandas import DataFrame
your_list = ['item1', 'item2', 'item3',...]
df = DataFrame (your_list,columns=['Column_Name'])

Examples of Converting a List to DataFrame in Python

Example 1: Convert a List

Let’s say that you have the following list that contains the names of 5 people:

People_List = ['Jon','Mark','Maria','Jill','Jack']

You can then apply the following syntax in order to convert the list of names to pandas DataFrame:

from pandas import DataFrame

People_List = ['Jon','Mark','Maria','Jill','Jack']

df = DataFrame (People_List,columns=['First_Name'])
print (df)

This is the DataFrame that you’ll get:

Convert a List to Dataframe in Python

Example 2: Convert List of Lists

How would you then convert a list of lists to a DataFrame?

For instance, let’s say that you have the following list of lists:

People_List = [['Jon','Smith',21],['Mark','Brown',38],['Maria','Lee',42],['Jill','Jones',28],['Jack','Ford',55]]
You can then run the code below to perform the conversion to the DataFrame:

from pandas import DataFrame

People_List = [['Jon','Smith',21],['Mark','Brown',38],['Maria','Lee',42],['Jill','Jones',28],['Jack','Ford',55]]

df = DataFrame (People_List,columns=['First_Name','Last_Name','Age'])
print (df)

And this is the result that you’ll get:

Convert List of lists to Dataframe

Alternatively, you could have your list of lists as follows:

People_List = [['Jon','Mark','Maria','Jill','Jack'],
['Smith','Brown','Lee','Jones','Ford'],[21,38,42,28,55]]
So the Python code, to perform the conversion to the DataFrame, would look like this:

from pandas import DataFrame

People_List = [['Jon','Mark','Maria','Jill','Jack'],
['Smith','Brown','Lee','Jones','Ford'],[21,38,42,28,55]]

df = DataFrame (People_List).transpose()
df.columns = ['First_Name','Last_Name','Age']
print (df)

Run the code, and you’ll get the same DataFrame:

Convert List of lists to Dataframe

Check the Object Type

If needed, you can also check the type of the objects (e.g., List vs. DataFrame) by applying this code:

from pandas import DataFrame

People_List = [['Jon','Mark','Maria','Jill','Jack'],
             ['Smith','Brown','Lee','Jones','Ford'],[21,38,42,28,55]]

df = DataFrame (People_List).transpose()
df.columns = ['First_Name','Last_Name','Age']

print ('People_List: ' + str(type(People_List)))
print ('df: ' + str(type(df)))

And here is the result:

Type of DataFrame

Applying Stats Using Pandas

Once you converted your list into a DataFrame, you’ll be able to perform an assortment of operations and calculations using pandas.

For instance, you can use pandas to drive some statistics about your data.

In the context of our example, you can apply the code below in order to get the mean, max and min age using pandas:

from pandas import DataFrame

People_List = [['Jon','Mark','Maria','Jill','Jack'],
               ['Smith','Brown','Lee','Jones','Ford'],[21,38,42,28,55]]

df = DataFrame (People_List).transpose()
df.columns = ['First_Name','Last_Name','Age']

mean1 = df['Age'].mean()
max1 = df['Age'].max()
min1 = df['Age'].min()

print ('The mean age is: ' + str(mean1))
print ('The max age is: ' + str(max1))
print ('The min age is: ' + str(min1))

Run the Python code, and you’ll get these stats:

Stats pandas

Different Representation of the DataFrame

Finally, you may apply the following code to represent your DataFrame:

from pandas import DataFrame

People_List = [['Jon', 'Mark', 'Maria','Jill','Jack'],
                     ['Smith', 'Brown', 'Lee', 'Jones', 'Ford'],[21, 38, 42, 28, 55]]

df = DataFrame (People_List, index = ['First_Name','Last_Name','Age'],
                           columns = ['a','b','c','d','e'])
print (df)

This is the representation that you’ll get:

Horizontal view of data

Tags
Submitted by shiksha.dahiya on February 16, 2021

Shiksha is working as a Data Scientist at iVagus. She has expertise in Data Science and Machine Learning.

About

Elix is a premium wordpress theme for portfolio, freelancer, design agencies and a wide range of other design institutions.