** pandas** is a fast, powerful, flexible and easy to use data analysis library built on top of NumPy and provides features not available in it. pandas stands for

**, a reference to the tabular format. It adopts significant parts of NumPy’s idiomatic style of array-based computing. While pandas adopts many coding idioms from NumPy, the biggest difference is that pandas**

*panel data**is designed for working with tabular, heterogeneous data. NumPy, by contrast, is best suited for working with homogeneous numerical array data.*

The key to learning pandas is to understand its data structures. A **data structure** is a collection of data values and defines the relationship between the data, and the operations that can be performed on the data.

The most widely used pandas data structures are the **Series** and the **DataFrame**. Simply, a *Series* is similar to a single column of data while a *DataFrame* is similar to a sheet with rows and columns. Likewise, a *Panel* can have many DataFrames.

There are three main data structures in pandas:

**Series**— 1D**DataFrame**— 2D**Panel**— 3D

## Pandas Series

Think of Series as a single column in an Excel sheet. You can also think of it as a 1d Numpy array. The only thing that differentiates it from 1d Numpy array is that we can have Index Names. The series is composed of two arrays associated with each other. The main array (array of values) holds **one-dimensional data** to which each element is associated with a label, contained within the other array (array of labels), called the **index**. If you want to individually see the two arrays that make up the series, you can call ** index** and

**attributes of the series. Because a series is one dimensional, it has a single axis (dimension) — the index and the values of the index — 0, 1, 2, 3 — are called axis labels.**

*values*The basic syntax to create a pandas Series is as follows:

`newSeries = pd.Series(data , index)`

A series consists of two components.

**One-dimensional data (Values)****Index**

### Introduction

The general construct for creating a Series data structure is:To create a series, you simply call the **Series()** class constructor and pass as an argument containing the data to be included in it. Here, ** data** can be one of the following:

**A one-dimensional ndarray****A Python list****A Python dictionary****A scalar value**

If an index is not specified, the default index **[0,… n-1]** will be created, where **n** is the length of the data. A series can be created from a variety of sources as shown in the following subsections.

### Using a one-dimensional ndarray

The following example creates a Series of the 1st 5 odd numbers.

If you do not specify any index during the definition of the series, by default, pandas will assign numerical values increasing from 0 as labels. In this case, the labels correspond to the indexes (position in the array) of the elements in the series object. If you want to create this series using meaningful labels, you would specify the **index** parameter during the series creation. Labels are included inside a list of the same length of ** an_array**.

### Using a Python list

To create a series using a Python list, you can just pass a list to the ** data **parameter of the

**Series()**class constructor.

### Using a scalar value

we can also create a Series from a scalar value. If you do not specify the ** index** argument, the default index is 0. If you specify the

**, the value will be repeated for specified index values.**

*index*### Using a Python dictionary

To create a series using a Python dictionary, you can just pass a dictionary to the ** data** parameter of the

**Series()**class constructor. This time, the arrays of the index and values are filled with the corresponding keys and values of the dictionary.

## Pandas DataFrame

Dataframe is indeed the most commonly used and important data structure of Pandas. Think of a data frame as an excel sheet.

A ** DataFrame** is a two-dimensional data structure composed of rows and columns — exactly like a simple spreadsheet or a SQL table.

*Each column of a DataFrame is a pandas Series*. These columns should be of the same length, but they can be of different data types — float, int, bool, and so on. DataFrames are both

**and**

*value-mutable***(**

*size-mutable***, by contrast, is only value-mutable, not size-mutable. The length of a Series cannot be changed although the values can be changed). This lets us perform operations that would alter values held within the DataFrame or add/delete columns to/from the DataFrame.**

*Series*A DataFrame consists of three components.

**Two-dimensional data (Values)****Row index****Column index**

Main ways to create Data Frame are

- Reading a CSV/Excel File
- Python Dictionary
- ndarray

```
#create a data frame by passing in a dictionary
df1 = {"Name":["Ahmad","Ali",'Ismail',"John"],"Age": [20,21,19,17],
"Height":[5.1,5.6,6.1,5.7]}
#convert this dictionary into a data frame
df1 = pd.DataFrame(df1)
df1
```

## DataFrame creation: Introduction

A DataFrame is the most commonly used data structure in pandas. The **DataFrame()** class constructor accepts many different types of arguments:

**A two-dimensional ndarray****A dictionary of dictionaries****A dictionary of lists****A dictionary of series**

Row label indexes and column labels can be specified along with the data. If they’re not specified, they will be generated from the input data in an intuitive fashion. A DataFrame can be created from a variety of sources as discussed in the following subsections.

### DataFrame creation: Using a two-dimensional ndarray

If you want to see the individual components which make up the DataFrame, you can call ** values**,

**and**

*index***attributes of the DataFrame.**

*columns*### DataFrame creation: Using a dictionary of dictionaries

Column names are created from the keys of the main dictionary, and the row index is created from the keys of the sub dictionaries.

### DataFrame creation: Using a dictionary lists

If you want to see the individual components which make up the DataFrame, you can call ** values**,

**and**

*index***attributes of the DataFrame.**

*columns*### DataFrame creation: Using a dictionary of series

## The Pandas Panel

A Panel is a 3D array. It is not as widely used as Series or DataFrames. It is not as easily displayed on screen or visualized as the other two because of its 3D nature. It is generally used for 3D time-series data. The three-axis names are as follows:

**items:**This is axis 0. Each item corresponds to a DataFrame structure.**major_axis:**This is axis 1. Each item corresponds to the rows of the DataFrame structure.**minor_axis:**This is axis 2. Each item corresponds to the columns of each DataFrame structure.

As with Series and DataFrames, there are different ways to create Panel objects.

### Panel creation: Using a 3D NumPy array

**Panel** is deprecated and will not be available in future versions. Hence, the recommended way to represent these types of 3-dimensional data is to use multi-indexing in DataFrames instead of Panels. A multi-indexed DataFrame can be directly converted to a Panel via **DataFrame.to_panel()** method.