etutorialspoint
  • Home
  • PHP
  • MySQL
  • MongoDB
  • HTML
  • Javascript
  • Node.js
  • Express.js
  • Python
  • Jquery
  • R
  • Kotlin
  • DS
  • Blogs
  • Theory of Computation

Python Pandas DataFrame

A DataFrame is a tabular-like data structure containing an ordered collection of columns. The each column can be of different data types, like numeric, boolean, strings, etc. It has both a row and a column index. It is basically used for analytical purposes.


Create DataFrame

A DataFrame is created by using DataFrame object.

import pandas as pd

df = pd.DataFrame()
print(df)

The above code returns an empty dataframe -

Empty DataFrame
Columns: []
Index: []


DataFrame Syntax

pandas.DataFrame( data, index, columns, dtype, copy)

Here, the data can be ndarray, series, dict, map, lists, constants and also dataframe. The index is used for the resulting frame, the columns are used for column labels, dtype is the data type of each column and the copy is used for copying data. Here, we have created a dataframe from a single list -

import pandas as pd

data = ["Rose","Tulip","Sunflower","Lilly"]
getData = pd.DataFrame(data,columns=['FLOWERS'])
print(getData)
Output of the above code-
     FLOWERS
0       Rose
1      Tulip
2  Sunflower
3      Lilly

Similarly, we can create a dataframe from lists -

import pandas as pd

data = [['Biscuit',6],['Chips',3],['Candy',20]]
getData = pd.DataFrame(data,columns=['Item','Quantity'],
                       index=[101,102,103])
print(getData)
Output of the above code-
        Item  Quantity
101  Biscuit         6
102    Chips         3
103    Candy        20




DataFrame Information

Pandas provides various methods to get information about dataframe, like - index dtype, column dtype, memory uses.

shape

The shape returns information about the shape (Rows, columns) of the dataframe.

import pandas as pd

data = [['Biscuit',6],['Chips',3],['Candy',20]]
getData = pd.DataFrame(data,columns=['Item','Quantity'],
                       index=[101,102,103])
print(getData.shape)
Output of the above code-
(3, 2)


index

The index returns the indexes of the dataframe.

import pandas as pd

data = [['Biscuit',6],['Chips',3],['Candy',20]]
getData = pd.DataFrame(data,columns=['Item','Quantity'],
                       index=[101,102,103])
print(getData.index)
Output of the above code-
Int64Index([101, 102, 103], dtype='int64')


columns

The columns returns column labels of the dataframe.

import pandas as pd

data = [['Biscuit',6],['Chips',3],['Candy',20]]
getData = pd.DataFrame(data,columns=['Item','Quantity'],
                       index=[101,102,103])
print(getData.columns)
Output of the above code-
Index(['Item', 'Quantity'], dtype='object')




info()

The info() method returns information about a dataframe including the index dtype and column dtypes, memory usage and non-null values.

import pandas as pd

data = [['Biscuit',6],['Chips',3],['Candy',20]]
getData = pd.DataFrame(data,columns=['Item','Quantity'],
                       index=[101,102,103])
print(getData.info())
Output of the above code-
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3 entries, 101 to 103
Data columns (total 2 columns):
 #   Column    Non-Null Count  Dtype
---  ------    --------------  -----
 0   Item      3 non-null      object
 1   Quantity  3 non-null      int64
dtypes: int64(1), object(1)
memory usage: 72.0+ bytes
None


count()

The count() method counts non-NA cells for each column or row.

import pandas as pd

data = [['Biscuit',6],['Chips',3],['Candy',20]]
getData = pd.DataFrame(data,columns=['Item','Quantity'],
                       index=[101,102,103])
print(getData.count())
Output of the above code-
Item        3
Quantity    3
dtype: int64


DataFrame Functions

Pandas provide functions to perform mathematical operations on data -

sum()

It returns the sum of the values for the requested axis. In the given syntax, axis is {index (0), columns (1)}, skipna excludes NA/Null values, level is the level name to count along a particular level, numeric_only includes only float, int, boolean columns, min_count is the required number of valid values to perform the operation. The syntax is -

DataFrame.sum(axis=None, skipna=None, level=None, numeric_only=None, min_count=0)
Example
import pandas as pd

data = [20,12,3,30,43,23]
getData = pd.DataFrame(data)
print(getData.sum())
Output of the above code-
0    131
dtype: int64


cumsum

The cumsum() function returns cumulative sum of values over a dataframe. In the given syntax, axis is the index or name of the column, skipna excludes NA/Null values. The syntax is -

DataFrame.cumsum(self, axis=None, skipna=True)
Example
import pandas as pd

data = [20,12,3,30,43,23]
getData = pd.DataFrame(data)
print(getData.cumsum())
Output of the above code-
     0
0   20
1   32
2   35
3   65
4  108
5  131


min() and max()

It returns the minimum and the maximum of the values for the requested axis.

DataFrame.min(self, axis=None, skipna=None, level=None, numeric_only=None)
DataFrame.max(self, axis=None, skipna=None, level=None, numeric_only=None)
Example
import pandas as pd

data = [20,12,3,30,43,23]
getData = pd.DataFrame(data)
print(getData.min())
print(getData.max())
Output of the above code-
0    3
dtype: int64
0    43
dtype: int64


idxmin() and idxmax()

The idxmin() and idxmax() functions return the index of the first occurrence of the minimum and the maximum values over the requested axes. The syntax is -

DataFrame.idxmin(self, axis=0, skipna=True)

DataFrame.idxmax(self, axis=0, skipna=True)
Example
import pandas as pd

data = [20,12,3,30,43,23]
getData = pd.DataFrame(data)
print(getData.idxmin())
print(getData.idxmax())
Output of the above code-
0    2
dtype: int64
0    4
dtype: int64




describe()

It returns the summary statistics of dataframe, like - count, mean, min, max, etc. The syntax is -

DataFrame.describe(self, percentiles=None, include=None, exclude=None)
Example
import pandas as pd

data = [20,12,3,30,43,23]
getData = pd.DataFrame(data)
print(getData.describe())
Output of the above code-
               0
count   6.000000
mean   21.833333
std    13.934370
min     3.000000
25%    14.000000
50%    21.500000
75%    28.250000
max    43.000000


mean()

The mean() function is used to return the mean of the values for the requested axis. The syntax is -

DataFrame.mean(axis=None, skipna=None, level=None, numeric_only=None)
Example
import pandas as pd

data = [20,12,3,30,43,23]
getData = pd.DataFrame(data)
print(getData.mean())
Output of the above code-
0    21.833333
dtype: float64


median()

The median() function return the median of the values for the requested axis. The syntax is -

median(axis=None, skipna=None, level=None, numeric_only=None)
Example
import pandas as pd

data = [20,12,3,30,43,23]
getData = pd.DataFrame(data)
print(getData.median())
Output of the above code-
0    21.5
dtype: float64




Dataframe Iteration Methods

These are the dataframe iteration methods- iteritems() and iterrows().

iteritems()

The iteritems() method returns col-index pairs.

import pandas as pd

data = [['Priska',16],['Smith',13],['Alex',20]]
getData = pd.DataFrame(data,columns=['Name','Age'],
                       index=[101,102,103])

for key,value in getData.iteritems():
    print(key, value)
Output of the above code-
Name 101    Priska
102     Smith
103      Alex
Name: Name, dtype: object
Age 101    16
102    13
103    20
Name: Age, dtype: int64


iterrows()

The iterrows() method returns row-index pairs.

import pandas as pd

data = [['Priska',16],['Smith',13],['Alex',20]]
getData = pd.DataFrame(data,columns=['Name','Age'],
                       index=[101,102,103])

for key,value in getData.iterrows():
    print(key, value)
Output of the above code-
101 Name    Priska
Age         16
Name: 101, dtype: object
102 Name    Smith
Age        13
Name: 102, dtype: object
103 Name    Alex
Age       20
Name: 103, dtype: object




Related Articles

Quick Introduction to Python Pandas
Pandas string to datetime
Write Python Pandas Dataframe to CSV
Fillna Pandas Example
Convert Excel to CSV Python Pandas
Python Pandas DataFrame
Python Pandas Plotting
Quick Introduction to Python Pandas
Python Pandas CSV to Dataframe
Python convert xml to dict
Python convert dict to xml
Read data from excel file using Python Pandas
Convert dictionary to dataframe Python
Python Convert XML to CSV
Multiply each element of a list by a number in Python
Convert JSON to CSV using Python




Most Popular Development Resources
Retrieve Data From Database Without Page refresh Using AJAX, PHP and Javascript
-----------------
PHP Create Word Document from HTML
-----------------
How to get data from XML file in PHP
-----------------
Hypertext Transfer Protocol Overview
-----------------
PHP code to send email using SMTP
-----------------
How to encrypt password in PHP
-----------------
Characteristics of a Good Computer Program
-----------------
Create Dynamic Pie Chart using Google API, PHP and MySQL
-----------------
PHP MySQL PDO Database Connection and CRUD Operations
-----------------
Splitting MySQL Results Into Two Columns Using PHP
-----------------
Dynamically Add/Delete HTML Table Rows Using Javascript
-----------------
How to get current directory, filename and code line number in PHP
-----------------
How to add multiple custom markers on google map
-----------------
Fibonacci Series Program in PHP
-----------------
Get current visitor\'s location using HTML5 Geolocation API and PHP
-----------------
How to Sort Table Data in PHP and MySQL
-----------------
Submit a form data using PHP, AJAX and Javascript
-----------------
Simple star rating system using PHP, jQuery and Ajax
-----------------
How to generate QR Code in PHP
-----------------
jQuery loop over JSON result after AJAX Success
-----------------
Simple pagination in PHP
-----------------
Recover forgot password using PHP7 and MySQLi
-----------------
PHP MYSQL Advanced Search Feature
-----------------
PHP Server Side Form Validation
-----------------
PHP user registration and login/ logout with secure password encryption
-----------------
Simple PHP File Cache
-----------------
jQuery File upload progress bar with file size validation
-----------------
Simple File Upload Script in PHP
-----------------
Simple way to send SMTP mail using Node.js
-----------------
Php file based authentication
-----------------
To check whether a year is a leap year or not in php
-----------------
PHP User Authentication by IP Address
-----------------
Calculate distance between two locations using PHP
-----------------
How to print specific part of a web page in javascript
-----------------
PHP Secure User Registration with Login/logout
-----------------
Simple Show Hide Menu Navigation
-----------------
Detect Mobile Devices in PHP
-----------------
Polling system using PHP, Ajax and MySql
-----------------
PHP Sending HTML form data to an Email
-----------------
Google Street View API Example
-----------------
SQL Injection Prevention Techniques
-----------------
Get Visitor\'s location and TimeZone
-----------------
Driving route directions from source to destination using HTML5 and Javascript
-----------------
Convert MySQL to JSON using PHP
-----------------
Preventing Cross Site Request Forgeries(CSRF) in PHP
-----------------
Set and Get Cookies in PHP
-----------------
CSS Simple Menu Navigation Bar
-----------------
PHP Programming Error Types
-----------------
Date Timestamp Formats in PHP
-----------------
How to add google map on your website and display address on click marker
-----------------
How to select/deselect all checkboxes using Javascript
-----------------
Write a python program to print all even numbers between 1 to 100
-----------------
How to display PDF file in web page from Database in PHP
-----------------
PHP Getting Document of Remote Address
-----------------
File Upload Validation in PHP
-----------------


Most Popular Blogs
Most in demand programming languages
Best mvc PHP frameworks in 2019
MariaDB vs MySQL
Most in demand NoSQL databases for 2019
Best AI Startups In India
Kotlin : Android App Development Choice
Kotlin vs Java which one is better
Top Android App Development Languages in 2019
Web Robots
Data Science Recruitment of Freshers - 2019


Interview Questions Answers
Basic PHP Interview
Advanced PHP Interview
MySQL Interview
Javascript Interview
HTML Interview
CSS Interview
Programming C Interview
Programming C++ Interview
Java Interview
Computer Networking Interview
NodeJS Interview
ExpressJS Interview
R Interview


Popular Tutorials
PHP Tutorial (Basic & Advance)
MySQL Tutorial & Exercise
MongoDB Tutorial
Python Tutorial & Exercise
Kotlin Tutorial & Exercise
R Programming Tutorial
HTML Tutorial
jQuery Tutorial
NodeJS Tutorial
ExpressJS Tutorial
Theory of Computation Tutorial
Data Structure Tutorial
Javascript Tutorial






Learn Popular Language

listen
listen
listen
listen
listen

Blogs

  • Jan 3

    Stateful vs Stateless

    A Stateful application recalls explicit subtleties of a client like profile, inclinations, and client activities...

  • Dec 29

    Best programming language to learn in 2021

    In this article, we have mentioned the analyzed results of the best programming language for 2021...

  • Dec 20

    How is Python best for mobile app development?

    Python has a set of useful Libraries and Packages that minimize the use of code...

  • July 18

    Learn all about Emoji

    In this article, we have mentioned all about emojis. It's invention, world emoji day, emojicode programming language and much more...

  • Jan 10

    Data Science Recruitment of Freshers

    In this article, we have mentioned about the recruitment of data science. Data Science is a buzz for every technician...

Follow us

  • etutorialspoint facebook
  • etutorialspoint twitter
  • etutorialspoint linkedin
etutorialspoint youtube
About Us      Contact Us


  • eTutorialsPoint©Copyright 2016-2023. All Rights Reserved.