Report this

What is the reason for this report?

3 Easy Ways to Create a Subset of Python Dataframe

Published on August 3, 2022
Safa Mulani

By Safa Mulani

3 Easy Ways to Create a Subset of Python Dataframe

Hello, readers! In this article, we will be focusing on Different Ways to Create a Subset of a Python Dataframe in detail.

So, let us get started!


First, what is a Python Dataframe?

Python Pandas module provides us with two data structures, namely, Series and Dataframe to store the values.

A Dataframe is a data structure that holds the data in the form of a matrix i.e. it contains the data in the value-form of rows and columns. Thus, in association with it, we can create and access the subset of it in the below formats:

  • Access data according to the rows as subset
  • Fetch data according to the columns as subset
  • Access specific data from some rows as well as columns as subset

Having understood about Dataframe and subsets, let us now understand the different techniques to create a subset out of a Dataframe.


Creating a Dataframe to work with!

To create subsets of a dataframe, we need to create a dataframe. Let’s get that out of our way first:

import pandas as pd 
data = {"Roll-num": [10,20,30,40,50,60,70], "Age":[12,14,13,12,14,13,15], "NAME":['John','Camili','Rheana','Joseph','Amanti','Alexa','Siri']}
block = pd.DataFrame(data)
print("Original Data frame:\n")
print(block)

Output:

Original Data frame:

   Roll-num  Age    NAME
0        10   12    John
1        20   14  Camili
2        30   13  Rheana
3        40   12  Joseph
4        50   14  Amanti
5        60   13   Alexa
6        70   15    Siri

Here, we have created a data frame using pandas.DataFrame() method. We will be using the above created dataset throughout this article

Let us begin!


1. Create a subset of a Python dataframe using the loc() function

Python loc() function enables us to form a subset of a data frame according to a specific row or column or a combination of both.

The loc() function works on the basis of labels i.e. we need to provide it with the label of the row/column to choose and create the customized subset.

Syntax:

pandas.dataframe.loc[]

Example 1: Extract data of specific rows of a dataframe

block.loc[[0,1,3]]

Output:

As seen below, we have created a subset which includes all the data of row 0, 1, and 3.

Roll-num	Age	NAME
0	10	12	John
1	20	14	Camili
3	40	12	Joseph

Example 2: Create a subset of rows using slicing

block.loc[0:3]

Here, we have extracted the data of all the rows from index 0 to index 3 using slicing operator with loc() function.

Output:

Roll-num	Age	NAME
0	10	12	John
1	20	14	Camili
2	30	13	Rheana
3	40	12	Joseph

Example 3: Create a subset of particular columns using labels

block.loc[0:2,['Age','NAME']]

Output:

Age	NAME
0	12	John
1	14	Camili
2	13	Rheana

Here, we have created a subset which includes data from rows 0 to 2, but includes that of only some specific columns i.e. ‘Age’ and ‘NAME’.


2. Using Python iloc() function to create a subset of a dataframe

Python iloc() function enables us to create subset choosing specific values from rows and columns based on indexes.

That is, unlike loc() function which works on labels, iloc() function works on index values. We can choose and create a subset of a Python dataframe from the data providing the index numbers of the rows and columns.

Syntax:

pandas.dataframe.iloc[]

Example:

block.iloc[[0,1,3,6],[0,2]]

Here, we have created a subset which includes the data of the rows 0,1,3 and 6 as well as column number 0 and 2 i.e. ‘Roll-num’ and ‘NAME’.

Output:

Roll-num	NAME
0	10	John
1	20	Camili
3	40	Joseph
6	70	Siri

3. Indexing operator to create a subset of a dataframe

In a simple manner, we can make use of an indexing operator i.e. square brackets to create a subset of the data.

Syntax:

dataframe[['col1','col2','colN']]

Example:

block[['Age','NAME']]

Here, we have selected all the data values of the columns ‘Age’ and ‘NAME’, respectively.

Output:

Age	NAME
0	12	John
1	14	Camili
2	13	Rheana
3	12	Joseph
4	14	Amanti
5	13	Alexa
6	15	Siri

Conclusion

By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question. For more such posts related to Python, stay tuned, and till then, Happy Learning!! :)

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the author

Safa Mulani
Safa Mulani
Author
Category:
While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

Still looking for an answer?

Was this helpful?

great lesson!

- Patrick Malaza

Creative CommonsThis work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License.
Join the Tech Talk
Success! Thank you! Please check your email for further details.

Please complete your information!

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

Get started for free

Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

*This promotional offer applies to new accounts only.