Tutorial

How to Parse CSV Files in Python

Published on August 3, 2022
Default avatar

By Pankaj

How to Parse CSV Files in Python

While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

CSV files are used a lot in storing tabular data into a file. We can easily export data from database tables or excel files to CSV files. It’s also easy to read by humans as well as in the program. In this tutorial, we will learn how to parse CSV files in Python.

What is Parsing?

Parsing a file means reading the data from a file. The file may contain textual data so-called text files, or they may be a spreadsheet.

What is a CSV file?

CSV stands for Comma Separated Files, i.e. data is separated using comma from each other. CSV files are created by the program that handles a large number of data. Data from CSV files can be easily exported in the form of spreadsheet and database as well as imported to be used by other programs. Let’s see how to parse a CSV file. Parsing CSV files in Python is quite easy. Python has an inbuilt CSV library which provides the functionality of both readings and writing the data from and to CSV files. There are a variety of formats available for CSV files in the library which makes data processing user-friendly.

Parsing a CSV file in Python

Reading CSV files using the inbuilt Python CSV module.

import csv

with open('university_records.csv', 'r') as csv_file:
    reader = csv.reader(csv_file)

    for row in reader:
        print(row)

Output:

Python Parse CSV File
Python Parse CSV File

Writing a CSV file in Python

For writing a file, we have to open it in write mode or append mode. Here, we will append the data to the existing CSV file.

import csv

row = ['David', 'MCE', '3', '7.8']

row1 = ['Lisa', 'PIE', '3', '9.1']

row2 = ['Raymond', 'ECE', '2', '8.5']

with open('university_records.csv', 'a') as csv_file:
    writer = csv.writer(csv_file)

    writer.writerow(row)

    writer.writerow(row1)

    writer.writerow(row2)
Python Append To CSV File
Python Append To CSV File

Parse CSV Files using Pandas library

There is one more way to work with CSV files, which is the most popular and more professional, and that is using the pandas library. Pandas is a Python data analysis library. It offers different structures, tools, and operations for working and manipulating given data which is mostly two dimensional or one-dimensional tables.

Uses and Features of pandas Library

  • Data sets pivoting and reshaping.
  • Data manipulation with indexing using DataFrame objects.
  • Data filtration.
  • Merge and join operation on data sets.
  • Slicing, indexing, and subset of massive datasets.
  • Missing data handling and data alignment.
  • Row/Column insertion and deletion.
  • One-Dimensional different file formats.
  • Reading and writing tools for data in various file formats.

To work with the CSV file, you need to install pandas. Installing pandas is quite simple, follow the instructions below to install it using PIP.

$ pip install pandas

Python Install PandasPython Install Pandas[/caption] [caption id=“attachment_30145” align=“aligncenter” width=“727”]Python Install Pandas Cmd

Once the installation is complete, you are good to go.

Reading a CSV file using Pandas Module

You need to know the path where your data file is in your filesystem and what is your current working directory before you can use pandas to import your CSV file data. I suggest keeping your code and the data file in the same directory or folder so that you will not need to specify the path which will save you time and space.

import pandas

result = pandas.read_csv('ign.csv')

print(result)

Output

Read Csv File Pandas Output
Read CSV File using pandas module

Writing a CSV file using Pandas Module

Writing CSV files using pandas is as simple as reading. The only new term used is DataFrame. Pandas DataFrame is a two-dimensional, heterogeneous tabular data structure (data is arranged in a tabular fashion in rows and columns. Pandas DataFrame consists of three main components - data, columns, and rows -  with a labeled x-axis and y-axis (rows and columns).

from pandas import DataFrame

C = {'Programming language': ['Python', 'Java', 'C++'],

     'Designed by': ['Guido van Rossum', 'James Gosling', 'Bjarne Stroustrup'],

     'Appeared': ['1991', '1995', '1985'],

     'Extension': ['.py', '.java', '.cpp'],

     }

df = DataFrame(C, columns=['Programming language', 'Designed by', 'Appeared', 'Extension'])

export_csv = df.to_csv(r'program_lang.csv', index=None, header=True)

Output

Python Pandas Write CSV File
Python Pandas Write CSV File

Conclusion

We learned to parse a CSV file using built-in CSV module and pandas module. There are many different ways to parse the files, but programmers do not widely use them. Libraries like PlyPlus, PLY, and ANTLR are some of the libraries used for parsing text data. Now you know how to use inbuilt CSV library and powerful pandas module for reading and writing data in CSV format. The codes shown above are very basic and straightforward. It is understandable by anyone familiar with python, so I don’t think there is any need for explanation. However, the manipulation of complex data with empty and ambiguous data entry is not easy. It requires practice and knowledge of various tools in pandas. CSV is the best way of saving and sharing data. Pandas is an excellent alternative to CSV modules. You may find it difficult in the beginning, but it isn’t so hard to learn. With a little bit of practice, you will master it.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about us


About the authors
Default avatar
Pankaj

author

Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
JournalDev
DigitalOcean Employee
DigitalOcean Employee badge
August 26, 2019

Nice tutorial In first example of reading csv, we try to close file and are using with statement too. With will close your resource so you don’t need to.

- Ankit Rana

    Try DigitalOcean for free

    Click below to sign up and get $200 of credit to try our products over 60 days!

    Sign up

    Join the Tech Talk
    Success! Thank you! Please check your email for further details.

    Please complete your information!

    Get our biweekly newsletter

    Sign up for Infrastructure as a Newsletter.

    Hollie's Hub for Good

    Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

    Become a contributor

    Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

    Welcome to the developer cloud

    DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

    Learn more
    DigitalOcean Cloud Control Panel