How to Export Pandas DataFrame to a CSV File

Spread the love

Introduction

Exporting data into different formats is a common task for data scientists and analysts working with Python. The Pandas library, an open-source data analysis and manipulation tool, provides powerful functions for these data export tasks. In this article, we will explore in detail how to export a Pandas DataFrame to a CSV file.

Creating a Pandas DataFrame

Let’s start by creating a Pandas DataFrame. Here’s an example:

# import pandas
import pandas as pd

# create a simple dataset of people
data = {'Name': ['John', 'Anna', 'Peter', 'Linda'],
        'Country': ['USA', 'Canada', 'Germany', 'Australia'],
        'Age': [24, 36, 29, 50]}

df = pd.DataFrame(data)

# print the dataframe
print(df)

This script will output:

   Name      Country  Age
0  John          USA   24
1  Anna       Canada   36
2  Peter     Germany   29
3  Linda    Australia  50

We have created a Pandas DataFrame from a dictionary, which includes three columns (Name, Country, and Age) and four rows of data.

Exporting DataFrame to a CSV file

The Pandas library provides a function called to_csv() that can be used to save a DataFrame to a local CSV file. Here is the basic syntax:

DataFrame.to_csv('file_name.csv')

Continuing from our previous example, here’s how you would export our DataFrame df to a CSV file:

df.to_csv('people.csv')

This line of code will write the DataFrame df to a CSV file named people.csv. By default, this file will be saved in the same directory as your Python script or Jupyter notebook. If you want to save it into other directory then provide the path like this.

df.to_csv('path_to_file/people.csv')

The to_csv() function comes with a number of options for customization.

Customizing the CSV Output

1. Selecting the delimiter

The default delimiter of a CSV file is a comma. However, you can change this by using the sep parameter:

df.to_csv('people.csv', sep='\t')

This will save the DataFrame as a tab-separated CSV file.

2. Selecting the encoding

The to_csv() function defaults to using ‘utf-8’ encoding when saving the file. However, you can specify a different encoding with the encoding parameter:

df.to_csv('people.csv', encoding='latin1')

3. Excluding the index

By default, to_csv() includes the DataFrame’s index as the first column in the CSV file. If you don’t want this, use the index parameter:

df.to_csv('people.csv', index=False)

4. Excluding the header

Likewise, the column names (header) of the DataFrame are included by default. To export the DataFrame to CSV without the header, use the header parameter:

df.to_csv('people.csv', header=False)

5. Specifying columns to export

The to_csv() function allows you to specify which columns to export using the columns parameter:

df.to_csv('people.csv', columns=['Name', 'Country'])

This will only export the ‘Name’ and ‘Country’ columns.

6. Compression options

Pandas also supports exporting to CSV with compression. The compression argument gives you a choice between ‘infer’, ‘gzip’, ‘bz2’, ‘zip’, ‘xz’, None. If ‘infer’, it uses compression from the specified file extension (.gz, .bz2, .zip, .xz).

df.to_csv('people.csv.gz', compression='gzip')

7. Specifying float formatting

You can also specify the float format for data columns. This is particularly useful when you have numerical data with many decimal places.

df.to_csv('people.csv', float_format='%.2f')

This would round all the floating point numbers to two decimal places.

Conclusion

This guide shows you how to use the to_csv() function to export your Pandas DataFrames to CSV files, with many customizable options to suit a wide variety of needs. This functionality will enable you to seamlessly integrate your Python data analysis tasks with other parts of your data pipeline.

Leave a Reply