How to Export Pandas DataFrame to a JSON File

Spread the love

Introduction

JSON (JavaScript Object Notation) is a lightweight data interchange format that is easy for humans to read and write and easy for machines to parse and generate. It’s widely used in web applications to transport data, and it is especially useful when you’re transmitting data from a server to a web application. The Pandas library in Python provides an easy-to-use function to export DataFrame objects into JSON format. This article provides a comprehensive guide on how to export a Pandas DataFrame to a JSON file.

Creating a Pandas DataFrame

Let’s begin by creating a Pandas DataFrame.

# import pandas
import pandas as pd

# create a simple dataset of people
data = {'Name': ['John', 'Anna', 'Peter', 'Linda'],
        'Country': ['USA', 'Canada', 'Germany', 'Australia'],
        'Age': [24, 36, 29, 50]}

df = pd.DataFrame(data)

# print the dataframe
print(df)

In the script above, we have created a simple DataFrame that contains three columns (Name, Country, and Age) and four rows of data.

Exporting DataFrame to a JSON File

To export a DataFrame to a JSON file, you can use the to_json() function. This is how it’s done:

df.to_json('people.json')

This will export the DataFrame to a JSON file named people.json. The file will be saved in the same directory as your Python script or Jupyter notebook.

Customizing the JSON Output

The to_json() function comes with several parameters that you can use to customize the JSON output.

1. Orient

The orient parameter controls the format of the JSON output. The available options are:

  • ‘split’ : Dictionary containing indexes, columns, and data.
  • ‘records’ : List where each row is a JSON object.
  • ‘index’ : Dictionary with row indexes as keys.
  • ‘columns’ : Dictionary with column names as keys.
  • ‘values’ : Just the values array.

The default option is ‘columns’. Here is an example using the ‘records’ orientation:

df.to_json('people.json', orient='records')

2. Date Format

The date_format parameter controls the ISO format of dates. The options are:

  • ‘epoch’ : epoch timestamps.
  • ‘iso’ : ISO8601 format.

The default is ‘epoch’. Here is an example using the ‘iso’ date format:

df.to_json('people.json', date_format='iso')

3. Double Precision

The double_precision parameter controls the number of decimal places for floating-point numbers. The default is 10. Here is an example with double precision set to 2:

df.to_json('people.json', double_precision=2)

4. Force ASCII

The force_ascii parameter controls whether to force the output to be ASCII or not. The default is True. If this is set to False and ensure_ascii is True in the JSON library (which is the default), the output will still be forced to be ASCII. Here is an example where force ASCII is set to False:

df.to_json('people.json', force_ascii=False)

Conclusion

The ability to export a DataFrame to a JSON file is a fundamental skill for anyone working with data in Python. The Pandas library provides the to_json() function, which makes this task straightforward and efficient. With the ability to customize the output using various parameters, you can tailor the JSON output to meet your specific requirements. Whether you’re building a web application, working with RESTful APIs, or performing data analysis, knowing how to export DataFrame objects into JSON format will be beneficial.

Leave a Reply