How to Export a Pandas Dataframe to a JSON File ?

Spread the love

In this post you will learn how to export a pandas dataframe to json file.

DataFrame.to_json() –

To convert a pandas dataframe to a json file we use the DataFrame.json() method.

To illustrate this let’s create a pandas dataframe.

import pandas as pd

data = {'Fruits': ['Apple','Banana','Orange','Mango','Grapes'],
       'Prices': [100, 20, 80, 60, 50]}
df = pd.DataFrame(data)
df

Now let’s export this pandas dataframe to a json file. We will save the file in the current directory.

df.to_json('./fruits_prices.json')

Now, let’s check the file. open the file using any text editor like vscode

{"Fruits":{"0":"Apple","1":"Banana","2":"Orange","3":"Mango","4":"Grapes"},"Prices":{"0":100,"1":20,"2":80,"3":60,"4":50}}

You can also format the json string using json formatter.

{
   "Fruits": {
      "0": "Apple",
      "1": "Banana",
      "2": "Orange",
      "3": "Mango",
      "4": "Grapes"
   },
   "Prices": {
      "0": 100,
      "1": 20,
      "2": 80,
      "3": 60,
      "4": 50
   }
}

Using different json formats –

There are various way you can format the json string when exporting a pandas dataframe to a json file. Some of the options are –

  1. split
  2. records
  3. index
  4. values
  5. table
  6. columns (default format)

Let’s try these one by one.

1. orient = ‘split’ –

df.to_json('./fruits_prices_split.json', orient='split')
{
   "columns": [
      "Fruits",
      "Prices"
   ],
   "index": [
      0,
      1,
      2,
      3,
      4
   ],
   "data": [
      [
         "Apple",
         100
      ],
      [
         "Banana",
         20
      ],
      [
         "Orange",
         80
      ],
      [
         "Mango",
         60
      ],
      [
         "Grapes",
         50
      ]
   ]
}

2. orient = ‘records’ –

df.to_json('./fruits_prices_records.json', orient='records')
[
   {
      "Fruits": "Apple",
      "Prices": 100
   },
   {
      "Fruits": "Banana",
      "Prices": 20
   },
   {
      "Fruits": "Orange",
      "Prices": 80
   },
   {
      "Fruits": "Mango",
      "Prices": 60
   },
   {
      "Fruits": "Grapes",
      "Prices": 50
   }
]

3. orient = ‘index’ –

df.to_json('./fruits_prices_index.json', orient='index')
{
   "0": {
      "Fruits": "Apple",
      "Prices": 100
   },
   "1": {
      "Fruits": "Banana",
      "Prices": 20
   },
   "2": {
      "Fruits": "Orange",
      "Prices": 80
   },
   "3": {
      "Fruits": "Mango",
      "Prices": 60
   },
   "4": {
      "Fruits": "Grapes",
      "Prices": 50
   }
}

4. orient= ‘values’ –

df.to_json('./fruits_prices_values.json', orient='values')
[
   [
      "Apple",
      100
   ],
   [
      "Banana",
      20
   ],
   [
      "Orange",
      80
   ],
   [
      "Mango",
      60
   ],
   [
      "Grapes",
      50
   ]
]

5. orient = ‘table’

df.to_json('./fruits_prices_table.json', orient='table')
{
   "schema": {
      "fields": [
         {
            "name": "index",
            "type": "integer"
         },
         {
            "name": "Fruits",
            "type": "string"
         },
         {
            "name": "Prices",
            "type": "integer"
         }
      ],
      "primaryKey": [
         "index"
      ],
      "pandas_version": "0.20.0"
   },
   "data": [
      {
         "index": 0,
         "Fruits": "Apple",
         "Prices": 100
      },
      {
         "index": 1,
         "Fruits": "Banana",
         "Prices": 20
      },
      {
         "index": 2,
         "Fruits": "Orange",
         "Prices": 80
      },
      {
         "index": 3,
         "Fruits": "Mango",
         "Prices": 60
      },
      {
         "index": 4,
         "Fruits": "Grapes",
         "Prices": 50
      }
   ]
}

6. orient = ‘columns’

df.to_json('./fruits_prices_columns.json', orient='columns')
{
   "Fruits": {
      "0": "Apple",
      "1": "Banana",
      "2": "Orange",
      "3": "Mango",
      "4": "Grapes"
   },
   "Prices": {
      "0": 100,
      "1": 20,
      "2": 80,
      "3": 60,
      "4": 50
   }
}

indent –

DataFrame.to_json() also have a indent parameter that let’s you set the length of whitespace used to indent each record.

df.to_json('./fruits_prices_indent2.json', indent=2)
{
  "Fruits":{
    "0":"Apple",
    "1":"Banana",
    "2":"Orange",
    "3":"Mango",
    "4":"Grapes"
  },
  "Prices":{
    "0":100,
    "1":20,
    "2":80,
    "3":60,
    "4":50
  }
}

Rating: 1 out of 5.

Leave a Reply