hand-businesswoman-touching-hand-artificial-intelligence-meaning-technology-connection-go-

Parsing JSON dataset using Pandas



Photo by Gabriel Heinzer on Unsplash


In the process of Data gathering a Data Analyst have to handle various forms of data from different sources. If you want to fetch data from some APIs that return in JSON format. JSON act as a Data structure to store that data. Here JSON is act like a universal format that is understandable by all programming languages.


What is JSON:


JSON (Java Script On Notation) is a standard text-based format for representing structured data based on JavaScript object syntax. It is commonly used for transmitting data in web applications (e.g., sending some data from the server to the client, so it can be displayed on a web page, or vice versa).


JSON is a lightweight data format used for data interchange between multiple different languages. It is easy to read for humans and easily parsed by machines. It's pretty common for websites to return JSON from API's so that the information is easy to parse by different programming languages.


Structure of a JSON file:


{
    "col1":{
          "row1": 1,
          "row2": 2,
          "row3": 3
           }
    "col2":{
          "row1": "x",
          "row2": "y",
          "row3": "z"
           }
}

Structure of JSON file is Dictionary like with key_name and key_values. In one line this data is look like as


Importing JSON Files:


We can work with JSON file using the Python Data Analysis Library (Pandas).

import pandas as pd
df= pd.DataFrame([['a','b'],['c','d']],
                 index = ['row1','row2'],
                 columns = ['col1','col2'])

Output of this code is :

           col1   col2
row1       a         b
row2       c         d

We can convert this Data Frame in JSON file using to_json


df.to_json(orient='index')

Output is :

'{"row1":{"col1":"a","col2":"b"},"row2":{"col1":"c","col2":"d"}}'

If you want to split Data Frame by column then code is


df.to_json(orient='split')

Output is :


'{"columns":["col1","col2"],"index":["row1","row2"],"data":[["a","b"],["c","d"]]}'


Read JSON file :

pd.read_json('train.json')

Output is :



Read JSON file from URL :


pd.read_json('https://api.exchangerate-api.com/v4/latest/INR')


Conclusion:


In this article, we have seen what is JSON file, structure of JSON file, how to read JSON file in pandas JSON is a data structure and can be used in almost every programming language.

167 views0 comments

Recent Posts

See All