Below is an organized workflow for a sepsis data analysis project using Kaggle API, Python, PostgreSQL, and Tableau:
1. Create a Kaggle Account & API Key
Go to Kaggle’s website and log in or sign up.
Visit your Account page (click on your profile picture > Account).
Scroll to the API section and click on Create New API Token.
This will download a file named kaggle.json containing your Kaggle username and API key.
a. Install the Kaggle Python Library
Install the Kaggle API client via pip: pip install kaggle
b. Data Acquisition: Extracting Sepsis Data Using APIs
2. Data Preprocessing with Python
Key Steps:
a. Import Necessary Libraries:
b. Load Data:
c. Explore Data:
Check for missing values, duplicates, and column data types:
d. Clean Data:
Handle missing or inconsistent values.
Convert data types if necessary.
e. Save Preprocessed Data:
Save the cleaned data for PostgreSQL ingestion:
This Blog covers the steps involving API and Python and in Part 2 (https://www.numpyninja.com/post/complete-da-project-involving-api-python-postgresql-and-tableau-part-2) we will be seeing the PostgreSQL and Tableau steps.