top of page
Writer's pictureShweta Rathkanthiwar

Four ways to Import CSV file into PostgreSQL: A Step-by-Step guide

Recently we came into a situation where we were supposed to import csv file in PostgreSQL. We browse through many sites and came up with four different ways to do it. First two ways are through Python and the other two ways are through PgAdmin. Let’s discuss them in detail.


Understanding CSV file

CSV (Comma Separated Values) files are great for moving table-like data between different programs because they store data in a simple, straightforward format. Each line in a CSV file represents a row of data, with each value separated by a comma. This makes them easy to read and use. Now, let’s look at how we can use this simplicity to work with PostgreSQL.


Preparation Steps Before Importing CSV into PostgreSQL

Before we get into how to import a CSV file into PostgreSQL, it's important to get everything ready first. Making sure your data is prepared can prevent a lot of headaches later. Here are some steps to follow: 

1. Make sure your CSV file is well-structured and clean. This means there shouldn’t be any missing values in your rows or columns. If there are, consider using a tool like Excel or Python’s pandas library to fill them in with appropriate default values.

 

2. Ensure that all data types match those in the target PostgreSQL table. For instance, if a column in PostgreSQL is of type INT, but your CSV file has some rows filled with strings for that same column – that’s going to cause an error.

 

3. Remember to check if the target PostgreSQL database and table exist and have correct permissions set up. You’ll need to WRITE access on the table where you want to import data.

 

4. Another common pitfall involves dealing with large datasets. If your CSV file is too large for your machine’s memory capacity, you’ll need to split it into smaller chunks before proceeding with the import process.

 

5. Don’t forget about encoding issues! Ensure that both your CSV file and PostgreSQL database use the same character encoding (like UTF-8) to avoid unwelcome surprises during import.

 

By following these preparatory steps before actually diving into the import process, you’ll save yourself a lot of headaches and your operations will go on without a hitch. Now let’s move onto the actual importing process!


Method 1: Using Python psycopg2 package

Python is a popular programming language that is widely used for data analysis and manipulation. You can use Python to import CSV files to PostgreSQL tables. Here are the steps to import a CSV file to a PostgreSQL table using Python:

1. Create Jupiter notebook and Install the psycopg2 package, which is a PostgreSQL adapter for Python.

pip install psycopg2


2. Use the following Python code to import the CSV file to the PostgreSQL table:

import psycopg2

import csv


conn = psycopg2.connect(

host="localhost",

database="mydatabase",

user="myusername",

password="mypassword"

)

cur = conn.cursor()

with open('/path/to/csv/file.csv', 'r') as f:

reader = csv.reader(f)

next(reader) # Skip the header row

for row in reader:

cur.execute(

"INSERT INTO table_name (column1, column2, column3, ...) VALUES (%s, %s, %s, ...)",row

)

conn.commit()

cur.close()

conn.close()

 

Method 2: Using Python sqlalchemy library.

SQLAlchemy is the Python SQL toolkit that allows developers to access and manage SQL databases using Pythonic domain language. You can write a query in the form of a string or chain Python objects for similar queries. Here are the steps to import a CSV file using sqlalchemy.


1. Create Jupiter notebook and Install the sqlalchemy toolkit.

!pip install sqlalchemy

from sqlalchemy import create_engine

from urllib.parse import quote_plus


2. Python code to import the CSV file to the PostgreSQL table.

# Reading data from a CSV file

df_observation = pd.read_csv(r'\path\ Project\File.csv')

# Setting up database connection

username = ''

password = ' '

host = ''

port = ''

dbname = ''

# Securely format the password

password_encoded = quote_plus(password)

# Create the database connection string

connection_string = f'postgresql://{username}:{password_encoded}@{host}:{port}/{dbname}'

# Create a SQL Alchemy engine

engine = create_engine(connection_string)

# Write the DataFrame to SQL

df_observation.to_sql(tablename, engine, if_exists='replace', index=False)

# Query the table back to verify

table_name = pd.read_sql_query("SELECT * FROM tablename ", engine)

print(table_name)

 

Method 3: Import a CSV file into a table using COPY statement

To import this CSV file here ae the steps


1.     Create table structure with data types

CREATE TABLE persons (

  id SERIAL,

  first_name VARCHAR(50),

  last_name VARCHAR(50),

  dob DATE,

  email VARCHAR(255),

  PRIMARY KEY (id)

);


2.     Now use copy command to load CSV file.

COPY persons(first_name, last_name, dob, email)

FROM 'C:\sampledb\persons.csv'

DELIMITER ','

CSV HEADER;


Method 4: Import CSV file into a table using pgAdmin

In case you need to import a CSV file from your computer into a table on the PostgreSQL database server, you can use pgAdmin. Here are steps to do it


1.      Create table structure with data types

CREATE TABLE persons (

  id SERIAL,

  first_name VARCHAR(50),

  last_name VARCHAR(50),

  dob DATE,

  email VARCHAR(255),

  PRIMARY KEY (id)

);


2.      Once table is created  right-click the persons table and select the Import/Export… menu item

3. In the "Import/Export" dialog box, select "Import".

4. Select the CSV file you want to import and specify the table name and delimiter.

5. Click on the "Import" button to import the CSV file to the table.

 

In conclusion, importing CSV files to PostgreSQL is a common task for developers and data analysts. There are various methods to perform this task, including command-line interfaces, graphical user interfaces, and third-party tools.

797 views

Recent Posts

See All
bottom of page