Introduction:
SQL (Structured Query Language) is a programming language used to manage relational databases. It allows users to perform operations such as adding, modifying, and retrieving data from databases. SQL is used to communicate with relational database management systems (RDBMS) such as MySQL, PostgreSQL, Oracle, Microsoft SQL Server, and SQLite. These RDBMS use SQL to create, modify, and query databases.
Common SQL Commands
Here are some commonly used SQL commands:
1. SELECT: Retrieves data from a database
SELECT column1, column2, ... FROM table_name
This command retrieves data from one or more columns in a table.
2. INSERT INTO: Adds new data to a table in a database
INSERT INTO table_name (column1, column2, ...) VALUES (value1, value2, ...)
This command adds a new row of data to a table.
3. UPDATE: Modifies existing data in a table in a database
UPDATE table_name SET column1 = value1, column2 = value2, ... WHERE some_column = some_value
This command updates existing data in a table.
4. DELETE: Removes data from a table in a database
DELETE FROM table_name WHERE some_column = some_value
This command removes one or more rows of data from a table.
5. CREATE TABLE: Creates a new table in a database
CREATE TABLE table_name ( column1 datatype, column2 datatype, ... )
This command creates a new table with specified columns and data types.
6. ALTER TABLE: Modifies the structure of an existing table in a database
ALTER TABLE table_name ADD column_name datatype
This command adds a new column to an existing table.
7. DROP TABLE: Deletes an existing table in a database
DROP TABLE table_name
This command deletes an existing table and all its data.
Additional SQL commands for manipulating and querying data in databases:
1. WHERE: Filters data based on a specified condition
SELECT column1, column2, ... FROM table_name WHERE some_column = some_value
This command retrieves data that meets a specific condition.
2. ORDER BY: Sorts data in ascending or descending order based on a specified column
SELECT column1, column2, ... FROM table_name ORDER BY column_name ASC/DESC
This command sorts retrieved data in ascending or descending order based on a specified column.
3. GROUP BY: Groups data based on a specified column
SELECT column1, COUNT(column2) FROM table_name GROUPBY column1
This command groups retrieved data based on a specified column and aggregates data using functions such as COUNT, SUM, AVG, MAX, or MIN.
4. JOIN: Combines data from two or more tables based on a common column
SELECT column1, column2, ... FROM table1 JOIN table2 ON table1.column = table2.column
This command combines data from two or more tables based on a common column.
5. DISTINCT: Removes duplicate values from retrieved data
SELECT DISTINCT column1, column2, ... FROM table_name
This command retrieves distinct values from a specified column or columns.
6. LIMIT: Limits the number of retrieved rows
SELECT column1, column2, ... FROM table_name LIMIT number_of_rows
This command limits the number of rows retrieved from a table.
7. HAVING: Filters data based on an aggregated value
SELECT column1, COUNT(column2) FROM table_name GROUPBY column1 HAVINGCOUNT(column2) > some_value
This command filters data that meets a specific condition after aggregation using GROUP BY.
SQL commands and functions for more complex data manipulation and analysis:
1. CASE: Performs conditional logic in SQL
SELECT column1, column2, CASE WHEN some_column > some_value THEN 'A' ELSE 'B' END AS new_column_name FROM table_name
This command performs conditional logic in SQL and creates a new column based on specified conditions.
2. COALESCE: Returns the first non-null value in a list of values
SELECT COALESCE(column1, column2, column3, ...) AS new_column_name FROM table_name
This command returns the first non-null value in a list of values.
3. SUM: Calculates the sum of a column of numeric values
SELECT SUM(column1) FROM table_name
This command calculates the sum of a column of numeric values.
4. AVG: Calculates the average of a column of numeric values
SELECT AVG(column1) FROM table_name
This command calculates the average of a column of numeric values.
5. MAX: Returns the maximum value in a column
SELECT MAX(column1) FROM table_name
This command returns the maximum value in a column.
6. MIN: Returns the minimum value in a column
SELECT MIN(column1) FROM table_name
This command returns the minimum value in a column.
7. UNION: Combines the results of two or more SELECT statements
SELECT column1, column2, ... FROM table1 UNION SELECT column1, column2, ... FROM table2
This command combines the results of two or more SELECT statements into a single result set.
8. EXTRACT: Extracts a specific part of a date or time value
SELECT EXTRACT(yearFROM column1) FROM table_name
This command extracts a specific part of a date or time value, such as year, month, or day.
9. WINDOW FUNCTIONS: Perform calculations over a window of rows
SELECT column1, column2, SUM(column3) OVER (PARTITIONBY column1 ORDERBY column2) AS running_total FROM table_name
This command performs calculations over a window of rows, allowing for advanced analysis such as calculating running totals, moving averages, and rankings.
10. RANKING FUNCTIONS: Assign a rank to each row based on specified criteria
SELECT column1, column2, RANK() OVER (ORDER BY column3 DESC) AS rank FROM table_name
This command assigns a rank to each row based on specified criteria such as sorting by a particular column.
11. PIVOT: Converts rows into columns based on a specified column
SELECT * FROM (SELECT column1, column2, column3 FROM table_name) PIVOT (SUM(column3) FOR column2 IN ('value1', 'value2', 'value3'))
This command converts rows into columns based on a specified column, allowing for easier analysis of data across different categories.
12. CTEs (Common Table Expressions): Create a temporary table for use in a query
WITH temp_table AS (SELECT column1, column2 FROM table_name) SELECT * FROM temp_table WHERE column1 > some_value
This command creates a temporary table for use in a query, allowing for complex analysis and manipulation of data.
13. GROUPING SETS: Group data by multiple columns or sets of columns
SELECT column1, column2, SUM(column3) FROM table_name GROUP BY GROUPING SETS ((column1), (column2), (column1, column2))
This command groups data by multiple columns or sets of columns, allowing for flexible and customized analysis of data.
14. STRING FUNCTIONS: Manipulate string values
SELECT CONCAT(column1, ' ', column2) AS full_name FROM table_name
This command manipulates string values, allowing for analysis and manipulation of text data.
Subqueries in SQL
Subqueries in SQL are queries that are nested inside another query. They can be used to retrieve data that will be used as a condition in the outer query, or they can be used to perform calculations or data manipulations that will be used in the outer query. Here are some examples of subqueries in SQL:
Subquery as a condition in the WHERE clause
SELECT column1, column2 FROM table_name WHERE column3 IN (SELECT column3 FROM table2 WHERE column4 = 'some_value')
This command retrieves data from table_name where the value of column3 is found in the result set of a subquery that retrieves data from table2.
2. Subquery as a condition in the HAVING clause
SELECT column1, COUNT(*) FROM table_name GROUPBY column1 HAVINGCOUNT(*) > (SELECTAVG(count_column) FROM (SELECT COUNT(*) AS count_column FROM table_name GROUP BY column1) AS subquery)
This command retrieves data from table_name where the count of each value in column1 is greater than the average count of all values in column1, which is calculated using a subquery.
3. Subquery as a derived table
SELECT derived_table.column1, derived_table.column2 FROM (SELECT column1, COUNT(*) AS count_column FROM table_name GROUP BY column1) AS derived_table WHERE derived_table.count_column > 1
This command creates a derived table using a subquery, which is then used in the outer query to retrieve data from table_name where the count of each value in column1 is greater than 1.
4. Subquery as a source of data for INSERT statement
INSERT INTO table_name (column1, column2) SELECT column1, column2 FROM table2 WHERE column3 = 'some_value'
This command inserts data into table_name using a subquery that retrieves data from table2 based on a condition. These are just a few examples of how subqueries can be used in SQL.
Subqueries are a powerful tool for retrieving and manipulating data in SQL and can be used in a variety of ways depending on the specific needs of a query.
Types of Subqueries in SQL
There are two types of subqueries in SQL:
1. Correlated subquery: A correlated subquery is a subquery that depends on the outer query. It uses values from the outer query in order to generate its results. A correlated subquery is executed for each row returned by the outer query.
Here is an example of a correlated subquery:
SELECT column1, column2 FROM table1 WHERE column1 IN (SELECT column1 FROM table2 WHERE table1.column2 = table2.column2)
In this example, the subquery depends on the value of column2 from the outer query (table1.column2 = table2.column2). The subquery is executed for each row in table1.
2. Non-correlated subquery: A non-correlated subquery is a subquery that can be executed independently of the outer query. It does not depend on values from the outer query in order to generate its results.
Here is an example of a non-correlated subquery:
SELECT column1, column2 FROM table1 WHERE column1 IN (SELECT column1 FROM table2).
In this example, the subquery does not depend on any values from the outer query. It can be executed independently of the outer query.
Both correlated and non-correlated subqueries can be used in a variety of ways to retrieve data from multiple tables, perform calculations, or filter data based on specific criteria.
Hope you enjoyed reading this blog and learnt about the commands commonly used in data analysis in SQL.