One who wants to master SQL (structured Query Language) must understand the key pillars which form the core.
1. Key elements of a database
2. Relational Keys
3. Select Queries
4. Joins
5. Data Modelling
6. Fact and Dimension Table Types
7. Star Schema
8. Data Definition Language (DDL) basics
9. Data Manipulation Language (DML) basic
1. Key elements of a database:
A database usually takes the key areas of a business and breaks them into structured tables. Each table will store information on a key area. A discrete example of retail database can be
Staff table, Customer table, Product table, Orders table, etc
2. Relational keys:
With each part of the business having its own table, we need to know how a table is structured and how they relate to each other. Primary and Foreign Keys help relate tables.
What is a primary Key?
Tables within a database are structured for a purpose so that each table removes the need for duplicate values. For example, a product table should not have duplicates of the same product.
A primary key is assigned to each unique row within a table. This creates an efficient structure when you need to find related information.
What is a foreign key?
A foreign key is a common field shared by two or more tables holding data from different business areas. A foreign key is usually the primary key in one table and a reference key in another table.
Order ID(Primary Key) | Order Date | Product ID(Foreign Key) | Customer ID(Foreign Key) |
1 | 1/2/1996 | 2 | 142 |
2 | 6/3/2001 | 1 | 362 |
3 | 7/9/2007 | 14 | 283 |
3. Select queries
Understanding the structure of a table allows us to write queries that access the information we need form each table. For a select statement we must assign which fields we want to retrieve, and then the table the fields come from. We can then add filters for the data allowing us to get answers to specific queries quickly.
SQL Query example:
SELECT Order_Date, Order_ID, Product_ID FROM Orders Where Order_Date = “1 Dec 2017”
4. Joins:
Now we know how each table is structured (Primary Keys), how tables are linked (Foreign Keys) and how to write select queries. We can now look at the ways in which we can retrieve data from multiple tables simultaneously.
Why use a join?
A join is how we connect two or more tables together. This allows us to summarize data that would normally be stored in different tables into one result set. This is useful when analyzing data for report creation.
What are the different types of joins?
5. Data modelling:
Now we know how to join tables in a query, we can look at the relationship between tables.
What are the different types of table relationships?
There are three types of relational models between tables: one-to-one, one-to-many, and many-to-many.
· A one-to-one relationship means a single ID in one table would reference a single ID in another table.
· One-to-many relationship requires one table with unique values to be related to a table that has multiple values for each of the original values.
· Many-to-many relationship means that are no unique IDs used in either table.
6. Table types:
When creating a database, we use two types of tables to split up our data: Fact tables and dimension tables.
What is a Fact table?
A Fact table (also known as a measure table) as the name suggests stores a record of all the facts related to a subject. In our example the Fact table is the Orders table, here we have all the facts needed in a transaction.
We have the who (Customer ID), the what (Product ID) and the when (Order Date). This allows each line of the order table to summarize everything about a transaction.
What is a Dimension table?
Dimension tables give structure to our Fact tables. If we look at our orders table, the Product ID and Customer ID do not mean much on their own.
Using Dimension tables to store the details of each area of the transaction allows the database to save space and speed up processing time.
Why do we use fact and dimension tables?
We use Fact and Dimension tables to improve the performance of our database and allow for easy maintenance.
If we were to store all the text information from our dimension table in the fact table, it would look nice to the user. However, due to the duplication of text fields, the server would really struggle to handle large datasets or process queries.
7. What is the star schema?
With an understanding of the key structural aspects of a database, we can look are the various types of database structures and their uses. The star schema is easiest to understand.
As we can see the at the centre of our star schema is our Fact table Orders. Each of our Dimension tables are linked to the Fact table, created a star looking schema.
This is the most common schema for databases that are used for reports purposes, as this schema is easier for a user to navigate.
8. What is data definition language (DDL)?
Data Definition Language (DDL) is a vocabulary used in Transact-SQL (T-SQL) to create, edit and delete tables. Most frequently used DDL’s: CREATE, DROP, TRUNCATE and ALTER
CREATE: We use this each time we create a table, for example:
CREATE TABLE Product (
Product ID int,
Product Name Varchar(255),
Product Type Varchar (255),
Product Colour Varchar (255),
);
DROP: We use this to delete a table, including its structure and data. For example:
DROP TABLE Product;
TRUNCATE: We use truncate to remove the data stored in a table but retain the table structure, for example:
TRUNCATE TABLE Product;
ALTER: We can use this to change the structure of a table. For example, we can remove a field:
ALTER TABLE Product
DROP COLUMN Product Colour
This would remove the column from the table and all data stored within it. We can also add additional columns to an existing table
ALTER TABLE Product
ADD Product Description Varchar (255)
9. What is data manipulation language (DML)?
Data Manipulation Language (DML) is a vocabulary used to retrieve and work with data stored within the database. Most frequently used DDL’s: SELECT, DELETE, UPDATE and INSERT
SELECT: We have already used this DML to specify which records we want to retrieve for reviewing
SELECT Order Date, Order ID, Product ID FROM Orders Where Order Date = “1 Dec 2017”
This would retrieve the data in each row stored in the fields Order Date, Order ID and Product ID. The FROM and WHERE clauses are essential to DML queries to specify which table we wish the query to run on, and which range of data we wish to retrieve.
DELETE: This is the opposite to the select statement, where the data in each row we specify is removed from the table e.g.
DELETE FROM Product WHERE Order Date = “1 Dec 2017”
This statement would remove each row from the database that matched the details of the WHERE clause.
UPDATE: We can use this DML to correct or update a value within a table e.g.
UPDATE Product SET Product Name = “Normal Kettle”, Product Colour = “Green” WHERE Product ID = 1
This statement would select rows where the Product ID is 1, and then update the existing values within fields Product Name and Product Colour.
INSERT: We can use the insert DML to add missing values to our tables e.g
INSERT INTO Product (Product ID, Product Name, Product Type, Product Colour) VALUES (13, “Stylish Cheese Grater”, 3, “Blue”)
This statement specifies the fields we wish to insert and the values to be added. You do not have to specify all fields, however any field not selected will be defaulted to a null value.
Комментарии