## INTRODUCTION TO WINDOWS FUNCTIONS

**What are Window Functions?**

Window functions are powerful feature in SQL that allows you to perform calculations across a number of rows similar to aggregate functions. But other than aggregate functions, which you might use with a GROUP BY, they don’t return a single value for a group of rows, but a value for each row in the set.

**Syntax for Window Function:**

*<Window_function> (<expression>) Over ( [PARTITION BY partition_expression][ORDER BY order_expression [frame_specification])*

- <Window_function> is the specific function you want to apply. (Example: avg(), sum(), Rank(), Row_number()).

- <expression> is the column you want to operate on.

- PARTITION BY divides the each set into partitions, allowing window function to apply separately for each partition.

- ORDER BY specifies the order of rows within the partition.

- Frame_specification defines the range of rows used in the calculation.

**Window Frame:**

The window frame determines which rows are considered in the window function’s calculation.

It can be

- ROWS BETWEEN <start> And <end> OR

- RANGE BETWEEN <start> And <end>.

**Common options for <start> and <end> are:**

- UNBOUNDED PRECEDING : All rows before the current row.

- CURRENT ROW : Only the current row.

- n PRECEDING or n FOLLOWING : ‘n’ rows before or after the current row.

**Advantages of Window Function:**

One advantage of window functions is that they allow you to work with aggregate and non-aggregate values combined because the rows are not collapsed together. This opens up a way to calculate many things in one step that would otherwise require multiple.

Window functions are also relatively simple to use and read compared to multiple subqueries and self-joins. In this way, they can also help with performance issues. You can use a window function instead of self-join or subqueries.

Before we proceed we are going to use the below data for all the examples.

Download the csv file *Employee_detail* and create a table Employee_detail

and import the given csv file into the table. And let’s try all the below window functions.

**Some of the Aggregate functions(avg(),sum(),count(),min(),max() can be a Window Function by using Over() clause with it.**

**Example:** *Find the average salary,Total Salary,Number of Employees,Minimum Salary and Maximum Salary in each department.*

**With Aggregate Function: **We need GROUP BY clause to get the department wise average. Also we cannot display the non-aggregate columns (employee name, employee_id)

**Result:**

In the above result, We get only 5 rows which is aggregated based on department. But in window function we will get all the rows including aggregate values. Let see the same function with over() clause.

**With Window Function: **Along with aggregate column department n average(salary),Count(employee_id),.. we can display non-aggregate columns (employee name, employee_id) in window function.

Will do the same aggregation as like above example but with over() clause.

In the above code, used Over(PARTITION BY department). It groups the department and calculate the average,sum,max,min based on department.

**Result: ***Highlight with different colors for each department.*

With the above example, you may come to know the difference between aggregate function and window function. Let see more about some important Window functions in PostgreSql.

**List of Window Functions:**

Here are some of the window functions in PostgreSql:

**Ranking Functions:**

1. ROW_NUMBER(): It assigns a sequential integer to each row in a result set.

2. RANK(): It returns the rank of each row within a partition. Same value within the partition have same Rank but next value assign Rank with gaps.

3. DENSE_RANK(): It returns the rank of each row within a partition. Same value within the partition have same Rank but next value assign Rank without gaps.

**Distribution Functions:**

PERCENT_RANK(): It returns the relative ranking of a value within a given set of vaues.

CUME_DIST(): It returns the cumulative distribution of value within a group of values.

**Analytic Functions:**

FIRST_VALUE(): It returns the first row in a sorted partition of a result set.

LAST_VALUE():It returns the last row in a sorted partition of a result set.

NTH_VALUE: It returns the nth row in a sorted partition of a result set.

LAG(): It can access the data of a previous row from current row.

LEAD():It can access the data of a next row from current row.

NTILE(): It used to divide rows within a partition as equally as possible into

*n*groups, and assign each row its group number.

**Ranking Functions (ROW_NUMBER(), RANK(), DENSE_RANK) with example**

*To get the Row_Number, Rank, Dense_Rank of all Employees based on their salary in each department.*

The Window functions will apply to each partition and returns the result based on the result set. Each color rectangle below indicates each department.

**Result:**

#### Distribution Function CUME_DIST() and PERCENT_RANK() with Example

*To get the Percent_Rank and *the cumulative distribution *of all Employees based on their salary in each department.*

**Result:**

**Analytic Functions (FIRST_VALUE(), LAST_VALUE(), NTH_VALUE() ) with example**

*To get the Lowest salary, Highest Salary and 2nd Highest Salary in each department.* For that we have to use the RANGE BETWEEN Window Frame to get the department wise result.

**Result:**

**Analytic Functions LEAD() with example.**

Lead() function used to display the employee who is having higher salary/Equal salary than the current employee in each department.

**Result:**

**Analytic Function LAG() with example**

Lag() function used to display the employee who is having lower salary/Equal salary than the current employee in each department.

**Result:**

**Analytic Function NTILE() with example**

To split the employees into 2 equalent buckets based on their salary by NTILE().

**Result:**

Here you could see the cumulative distribution of employee based on their salary and also the distribution bucket based on their salary.

This was an overview of all window functions used in PostgreSQL.

**Conclusion:**

Window functions are a powerful tool in SQL that allows you to perform intricate analyses with ease and efficiency. From basic concepts to advanced techniques, this guide has equipped you with the knowledge to utilize window functions effectively.

Thank you for reading. Happy learning!!! If this has been of immense value to you, do give me a clap.

## Commentaires