top of page
Jayanthi Paramasivam

Understanding Window Functions(Analytic Functions) In SQL: "A Beginners Guide"


Introduction:

Window functions performs an aggregate operations on groups of rows. However, an aggregate operation groups query rows into a single result row, but window function produce a result for each query row .

  • OVER() clause

     The OVER() clause constructs a window. When it's empty, the window will include all records.

    Example

We will demonstrate the ‘employees' table from the sample database.


Employees Table

Syntax:

function_name(expression)OVER()


The following SQL command is to find the average of salary across all the rows.

SELECT emp_no, department, salary, AVG(salary) OVER() FROM employees

This will produce the following result.


  • PARTITION BY

            Inside of the OVER(), Use PARTITION BY to form rows into group of row.

        Syntax:

             function_name(expression) OVER( PARTITION BY column name)


The following SQL command is to find the average of salary across all the rows partitioned by department.

SELECT emp_no, department, salary, AVG(salary) OVER(PARTITION BY department) AS dept_avg FROM employees;

This will produce the following result.



Average Salary Partitioned By Department
  • ORDER BY

           Use ORDER BY inside of the OVER() CLAUSE to re-order rows within each window.

        Syntax:

            function_name(expression) OVER( PARTION BY column name ORDER BY column name)


The following SQL command will sum up the salary on rolling basis partitioned by department.

SELECT emp_no, department, salary,       
SUM(salary) OVER(PARTITION BY department ORDER BY salary DESC) AS   rolling_dept_salary,
SUM(salary) OVER(PARTITION BY department) AS total_dept_salary       
FROM employees;

This will produce the following result.


  • RANK()

    Returns the rank of the current row within its partition, with gaps in the ranking for ties. If two values are the same they will receive the same rank, and the next rank will skip a number.

 Syntax:

            function_name(expression) RANK() OVER( ORDER BY column name)


The following SQL command , All the rows order them by salary and partitioned by department and rank by salary.

SELECT emp_no,department,salary,
RANK() OVER(PARTITION BY department ORDER BY salary DESC) as Overall_Salary_Rank
FROM employees;

This will produce the following result.




  • ROW_NUMBER()

    Returns the number of the current row within its partition. Rows numbers range from 1 to the number of partition rows.

     Syntax:

                function_name(expression) ROW_NUMBER()  OVER(PARTITION BY column name ORDER BY column name)


The following SQL command will give row_number for each department order by salary with Rank.

SELECT emp_no,department,salary,
ROW_NUMBER() OVER(PARTITION BY department ORDER BY salary DESC) AS  dept_row_number,
RANK() OVER(PARTITION BY department ORDER BY salary DESC) AS dept_salary_rank FROM employees
ORDER BY department;

This will produce the following result.



  • DENSE_RANK()

    Returns the rank of the current row within its partition, without gaps. Peers are considered ties and receive the same rank. This function assigns consecutive ranks to peer groups; the result is that groups of size greater one do not produce noncontigous rank numbers.

    SYNTAX

     function_name(expression) DENSE_RANK()  OVER( ORDER BY column name)

SELECT emp_no,department,salary,
ROW_NUMBER() OVER(PARTITION BY department ORDER BY salary DESC) AS dept_row_number,
RANK() OVER(PARTITION BY department ORDER BY salary DESC) as dept_salary_rank,
RANK() OVER(ORDER BY salary DESC) AS overall_rank,
DENSE_RANK() OVER(ORDER BY salary DESC) AS  overall_dense_rank  
FROM employees ORDER BY  overall_rank;

This will produce the following result.


  • NTILE()

    Divides a partition into N groups(buckets),assigns each row in the portion its bucket number, and returns the bucket number of the current row within its partition. For example, if N is 4 , NTITLE() divides rows into four buckets. If N is 100, NTITLE() divides rows into 100 buckets.


    The following SQL command divides employees into 4 salary buckets within each department.

SELECT emp_no, department,salary,
NTILE(4) OVER(PARTITION BY department ORDER BY salary DESC) AS dept_salary_quartile FROM employees;

This will produce the following result.


  • FIRST_VALUE()

    Returns the values of expression from the first row of the window frame.


    The following SQL command returns first value of employee number of highest paid salary partitioned by department.

SELECT emp_no, department, salary,
  FIRST_VALUE(emp_no) OVER(PARTITION BY department ORDER BY salary DESC)     as highest_paid_dept,
FIRST_VALUE(emp_no) OVER(ORDER BY salary DESC) as highest_paid_overall FROM employees;

This will produce the following result.


  • LAG()

    It provides access to a row at a specified physical offset before the current row in the result set.


    The following SQL command will return the salary from the previous row for each employee.


SELECT emp_no, department, salary,
LAG(salary,1) OVER(PARTITION BY department ORDER BY salary DESC) as previous_salary 
FROM employees;

This will produce the following result.


  • LEAD()

    Similar to LAG(),but it accesses the next row's data.

    The following SQL command will return the salary from the previous row for each employee.


SELECT emp_no, department, salary, 
LEAD(salary,1) OVER(PARTITION BY department ORDER BY salary DESC) as previous_salary 
FROM employees;

This will produce the following result.


CONCLUSION

The Windows function in SQL is very powerful in the database. In this blog, we have seen several examples of Window function and aggregate functions with window functions in SQL. Hope you will now be clear about what an window function is in SQL.


Thanks for reading!

45 views

Recent Posts

See All
bottom of page