SQL Window Functions and Analytic Functions: Your FAQ Guide

1.What are window functions in SQL and how do they differ from regular aggregate functions?
Window functions compute across a set of table rows that are related to the current row. In contrast to aggregate functions which return a single value for a group of rows, window functions allow you to preserve the individual row details while performing calculations like running totals or ranking. A window function operates within a defined "window" of rows.
Example Query:
SELECT employee_id, salary,
SUM(salary) OVER (ORDER BY salary) AS running_total
FROM employees;
This statement calculates a running total of salaries in ascending order not collapsing the rows like an aggregate function such as SUM() would do and group all the rows.

2.What does the OVER clause of a window function do?
The OVER clause decides which window of rows on which a window function actually operates. The OVER clause can contain an ORDER BY that defines how rows will be sorted and a PARTITION BY that partitions rows into groups. Without an OVER clause, applying any window function to a particular set of rows is meaningless because all the row-specific data is lost.
Query to Show this Concept:
SELECT employee_id, salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank FROM employees;
Here, RANK() ranks all employees within departments according to salary. Again, with the use of the PARTITION BY clause, we are able to separate each ranking per department.

3.What is ROW_NUMBER()? How do you apply this function in SQL, and what are its applications?
ROW_NUMBER() returns a unique number for each row in the result set, starting with 1. It can be applied to row number results to find duplicate records or provide ranks for rows in a sorted result set.
Example Query
SELECT employee_id, salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num FROM employees;
This assigns a sequential number to every employee based on their salary in descending order. It can be helpful when you require a unique identifier for every row or if you limit the results to a particular row number.

4.What is the RANK() function and how is it different from DENSE_RANK() in SQL?
Both RANK() and DENSE_RANK() provide rankings to rows based on an ordering specified. However, in case of ties, RANK() leaves gaps in the ranking, while DENSE_RANK() does not leave gaps and provides the same rank to tied rows.
Sample Query:
SELECT employee_id, salary,
RANK() OVER (ORDER BY salary DESC) AS rank,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;
If two employees have the same salary, RANK() will assign them the same rank but leave a gap in the next rank, while DENSE_RANK() will assign the next rank without gaps.

5.What are the LEAD() and LAG() functions in SQL and how do they work?
The LEAD() and LAG() functions enable you to access data from a subsequent or preceding row, respectively without having to use a self-join. These functions are helpful for comparing values between rows such as calculating differences between consecutive rows.
Example Query:
SELECT employee_id, salary,
LEAD(salary) OVER (ORDER BY salary) AS next_salary
FROM employees;
This query returns the salary of the employee and the salary of the next employee in the ordered list. LAG() would return the previous row's salary.

Detailed guide to SQL window and analytic functions with examples of ROW_NUMBER, PARTITION BY and OVER clause

6.What is the difference between NTILE() and PERCENT_RANK()?
NTILE() is to divide the result set into any number of equal-sized buckets and assigns a bucket number to each row. PERCENT_RANK() calculates the relative rank of a row within a result set expressed as a percentage ranging from 0 to 1.
Example Query (NTILE())
SELECT employee_id, salary,
NTILE(4) OVER (ORDER BY salary) AS quartile
FROM employees;
This query divides the employees into 4 equal quartiles based on their salary.
Example Query (PERCENT_RANK()):
SELECT employee_id, salary,
PERCENT_RANK() OVER (ORDER BY salary) AS percent_rank
FROM employees;
It returns the percentile rank for each employee's salary.

7.How do you use SUM() and AVG() as window functions to calculate running totals and moving averages?
You can use SUM() and AVG() with the OVER clause to calculate running totals or moving averages over a window of rows. The operation is performed on an order of rows and set by PARTITION BY and ORDER BY.
Query:
SELECT employee_id, salary,
SUM(salary) OVER (ORDER BY hire_date) AS running_total,
AVG(salary) OVER (ORDER BY hire_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg FROM employees;
In this query, SUM() gives a running sum of salaries and AVG() gives a moving average of the last 3 salaries (including the current row).

8.What is a window frame in SQL and how might you define it?
A window frame identifies the set of rows over which the window function operates and in terms of the language, may be defined either by a ROWS or RANGE clause within the OVER clause. Thus, the window frame determines the extent of the computation for the window function-by including either just the current row or a given range of rows.
Example Query:
SELECT employee_id, salary,
SUM(salary) OVER (ORDER BY hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total FROM employees;
In this query, the window frame includes all rows from the beginning of the result set up to the current row.

9.what are real-life use cases for window functions in SQL?
Window functions are most used in ranking, running sums, moving averages, and similar comparisons of values between the rows. They find practical applications in financial computations and sales analysis and also can be used in reporting systems.
Example Query (Ranking: Real-life usage for ranking)
SELECT employee_id, salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS department_rank FROM employees;.

10.How do you optimize queries with window functions in SQL?
Optimization of queries with window functions goes about indexing, reducing rows processed and ensuring that window functions are applied where necessary. You may also use pre-aggregated data or subqueries to avoid using unnecessary window functions.
Query Example (Indexing for Optimization):
CREATE INDEX idx_employee_salary ON employees (salary);
Indexing on frequently used columns such as salary can improve query performance of queries that use window functions, sorting or partitioning on this column.

Previous Topic==> Indexing SQL FAQ. || Next Topic==> Transaction and ACID properties FAQ

Top SQL Interview Questions Employee Salary Management FAQ!. Top 25 PL/SQL Interview Questions
Topics for Account Management Case Study
CASE Study SQL (Account Management)
Joins With Group by Having Equi Join Joins with Subqueries Self Join Outer Join