Analytic functions have been used from the early versions of Oracle. It is used in many tasks, especially reporting. In this article, I am going to use EMP table for my examples. If you don’t have EMP table, you can find the script from the following link.

https://livesql.oracle.com/apex/livesql/file/content_O5AEB2HE08PYEPTGCFLZU9YCV.html

You can also find more about analytic function syntax in the docs.

https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions004.htm#SQLRF06174

Analytic functions are similar to aggregate functions. However, the output of the query does not reduce the rows. In other words, unlike aggregate functions you can see the whole result set when using the analytic functions.

```
SELECT
DEPTNO,
AVG(SAL)
FROM EMP GROUP BY DEPTNO;
```

```
SELECT
EMPNO,
ENAME,
JOB,
SAL,
AVG(SAL) OVER(PARTITION BY DEPTNO) AVG_DEPT_SAL
FROM EMP;
```

The most used sort functions are the followings. So, you can look at the result set below to see the difference between each other. **ROW_NUMBER** function is the conventional way to sort the data. **RANK** function is like much more gradation and as the name suggests, **DENSE_RANK** sorts the data without leave a blank.

```
SELECT
EMPNO,
ENAME,
JOB,
DEPTNO,
SAL,
ROW_NUMBER() OVER(PARTITION BY DEPTNO ORDER BY SAL) RN,
RANK() OVER(PARTITION BY DEPTNO ORDER BY SAL) RNK,
DENSE_RANK() OVER(PARTITION BY DEPTNO ORDER BY SAL) DRNK
FROM EMP ORDER BY DEPTNO;
```

Which function would you use if you want to get the EMPNO of the person with the highest salary? Actually, there are several ways to get this data. Assuming that only one person gets the highest salary, the following queries get the same result. However, I would prefer to use the **KEEP** one.

##### Option 1

```
SELECT
EMPNO
FROM (
SELECT T.*, ROW_NUMBER() OVER(ORDER BY SAL DESC) RN FROM EMP T
) WHERE RN = 1;
```

##### Option 2

```
SELECT
EMPNO
FROM EMP
WHERE SAL = (SELECT MAX(SAL) FROM EMP);
```

##### Option 3

```
SELECT
MAX(EMPNO) KEEP(DENSE_RANK FIRST ORDER BY SAL DESC)
FROM EMP;
```

No matter which function you use **MAX**, **MIN** or **AVG**. In **KEEP** clause, you get the same result as long as you get only one row for the highest salary record.

```
SELECT
MAX(EMPNO) KEEP(DENSE_RANK FIRST ORDER BY SAL DESC) mx,
MIN(EMPNO) KEEP(DENSE_RANK FIRST ORDER BY SAL DESC) mn,
AVG(EMPNO) KEEP(DENSE_RANK FIRST ORDER BY SAL DESC) ag
FROM EMP;
```

*Windowing Clause*

It can only be used when ORDER BY clause is present and after the ORDER BY clause. It has two basic formats with using several combinations like the following.

```
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
RANGE BETWEEN UNBOUNDED PRECEDING AND offset FOLLOWING
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
RANGE BETWEEN CURRENT ROW AND CURRENT ROW
RANGE BETWEEN CURRENT ROW AND offset FOLLOWING
RANGE BETWEEN offset PRECEDING AND UNBOUNDED FOLLOWING
RANGE BETWEEN offset PRECEDING AND CURRENT ROW
RANGE BETWEEN offset PRECEDING AND offset FOLLOWING
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
ROWS BETWEEN UNBOUNDED PRECEDING AND offset FOLLOWING
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
ROWS BETWEEN CURRENT ROW AND CURRENT ROW
ROWS BETWEEN CURRENT ROW AND offset FOLLOWING
ROWS BETWEEN offset PRECEDING AND UNBOUNDED FOLLOWING
ROWS BETWEEN offset PRECEDING AND CURRENT ROW
ROWS BETWEEN offset PRECEDING AND offset FOLLOWING
```

The default value is **RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW**, don’t get confused with ROWS clause. RANGE and ROWS have the different meanings.

```
SELECT
EMPNO,
ENAME,
JOB,
DEPTNO,
SAL,
AVG(SAL) OVER(ORDER BY DEPTNO RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) range_sal, --equivalent to AVG(SAL) OVER(ORDER BY DEPTNO)
AVG(SAL) OVER(ORDER BY DEPTNO ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) rows_sal
FROM EMP;
```

As can be seen in the following example, using the WINDOWING clause, the FIRST_VALUE and LAST_VALUE functions can be used instead of the LAG and LEAD functions.

```
SELECT
EMPNO,
ENAME,
JOB,
DEPTNO,
SAL,
LAG(SAL, 1, SAL) OVER(ORDER BY SAL) lag_val,
FIRST_VALUE(SAL) OVER(ORDER BY SAL ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) first_val,
LEAD(SAL, 1, SAL) OVER(ORDER BY SAL) lead_val,
LAST_VALUE(SAL) OVER(ORDER BY SAL ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) last_val
FROM EMP;
```

Lastly, I also would like to mention one of the most useful analytic function that introduced in Oracle 11g Release 2. LISTAGG function. Basically, it is used like a pivot process. It aggregates strings easily. It also order the elements in the concatenated list.

```
SELECT
JOB,
LISTAGG(ENAME, ', ') WITHIN GROUP(ORDER BY ENAME) ENAMES
FROM EMP GROUP BY JOB;
```

I usually use this function when I have to specify all column names instead of using * sign in SELECT clause for a table that too many columns.

```
SELECT
LISTAGG(COLUMN_NAME, ', ') WITHIN GROUP(ORDER BY COLUMN_ID)
FROM USER_TAB_COLS
WHERE TABLE_NAME = 'T_TAB1'
AND COLUMN_NAME != 'ID'; --Column names that not intended to be used in the SELECT section
```

*NTILE Function*

Basically, NTILE function divides an ordered result set into specified number of equal groups or buckets. In the following example, 10 rows result set split 4 groups. The reason why the first buckets have 3 lines is due to the incomplete ratio of the number of lines to the section. In this case, some buckets may have more lines according to the specified sort priority in the function.

```
SELECT
EMPNO,
ENAME,
NTILE(4) OVER(ORDER BY ENAME) BUCKET
FROM EMP;
```

If the number of buckets specified is less than the number of rows in the result set. Then, the number of buckets will be reduced. So there is one row per bucket.

```
SELECT
EMPNO,
ENAME,
NTILE(20) OVER(ORDER BY ENAME) BUCKET
FROM EMP;
```

The query below shows the employees who are in the middle of the salary scale.

```
SELECT * FROM (
SELECT
EMPNO,
DEPTNO,
SAL,
NTILE(3) OVER(ORDER BY SAL) BUCKET
FROM EMP
) WHERE BUCKET = 2;
```

```
SELECT
EMPNO,
ENAME,
SAL,
DEPTNO,
NTILE(3) OVER(PARTITION BY DEPTNO ORDER BY SAL) BUCKET
FROM EMP;
```

*RATIO_TO_REPORT*

As the name suggests, **RATIO_TO_REPORT** function computes the ratio of the column data according to the sum of the grouping data set. If query partition clause is omitted then the function is computed over the whole data set of the column. It doesn’t support the order or windowing clauses. So, the below query shows that the percentage rate of each employee’s salary to the sum of the all salaries.

```
SELECT
EMPNO,
ENAME,
JOB,
SAL,
ROUND(RATIO_TO_REPORT(SAL) OVER(), 2) * 100 SAL_RATIO,
ROUND((SAL / SUM(SAL) OVER()), 2) * 100 SAL_RATIO2 --Equivalent of RATIO_TO_REPORT column
FROM EMP;
```

The following example gives the percentage of each employee’s salary by each job title.

```
SELECT
EMPNO,
ENAME,
JOB,
SAL,
ROUND(RATIO_TO_REPORT(SAL) OVER(PARTITION BY JOB), 2) * 100 SAL_RATIO
FROM EMP;
```

So, there are many analytic functions that used for different purposes. I hope this article will help you.