Want to show your appreciation?
Please a cup of tea.

Sunday, April 24, 2016

Analytical Function in MySQL - ROW_NUMBER, RANK, DENSE_RANK

It was unfortunate that MySQL doesn't support analytical function. The problem got my attention when we were considering MySQL as staging and/or operational database for a Data Warehouse solution. A bit research reveals that it was indeed not very difficult to emulate some of them. Here is the workaround to get the the result of ROW_NUMBER, RANK and DENSE_RANK in MySQL.

Let's take an example to emulate the standard SQL99 analytical query below:

SELECT 
  ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY col3 DESC) AS row_number,
  RANK() OVER (PARTITION BY col1, col2 ORDER BY col3 DESC) AS rank,
  DENSE_RANK() OVER (PARTITION BY col1, col2 ORDER BY col3 DESC) AS dense_rank,
  t.* 
FROM table1 t

The MySQL workaround can be written with a mix of variables and sub queries:

SELECT
  @row_num:=IF(@prev_col1=t.col1 AND @prev_col2=t.col2, @row_num+1, 1) AS row_number,
  @dense:=IF(@prev_col1=t.col1 AND @prev_col2=t.col2, IF(@prev_col3=col3, @dense, @dense+1), 1) AS dense_rank,
  @rank:=IF(@prev_col1=t.col1 AND @prev_col2=t.col2 AND @prev_col3=col3, @rank, @row_num) AS rank,
  t.*
FROM (SELECT * FROM table1 ORDER BY col1, col2, col3 DESC) t,
     (SELECT @row_num:=1, @dense:=1, @rank:=1, @prev_col1:=NULL, @prev_col2:=NULL, @prev_col3:=NULL) var

This solution requires no self join, no temp table, still single pass, and generic enough to adapt it to any use case. But I admit that it is quite convoluted. Very sad that MySQL doesn't include such basic functions that are well achievable.