Want to show your appreciation? Please to my charity.

Sunday, April 24, 2016

Analytical Function in MySQL - ROW_NUMBER, RANK, DENSE_RANK

It was unfortunate that MySQL doesn't support analytical function. The problem got my attention when we were considering MySQL as staging and/or operational database for a Data Warehouse solution. A bit research reveals that it was indeed not very difficult to emulate some of them. Here is the workaround to get the the result of ROW_NUMBER, RANK and DENSE_RANK in MySQL.

Let's take an example to emulate the standard SQL99 analytical query below:

  ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY col3 DESC) AS row_number,
  RANK() OVER (PARTITION BY col1, col2 ORDER BY col3 DESC) AS rank,
  DENSE_RANK() OVER (PARTITION BY col1, col2 ORDER BY col3 DESC) AS dense_rank,
FROM table1 t

The MySQL workaround can be written with a mix of variables and sub queries:

  @row_num:=IF(@prev_col1=t.col1 AND @prev_col2=t.col2, @row_num+1, 1) AS row_number,
  @dense:=IF(@prev_col1=t.col1 AND @prev_col2=t.col2, IF(@prev_col3=col3, @dense, @dense+1), 1) AS dense_rank,
  @rank:=IF(@prev_col1=t.col1 AND @prev_col2=t.col2 AND @prev_col3=col3, @rank, @row_num) AS rank,
FROM (SELECT * FROM table1 ORDER BY col1, col2, col3 DESC) t,
     (SELECT @row_num:=1, @dense:=1, @rank:=1, @prev_col1:=NULL, @prev_col2:=NULL, @prev_col3:=NULL) var

This solution requires no self join, no temp table, still single pass, and generic enough to adapt it to any use case. But I admit that it is quite convoluted. Very sad that MySQL doesn't include such basic functions that are well achievable.


Mahendra V.M said...


Rick James said...

MariaDB just announced "Window Functions" in MariaDB 10.2 -- https://mariadb.com/kb/en/mariadb/window-functions/ . These may eventually obviate your nice, concise, SQL.

-- Rick James

Dan said...

Thank you very much! The ranking solutions I found on Stackoverflow didnt quite work the way I wanted to, but your workaround was the trick. Thanks for sharing with the community!

ToolmakerSteve said...

@prev_col1, 2, 3 are never set to a new value, so all row_nums are "1". After "t.*", need ", @prev_col1:=t.col1, @prev_col2:=t.col2, @prev_col3:=t.col3".

Unknown said...
This comment has been removed by the author.
jberryman said...

I think this is relying on undefined behavior, e.g. this variation gives me incorrect results (row_number is not ordered with respect to the `id` column):

SELECT @row_num:=@row_num+1 AS row_number, t.id FROM (SELECT * FROM table1 WHERE col = 264 ORDER BY id) t, (SELECT @row_num:=0) var;

Post a Comment