开窗函数 - first_value/last_value
1、开窗函数是什么?
开窗函数用于为行定义一个窗口(这里的窗口是指运算将要操作的行的集合),它对一组值进行操作,不需要使用 GROUP BY 子句对数据进行分组,能够在同一行中同时返回基础行的列和聚合列。
2、开窗函数有什么用?
开窗函数的功能本质是聚合,但是相比聚合,开窗函数可以提供的信息更多。
3、first_value/last_value 函数
first_value()over(partition by 列名1,列名2 order by 列名1,列名2)是求一组数据的第一个值last_value()over(partition by 列名1,列名2 order by 列名1,列名2)是求一组数据的最后一个值
first_value 用法:
select distinct a.date,a.name,first_value(date)over(partition by name order by date asc)as `每个人对应最早的date`,first_value(date)over(partition by name order by date desc)as `每个人对应最晚的date`from (select '张三'as name,'2021-04-11' as date union all select '李四'as name,'2021-04-09' as date union all select '赵四'as name,'2021-04-16' as date union all select '张三'as name,'2021-03-10'as dateunion all select '李四'as name,'2020-01-01'as date)a
last_value 用法
select distinct a.date,a.name,last_value(date)over(partition by name order by date asc)as `每个人对应最晚的date`from (select '张三'as name,'2021-04-11' as date union all select '李四'as name,'2021-04-09' as date union all select '赵四'as name,'2021-04-16' as date union all select '张三'as name,'2021-03-10'as dateunion all select '李四'as name,'2020-01-01'as date)a
可以看到使用 last_value 函数求每个人最后一个日期,结果并不是想要的。那该怎么办呢,查询该函数的具体用法发现:
last_value() 默认的统计范围是”rows between unbounded preceding and current row【无界的前面行和当前行之间】” 怎么理解呢?见下:
rows between unbounded preceding and current row,可以这么理解: x∈(-∞,X)rows between unbounded preceding and unbounded following, x∈(-∞,+ ∞)rows between current row and unbounded following, x∈(X,+ ∞)
last_value() 默认是升序,如果限制了是降序,则等同于 first_value() 升序
select distinct a.date,a.name,last_value(date)over(partition by name order by date rows between unbounded preceding and current row)as `(-∞,X)`,last_value(date)over(partition by name order by date rows between unbounded preceding and unbounded following)as `(-∞,+ ∞)`,last_value(date)over(partition by name order by date rows between current row and unbounded following)as `(X,+ ∞)`from (select '张三'as name,'2021-04-11' as date union all select '李四'as name,'2021-04-09' as date union all select '赵四'as name,'2021-04-16' as date union all select '张三'as name,'2021-03-10'as dateunion all select '李四'as name,'2020-01-01'as date)a