当前位置: 首页 > news >正文

SQL面试题练习 —— 合并用户浏览行为

目录

  • 1 题目
  • 2 建表语句
  • 3 题解

1 题目


有一份用户访问记录表,记录用户id和访问时间,如果用户访问时间间隔小于60s则认为时一次浏览,请合并用户的浏览行为。

样例数据

+----------+--------------+
| user_id  | access_time  |
+----------+--------------+
| 1        | 1736337600   |
| 1        | 1736337660   |
| 2        | 1736337670   |
| 1        | 1736337710   |
| 3        | 1736337715   |
| 2        | 1736337750   |
| 1        | 1736337760   |
| 3        | 1736337820   |
| 2        | 1736337850   |
| 1        | 1736337910   |
+----------+--------------+

2 建表语句


--建表语句
CREATE TABLE user_access_log (user_id INT,access_time BIGINT
) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
--插入数据
insert into user_access_log (user_id,access_time)
values
(1,1736337600),
(1,1736337660),
(2,1736337670),
(1,1736337710),
(3,1736337715),
(2,1736337750),
(1,1736337760),
(3,1736337820),
(2,1736337850),
(1,1736337910);

3 题解


(1)分用户计算出每次点击时间差;

select user_id,access_time,last_access_time,access_time - last_access_time as time_diff
from (select user_id,access_time,lag(access_time) over (partition by user_id order by access_time) as last_access_timefrom user_access_log) t

执行结果

+----------+--------------+-------------------+------------+
| user_id  | access_time  | last_access_time  | time_diff  |
+----------+--------------+-------------------+------------+
| 1        | 1736337600   | NULL              | NULL       |
| 1        | 1736337660   | 1736337600        | 60         |
| 1        | 1736337710   | 1736337660        | 50         |
| 1        | 1736337760   | 1736337710        | 50         |
| 1        | 1736337910   | 1736337760        | 150        |
| 2        | 1736337670   | NULL              | NULL       |
| 2        | 1736337750   | 1736337670        | 80         |
| 2        | 1736337850   | 1736337750        | 100        |
| 3        | 1736337715   | NULL              | NULL       |
| 3        | 1736337820   | 1736337715        | 105        |
+----------+--------------+-------------------+------------+

(2)确认是否是新的访问

select user_id,access_time,last_access_time,if(access_time - last_access_time >= 60, 1, 0) as is_new_group
from (select user_id,access_time,lag(access_time) over (partition by user_id order by access_time) as last_access_timefrom user_access_log) t

执行结果

+----------+--------------+-------------------+---------------+
| user_id  | access_time  | last_access_time  | is_new_group  |
+----------+--------------+-------------------+---------------+
| 1        | 1736337600   | NULL              | 0             |
| 1        | 1736337660   | 1736337600        | 1             |
| 1        | 1736337710   | 1736337660        | 0             |
| 1        | 1736337760   | 1736337710        | 0             |
| 1        | 1736337910   | 1736337760        | 1             |
| 2        | 1736337670   | NULL              | 0             |
| 2        | 1736337750   | 1736337670        | 1             |
| 2        | 1736337850   | 1736337750        | 1             |
| 3        | 1736337715   | NULL              | 0             |
| 3        | 1736337820   | 1736337715        | 1             |
+----------+--------------+-------------------+---------------+

(3)得出结果

使用sum()over(partition by …… order by ……)累加计算,给出组ID。聚合函数开窗使用order by 计算结果是从分组开始计算到当前行的结果。

这里的技巧:需要新建组的时候就给标签赋值1,否则0,然后累加计算结果在新建组的时候值就会变化,根据聚合值分组,得到合并结果。

with t_group as(select user_id,access_time,last_access_time,if(access_time - last_access_time >= 60, 1, 0) as is_new_groupfrom (select user_id,access_time,lag(access_time) over (partition by user_id order by access_time) as last_access_timefrom user_access_log) t)
select user_id,access_time,last_access_time,is_new_group,sum(is_new_group) over (partition by user_id order by access_time asc) as group_id
from t_group

执行结果

+----------+--------------+-------------------+---------------+-----------+
| user_id  | access_time  | last_access_time  | is_new_group  | group_id  |
+----------+--------------+-------------------+---------------+-----------+
| 1        | 1736337600   | NULL              | 0             | 0         |
| 1        | 1736337660   | 1736337600        | 1             | 1         |
| 1        | 1736337710   | 1736337660        | 0             | 1         |
| 1        | 1736337760   | 1736337710        | 0             | 1         |
| 1        | 1736337910   | 1736337760        | 1             | 2         |
| 2        | 1736337670   | NULL              | 0             | 0         |
| 2        | 1736337750   | 1736337670        | 1             | 1         |
| 2        | 1736337850   | 1736337750        | 1             | 2         |
| 3        | 1736337715   | NULL              | 0             | 0         |
| 3        | 1736337820   | 1736337715        | 1             | 1         |
+----------+--------------+-------------------+---------------+-----------+
http://www.lryc.cn/news/384781.html

相关文章:

  • 【Docker】docker 替换宿主与容器的映射端口和文件路径
  • GPU算力租用平台推荐
  • 定个小目标之刷LeetCode热题(31)
  • 我在高职教STM32——LCD液晶显示(3)
  • uniapp横屏移动端卡片缩进轮播图
  • 整合Spring Boot和Apache Solr进行全文搜索
  • 网络治理新模式:Web3时代的社会价值重构
  • [个人感悟] MySQL应该考察哪些问题?
  • 《数据结构与算法基础》学习笔记——1.2基本概念和术语
  • Java之线程相关应用实现
  • 一加全机型TWRP合集/橙狐recovery下载-20240603更新-支持一加12/Ace3V手机
  • 小伙子知道synchronized的优化过程吗
  • 鸿蒙面试心得
  • SQLite vs MySQL vs PostgreSQL对比总结
  • 一种改进解卷积算法在旋转机械故障诊断中的应用(MATLAB)
  • 分布式锁(4):jedis基于Redis setnx、get、getset的分布式锁
  • linux内存排查工具smem使用
  • 云主机相比物理机有哪些优势
  • ClickHouse-Keeper安装使用
  • 全国产飞腾+FPGA架构,支持B码+12网口+多串电力通讯管理机解决方案
  • bat命令 批处理 脚本 windows DOS
  • 【云计算】阿里云、腾讯云、华为云RocketMQ、Kafka、RabbitMq消息队列对比
  • 【JavaScript脚本宇宙】玩转数据存储:深入剖析提升 Web 应用程序性能的六大利器
  • Web应用和Tomcat的集成鉴权2-Form Authentication
  • async、await 官宣:JavaScript 中的异步编程新纪元
  • 日元跌破160大关,日本当局何时干预?
  • iptables(12)实际应用举例:策略路由、iptables转发、TPROXY
  • phpMyAdmin 4.0.10 文件包含 -> getshell
  • Spring Boot中如何集成ElasticSearch进行全文搜索
  • HistoQC|病理切片的质量控制工具