作者:zhiting999977 | 来源:互联网 | 2023-09-07 19:17
一、每门课程问题
用一条 SQL 语句查询 学生表每门课都大于 80 分的学生姓名。
解决办法一: having
思路:如果最小的课程都大于80分,那他所有的课程分数,肯定都大于80分!
代码实现
SELECT name
FROM xuesheng
GROUP BY name
HAVING MIN(score)> 80
解决办法二:not in
可以用反向思维,先查询出表里面有小于 80 分的 name,然后用 not in 去除掉
代码实现
SELECT DISTINCT name
FROM xuesheng
WHERE name NOT IN(SELECT DISTINCT nameFROM xueshengWHERE score <&#61;80);
二、topN 问题
案例&#xff1a;查询各科成绩前两名的记录
代码实现
row_number() over( partition by 课程 order by 成绩 desc) as rank
...
where rank <&#61; 2 -- 前两名
三、连续问题&#xff08;7 天连续登陆&#xff09;
实现思路:
因为每天用户登录次数可能不止一次&#xff0c;所以需要先将用户每天的登录日期去重。
再用row_number() over(partition by _ order by _)函数将用户id分组&#xff0c;按照登陆时间进行排序。
计算登录日期减去第二步骤得到的结果值&#xff0c;用户连续登陆情况下&#xff0c;每次相减的结果都相同。
按照id和日期分组并求和&#xff0c;筛选大于等于7的即为连续7天登陆的用户。
代码实现
SELECT user_id, MAX(count_val) AS max_count -- 查出了最大连续登陆&#xff0c;where>&#61;7,即7天连续
FROM (-- group by相同日期SELECT user_id, symbol_date, COUNT(*) AS count_valFROM (-- 日期减rank&#xff0c;连续登陆的话&#xff0c;会得到相同日期SELECT user_id, log_date, date_sub(log_date, CAST(rn AS INT)) AS symbol_dateFROM (-- 打上rank标识SELECT user_id, log_date, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY log_date) AS rnFROM user_logging_format) c) dGROUP BY user_id, symbol_date
) e
GROUP BY user_id;
四、行转列问题
行转列可谓是经典中的经典了&#xff0c;必须掌握了&#xff01;
6行变2行&#xff0c;2列变3列
代码实现
SELECT SID,
MAX(case CID when &#39;01&#39; then score else 0 end) &#39;01&#39;,
MAX(case CID when &#39;02&#39; then score else 0 end)&#39;02&#39;,
MAX(case CID when &#39;03&#39; then score else 0 end)&#39;03&#39;
FROM SC
GROUP BY SID
五、留存问题
留存率是衡量用户质量的最重要指标之一&#xff0c;因此计算用户留存率是用户数据分析中必须掌握的技能之一。同样也成为了面试经典sql之一。
留存率指标中&#xff0c;通常需要关注次日留存、3日留存、7日留存和月留存。对新增用户而言&#xff0c;需要关注更细颗粒度的数据&#xff0c;也就是7日内每天的留存率。
代码实现
select
dd
, count( if(id&#61;lead_id and datediff(dd,lead_dd)&#61;1 ,id, null ) ) as &#39;1 日留存&#39;
, count( if(id&#61;lead_id7 and datediff(dd,lead_dd7)&#61;7 ,id, null ) ) as &#39;7 日留存&#39;
from
(
select
id, dd
, lead(dd,1) over(partition by id order by dd asc ) as lead_dd
, lead(id,1) over(partition by id order by dd asc ) as lead_id
, lead(dd,7) over(partition by id order by dd asc ) as lead_dd7
, lead(id,7) over(partition by id order by dd asc ) as lead_id7
from (
select &#39;slm&#39; as id, &#39;2018-12-26&#39; as dd
union all select &#39;slm&#39; as id, &#39;2018-12-27&#39; as dd
union all select &#39;slm&#39; as id, &#39;2018-12-28&#39; as dd
union all select &#39;hh &#39; as id, &#39;2018-12-26&#39; as dd
union all select &#39;hh &#39; as id, &#39;2018-12-28&#39; as dd ) aa
硬核资料&#xff1a;关注即私信或&#xff08;点击获取&#xff09;可领取行业经典书籍PDF。
技术互助&#xff1a;技术群大佬指点迷津&#xff0c;你的问题可能不是问题&#xff0c;求资源在&#xff08;技术群&#xff09;里喊一声。
面试题库&#xff1a;由P8大佬们共同投稿&#xff0c;热乎的大厂面试真题&#xff0c;持续更新中。&#xff08;点击获取&#xff09;
知识体系&#xff1a;含编程语言、算法、大数据生态圈组件&#xff08;Mysql、Hive、Spark、Flink&#xff09;、数据仓库