热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

4、数据仓库电商项目(尚硅谷第二版)DWS层

4、数据仓库电商项目DWS层上一章节:3、数据仓库电商项目(尚硅谷第二版)DWD下下一章节:5、数据仓库电商项目DWT层本章需求        1.DWS层用户行为数据统计 
4、数据仓库电商项目 DWS层

上一章节:3、数据仓库电商项目(尚硅谷第二版)DWD下
下一章节:5、数据仓库电商项目 DWT层

本章需求
 

        1. DWS层用户行为数据统计
       2. DWS层业务行为数据统计
 
注:按照需求自己要敲一遍,可以百度,但切记不要 重度 粘贴复制。
 
上一章说完了DWD层的分解,是比较有难度的,由于时间限制,我写的比较仓促,以后有时间会慢慢的补上。

DWS层的概括

       dws层的数据从dwd层来选取,在建模上进行主题分类,分类后每个主题都制作一张表,按照dwd层的数据,进行每日的轻度聚合,一般是建宽表。

       要求:知道主题的来龙去脉,知道一些的电商术语。

       说明:本次DWS层的数据清洗是与DWT(主题宽表层)相对应的,区别在于DWS层是针对于每日进行统计的,而DWT层则是针对全部进行统计。

DWT层每日设备行为每日会员行为每日商品行为每日优惠券统计 (预留)每日活动统计(预留)每日购买行为
DWT层设备主题宽表会员主题宽表商品主题宽表优惠券主题宽表活动主题宽表会员主题(ADS层)

业务术语
1)用户
用户以设备为判断标准,在移动统计中,每个独立设备认为是一个独立用户。Android系统根据IMEI号,IOS系统根据OpenUDID来标识一个独立用户,每部手机一个用户。
2)新增用户
首次联网使用应用的用户。如果一个用户首次打开某APP,那这个用户定义为新增用户;卸载再安装的设备,不会被算作一次新增。新增用户包括日新增用户、周新增用户、月新增用户。
3)活跃用户
打开应用的用户即为活跃用户,不考虑用户的使用情况。每天一台设备打开多次会被计为一个活跃用户。
4)周(月)活跃用户
某个自然周(月)内启动过应用的用户,该周(月)内的多次启动只记一个活跃用户。
5)月活跃率
月活跃用户与截止到该月累计的用户总和之间的比例。
6)沉默用户
用户仅在安装当天(次日)启动一次,后续时间无再启动行为。该指标可以反映新增用户质量和用户与APP的匹配程度。
7)版本分布
不同版本的周内各天新增用户数,活跃用户数和启动次数。利于判断APP各个版本之间的优劣和用户行为习惯。
8)本周回流用户
上周未启动过应用,本周启动了应用的用户。
9)连续n周活跃用户
连续n周,每周至少启动一次。
10)忠诚用户
连续活跃5周以上的用户
11)连续活跃用户
连续2周及以上活跃的用户
12)近期流失用户
连续n(2<= n <= 4)周没有启动应用的用户。(第n+1周没有启动过)
13)留存用户
某段时间内的新增用户,经过一段时间后,仍然使用应用的被认作是留存用户;这部分用户占当时新增用户的比例即是留存率。
例如,5月份新增用户200,这200人在6月份启动过应用的有100人,7月份启动过应用的有80人,8月份启动过应用的有50人;则5月份新增用户一个月后的留存率是50%,二个月后的留存率是40%,三个月后的留存率是25%。
14)用户新鲜度
每天启动应用的新老用户比例,即新增用户数占活跃用户数的比例。
15)单次使用时长
每次启动使用的时间长度。
16)日使用时长
累计一天内的使用时间长度。
17)启动次数计算标准
IOS平台应用退到后台就算一次独立的启动;Android平台我们规定,两次启动之间的间隔小于30秒,被计算一次启动。用户在使用过程中,若因收发短信或接电话等退出应用30秒又再次返回应用中,那这两次行为应该是延续而非独立的,所以可以被算作一次使用行为,即一次启动。业内大多使用30秒这个标准,但用户还是可以自定义此时间间隔。

在这里官方给了一些需要用到的函数,大家了解一二:

系统函数
collect_set函数
1)创建原数据表
hive (gmall)>
drop table if exists stud;
create table stud (name string, area string, course string, score int);
2)向原数据表中插入数据
hive (gmall)>
insert into table stud values('zhang3','bj','math',88);
insert into table stud values('li4','bj','math',99);
insert into table stud values('wang5','sh','chinese',92);
insert into table stud values('zhao6','sh','chinese',54);
insert into table stud values('tian7','bj','chinese',91);
3)查询表中数据
hive (gmall)> select * from stud;
stud.name stud.area stud.course stud.score
zhang3 bj math 88
li4 bj math 99
wang5 sh chinese 92
zhao6 sh chinese 54
tian7 bj chinese 91
4)把同一分组的不同行的数据聚合成一个集合
hive (gmall)> select course, collect_set(area), avg(score) from stud group by course;
chinese ["sh","bj"] 79.0
math ["bj"] 93.5
5) 用下标可以取某一个
hive (gmall)> select course, collect_set(area)[0], avg(score) from stud group by course;
chinese sh 79.0
math bj 93.5
nvl函数
1)基本语法
NVL(表达式1,表达式2)
如果表达式1为空值,NVL返回值为表达式2的值,否则返回表达式1的值。 该函数的目的是把一个空值(null)转换成一个实际的值。其表达式的值可以是数字型、字符型和日期型。但是表达式1和表达式2的数据类型必须为同一个类型。
5.2.3 日期处理函数
1)date_format函数(根据格式整理日期)
hive (gmall)> select date_format('2020-03-29','yyyy-MM');
2020-03
2)date_add函数(加减日期)
hive (gmall)> select date_add('2020-03-29',-1);
2020-03-09
hive (gmall)> select date_add('2020-03-29',1);
2020-03-11
3)next_day函数
(1)取当前天的下一个周一
hive (gmall)> select next_day('2020-03-12','MO');
2020-03-16
说明:星期一到星期日的英文(Monday,Tuesday、Wednesday、Thursday、Friday、Saturday、Sunday)
(2)取当前周的周一
hive (gmall)> select date_add(next_day('2020-03-12','MO'),-7);
2020-03-11
4)last_day函数(求当月最后一天日期)
hive (gmall)> select last_day('2020-03-29');
2020-03-31

DWS层每日设备行为(用户行为)

建表语句

drop table if exists dws_uv_detail_daycount;
create external table dws_uv_detail_daycount
(
`mid_id` string COMMENT '设备唯一标识',
`user_id` string COMMENT '用户标识',
`version_code` string COMMENT '程序版本号',
`version_name` string COMMENT '程序版本名',
`lang` string COMMENT '系统语言',
`source` string COMMENT '渠道号',
`os` string COMMENT '安卓系统版本',
`area` string COMMENT '区域',
`model` string COMMENT '手机型号',
`brand` string COMMENT '手机品牌',
`sdk_version` string COMMENT 'sdkVersion',
`gmail` string COMMENT 'gmail',
`height_width` string COMMENT '屏幕宽高',
`app_time` string COMMENT '客户端日志产生时的时间',
`network` string COMMENT '网络模式',
`lng` string COMMENT '经度',
`lat` string COMMENT '纬度',
`login_count` bigint COMMENT '活跃次数'
)
partitioned by(dt string)
stored as parquet
location '/warehouse/gmall/dws/dws_uv_detail_daycount'
TBLPROPERTIES('parquet.compression'='lzo');

数据装载

hive (gmall)>
insert overwrite table dws_uv_detail_daycount partition(dt='2020-03-29')
select
mid_id,
concat_ws('|', collect_set(user_id)) user_id,
concat_ws('|', collect_set(version_code)) version_code,
concat_ws('|', collect_set(version_name)) version_name,
concat_ws('|', collect_set(lang))lang,
concat_ws('|', collect_set(source)) source,
concat_ws('|', collect_set(os)) os,
concat_ws('|', collect_set(area)) area,
concat_ws('|', collect_set(model)) model,
concat_ws('|', collect_set(brand)) brand,
concat_ws('|', collect_set(sdk_version)) sdk_version,
concat_ws('|', collect_set(gmail)) gmail,
concat_ws('|', collect_set(height_width)) height_width,
concat_ws('|', collect_set(app_time)) app_time,
concat_ws('|', collect_set(network)) network,
concat_ws('|', collect_set(lng)) lng,
concat_ws('|', collect_set(lat)) lat,
count(*) login_count
from dwd_start_log
where dt='2020-03-29'
group by mid_id;

DWS层(业务)

每日会员行为

建表语句

hive (gmall)>
drop table if exists dws_user_action_daycount;
create external table dws_user_action_daycount
(
user_id string comment '用户 id',
login_count bigint comment '登录次数',
cart_count bigint comment '加入购物车次数',
cart_amount double comment '加入购物车金额',
order_count bigint comment '下单次数',
order_amount decimal(16,2) comment '下单金额',
payment_count bigint comment '支付次数',
payment_amount decimal(16,2) comment '支付金额'
) COMMENT '每日用户行为'
PARTITIONED BY (`dt` string)
stored as parquet
location '/warehouse/gmall/dws/dws_user_action_daycount/'
tblproperties ("parquet.compression"="lzo");

**数据装载**

hive (gmall)>
with
tmp_login as
(
select
user_id,
count(*) login_count
from dwd_start_log
where dt='2020-03-29'
and user_id is not null
group by user_id
),
tmp_cart as
(
select
user_id,
count(*) cart_count,
sum(cart_price*sku_num) cart_amount
from dwd_fact_cart_info
where dt='2020-03-29'
and user_id is not null
and date_format(create_time,'yyyy-MM-dd')='2020-03-29'
group by user_id
),
tmp_order as
(
select
user_id,
count(*) order_count,
sum(final_total_amount) order_amount
from dwd_fact_order_info
where dt='2020-03-29'
group by user_id
) ,
tmp_payment as
(
select
user_id,
count(*) payment_count,
sum(payment_amount) payment_amount
from dwd_fact_payment_info
where dt='2020-03-29'
group by user_id
)
insert overwrite table dws_user_action_daycount partition(dt='2020-03-29')
select
user_actions.user_id,
sum(user_actions.login_count),
sum(user_actions.cart_count),
sum(user_actions.cart_amount),
sum(user_actions.order_count),
sum(user_actions.order_amount),
sum(user_actions.payment_count),
sum(user_actions.payment_amount)
from
(
select
user_id,
login_count,
0 cart_count,
0 cart_amount,
0 order_count,
0 order_amount,
0 payment_count,
0 payment_amount
from
tmp_login
union all
select
user_id,
0 login_count,
cart_count,
cart_amount,
0 order_count,
0 order_amount,
0 payment_count,
0 payment_amount
from
tmp_cart
union all
select
user_id,
0 login_count,
0 cart_count,
0 cart_amount,
order_count,
order_amount,
0 payment_count,
0 payment_amount
from tmp_order
union all
select
user_id,
0 login_count,
0 cart_count,
0 cart_amount,
0 order_count,
0 order_amount,
payment_count,
payment_amount
from tmp_payment
) user_actions
group by user_id;

每日商品行为

建表语句

hive (gmall)>
drop table if exists dws_sku_action_daycount;
create external table dws_sku_action_daycount
(
sku_id string comment 'sku_id',
order_count bigint comment '被下单次数',
order_num bigint comment '被下单件数',
order_amount decimal(16,2) comment '被下单金额',
payment_count bigint comment '被支付次数',
payment_num bigint comment '被支付件数',
payment_amount decimal(16,2) comment '被支付金额',
refund_count bigint comment '被退款次数',
refund_num bigint comment '被退款件数',
refund_amount decimal(16,2) comment '被退款金额',
cart_count bigint comment '被加入购物车次数',
cart_num bigint comment '被加入购物车件数',
favor_count bigint comment '被收藏次数',
appraise_good_count bigint comment '好评数',
appraise_mid_count bigint comment '中评数',
appraise_bad_count bigint comment '差评数',
appraise_default_count bigint comment '默认评价数'
) COMMENT '每日商品行为'
PARTITIONED BY (`dt` string)
stored as parquet
location '/warehouse/gmall/dws/dws_sku_action_daycount/'
tblproperties ("parquet.compression"="lzo");

数据装载

hive (gmall)>
with
tmp_order as
(
select
sku_id,
count(*) order_count,
sum(sku_num) order_num,
sum(total_amount) order_amount
from dwd_fact_order_detail
where dt='2020-03-29'
group by sku_id
),
tmp_payment as
(
select
sku_id,
count(*) payment_count,
sum(sku_num) payment_num,
sum(total_amount) payment_amount
from dwd_fact_order_detail
//每日新增
// 只能取出3月29日新增的订单
[3, 4,5,6,7]
where dt='2020-03-29' or dt=date_add('2020-03-29',-1)
and order_id in
[1,2,3]
[3, 4,5,6,7] in [1,2,3] = [3]
(
所有29日28日下单的,在29日支付的订单id
复合条件的有28日下单的1,2
29日下单的3号订单
select
id
from dwd_fact_order_info
where (dt='2020-03-29' or dt=date_add('2020-03-29',-1))
and date_format(payment_time,'yyyy-MM-dd')='2020-03-29'
)
group by sku_id
),
tmp_refund as
(
select
sku_id,
count(*) refund_count,
sum(refund_num) refund_num,
sum(refund_amount) refund_amount
from dwd_fact_order_refund_info
where dt='2020-03-29'
group by sku_id
),
tmp_cart as
(
select
sku_id,
count(*) cart_count,
sum(sku_num) cart_num
from dwd_fact_cart_info
where dt='2020-03-29'
and date_format(create_time,'yyyy-MM-dd')='2020-03-29'
group by sku_id
),
tmp_favor as
(
select
sku_id,
count(*) favor_count
from dwd_fact_favor_info
where dt='2020-03-29'
and date_format(create_time,'yyyy-MM-dd')='2020-03-29'
group by sku_id
),
tmp_appraise as
(
select
sku_id,
sum(if(appraise='1201',1,0)) appraise_good_count,
sum(if(appraise='1202',1,0)) appraise_mid_count,
sum(if(appraise='1203',1,0)) appraise_bad_count,
sum(if(appraise='1204',1,0)) appraise_default_count
from dwd_fact_comment_info
where dt='2020-03-29'
group by sku_id
)
insert overwrite table dws_sku_action_daycount partition(dt='2020-03-29')
select
sku_id,
sum(order_count),
sum(order_num),
sum(order_amount),
sum(payment_count),
sum(payment_num),
sum(payment_amount),
sum(refund_count),
sum(refund_num),
sum(refund_amount),
sum(cart_count),
sum(cart_num),
sum(favor_count),
sum(appraise_good_count),
sum(appraise_mid_count),
sum(appraise_bad_count),
sum(appraise_default_count)
from
(
select
sku_id,
order_count,
order_num,
order_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_count,
0 refund_num,
0 refund_amount,
0 cart_count,
0 cart_num,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_order
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_amount,
payment_count,
payment_num,
payment_amount,
0 refund_count,
0 refund_num,
0 refund_amount,
0 cart_count,
0 cart_num,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_payment
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
refund_count,
refund_num,
refund_amount,
0 cart_count,
0 cart_num,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_refund
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_count,
0 refund_num,
0 refund_amount,
cart_count,
cart_num,
0 favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_cart
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_count,
0 refund_num,
0 refund_amount,
0 cart_count,
0 cart_num,
favor_count,
0 appraise_good_count,
0 appraise_mid_count,
0 appraise_bad_count,
0 appraise_default_count
from tmp_favor
union all
select
sku_id,
0 order_count,
0 order_num,
0 order_amount,
0 payment_count,
0 payment_num,
0 payment_amount,
0 refund_count,
0 refund_num,
0 refund_amount,
0 cart_count,
0 cart_num,
0 favor_count,
appraise_good_count,
appraise_mid_count,
appraise_bad_count,
appraise_default_count
from tmp_appraise
)tmp
group by sku_id;

每日优惠券统计(预留)

hive (gmall)>
drop table if exists dws_coupon_use_daycount;
create external table dws_coupon_use_daycount
(
`coupon_id` string COMMENT '优惠券ID',
`coupon_name` string COMMENT '购物券名称',
`coupon_type` string COMMENT '购物券类型 1 现金券 2 折扣券 3 满减券 4 满件打折券',
`condition_amount` string COMMENT '满额数',
`condition_num` string COMMENT '满件数',
`activity_id` string COMMENT '活动编号',
`benefit_amount` string COMMENT '减金额',
`benefit_discount` string COMMENT '折扣',
`create_time` string COMMENT '创建时间',
`range_type` string COMMENT '范围类型 1、商品 2、品类 3、品牌',
`spu_id` string COMMENT '商品id',
`tm_id` string COMMENT '品牌id',
`category3_id` string COMMENT '品类id',
`limit_num` string COMMENT '最多领用次数',
`get_count` bigint COMMENT '领用次数',
`using_count` bigint COMMENT '使用(下单)次数',
`used_count` bigint COMMENT '使用(支付)次数'
) COMMENT '每日优惠券统计'
PARTITIONED BY (`dt` string)
stored as parquet
location '/warehouse/gmall/dws/dws_coupon_use_daycount/'
tblproperties ("parquet.compression"="lzo");

数据装载

hive (gmall)>
insert overwrite table dws_coupon_use_daycount partition(dt='2020-03-29')
select
cu.coupon_id,
ci.coupon_name,
ci.coupon_type,
ci.condition_amount,
ci.condition_num,
ci.activity_id,
ci.benefit_amount,
ci.benefit_discount,
ci.create_time,
ci.range_type,
ci.spu_id,
ci.tm_id,
ci.category3_id,
ci.limit_num,
cu.get_count,
cu.using_count,
cu.used_count
from
(
select
coupon_id,
sum(if(date_format(get_time,'yyyy-MM-dd')='2020-03-29',1,0)) get_count,
sum(if(date_format(using_time,'yyyy-MM-dd')='2020-03-29',1,0)) using_count,
sum(if(date_format(used_time,'yyyy-MM-dd')='2020-03-29',1,0)) used_count
from dwd_fact_coupon_use
where dt='2020-03-29'
group by coupon_id
)cu
left join
(
select
*
from dwd_dim_coupon_info
where dt='2020-03-29'
)ci on cu.coupon_id=ci.id;

每日活动统计(预留)

建表语句

hive (gmall)>
drop table if exists dws_activity_info_daycount;
create external table dws_activity_info_daycount(
`id` string COMMENT '编号',
`activity_name` string COMMENT '活动名称',
`activity_type` string COMMENT '活动类型',
`start_time` string COMMENT '开始时间',
`end_time` string COMMENT '结束时间',
`create_time` string COMMENT '创建时间',
`order_count` bigint COMMENT '下单次数',
`payment_count` bigint COMMENT '支付次数'
) COMMENT '活动信息表'
PARTITIONED BY (`dt` string)
row format delimited fields terminated by '\t'
location '/warehouse/gmall/dws/dws_activity_info_daycount/'
tblproperties ("parquet.compression"="lzo");

数据装载

hive (gmall)>
insert overwrite table dws_activity_info_daycount partition(dt='2020-03-29')
select
oi.activity_id,
ai.activity_name,
ai.activity_type,
ai.start_time,
ai.end_time,
ai.create_time,
oi.order_count,
oi.payment_count
from
(
select
activity_id,
sum(if(date_format(create_time,'yyyy-MM-dd')='2020-03-29',1,0)) order_count,
sum(if(date_format(payment_time,'yyyy-MM-dd')='2020-03-29',1,0)) payment_count
from dwd_fact_order_info
where (dt='2020-03-29' or dt=date_add('2020-03-29',-1))
and activity_id is not null
group by activity_id
)oi
join
(
select
*
from dwd_dim_activity_info
where dt='2020-03-29'
)ai
on oi.activity_id=ai.id;

每日购买行为

建表语句

hive (gmall)>
drop table if exists dws_sale_detail_daycount;
create external table dws_sale_detail_daycount
(
user_id string comment '用户 id',
sku_id string comment '商品 id',
user_gender string comment '用户性别',
user_age string comment '用户年龄',
user_level string comment '用户等级',
order_price decimal(10,2) comment '商品价格',
sku_name string comment '商品名称',
sku_tm_id string comment '品牌id',
sku_category3_id string comment '商品三级品类id',
sku_category2_id string comment '商品二级品类id',
sku_category1_id string comment '商品一级品类id',
sku_category3_name string comment '商品三级品类名称',
sku_category2_name string comment '商品二级品类名称',
sku_category1_name string comment '商品一级品类名称',
spu_id string comment '商品 spu',
sku_num int comment '购买个数',
order_count bigint comment '当日下单单数',
order_amount decimal(16,2) comment '当日下单金额'
) COMMENT '每日购买行为'
PARTITIONED BY (`dt` string)
stored as parquet
location '/warehouse/gmall/dws/dws_sale_detail_daycount/'
tblproperties ("parquet.compression"="lzo");

数据装载

hive (gmall)>
insert overwrite table dws_sale_detail_daycount partition(dt='2020-03-29')
select
op.user_id,
op.sku_id,
ui.gender,
months_between('2020-03-29', ui.birthday)/12 age,
ui.user_level,
si.price,
si.sku_name,
si.tm_id,
si.category3_id,
si.category2_id,
si.category1_id,
si.category3_name,
si.category2_name,
si.category1_name,
si.spu_id,
op.sku_num,
op.order_count,
op.order_amount
from
(
select
user_id,
sku_id,
sum(sku_num) sku_num,
count(*) order_count,
sum(total_amount) order_amount
from dwd_fact_order_detail
where dt='2020-03-29'
group by user_id, sku_id
)op
join
(
select
*
from dwd_dim_user_info_his
where end_date='9999-99-99'
)ui on op.user_id = ui.id
join
(
select
*
from dwd_dim_sku_info
where dt='2020-03-29'
)si on op.sku_id = si.id;

DWS层数据导入脚本

vim dwd_to_dws.sh

#!/bin/bash
APP=gmall
hive=/opt/module/hive/bin/hive
# 如果是输入的日期按照取输入日期;如果没输入日期取当前时间的前一天
if [ -n "$1" ] ;then
do_date=$1
else
do_date=`date -d "-1 day" +%F`
fi
sql=" insert overwrite table ${APP}.dws_uv_detail_daycount partition(dt='$do_date') select mid_id, concat_ws('|', collect_set(user_id)) user_id, concat_ws('|', collect_set(version_code)) version_code, concat_ws('|', collect_set(version_name)) version_name, concat_ws('|', collect_set(lang))lang, concat_ws('|', collect_set(source)) source, concat_ws('|', collect_set(os)) os, concat_ws('|', collect_set(area)) area, concat_ws('|', collect_set(model)) model, concat_ws('|', collect_set(brand)) brand, concat_ws('|', collect_set(sdk_version)) sdk_version, concat_ws('|', collect_set(gmail)) gmail, concat_ws('|', collect_set(height_width)) height_width, concat_ws('|', collect_set(app_time)) app_time, concat_ws('|', collect_set(network)) network, concat_ws('|', collect_set(lng)) lng, concat_ws('|', collect_set(lat)) lat, count(*) login_count from ${APP}.dwd_start_log where dt='$do_date' group by mid_id; with tmp_login as ( select user_id, count(*) login_count from ${APP}.dwd_start_log where dt='$do_date' and user_id is not null group by user_id ), tmp_cart as ( select user_id, count(*) cart_count, sum(cart_price*sku_num) cart_amount from ${APP}.dwd_fact_cart_info where dt='$do_date' and user_id is not null and date_format(create_time,'yyyy-MM-dd')='$do_date' group by user_id ), tmp_order as ( select user_id, count(*) order_count, sum(final_total_amount) order_amount from ${APP}.dwd_fact_order_info where dt='$do_date' group by user_id ) , tmp_payment as ( select user_id, count(*) payment_count, sum(payment_amount) payment_amount from ${APP}.dwd_fact_payment_info where dt='$do_date' group by user_id ) insert overwrite table ${APP}.dws_user_action_daycount partition(dt='$do_date') select user_actions.user_id, sum(user_actions.login_count), sum(user_actions.cart_count), sum(user_actions.cart_amount), sum(user_actions.order_count), sum(user_actions.order_amount), sum(user_actions.payment_count), sum(user_actions.payment_amount) from ( select user_id, login_count, 0 cart_count, 0 cart_amount, 0 order_count, 0 order_amount, 0 payment_count, 0 payment_amount from tmp_login union all select user_id, 0 login_count, cart_count, cart_amount, 0 order_count, 0 order_amount, 0 payment_count, 0 payment_amount from tmp_cart union all select user_id, 0 login_count, 0 cart_count, 0 cart_amount, order_count, order_amount, 0 payment_count, 0 payment_amount from tmp_order union all select user_id, 0 login_count, 0 cart_count, 0 cart_amount, 0 order_count, 0 order_amount, payment_count, payment_amount from tmp_payment ) user_actions group by user_id; with tmp_order as ( select sku_id, count(*) order_count, sum(sku_num) order_num, sum(total_amount) order_amount from ${APP}.dwd_fact_order_detail where dt='$do_date' group by sku_id ), tmp_payment as ( select sku_id, count(*) payment_count, sum(sku_num) payment_num, sum(total_amount) payment_amount from ${APP}.dwd_fact_order_detail where dt='$do_date' and order_id in ( select id from ${APP}.dwd_fact_order_info where (dt='$do_date' or dt=date_add('$do_date',-1)) and date_format(payment_time,'yyyy-MM-dd')='$do_date' ) group by sku_id ), tmp_refund as ( select sku_id, count(*) refund_count, sum(refund_num) refund_num, sum(refund_amount) refund_amount from ${APP}.dwd_fact_order_refund_info where dt='$do_date' group by sku_id ), tmp_cart as ( select sku_id, count(*) cart_count, sum(sku_num) cart_num from ${APP}.dwd_fact_cart_info where dt='$do_date' and date_format(create_time,'yyyy-MM-dd')='$do_date' group by sku_id ), tmp_favor as ( select sku_id, count(*) favor_count from ${APP}.dwd_fact_favor_info where dt='$do_date' and date_format(create_time,'yyyy-MM-dd')='$do_date' group by sku_id ), tmp_appraise as ( select sku_id, sum(if(appraise='1201',1,0)) appraise_good_count, sum(if(appraise='1202',1,0)) appraise_mid_count, sum(if(appraise='1203',1,0)) appraise_bad_count, sum(if(appraise='1204',1,0)) appraise_default_count from ${APP}.dwd_fact_comment_info where dt='$do_date' group by sku_id ) insert overwrite table ${APP}.dws_sku_action_daycount partition(dt='$do_date') select sku_id, sum(order_count), sum(order_num), sum(order_amount), sum(payment_count), sum(payment_num), sum(payment_amount), sum(refund_count), sum(refund_num), sum(refund_amount), sum(cart_count), sum(cart_num), sum(favor_count), sum(appraise_good_count), sum(appraise_mid_count), sum(appraise_bad_count), sum(appraise_default_count) from ( select sku_id, order_count, order_num, order_amount, 0 payment_count, 0 payment_num, 0 payment_amount, 0 refund_count, 0 refund_num, 0 refund_amount, 0 cart_count, 0 cart_num, 0 favor_count, 0 appraise_good_count, 0 appraise_mid_count, 0 appraise_bad_count, 0 appraise_default_count from tmp_order union all select sku_id, 0 order_count, 0 order_num, 0 order_amount, payment_count, payment_num, payment_amount, 0 refund_count, 0 refund_num, 0 refund_amount, 0 cart_count, 0 cart_num, 0 favor_count, 0 appraise_good_count, 0 appraise_mid_count, 0 appraise_bad_count, 0 appraise_default_count from tmp_payment union all select sku_id, 0 order_count, 0 order_num, 0 order_amount, 0 payment_count, 0 payment_num, 0 payment_amount, refund_count, refund_num, refund_amount, 0 cart_count, 0 cart_num, 0 favor_count, 0 appraise_good_count, 0 appraise_mid_count, 0 appraise_bad_count, 0 appraise_default_count from tmp_refund union all select sku_id, 0 order_count, 0 order_num, 0 order_amount, 0 payment_count, 0 payment_num, 0 payment_amount, 0 refund_count, 0 refund_num, 0 refund_amount, cart_count, cart_num, 0 favor_count, 0 appraise_good_count, 0 appraise_mid_count, 0 appraise_bad_count, 0 appraise_default_count from tmp_cart union all select sku_id, 0 order_count, 0 order_num, 0 order_amount, 0 payment_count, 0 payment_num, 0 payment_amount, 0 refund_count, 0 refund_num, 0 refund_amount, 0 cart_count, 0 cart_num, favor_count, 0 appraise_good_count, 0 appraise_mid_count, 0 appraise_bad_count, 0 appraise_default_count from tmp_favor union all select sku_id, 0 order_count, 0 order_num, 0 order_amount, 0 payment_count, 0 payment_num, 0 payment_amount, 0 refund_count, 0 refund_num, 0 refund_amount, 0 cart_count, 0 cart_num, 0 favor_count, appraise_good_count, appraise_mid_count, appraise_bad_count, appraise_default_count from tmp_appraise )tmp group by sku_id; insert overwrite table ${APP}.dws_coupon_use_daycount partition(dt='$do_date') select cu.coupon_id, ci.coupon_name, ci.coupon_type, ci.condition_amount, ci.condition_num, ci.activity_id, ci.benefit_amount, ci.benefit_discount, ci.create_time, ci.range_type, ci.spu_id, ci.tm_id, ci.category3_id, ci.limit_num, cu.get_count, cu.using_count, cu.used_count from ( select coupon_id, sum(if(date_format(get_time,'yyyy-MM-dd')='$do_date',1,0)) get_count, sum(if(date_format(using_time,'yyyy-MM-dd')='$do_date',1,0)) using_count, sum(if(date_format(used_time,'yyyy-MM-dd')='$do_date',1,0)) used_count from ${APP}.dwd_fact_coupon_use where dt='$do_date' group by coupon_id )cu left join ( select * from ${APP}.dwd_dim_coupon_info where dt='$do_date' )ci on cu.coupon_id=ci.id; insert overwrite table ${APP}.dws_activity_info_daycount partition(dt='$do_date') select oi.activity_id, ai.activity_name, ai.activity_type, ai.start_time, ai.end_time, ai.create_time, oi.order_count, oi.payment_count from ( select activity_id, sum(if(date_format(create_time,'yyyy-MM-dd')='$do_date',1,0)) order_count, sum(if(date_format(payment_time,'yyyy-MM-dd')='$do_date',1,0)) payment_count from ${APP}.dwd_fact_order_info where (dt='$do_date' or dt=date_add('$do_date',-1)) and activity_id is not null group by activity_id )oi join ( select * from ${APP}.dwd_dim_activity_info where dt='$do_date' )ai on oi.activity_id=ai.id; insert overwrite table ${APP}.dws_sale_detail_daycount partition(dt='$do_date') select op.user_id, op.sku_id, ui.gender, months_between('$do_date', ui.birthday)/12 age, ui.user_level, si.price, si.sku_name, si.tm_id, si.category3_id, si.category2_id, si.category1_id, si.category3_name, si.category2_name, si.category1_name, si.spu_id, op.sku_num, op.order_count, op.order_amount from ( select user_id, sku_id, sum(sku_num) sku_num, count(*) order_count, sum(total_amount) order_amount from ${APP}.dwd_fact_order_detail where dt='$do_date' group by user_id, sku_id )op join ( select * from ${APP}.dwd_dim_user_info_his where end_date='9999-99-99' )ui on op.user_id = ui.id join ( select * from ${APP}.dwd_dim_sku_info where dt='$do_date' )si on op.sku_id = si.id; "
$hive -e "$sql"

推荐阅读
  • Java学习笔记之面向对象编程(OOP)
    本文介绍了Java学习笔记中的面向对象编程(OOP)内容,包括OOP的三大特性(封装、继承、多态)和五大原则(单一职责原则、开放封闭原则、里式替换原则、依赖倒置原则)。通过学习OOP,可以提高代码复用性、拓展性和安全性。 ... [详细]
  • 本文详细介绍了Java中vector的使用方法和相关知识,包括vector类的功能、构造方法和使用注意事项。通过使用vector类,可以方便地实现动态数组的功能,并且可以随意插入不同类型的对象,进行查找、插入和删除操作。这篇文章对于需要频繁进行查找、插入和删除操作的情况下,使用vector类是一个很好的选择。 ... [详细]
  • Java容器中的compareto方法排序原理解析
    本文从源码解析Java容器中的compareto方法的排序原理,讲解了在使用数组存储数据时的限制以及存储效率的问题。同时提到了Redis的五大数据结构和list、set等知识点,回忆了作者大学时代的Java学习经历。文章以作者做的思维导图作为目录,展示了整个讲解过程。 ... [详细]
  • Android Studio Bumblebee | 2021.1.1(大黄蜂版本使用介绍)
    本文介绍了Android Studio Bumblebee | 2021.1.1(大黄蜂版本)的使用方法和相关知识,包括Gradle的介绍、设备管理器的配置、无线调试、新版本问题等内容。同时还提供了更新版本的下载地址和启动页面截图。 ... [详细]
  • 自动轮播,反转播放的ViewPagerAdapter的使用方法和效果展示
    本文介绍了如何使用自动轮播、反转播放的ViewPagerAdapter,并展示了其效果。该ViewPagerAdapter支持无限循环、触摸暂停、切换缩放等功能。同时提供了使用GIF.gif的示例和github地址。通过LoopFragmentPagerAdapter类的getActualCount、getActualItem和getActualPagerTitle方法可以实现自定义的循环效果和标题展示。 ... [详细]
  • C++字符字符串处理及字符集编码方案
    本文介绍了C++中字符字符串处理的问题,并详细解释了字符集编码方案,包括UNICODE、Windows apps采用的UTF-16编码、ASCII、SBCS和DBCS编码方案。同时说明了ANSI C标准和Windows中的字符/字符串数据类型实现。文章还提到了在编译时需要定义UNICODE宏以支持unicode编码,否则将使用windows code page编译。最后,给出了相关的头文件和数据类型定义。 ... [详细]
  • MyBatis多表查询与动态SQL使用
    本文介绍了MyBatis多表查询与动态SQL的使用方法,包括一对一查询和一对多查询。同时还介绍了动态SQL的使用,包括if标签、trim标签、where标签、set标签和foreach标签的用法。文章还提供了相关的配置信息和示例代码。 ... [详细]
  • 利用Visual Basic开发SAP接口程序初探的方法与原理
    本文介绍了利用Visual Basic开发SAP接口程序的方法与原理,以及SAP R/3系统的特点和二次开发平台ABAP的使用。通过程序接口自动读取SAP R/3的数据表或视图,在外部进行处理和利用水晶报表等工具生成符合中国人习惯的报表样式。具体介绍了RFC调用的原理和模型,并强调本文主要不讨论SAP R/3函数的开发,而是针对使用SAP的公司的非ABAP开发人员提供了初步的接口程序开发指导。 ... [详细]
  • 本文介绍了Android 7的学习笔记总结,包括最新的移动架构视频、大厂安卓面试真题和项目实战源码讲义。同时还分享了开源的完整内容,并提醒读者在使用FileProvider适配时要注意不同模块的AndroidManfiest.xml中配置的xml文件名必须不同,否则会出现问题。 ... [详细]
  • 在Xamarin XAML语言中如何在页面级别构建ControlTemplate控件模板
    本文介绍了在Xamarin XAML语言中如何在页面级别构建ControlTemplate控件模板的方法和步骤,包括将ResourceDictionary添加到页面中以及在ResourceDictionary中实现模板的构建。通过本文的阅读,读者可以了解到在Xamarin XAML语言中构建控件模板的具体操作步骤和语法形式。 ... [详细]
  • 本文讨论了在openwrt-17.01版本中,mt7628设备上初始化启动时eth0的mac地址总是随机生成的问题。每次随机生成的eth0的mac地址都会写到/sys/class/net/eth0/address目录下,而openwrt-17.01原版的SDK会根据随机生成的eth0的mac地址再生成eth0.1、eth0.2等,生成后的mac地址会保存在/etc/config/network下。 ... [详细]
  • 如何搭建Java开发环境并开发WinCE项目
    本文介绍了如何搭建Java开发环境并开发WinCE项目,包括搭建开发环境的步骤和获取SDK的几种方式。同时还解答了一些关于WinCE开发的常见问题。通过阅读本文,您将了解如何使用Java进行嵌入式开发,并能够顺利开发WinCE应用程序。 ... [详细]
  • r2dbc配置多数据源
    R2dbc配置多数据源问题根据官网配置r2dbc连接mysql多数据源所遇到的问题pom配置可以参考官网,不过我这样配置会报错我并没有这样配置将以下内容添加到pom.xml文件d ... [详细]
  • Imtryingtofigureoutawaytogeneratetorrentfilesfromabucket,usingtheAWSSDKforGo.我正 ... [详细]
  • IjustinheritedsomewebpageswhichusesMooTools.IneverusedMooTools.NowIneedtoaddsomef ... [详细]
author-avatar
手机用户2502930273
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有