2.4 关系建模与维度建模
关系模型
关系模型主要应用与OLTP系统中,为了保证数据的一致性以及避免冗余,所以大部分业务系统的表都是遵循第三范式的。
维度模型
维度模型主要应用于OLAP系统中,因为关系模型虽然冗余少,但是在大规模数据,跨表分析统计查询过程中,会造成多表关联,这会大大降低执行效率。
所以把相关各种表整理成两种:事实表和维度表两种。所有维度表围绕着事实表进行解释。
OLAP与OLTP对比
雪花模型、星型模型和星座模型
在维度建模的基础上又分为三种模型:星型模型、雪花模型、星座模型
第3章 数仓搭建
3.0 配置Hadoop支持Snappy压缩
1)将编译后支持Snappy压缩的Hadoop jar包解压缩,并将lib/native目录中所有文件上传到hadoop102的/opt/module/hadoop-2.7.2/lib/native目录,并分发到hadoop103 hadoop104。
2)重新启动Hadoop。
3)检查支持的压缩方式
[kgg@hadoop102 native]$ hadoop checknative
hadoop: true /opt/module/hadoop-2.7.2/lib/native/libhadoop.so
zlib: true /lib64/libz.so.1
snappy: true /opt/module/hadoop-2.7.2/lib/native/libsnappy.so.1
lz4: true revision:99
bzip2: false
3.1 业务数据生成
3.1.1 建表语句
1)通过SQLyog创建数据库gmall
2)设置数据库编码
3)导入建表语句(1建表脚本)
4)重复步骤3的导入方式,依次导入:2商品分类数据插入脚本、3函数脚本、4存储过程脚本。
3.1.2 生成业务数据
1)生成业务数据函数说明
init_data ( do_date_string VARCHAR(20) , order_incr_num INT, user_incr_num INT , sku_num INT , if_truncate BOOLEAN ):
参数一:do_date_string生成数据日期
参数二:order_incr_num订单id个数
参数三:user_incr_num用户id个数
参数四:sku_num商品sku个数
参数五:if_truncate是否删除数据
2)案例测试:
(1)需求:生成日期2019年2月10日数据、订单1000个、用户200个、商品sku300个、删除原始数据。
CALL init_data('2019-02-10',1000,200,300,TRUE);
(2)查询生成数据结果
SELECT * from base_category1;
SELECT * from base_category2;
SELECT * from base_category3;
SELECT * from order_info;
SELECT * from order_detail;
SELECT * from sku_info;
SELECT * from user_info;
SELECT * from payment_info;
3.2 业务数据导入数仓
3.2.1 Sqoop安装
详见尚硅谷大数据技术之Sqoop
3.2.2 Sqoop导入命令
/opt/module/sqoop/bin/sqoop import
--connect
--username
--password
--target-dir
--delete-target-dir
--num-mappers
--fields-terminated-by
--query "$2" ' and $CONDITIONS;'
3.2.3 分析表
3.2.4 Sqoop定时导入脚本
1)在/home/kgg/bin目录下创建脚本sqoop_import.sh
[kgg@hadoop102 bin]$ vim sqoop_import.sh
在脚本中填写如下内容
#!/bin/bash
db_date=$2
echo $db_date
db_name=gmall
import_data() {
/opt/module/sqoop/bin/sqoop import
--connect jdbc:mysql://hadoop102:3306/$db_name
--username root
--password 000000
--target-dir /origin_data/$db_name/db/$1/$db_date
--delete-target-dir
--num-mappers 1
--fields-terminated-by " "
--query "$2"' and $CONDITIONS;'
}
import_sku_info(){
import_data "sku_info" "select
id, spu_id, price, sku_name, sku_desc, weight, tm_id,
category3_id, create_time
from sku_info where 1=1"
}
import_user_info(){
import_data "user_info" "select
id, name, birthday, gender, email, user_level,
create_time
from user_info where 1=1"
}
import_base_category1(){
import_data "base_category1" "select
id, name from base_category1 where 1=1"
}
import_base_category2(){
import_data "base_category2" "select
id, name, category1_id from base_category2 where 1=1"
}
import_base_category3(){
import_data "base_category3" "select id, name, category2_id from base_category3 where 1=1"
}
import_order_detail(){
import_data "order_detail" "select
od.id,
order_id,
user_id,
sku_id,
sku_name,
order_price,
sku_num,
o.create_time
from order_info o, order_detail od
where o.id=od.order_id
and DATE_FORMAT(create_time,'%Y-%m-%d')='$db_date'"
}
import_payment_info(){
import_data "payment_info" "select
id,
out_trade_no,
order_id,
user_id,
alipay_trade_no,
total_amount,
subject,
payment_type,
payment_time
from payment_info
where DATE_FORMAT(payment_time,'%Y-%m-%d')='$db_date'"
}
import_order_info(){
import_data "order_info" "select
id,
total_amount,
order_status,
user_id,
payment_way,
out_trade_no,
create_time,
operate_time
from order_info
where (DATE_FORMAT(create_time,'%Y-%m-%d')='$db_date' or DATE_FORMAT(operate_time,'%Y-%m-%d')='$db_date')"
}
case $1 in
"base_category1")
import_base_category1
;;
"base_category2")
import_base_category2
;;
"base_category3")
import_base_category3
;;
"order_info")
import_order_info
;;
"order_detail")
import_order_detail
;;
"sku_info")
import_sku_info
;;
"user_info")
import_user_info
;;
"payment_info")
import_payment_info
;;
"all")
import_base_category1
import_base_category2
import_base_category3
import_order_info
import_order_detail
import_sku_info
import_user_info
import_payment_info
;;
esac
2)增加脚本执行权限
[kgg@hadoop102 bin]$ chmod 777 sqoop_import.sh
3)执行脚本导入数据
[kgg@hadoop102 bin]$ sqoop_import.sh all 2019-02-10
4)在SQLyog中生成2019年2月11日数据
CALL init_data('2019-02-11',1000,200,300,TRUE);
5)执行脚本导入数据
[kgg@hadoop102 bin]$ sqoop_import.sh all 2019-02-11
3.2.5 Sqoop导入数据异常处理
1)问题描述:执行Sqoop导入数据脚本时,发生如下异常
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@65d6b83b is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:930)
at com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2646)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1861)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2101)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2548)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2477)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1422)
at com.mysql.jdbc.ConnectionImpl.getMaxBytesPerChar(ConnectionImpl.java:2945)
at com.mysql.jdbc.Field.getMaxBytesPerCharacter(Field.java:582)
2)问题解决方案:增加如下导入参数
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@65d6b83b is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:930)
at com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2646)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1861)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2101)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2548)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2477)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1422)
at com.mysql.jdbc.ConnectionImpl.getMaxBytesPerChar(ConnectionImpl.java:2945)
at com.mysql.jdbc.Field.getMaxBytesPerCharacter(Field.java:582)
页面更新:2024-04-24
本站资料均由网友自行发布提供,仅用于学习交流。如有版权问题,请与我联系,QQ:4156828
© CopyRight 2020-2024 All Rights Reserved. Powered By 71396.com 闽ICP备11008920号-4
闽公网安备35020302034903号