Pig和Hive对比
Apache PigHive
Apache Pig uses a language called Pig Latin. It was originally created atYahoo. |
Hive uses a language called HiveQL. It was originally created at Facebook. |
Pig Latin is a data flow language. |
HiveQL is a query processing language. |
Pig Latin is a procedural language and it fits in pipeline paradigm. |
HiveQL is a declarative language. |
Apache Pig can handle structured, unstructured, and semi-structured data. |
Hive is mostly for structured data.
|
Pig执行模式
-----------------------------------
1. local
所有文件都在本地,for test
2. mapreduce
数据在HDFS上
Pig运行模式
1. 交互模式(grunt shell)
输入-执行-输出
2. batch mode 批处理模式
编写pig为扩展名的pig脚本
3. enbed mode 嵌入式
编写udf,在脚本使用
安装PIG
1. download pig
wget https://mirrors.tuna.tsinghua.edu.cn/apache/pig/latest/pig-0.16.0.tar.gz
tar -zxvf pig-0.16.0.tar.gz
ln -s pig-0.16.0.tar.gz pig
2. config ~/.bashrc
vi ~/.bashrc
export PIG_HOME=/usr/local/pig
export PATH=:$PIG_HOME/bin
source ~/.bashrc
3. verify
pig -version
参考:
http://pig.apache.org/docs/r0.16.0/
https://www.tutorialspoint.com/apache_pig/index.htm