热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

mysqldatatransfer_DataTransferBySqoop2

本文主要描述了HDFS和RDBMS数据库之间的数据迁移。一、创建Link这里主要描述下Oracle、PostgreSQL、MySQL以及HDFS的link创建。1.1JDBC驱动安

本文主要描述了HDFS和RDBMS数据库之间的数据迁移。

一、创建Link

这里主要描述下Oracle、PostgreSQL、MySQL以及HDFS的link创建。

1.1 JDBC驱动安装

从官网下载Oracle和PostgreSQL的jdbc驱动,拷贝到$SQOOP_HOME/server/lib,然后重启sqoop2服务器。我这里用到的Oracle jdbc驱动为ojdbc6.jar,PostgreSQL为postgresql-42.1.4.jar,MySQL为mysql-connector-java-5.1.44-bin.jar。

1.2 创建HDFS Link

sqoop:000> create link -c hdfs-connector

Creating link for connector with name hdfs-connector

Please fill following values to create new link object

Name: HDFS_Link

HDFS cluster

URI: hdfs://192.168.120.96:9000

Conf directory: /u01/hadoop/etc/hadoop

Additional configs::

There are currently 0 values in the map:

entry#

New link was successfully created with validation status OK and name HDFS_Link

1.3 创建MySQL Link

sqoop:000> create link -c generic-jdbc-connector

Creating link for connector with name generic-jdbc-connector

Please fill following values to create new link object

Name: MySQL_Link

Database connection

Driver class: com.mysql.jdbc.Driver

Connection String: jdbc:mysql://192.168.120.92:3306/smsqw?useSSL=false

Username: root

Password: *********

Fetch Size:

Connection Properties:

There are currently 0 values in the map:

entry#

SQL Dialect

Identifier enclose:

New link was successfully created with validation status OK and name MySQL_Link

1.4 创建PostgreSQL link

sqoop:000> create link -c generic-jdbc-connector

Creating link for connector with name generic-jdbc-connector

Please fill following values to create new link object

Name: PostgreSQL_Link

Database connection

Driver class: org.postgresql.Driver

Connection String: jdbc:postgresql://192.168.120.93:5432/rhndb

Username: rhnuser

Password: ******

Fetch Size:

Connection Properties:

There are currently 0 values in the map:

entry#

SQL Dialect

Identifier enclose:

New link was successfully created with validation status OK and name PostgreSQL_Link

1.5 创建Oracle Link

sqoop:000> create link -c generic-jdbc-connector

Creating link for connector with name generic-jdbc-connector

Please fill following values to create new link object

Name: Oracle_link

Database connection

Driver class: oracle.jdbc.driver.OracleDriver

Connection String: jdbc:oracle:thin:@192.168.120.121:1521:rhndb

Username: spwuser

Password: ******

Fetch Size:

Connection Properties:

There are currently 0 values in the map:

entry#

SQL Dialect

Identifier enclose:

New link was successfully created with validation status OK and name Oracle_link

> 注意:关系型数据库在Identifier enclose这里,必须先空格再回车。

b56118559fc0d81302b294c99d1fd31c.png

二、创建关系型数据库到HDFS作业

2.1 MySQL To HDFS作业

sqoop:000> create job -f MySQL_Link -t HDFS_Link

Creating job for links with from name MySQL_Link and to name HDFS_Link

Please fill following values to create new job object

Name: MySQL_To_HDFS

Database source

Schema name: smsqw

Table name: tbMessage

SQL statement:

Column names:

There are currently 0 values in the list:

element#

Partition column:

Partition column nullable:

Boundary query:

Incremental read

Check column:

Last value:

Target configuration

Override null value:

Null value:

File format:

0 : TEXT_FILE

1 : SEQUENCE_FILE

2 : PARQUET_FILE

Choose: 0

Compression codec:

0 : NONE

1 : DEFAULT

2 : DEFLATE

3 : GZIP

4 : BZIP2

5 : LZO

6 : LZ4

7 : SNAPPY

8 : CUSTOM

Choose: 1

Custom codec:

Output directory: /user/DataSource

Append mode:

Throttling resources

Extractors:

Loaders:

Classpath configuration

Extra mapper jars:

There are currently 0 values in the list:

element#

New job was successfully created with validation status OK and name MySQL_To_HDFS

2.2 PostgreSQL To HDFS作业

sqoop:000> create job -f PostgreSQL_Link -t HDFS_Link

Creating job for links with from name PostgreSQL_Link and to name HDFS_Link

Please fill following values to create new job object

Name: PostgreSQL_to_HDFS

Database source

Schema name: rhndb

Table name: rhnpackagechangelogdata

SQL statement:

Column names:

There are currently 0 values in the list:

element#

Partition column: id --如果表没有主键,就指定一个字段

Partition column nullable:

Boundary query:

Incremental read

Check column:

Last value:

Target configuration

Override null value:

Null value:

File format:

0 : TEXT_FILE

1 : SEQUENCE_FILE

2 : PARQUET_FILE

Choose: 0

Compression codec:

0 : NONE

1 : DEFAULT

2 : DEFLATE

3 : GZIP

4 : BZIP2

5 : LZO

6 : LZ4

7 : SNAPPY

8 : CUSTOM

Choose: 1

Custom codec:

Output directory: /user/DataSource/PostgreSQL/RHNPACKAGECHANGELOGDATA

Append mode:

Throttling resources

Extractors:

Loaders:

Classpath configuration

Extra mapper jars:

There are currently 0 values in the list:

element#

New job was successfully created with validation status OK and name PostgreSQL_to_HDFS

2.3 创建Oracle To HDFS作业

sqoop:000> create job -f Oracle_link -t HDFS_Link

Creating job for links with from name Oracle_link and to name HDFS_Link

Please fill following values to create new job object

Name: Oracle_To_HDFS

Database source

Schema name: spwuser

Table name: RHNCHANNELPACKAGE

SQL statement:

Column names:

There are currently 0 values in the list:

element#

Partition column:

Partition column nullable:

Boundary query:

Incremental read

Check column:

Last value:

Target configuration

Override null value:

Null value:

File format:

0 : TEXT_FILE

1 : SEQUENCE_FILE

2 : PARQUET_FILE

Choose: 0

Compression codec:

0 : NONE

1 : DEFAULT

2 : DEFLATE

3 : GZIP

4 : BZIP2

5 : LZO

6 : LZ4

7 : SNAPPY

8 : CUSTOM

Choose: 1

Custom codec:

Output directory: /user/DataSource/Oracle/RHNCHANNELPACKAGE

Append mode:

Throttling resources

Extractors:

Loaders:

Classpath configuration

Extra mapper jars:

There are currently 0 values in the list:

element#

New job was successfully created with validation status OK and name Oracle_To_HDFS

> 注意:如果表没有主键,则需在"Partition column: "处指定一个字段。

三、创建HDFS To 关系型数据库作业

3.1 HDFS To MySQL作业

sqoop:000> create job -f HDFS_Link -t MySQL_Link

Creating job for links with from name HDFS_Link and to name MySQL_Link

Please fill following values to create new job object

Name: HDFS_To_MySQL

Input configuration

Input directory: /user/DataSource/MySQL/tbDeliverMsg

Override null value:

Null value:

Incremental import

Incremental type:

0 : NONE

1 : NEW_FILES

Choose: 0

Last imported date:

Database target

Schema name: smsqw

Table name: tbDeliverMsg2

Column names:

There are currently 0 values in the list:

element#

Staging table:

Clear stage table:

Throttling resources

Extractors:

Loaders:

Classpath configuration

Extra mapper jars:

There are currently 0 values in the list:

element#

New job was successfully created with validation status OK and name HDFS_To_MySQL

3.2 HDFS To PostgreSQL作业

sqoop:000> create job -f HDFS_Link -t PostgreSQL_Link

Creating job for links with from name HDFS_Link and to name PostgreSQL_Link

Please fill following values to create new job object

Name: HDFS_To_PostgreSQL

Input configuration

Input directory: /user/DataSource/PostgreSQL/RHNPACKAGECHANGELOGDATA

Override null value:

Null value:

Incremental import

Incremental type:

0 : NONE

1 : NEW_FILES

Choose: 0

Last imported date:

Database target

Schema name: public

Table name: rhnpackagechangelogdata

Column names:

There are currently 0 values in the list:

element#

Staging table:

Clear stage table:

Throttling resources

Extractors:

Loaders:

Classpath configuration

Extra mapper jars:

There are currently 0 values in the list:

element#

New job was successfully created with validation status OK and name HDFS_To_PostgreSQL

3.3 HDFS To Oracle作业

sqoop:000> create job -f HDFS_Link -t Oracle_link

Creating job for links with from name HDFS_Link and to name Oracle_link

Please fill following values to create new job object

Name: HDFS_To_Oracle

Input configuration

Input directory: /user/DataSource/Oracle/QRTZ_CRON_TRIGGERS

Override null value:

Null value:

Incremental import

Incremental type:

0 : NONE

1 : NEW_FILES

Choose: 0

Last imported date:

Database target

Schema name: spwuser

Table name: QRTZ_CRON_TRIGGERS

Column names:

There are currently 0 values in the list:

element#

Staging table:

Clear stage table:

Throttling resources

Extractors:

Loaders:

Classpath configuration

Extra mapper jars:

There are currently 0 values in the list:

element#

New job was successfully created with validation status OK and name create job -f HDFS_Link -t Oracle_link

9d585218ddbed067a74cceaf306df766.png

另外,可以查看指定作业的详细信息,如下:

aa261495f690b43bd537c92266505366.png

四、运行作业

4.1 MySQL_To_HDFS

cb7a8f1759bcc1e9bed2305db21a0908.png

4.2 PostgreSQL_To_HDFS

a8e30197f843040cfcad1fd037bd54a8.png

4.3 Oracle_To_HDFS

faad6b9c8f44ca291a75a0d6f764b333.png

4.4 HDFS_To_MySQL

004e88da129969c40ba18cda4a705805.png

4.5 HDFS_To_PostgreSQL

c11f9cbdb5aa16bede5e8589fd1b1a6c.png

4.6 HDFS_To_Oracle

b0627989565e950b682c7a1861ec0d12.png

参考文献:

1、Sqoop2 User Guide



推荐阅读
author-avatar
85中互动_715
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有