当前位置: 开发笔记 > 编程语言 > 正文

ApacheSqoop1.4.5发布

作者：vivianchen1988 | 来源：互联网 | 2014-08-14 13:51

Sqoop,ApacheSqoop1.4.5发布

Git@OSC 安卓客户端 1.0 Beta 发布

Sqoop是一个用来将Hadoop和关系型数据库中的数据相互转移的工具，可以将一个关系型数据库（例如： MySQL ,Oracle ,Postgres等）中的数据导入到Hadoop的HDFS中，也可以将HDFS的数据导入到关系型数据库中。

Apache Sqoop 1.4.5 发布，此版本是 Sqoop 作为 Apache TLP 项目以来的第四个版本。

子任务

[SQOOP-1194] - Make changes to Sqoop build file to enable Netezza third party tests
[SQOOP-1323] - Update HCatalog version to 0.13 in Sqoop builds
[SQOOP-1324] - Support new hive datatypes in Sqoop hcatalog integration
[SQOOP-1325] - Make hcatalog object names escaped during creation so that reserved words are properly processed
[SQOOP-1326] - Support multiple static partition keys for better integration support
[SQOOP-1357] - QA testing of Data Connector for Oracle and Hadoop
[SQOOP-1363] - Document Hcatalog integration enhancements introduced in SQOOP-1322

Bug 修复

[SQOOP-585] - Bug when sqoop a join of two tables with the same column name with mysql backend
[SQOOP-832] - Document --columns argument usage in export tool
[SQOOP-1032] - Add the --bulk-load-dir option to support the HBase doBulkLoad function
[SQOOP-1107] - Further improve error reporting when exporting malformed data
[SQOOP-1117] - when failed to import a non-existing table, the failure information includes NullPointerException
[SQOOP-1138] - incremental lastmodified should re-use output directory
[SQOOP-1167] - Enhance HCatalog support to allow direct mode connection manager implementations
[SQOOP-1170] - Can't import columns with name "public"
[SQOOP-1179] - Incorrect warning saying --hive-import was not specified when it was specified
[SQOOP-1185] - LobAvroImportTestCase is sensitive to test method order execution
[SQOOP-1190] - Class HCatHadoopShims will be removed in HCatalog 0.12
[SQOOP-1192] - Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib
[SQOOP-1209] - DirectNetezzaManager fails to find tables from older Netezza system catalogs
[SQOOP-1216] - Improve error message on corrupted input while doing export
[SQOOP-1224] - Enable use of Oracle Wallets with Oracle Manager
[SQOOP-1226] - --password-file option triggers FileSystemClosed exception at end of Oozie action
[SQOOP-1227] - Sqoop fails to compile against commons-io higher then 1.4
[SQOOP-1228] - Method Configuration#unset is not available on Hadoop <1.2.0
[SQOOP-1239] - Sqoop import code too large error
[SQOOP-1246] - HBaseImportJob should add job authtoken only if HBase is secured
[SQOOP-1249] - Sqoop HCatalog Import fails with -queries because of validation issues
[SQOOP-1250] - Oracle connector is not disabling autoCommit on created connections
[SQOOP-1259] - Sqoop on Windows can't run HCatalog/HBase multinode jobs
[SQOOP-1260] - HADOOP_MAPRED_HOME should be defaulted correctly
[SQOOP-1261] - CompilationManager should add Hadoop 2.x libraries to the classpath under Hadoop 2.x
[SQOOP-1268] - Sqoop tarballs do not contain .gitignore and .gitattribute files
[SQOOP-1271] - Sqoop hcatalog location should support older bigtop default location also
[SQOOP-1273] - Multiple append jobs can easily end up sharing directories
[SQOOP-1278] - Allow use of uncommitted isolation for databases that support it as an import option
[SQOOP-1279] - Sqoop connection resiliency option breaks older Mysql versions that don't have JDBC 4 methods
[SQOOP-1302] - Doesn't run the mapper for remaining splits, when split-by ROWNUM
[SQOOP-1303] - Can only write to default file system on incremental import
[SQOOP-1316] - Example for use of password file in docs is incorrect
[SQOOP-1322] - Enhance Sqoop HCatalog Integration to cover features introduced in newer Hive versions
[SQOOP-1329] - JDBC connection to Oracle timeout after data import but before hive metadata import
[SQOOP-1339] - Synchronize .gitignore files
[SQOOP-1353] - Sqoop 1.4.5 release preparation
[SQOOP-1358] - Add wallet support for Oracle High performance connector
[SQOOP-1359] - Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2
[SQOOP-1362] - TestImportJob getContent method doesn't work
[SQOOP-1365] - Do not print stack trace when we can't move generated .java file to CWD
[SQOOP-1370] - AccumuloUtils can throw NPE when zookeeper or accumulo home is null
[SQOOP-1372] - configure-sqoop does not export ZOOKEEPER_HOME
[SQOOP-1398] - Upgrade ivy version used to the latest release version
[SQOOP-1399] - Fix TestOraOopJdbcUrl test case
[SQOOP-1406] - Add license headers
[SQOOP-1410] - Update change log for 1.4.5

改进

[SQOOP-435] - Avro import should write the Schema to a file
[SQOOP-1056] - Implement connection resiliency in Sqoop using pluggable failure handlers
[SQOOP-1132] - Print out Sqoop version into log during execution
[SQOOP-1137] - Put a stress in the user guide that eval tool is meant for evaluation purpose only
[SQOOP-1161] - Generated Delimiter Set Field Should be Static
[SQOOP-1172] - Make Sqoop compatible with HBase 0.95+
[SQOOP-1203] - Add another default case for finding *_HOME when not explicitly defined
[SQOOP-1212] - Do not print usage on wrong command line
[SQOOP-1213] - Support reading password files from Amazon S3
[SQOOP-1223] - Enhance the password file capability to enable plugging-in custom loaders
[SQOOP-1282] - Consider avro files even if they carry no extension
[SQOOP-1321] - Add ability to serialize SqoopOption into JobConf
[SQOOP-1337] - Doc refactoring - Consolidate documentation of --direct
[SQOOP-1341] - Sqoop Export Upsert for MySQL lacks batch support
[SQOOP-1373] - Sqoop import schema is locked shows NullPointerException

新特性

[SQOOP-767] - Add support for Accumulo
[SQOOP-1051] - Support direct mode connection managers in a generalized fashion
[SQOOP-1197] - Enable Sqoop to build against Hadoop-2.1.0-beta jar files
[SQOOP-1287] - Add high performance Oracle connector into Sqoop

任务

[SQOOP-1207] - Allow user to override java source version
[SQOOP-1344] - Add documentation for Oracle connector
[SQOOP-1408] - Document SQL Server's --non-resilient arg

测试

[SQOOP-1057] - Introduce fault injection framework to test connection resiliency

推荐阅读

go
Hadoop源码解析1Hadoop工程包架构解析

1 Hadoop中各工程包依赖简述 Google的核心竞争技术是它的计算平台。Google的大牛们用了下面5篇文章，介绍了它们的计算设施。 GoogleCluster：ht ... [详细]

蜡笔小新 2023-10-17 13:28:20
go
Hadoop学习笔记1：伪分布式环境搭建

在搭建Hadoop环境之前，请先阅读如下博文，把搭建Hadoop环境之前的准备工作做好，博文如下： 1、CentOS6.7下安装JDK,地址：http:b ... [详细]

蜡笔小新 2023-10-16 16:04:04
go
hadoop基础----hadoop实战(六)-----hadoop管理工具---Cloudera Manager---CDH介绍

我们在之前的文章中已经初步介绍了Cloudera。hadoop基础----hadoop实战(零)-----hadoop的平台版本选择从版本选择这篇文章中我们了解到除了hadoop官方版本外很多 ... [详细]

蜡笔小新 2023-10-16 14:21:13
go
每天收获一点点Hadoop概述

一、Hadoop来历Hadoop的思想来源于Google在做搜索引擎的时候出现一个很大的问题就是这么多网页我如何才能以最快的速度来搜索到，由于这个问题Google发明 ... [详细]

蜡笔小新 2023-12-14 18:58:01
java
WinPythonHadoop在Win10上安装教程

本文介绍了在Win10上安装WinPythonHadoop的详细步骤，包括安装Python环境、安装JDK8、安装pyspark、安装Hadoop和Spark、设置环境变量、下载winutils.exe等。同时提醒注意Hadoop版本与pyspark版本的一致性，并建议重启电脑以确保安装成功。 ... [详细]

蜡笔小新 2023-12-14 11:26:56
java
Hadoop与大数据技术大会将于11月30日开幕

11月26日，由中国计算机协会（CCF）主办，CCF大数据专家委员会协办，CSDN承办的Hadoop与大数据技术大会（Hadoop&BigDataTechnology ... [详细]

蜡笔小新 2023-10-17 17:47:11
java
Kylin 单节点安装

软件环境Hadoop:2.7,3.1(sincev2.5)Hive:0.13-1.2.1HBase:1.1,2.0(sincev2.5)Spark(optional)2.3.0K ... [详细]

蜡笔小新 2023-10-16 16:09:42
install
Hadoop （CDH4发行版）集群部署（部署脚本，namenode高可用，hadoop管理）

前言折腾了一段时间hadoop的部署管理，写下此系列博客记录一下。为了避免各位做部署这种重复性的劳动，我已经把部署的步骤写成脚本，各位只需要按着本文把脚本执行完，整个环境基本就部署 ... [详细]

蜡笔小新 2023-10-16 15:11:51
jsp
python zookeeeper 学习和操作

1.zookeeeper介绍ZooKeeper是一个为分布式应用所设计的分布的、开源的协调服务，它主要是用来解决分布式应用中经常遇到的一些数据管理问题，简化分布式应用协调及其管理的 ... [详细]

蜡笔小新 2023-10-16 11:58:31
jsp
phpBB安装环境配置及如何搭建php环境

本文介绍了关于apache、phpmyadmin、mysql、php、emacs、path等知识点，以及如何搭建php环境。文章提供了详细的安装步骤和所需软件列表，希望能帮助读者解决与LAMP相关的技术问题。 ... [详细]

蜡笔小新 2023-12-13 13:33:01
jsp
zookeeper_Starting zookeeper ... FAILED TO START

本文由编程笔记#小编为大家整理，主要介绍了StartingzookeeperFAILEDTOSTART相关的知识，希望对你有一定的参考价值。下载路径：https://ar ... [详细]

蜡笔小新 2023-12-13 01:31:19
export
FileNotFoundException: File does not exist

ubuntu用sqoop将数据从hive导入mysql时，命令： ... [详细]

蜡笔小新 2023-12-12 18:56:13
export
Linux下Kafka单机安装配置方法（实操成功）

本文介绍了在Linux下安装和配置Kafka的方法，包括安装JDK、下载和解压Kafka、配置Kafka的参数，以及配置Kafka的日志目录、服务器IP和日志存放路径等。同时还提供了单机配置部署的方法和zookeeper地址和端口的配置。通过实操成功的案例，帮助读者快速完成Kafka的安装和配置。 ... [详细]

蜡笔小新 2023-12-12 18:14:32
java
大数据Hadoop生态(20)MapReduce框架原理OutputFormat的开发笔记

本文介绍了大数据Hadoop生态(20)MapReduce框架原理OutputFormat的开发笔记，包括outputFormat接口实现类、自定义outputFormat步骤和案例。案例中将包含nty的日志输出到nty.log文件，其他日志输出到other.log文件。同时提供了一些相关网址供参考。 ... [详细]

蜡笔小新 2023-12-10 11:44:06
jar
MR程序的几种提交运行模式

MR程序的几种提交运行模式本地模型运行1在windows的eclipse里面直接运行main方法，就会将job提交给本地执行器localjobrunner执行-- ... [详细]

蜡笔小新 2023-10-16 18:29:26

vivianchen1988

Tags | 热门标签

RankList | 热门文章