从oozie的架构图中,可以看到所有的任务都是通过oozie生成相应的任务客户端,并通过任务客户端来提交相应的任务;
对oozie的二次开发都集中在了oozie server那里,其实官网是有自定义例子:http://oozie.apache.org/docs/4.2.0/DG_CustomActionExecutor.html ,但如果了解oozie项目的代码架构有助于二次开发和调错,二次开发的步骤大概有三个步骤:
a.设计workflow流程且设计xsd schema文件
b.编写action的代码
c.部署action jar包
oozie的服务架构如上图,但整个服务程序是由tomcat来启动,所有服务和事件的接口都在org.apache.oozie.servlet,所有的服务单例都是由org.apache.oozie.servlet.ServicesLoader来启动,看下图:
org.apache.oozie.service 的包主要是用于各功能的管理服务的单例服务的实现
(先说如何部署自定义action)
对于部署自定义action,我是通过分析oozie的启动脚本日记入手的,当然我使用的是ambari启动oozie,启动日记里面有一段如下的日志,这是故意调错的日志
//解压oozie.war文件
u"Execute['cd /var/tmp/oozie && /usr/hdp/current/oozie-server/bin/oozie-setup.sh prepare-war']" {'not_if': 'ls /datas/hadoop/hdp/oozie/run/oozie.pid >/dev/null 2>&1 && ps -p `cat /datas/hadoop/hdp/oozie/run/oozie.pid` >/dev/null 2>&1', 'user': 'oozie'}
2015-08-28 09:47:54,538 - Error while executing command 'restart':
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 214, in execute
method(env)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 371, in restart
self.start(env)
File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie_server.py", line 60, in start
self.configure(env)
File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie_server.py", line 53, in configure
oozie(is_server=True)
File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie.py", line 101, in oozie
oozie_server_specific()
File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie.py", line 193, in oozie_server_specific
not_if = no_op_test
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 274, in action_run
raise ex
Fail: Execution of 'cd /var/tmp/oozie && /usr/hdp/current/oozie-server/bin/oozie-setup.sh prepare-war' returned 255. setting OOZIE_CONFIG=${OOZIE_CONFIG:-/etc/oozie/conf}
setting CATALINA_BASE=${CATALINA_BASE:-/usr/hdp/current/oozie-server/oozie-server}
setting CATALINA_TMPDIR=${CATALINA_TMPDIR:-/var/tmp/oozie}
setting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcat
setting JAVA_HOME=/usr/local/jdk
setting JRE_HOME=${JAVA_HOME}
setting OOZIE_LOG=/datas/hadoop/hdp/oozie/log
setting CATALINA_PID=/datas/hadoop/hdp/oozie/run/oozie.pid
setting OOZIE_DATA=/datas/hadoop/hdp/oozie/data
setting OOZIE_HTTP_PORT=11000
setting OOZIE_ADMIN_PORT=11001
setting JAVA_LIBRARY_PATH=/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64
setting OOZIE_CLIENT_OPTS="${OOZIE_CLIENT_OPTS} -Doozie.connection.retry.count=5 "
setting CATALINA_OPTS="${CATALINA_OPTS} -Xmx2048m -XX:MaxPermSize=256m "
setting OOZIE_CONFIG=${OOZIE_CONFIG:-/etc/oozie/conf}
setting CATALINA_BASE=${CATALINA_BASE:-/usr/hdp/current/oozie-server/oozie-server}
setting CATALINA_TMPDIR=${CATALINA_TMPDIR:-/var/tmp/oozie}
setting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcat
setting JAVA_HOME=/usr/local/jdk
setting JRE_HOME=${JAVA_HOME}
setting OOZIE_LOG=/datas/hadoop/hdp/oozie/log
setting CATALINA_PID=/datas/hadoop/hdp/oozie/run/oozie.pid
setting OOZIE_DATA=/datas/hadoop/hdp/oozie/data
setting OOZIE_HTTP_PORT=11000
setting OOZIE_ADMIN_PORT=11001
setting JAVA_LIBRARY_PATH=/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64
setting OOZIE_CLIENT_OPTS="${OOZIE_CLIENT_OPTS} -Doozie.connection.retry.count=5 "
setting CATALINA_OPTS="${CATALINA_OPTS} -Xmx2048m -XX:MaxPermSize=256m "
INFO: Adding extension: /usr/hdp/current/oozie-server/libext/mysql-connector-java.jar
File/Dir does no exist: /usr/hdp/current/oozie-server/oozie.war
可以看到了,oozie启动后都会重新解压和压缩oozie.war,oozie.war文件是在oozie所在路径里面,再去oozie-setup.sh脚本:
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
function printUsage() {
echo
echo " Usage : oozie-setup.sh "
echo " prepare-war [-d directory] [-secure] (-d identifies an alternative directory for processing jars"
echo " -secure will configure the war file to use HTTPS (SSL))"
echo " sharelib create -fs FS_URI [-locallib SHARED_LIBRARY] (create sharelib for oozie,"
echo " FS_URI is the fs.default.name"
echo " for hdfs uri; SHARED_LIBRARY, path to the"
echo " Oozie sharelib to install, it can be a tarball"
echo " or an expanded version of it. If ommited,"
echo " the Oozie sharelib tarball from the Oozie"
echo " installation directory will be used)"
echo " (action failes if sharelib is already installed"
echo " in HDFS)"
echo " sharelib upgrade -fs FS_URI [-locallib SHARED_LIBRARY] (upgrade existing sharelib, fails if there"
echo " is no existing sharelib installed in HDFS)"
echo " db create|upgrade|postupgrade -run [-sqlfile ] (create, upgrade or postupgrade oozie db with an"
echo " optional sql File)"
echo " (without options prints this usage information)"
echo
echo " EXTJS can be downloaded from http://www.extjs.com/learn/Ext_Version_Archives"
echo
}
#Creating temporary directory
function prepare() {
tmpDir=/tmp/oozie-war-packing-$$
rm -rf ${tmpDir}
mkdir ${tmpDir}
tmpWarDir=${tmpDir}/oozie-war
mkdir ${tmpWarDir}
checkExec "creating staging directory ${tmpDir}"
}
#cleans up temporary directory
function cleanUp() {
if [ ! "${tmpDir}" = "" ]; then
rm -rf ${tmpDir}
checkExec "deleting staging directory ${tmpDir}"
fi
}
#check execution of command
function checkExec() {
if [ $? -ne 0 ]
then
echo
echo "Failed: $1"
echo
cleanUp
exit -1;
fi
}
#check that a file/path exists
function checkFileExists() {
if [ ! -e ${1} ]; then
echo
echo "File/Dir does no exist: ${1}"
echo
cleanUp
exit -1
fi
}
#check that a file/path does not exist
function checkFileDoesNotExist() {
if [ -e ${1} ]; then
echo
echo "File/Dir already exists: ${1}"
echo
cleanUp
exit -1
fi
}
# resolve links - $0 may be a softlink
PRG="${0}"
while [ -h "${PRG}" ]; do
ls=`ls -ld "${PRG}"`
link=`expr "$ls" : '.*-> \(.*\)$'`
if expr "$link" : '/.*' > /dev/null; then
PRG="$link"
else
PRG=`dirname "${PRG}"`/"$link"
fi
done
BASEDIR=`dirname ${PRG}`
BASEDIR=`cd ${BASEDIR}/..;pwd`
source ${BASEDIR}/bin/oozie-sys.sh -silent
addExtjs=""
addHadoopJars=""
additiOnalDir=""
extjsHome=""
jarsPath=""
prepareWar=""
inputWar="${OOZIE_HOME}/oozie.war"
outputWar="${CATALINA_BASE}/webapps/oozie.war"
outputWarExpanded="${CATALINA_BASE}/webapps/oozie"
secure=""
secureCOnfigsDir="${CATALINA_BASE}/conf/ssl"
while [ $# -gt 0 ]
do
if [ "$1" = "sharelib" ] || [ "$1" = "db" ]; then
OOZIE_OPTS="-Doozie.home.dir=${OOZIE_HOME}";
OOZIE_OPTS="${OOZIE_OPTS} -Doozie.config.dir=${OOZIE_CONFIG}";
OOZIE_OPTS="${OOZIE_OPTS} -Doozie.log.dir=${OOZIE_LOG}";
OOZIE_OPTS="${OOZIE_OPTS} -Doozie.data.dir=${OOZIE_DATA}";
OOZIE_OPTS="${OOZIE_OPTS} -Dderby.stream.error.file=${OOZIE_LOG}/derby.log"
#Create lib directory from war if lib doesn't exist
if [ ! -d "${BASEDIR}/lib" ]; then
mkdir ${BASEDIR}/lib
//解压oozie.war文件
unzip ${BASEDIR}/oozie.war WEB-INF/lib/*.jar -d ${BASEDIR}/lib > /dev/null
mv ${BASEDIR}/lib/WEB-INF/lib/*.jar ${BASEDIR}/lib/
rmdir ${BASEDIR}/lib/WEB-INF/lib
rmdir ${BASEDIR}/lib/WEB-INF
fi
OOZIECPPATH=""
OOZIECPPATH=${BASEDIR}/lib/'*':${BASEDIR}/libtools/'*':${BASEDIR}/libext/'*'
if test -z ${JAVA_HOME}; then
JAVA_BIN=java
else
JAVA_BIN=${JAVA_HOME}/bin/java
fi
if [ "$1" = "sharelib" ]; then
shift
${JAVA_BIN} ${OOZIE_OPTS} -cp ${OOZIECPPATH} org.apache.oozie.tools.OozieSharelibCLI "${@}"
else
shift
${JAVA_BIN} ${OOZIE_OPTS} -cp ${OOZIECPPATH} org.apache.oozie.tools.OozieDBCLI "${@}"
fi
exit $?
elif [ "$1" = "-secure" ]; then
shift
secure=true
elif [ "$1" = "-d" ]; then
shift
additiOnalDir=$1
elif [ "$1" = "prepare-war" ]; then
prepareWar=true
else
printUsage
exit -1
fi
shift
done
if [ -e "${CATALINA_PID}" ]; then
echo
echo "ERROR: Stop Oozie first"
echo
exit -1
fi
echo
if [ "${prepareWar}" == "" ]; then
echo "no arguments given"
printUsage
exit -1
else
if [ -e "${outputWar}" ]; then
chmod -f u+w ${outputWar}
rm -rf ${outputWar}
fi
rm -rf ${outputWarExpanded}
# Adding extension JARs
libext=${OOZIE_HOME}/libext
if [ "${additionalDir}" != "" ]; then
libext=${additionalDir}
fi
//从libext添加自定义的jar文件路径
if [ -d "${libext}" ]; then
if [ `ls ${libext} | grep \.jar\$ | wc -c` != 0 ]; then
for i in "${libext}/"*.jar; do
echo "INFO: Adding extension: $i"
jarsPath="${jarsPath}:$i"
addJars="true"
done
fi
if [ -f "${libext}/ext-2.2.zip" ]; then
extjsHome=${libext}/ext-2.2.zip
addExtjs=true
fi
# find war files (e.g., workflowgenerator) under /libext and deploy
if [ `ls ${libext} | grep \.war\$ | wc -c` != 0 ]; then
for i in "${libext}/"*.war; do
echo "INFO: Deploying extention: $i"
cp $i ${CATALINA_BASE}/webapps/
done
fi
fi
prepare
checkFileExists ${inputWar}
checkFileDoesNotExist ${outputWar}
if [ "${addExtjs}" = "true" ]; then
checkFileExists ${extjsHome}
else
echo "INFO: Oozie webconsole disabled, ExtJS library not specified"
fi
if [ "${addJars}" = "true" ]; then
for jarPath in ${jarsPath//:/$'\n'}
do
checkFileExists ${jarPath}
done
fi
#Unpacking original war
unzip ${inputWar} -d ${tmpWarDir} > /dev/null
checkExec "unzipping Oozie input WAR"
compOnents=""
if [ "${secure}" != "" ]; then
#Use the SSL version of server.xml in oozie-server
checkFileExists ${secureConfigsDir}/ssl-server.xml
cp ${secureConfigsDir}/ssl-server.xml ${CATALINA_BASE}/conf/server.xml
#Inject the SSL version of web.xml in oozie war
checkFileExists ${secureConfigsDir}/ssl-web.xml
cp ${secureConfigsDir}/ssl-web.xml ${tmpWarDir}/WEB-INF/web.xml
echo "INFO: Using secure server.xml and secure web.xml"
else
#Use the regular version of server.xml in oozie-server
checkFileExists ${secureConfigsDir}/server.xml
cp ${secureConfigsDir}/server.xml ${CATALINA_BASE}/conf/server.xml
#No need to restore web.xml because its already in the original WAR file
fi
if [ "${addExtjs}" = "true" ]; then
if [ ! "${components}" = "" ];then
compOnents="${components}, "
fi
compOnents="${components}ExtJS library"
if [ -e ${tmpWarDir}/ext-2.2 ]; then
echo
echo "Specified Oozie WAR '${inputWar}' already contains ExtJS library files"
echo
cleanUp
exit -1
fi
#If the extjs path given is a ZIP, expand it and use it from there
if [ -f ${extjsHome} ]; then
unzip ${extjsHome} -d ${tmpDir} > /dev/null
extjsHome=${tmpDir}/ext-2.2
fi
#Inject the library in oozie war
cp -r ${extjsHome} ${tmpWarDir}/ext-2.2
checkExec "copying ExtJS files into staging"
fi
if [ "${addJars}" = "true" ]; then
if [ ! "${components}" = "" ];then
compOnents="${components}, "
fi
compOnents="${components}JARs"
//逐个复制自定义jar包到oozie.war的解压路径当中
for jarPath in ${jarsPath//:/$'\n'}
do
found=`ls ${tmpWarDir}/WEB-INF/lib/${jarPath} 2> /dev/null | wc -l`
checkExec "looking for JAR ${jarPath} in input WAR"
if [ ! $found = 0 ]; then
echo
echo "Specified Oozie WAR '${inputWar}' already contains JAR ${jarPath}"
echo
cleanUp
exit -1
fi
//复制过程
cp ${jarPath} ${tmpWarDir}/WEB-INF/lib/
checkExec "copying jar ${jarPath} to staging"
done
fi
#Creating new Oozie WAR
currentDir=`pwd`
cd ${tmpWarDir}
zip -r oozie.war * > /dev/null
checkExec "creating new Oozie WAR"
cd ${currentDir}
#copying new Oozie WAR to asked location
cp ${tmpWarDir}/oozie.war ${outputWar}
checkExec "copying new Oozie WAR"
echo
echo "New Oozie WAR file with added '${components}' at ${outputWar}"
echo
cleanUp
if [ "$?" != "0" ]; then
exit -1
fi
echo
echo "INFO: Oozie is ready to be started"
echo
fi
可以看到oozie部署自定义action从其home目录下的libext目录添加的,当然自定义action的jar包中必须包括自定义action的xsd schema文件,此文件是用于校验workflow的,如果对xsd schema不熟悉,请自行网上查阅;
自此,需要为oozie配置自定义的action信息,主要是要oozie-site.xml下的oozie.service.ActionService.executor.ext.classes添加自定义的action类以及oozie.service.SchemaService.wf.ext.schemas下的自定义xsd scheama
下面是测试用的自定义action项目架构图
maven项目依赖为:
<dependency>
<groupId>org.apache.ooziegroupId>
<artifactId>oozie-hadoopartifactId>
<version>2.3.0.oozie-4.1.0version>
dependency>
<dependency>
<groupId>org.apache.ooziegroupId>
<artifactId>oozie-coreartifactId>
<version>4.1.0version>
dependency>