I've installed Hadoop Cluster Setup on Multiple Nodes (physical). I have one server for NameNode, ResourceManager and JobHistory server. I have two servers for DataNodes. I followed this tutorial while configuring.
我在多个节点上安装了Hadoop集群设置(物理)。我有一个服务器用于NameNode, ResourceManager和JobHistory服务器。我有两台DataNodes服务器。我在配置时遵循了本教程。
I tried to test MapReduce programs, such as WordCount, Terasoft, Teragen and etc, all i can launch from hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar
我试着测试MapReduce程序,比如WordCount、Terasoft、Teragen等,我可以从hadoop/share/hadoop/ MapReduce /hadoop- MapReduce -examples-2.6.0.jar中启动。
So, Teragen and randomwriter i launch and they completed with success status (because there are no Reduce tasks, only Map tasks), but when i tried launch WordCount or WordMean, Map tasks completed (1 task), but Reduce 0% all the time. It is just stop completing. In yarn-root-resourcemanager-yamaster.log
after success Map tasks i see only one row:
所以,Teragen和randomwriter我启动了,他们成功地完成了状态(因为没有Reduce任务,只有Map任务),但是当我尝试启动WordCount或WordMean时,Map任务完成了(1个任务),但是一直减少0%。它只是停止完成。在yarn-root-resourcemanager-yamaster。登录成功地图任务后,我只看到一行:
INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
I tried to find out solution and i've found similar question on SOF, but there are no correct answer, actually i don't know how to see free reducers in resource manager. What i have:
我试图找到解决方案,我也发现了类似的问题,但是没有正确的答案,实际上我不知道如何在资源管理器中看到免费的减速器。我有什么:
UPDATE: I try to launch wordcount example program without reduce tasks using key -D mapd.reduce.tasks=0:
更新:我尝试启动wordcount示例程序,而不使用key -D map .reduce.tasks=0:
hadoop jar hadoop-mapreduce-examples-2.6.0.jar wordcount -D mapd.reduce.tasks=0 /bigtext.txt /bigtext_wc_1.txt
And it works. I got wordcount result. It's wrong, coz no reduce, but my program completed.
和它的工作原理。我得到wordcount结果。这是错误的,因为没有减少,但是我的程序完成了。
15/02/03 12:40:37 INFO mapreduce.Job: Running job: job_1422950901990_0004
15/02/03 12:40:52 INFO mapreduce.Job: Job job_1422950901990_0004 running in uber mode : false
15/02/03 12:40:52 INFO mapreduce.Job: map 0% reduce 0%
15/02/03 12:41:03 INFO mapreduce.Job: map 100% reduce 0%
15/02/03 12:41:04 INFO mapreduce.Job: Job job_1422950901990_0004 completed successfully
15/02/03 12:41:05 INFO mapreduce.Job: Counters: 30
UPDATE #2:
更新2:
More information from application log:
更多来自应用程序日志的信息:
2015-02-03 15:02:12,008 INFO [IPC Server handler 0 on 55452] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1422959549820_0005_m_000000_0 is : 1.0
2015-02-03 15:02:12,025 INFO [IPC Server handler 1 on 55452] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from attempt_1422959549820_0005_m_000000_0
2015-02-03 15:02:12,028 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1422959549820_0005_m_000000_0 TaskAttempt Transitioned from RUNNING to SUCCESS_CONTAINER_CLEANUP
2015-02-03 15:02:12,029 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1422959549820_0005_01_000002 taskAttempt attempt_1422959549820_0005_m_000000_0
2015-02-03 15:02:12,030 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1422959549820_0005_m_000000_0
2015-02-03 15:02:12,030 INFO [ContainerLauncher #1] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : slave102.hadoop.ot.ru:51573
2015-02-03 15:02:12,063 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1422959549820_0005_m_000000_0 TaskAttempt Transitioned from SUCCESS_CONTAINER_CLEANUP to SUCCEEDED
2015-02-03 15:02:12,084 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with attempt attempt_1422959549820_0005_m_000000_0
2015-02-03 15:02:12,087 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1422959549820_0005_m_000000 Task Transitioned from RUNNING to SUCCEEDED
2015-02-03 15:02:12,094 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1
2015-02-03 15:02:12,792 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:1 RackLocal:0
2015-02-03 15:02:12,794 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=
2015-02-03 15:02:12,794 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold reached. Scheduling reduces.
2015-02-03 15:02:12,795 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: All maps assigned. Ramping up all remaining reduces:1
2015-02-03 15:02:12,795 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:1 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:1 RackLocal:0
2015-02-03 15:02:13,805 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1422959549820_0005: ask=1 release= 0 newCOntainers=0 finishedCOntainers=1 resourcelimit= knownNMs=4
2015-02-03 15:02:13,806 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1422959549820_0005_01_000002
2015-02-03 15:02:13,808 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:1 AssignedMaps:0 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:1 RackLocal:0
2015-02-03 15:02:13,808 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1422959549820_0005_m_000000_0: Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Configuration files of cluster.
集群的配置文件。
hdfs-site.xml
hdfs-site.xml
dfs.namenode.name.dir
/grid/hadoop1/nn
Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.
dfs.namenode.hosts
/opt/current/hadoop/etc/hadoop/slaves
List of permitted DataNodes.If necessary, use these files to control the list of allowable datanodes.
dfs.namenode.hosts.exclude
/opt/current/hadoop/etc/hadoop/excludes
List of excluded DataNodes. If necessary, use these files to control the list of allowable datanodes.
dfs.blocksize
268435456
HDFS blocksize of 256MB for large file-systems.
dfs.namenode.handler.count
100
More NameNode server threads to handle RPCs from large number of DataNodes.
dfs.datanode.data.dir
/grid/hadoop1/dn
Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices.
core-site.xml
core-site.xml
fs.defaultFS
hdfs://master:8020
Default hdfs filesystem on namenode host like - hdfs://host:port/
io.file.buffer.size
131072
Size of read/write buffer used in SequenceFiles.
mapred-site.xml
mapred-site.xml
mapreduce.framework.name
yarn
Execution framework set to Hadoop YARN.
mapreduce.map.memory.mb
1536
Larger resource limit for maps.
mapreduce.map.java.opts
-Xmx1024M
Larger heap-size for child jvms of maps.
mapreduce.reduce.memory.mb
3072
Larger resource limit for reduces.
mapreduce.reduce.java.opts
-Xmx2560M
Larger heap-size for child jvms of reduces.
mapreduce.task.io.sort.mb
512
Higher memory-limit while sorting data for efficiency.
mapreduce.task.io.sort.factor
100
More streams merged at once while sorting files.
mapreduce.reduce.shuffle.parallelcopies
50
Higher number of parallel copies run by reduces to fetch outputs from very large number of maps.
mapreduce.jobhistory.address
master:10020
MapReduce JobHistory Server host:port. Default port is 10020.
mapreduce.jobhistory.webapp.address
master:19888
MapReduce JobHistory Server Web UI host:port. Default port is 19888.
mapreduce.jobhistory.intermediate-done-dir
/mr-history/tmp
Directory where history files are written by MapReduce jobs.
mapreduce.jobhistory.done-dir
/mr-history/done
Directory where history files are managed by the MR JobHistory Server.
yarn-site.xml
yarn-site.xml
yarn.acl.enable
yes
Enable ACLs? Defaults to false.
yarn.admin.acl
false
ACL to set admins on the cluster. ACLs are of for comma-separated-usersspacecomma-separated-groups. Defaults to special value of * which means anyone. Special value of just space means no one has access.
yarn.log-aggregation-enable
false
Configuration to enable or disable log aggregation
yarn.resourcemanager.address
master:8050
Value: host:port. If set, overrides the hostname set in yarn.resourcemanager.hostname.
yarn.resourcemanager.scheduler.address
master:8030
ResourceManager host:port for ApplicationMasters to talk to Scheduler to obtain resources. If set, overrides the hostname set in yarn.resourcemanager.hostname.
yarn.resourcemanager.resource-tracker.address
master:8025
ResourceManager host:port for NodeManagers. If set, overrides the hostname set in yarn.resourcemanager.hostname.
yarn.resourcemanager.admin.address
master:8141
ResourceManager host:port for administrative commands. If set, overrides the hostname set in yarn.resourcemanager.hostname.
yarn.resourcemanager.webapp.address
master:8088
web-ui host:port. If set, overrides the hostname set in
yarn.resourcemanager.hostname
master
ResourceManager host
yarn.resourcemanager.scheduler.class
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
ResourceManager Scheduler class.
yarn.scheduler.maximum-allocation-mb
6144
Maximum limit of memory to allocate to each container request at the Resource Manager. In MBs
yarn.scheduler.minimum-allocation-mb
2048
Minimum limit of memory to allocate to each container request at the Resource Manager. In MBs
yarn.resourcemanager.nodes.include-path
/opt/current/hadoop/etc/hadoop/slaves
List of permitted NodeManagers. If necessary, use these files to control the list of allowable NodeManagers.
yarn.resourcemanager.nodes.exclude-path
/opt/current/hadoop/etc/hadoop/excludes
List of excluded NodeManagers. If necessary, use these files to control the list of allowable NodeManagers.
yarn.nodemanager.resource.memory-mb
2048
Resource i.e. available physical memory, in MB, for given NodeManager. Defines total available resources on the NodeManager to be made available to running containers
yarn.nodemanager.vmem-pmem-ratio
2.1
Maximum ratio by which virtual memory usage of tasks may exceed physical memory. The virtual memory usage of each task may exceed its physical memory limit by this ratio. The total amount of virtual memory used by tasks on the NodeManager may exceed its physical memory usage by this ratio.
yarn.nodemanager.local-dirs
/grid/hadoop1/yarn/local
Comma-separated list of paths on the local filesystem where intermediate data is written.Multiple paths help spread disk i/o.
yarn.nodemanager.log-dirs
/var/log/hadoop-yarn/containers
Where to store container logs.
yarn.nodemanager.log.retain-second
10800
Default time (in seconds) to retain log files on the NodeManager Only applicable if log-aggregation is disabled.
yarn.nodemanager.remote-app-log-dir
/logs
HDFS directory where the application logs are moved on application completion. Need to set appropriate permissions. Only applicable if log-aggregation is enabled.
yarn.nodemanager.remote-app-log-dir-suffix
logs
Suffix appended to the remote log dir. Logs will be aggregated to ${yarn.nodemanager.remote-app-log-dir}/${user}/${thisParam} Only applicable if log-aggregation is enabled.
yarn.nodemanager.aux-services
mapreduce_shuffle
Shuffle service that needs to be set for Map Reduce applications.
And finally /etc/hosts:
最后/ etc / hosts:
127.0.0.1 localhost
## BigData Hadoop Lab ##
#Name Node
172.25.28.100 master.hadoop.ot.ru master
172.25.28.101 secondary.hadoop.ot.ru secondary
#DataNodes on DL Servers
172.25.28.102 slave102.hadoop.ot.ru slave102
172.25.28.103 slave103.hadoop.ot.ru slave103
172.25.28.104 slave104.hadoop.ot.ru slave104
172.25.28.105 slave105.hadoop.ot.ru slave105
172.25.28.106 slave106.hadoop.ot.ru slave106
172.25.28.107 slave107.hadoop.ot.ru slave107
#DataNodes on ARM Servers
172.25.40.25 slave25.hadoop.ot.ru slave25
172.25.40.26 slave26.hadoop.ot.ru slave26
172.25.40.27 slave27.hadoop.ot.ru slave27
172.25.40.28 slave28.hadoop.ot.ru slave28
0
The answer is not enough memory. Every container of task (map or reduce), was too big for my machines.
答案是没有足够的记忆。每个任务的容器(map或reduce)对我的机器来说都太大了。
This error:
这个错误:
2015-02-03 15:02:13,808 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1422959549820_0005_m_000000_0: Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
tells me about it.
告诉我关于它。
That's optimal settings most of my servers:
这是我大部分服务器的最优设置:
yarn.scheduler.minimum-allocation-mb=768
yarn.scheduler.maximum-allocation-mb=3072
yarn.nodemanager.resource.memory-mb=3072
mapreduce.map.memory.mb=768
mapreduce.map.java.opts=-Xmx512m
mapreduce.reduce.memory.mb=1536
mapreduce.reduce.java.opts=-Xmx1024m
yarn.app.mapreduce.am.resource.mb=768
yarn.app.mapreduce.am.command-opts=-Xmx512m