热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

Hadoop2.6.0不能在WordCount示例中减少任务。-Hadoop2.6.0doesnotworkreducetasksinWordCountexample

IveinstalledHadoopClusterSetuponMultipleNodes(physical).IhaveoneserverforNameNode,Re

I've installed Hadoop Cluster Setup on Multiple Nodes (physical). I have one server for NameNode, ResourceManager and JobHistory server. I have two servers for DataNodes. I followed this tutorial while configuring.

我在多个节点上安装了Hadoop集群设置(物理)。我有一个服务器用于NameNode, ResourceManager和JobHistory服务器。我有两台DataNodes服务器。我在配置时遵循了本教程。

I tried to test MapReduce programs, such as WordCount, Terasoft, Teragen and etc, all i can launch from hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar

我试着测试MapReduce程序,比如WordCount、Terasoft、Teragen等,我可以从hadoop/share/hadoop/ MapReduce /hadoop- MapReduce -examples-2.6.0.jar中启动。

So, Teragen and randomwriter i launch and they completed with success status (because there are no Reduce tasks, only Map tasks), but when i tried launch WordCount or WordMean, Map tasks completed (1 task), but Reduce 0% all the time. It is just stop completing. In yarn-root-resourcemanager-yamaster.log after success Map tasks i see only one row:

所以,Teragen和randomwriter我启动了,他们成功地完成了状态(因为没有Reduce任务,只有Map任务),但是当我尝试启动WordCount或WordMean时,Map任务完成了(1个任务),但是一直减少0%。它只是停止完成。在yarn-root-resourcemanager-yamaster。登录成功地图任务后,我只看到一行:

INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...

I tried to find out solution and i've found similar question on SOF, but there are no correct answer, actually i don't know how to see free reducers in resource manager. What i have:

我试图找到解决方案,我也发现了类似的问题,但是没有正确的答案,实际上我不知道如何在资源管理器中看到免费的减速器。我有什么:

  • Hadoop Web Interface: master:50070
  • Hadoop Web界面:主:50070
  • Resource manager: master:8088
  • 资源管理器:主:8088
  • JobHistory Server: master:19888/jobhistory
  • JobHistory服务器:主:19888 / JobHistory

UPDATE: I try to launch wordcount example program without reduce tasks using key -D mapd.reduce.tasks=0:

更新:我尝试启动wordcount示例程序,而不使用key -D map .reduce.tasks=0:

hadoop jar hadoop-mapreduce-examples-2.6.0.jar wordcount -D mapd.reduce.tasks=0 /bigtext.txt /bigtext_wc_1.txt

And it works. I got wordcount result. It's wrong, coz no reduce, but my program completed.

和它的工作原理。我得到wordcount结果。这是错误的,因为没有减少,但是我的程序完成了。

15/02/03 12:40:37 INFO mapreduce.Job: Running job: job_1422950901990_0004
15/02/03 12:40:52 INFO mapreduce.Job: Job job_1422950901990_0004 running in uber mode : false
15/02/03 12:40:52 INFO mapreduce.Job:  map 0% reduce 0%
15/02/03 12:41:03 INFO mapreduce.Job:  map 100% reduce 0%
15/02/03 12:41:04 INFO mapreduce.Job: Job job_1422950901990_0004 completed successfully
15/02/03 12:41:05 INFO mapreduce.Job: Counters: 30

UPDATE #2:

更新2:

More information from application log:

更多来自应用程序日志的信息:

2015-02-03 15:02:12,008 INFO [IPC Server handler 0 on 55452] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1422959549820_0005_m_000000_0 is : 1.0
2015-02-03 15:02:12,025 INFO [IPC Server handler 1 on 55452] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from attempt_1422959549820_0005_m_000000_0
2015-02-03 15:02:12,028 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1422959549820_0005_m_000000_0 TaskAttempt Transitioned from RUNNING to SUCCESS_CONTAINER_CLEANUP
2015-02-03 15:02:12,029 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1422959549820_0005_01_000002 taskAttempt attempt_1422959549820_0005_m_000000_0
2015-02-03 15:02:12,030 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1422959549820_0005_m_000000_0
2015-02-03 15:02:12,030 INFO [ContainerLauncher #1] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : slave102.hadoop.ot.ru:51573
2015-02-03 15:02:12,063 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1422959549820_0005_m_000000_0 TaskAttempt Transitioned from SUCCESS_CONTAINER_CLEANUP to SUCCEEDED
2015-02-03 15:02:12,084 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with attempt attempt_1422959549820_0005_m_000000_0
2015-02-03 15:02:12,087 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1422959549820_0005_m_000000 Task Transitioned from RUNNING to SUCCEEDED
2015-02-03 15:02:12,094 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1
2015-02-03 15:02:12,792 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:1 RackLocal:0
2015-02-03 15:02:12,794 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=
2015-02-03 15:02:12,794 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold reached. Scheduling reduces.
2015-02-03 15:02:12,795 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: All maps assigned. Ramping up all remaining reduces:1
2015-02-03 15:02:12,795 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:1 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:1 RackLocal:0
2015-02-03 15:02:13,805 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1422959549820_0005: ask=1 release= 0 newCOntainers=0 finishedCOntainers=1 resourcelimit= knownNMs=4
2015-02-03 15:02:13,806 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1422959549820_0005_01_000002
2015-02-03 15:02:13,808 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:1 AssignedMaps:0 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:1 RackLocal:0
2015-02-03 15:02:13,808 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1422959549820_0005_m_000000_0: Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

Configuration files of cluster.

集群的配置文件。

hdfs-site.xml

hdfs-site.xml



    
        dfs.namenode.name.dir
        /grid/hadoop1/nn
        Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.
    
    
            dfs.namenode.hosts
            /opt/current/hadoop/etc/hadoop/slaves
            List of permitted DataNodes.If necessary, use these files to control the list of allowable datanodes.
    
    
            dfs.namenode.hosts.exclude
            /opt/current/hadoop/etc/hadoop/excludes
            List of excluded DataNodes. If necessary, use these files to control the list of allowable datanodes.
    
    
            dfs.blocksize
            268435456
            HDFS blocksize of 256MB for large file-systems.
    
    
            dfs.namenode.handler.count
            100
            More NameNode server threads to handle RPCs from large number of DataNodes.
    


    
            dfs.datanode.data.dir
            /grid/hadoop1/dn
            Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices.
    

core-site.xml

core-site.xml


    
        fs.defaultFS
        hdfs://master:8020
        Default hdfs filesystem on namenode host like - hdfs://host:port/
    
    
        io.file.buffer.size
        131072
        Size of read/write buffer used in SequenceFiles.
    

mapred-site.xml

mapred-site.xml


        
        
                mapreduce.framework.name
                yarn
                Execution framework set to Hadoop YARN.
        
        
                mapreduce.map.memory.mb
                1536
                Larger resource limit for maps.
        
        
                mapreduce.map.java.opts
                -Xmx1024M
                Larger heap-size for child jvms of maps.
        
        
                mapreduce.reduce.memory.mb
                3072
                Larger resource limit for reduces.
        
        
                mapreduce.reduce.java.opts
                -Xmx2560M
                Larger heap-size for child jvms of reduces.
        
        
                mapreduce.task.io.sort.mb
                512
                Higher memory-limit while sorting data for efficiency.
        
        
                mapreduce.task.io.sort.factor
                100
                More streams merged at once while sorting files.
        
        
                mapreduce.reduce.shuffle.parallelcopies
                50
                Higher number of parallel copies run by reduces to fetch outputs from very large number of maps.
        

        
                mapreduce.jobhistory.address
                master:10020
                MapReduce JobHistory Server host:port. Default port is 10020.
        
        
                mapreduce.jobhistory.webapp.address
                master:19888
                MapReduce JobHistory Server Web UI host:port. Default port is 19888.
        
        
                mapreduce.jobhistory.intermediate-done-dir
                /mr-history/tmp
                Directory where history files are written by MapReduce jobs.
        
        
                mapreduce.jobhistory.done-dir
                /mr-history/done
                Directory where history files are managed by the MR JobHistory Server.
        

yarn-site.xml

yarn-site.xml





        
                yarn.acl.enable
                yes
                Enable ACLs? Defaults to false.
        
        
                yarn.admin.acl
                false
                ACL to set admins on the cluster. ACLs are of for comma-separated-usersspacecomma-separated-groups. Defaults to special value of * which means anyone. Special value of just space means no one has access.
        
        
                yarn.log-aggregation-enable
                false
                Configuration to enable or disable log aggregation
        
     
        
                yarn.resourcemanager.address
                master:8050
                Value: host:port. If set, overrides the hostname set in yarn.resourcemanager.hostname.
        
        
                yarn.resourcemanager.scheduler.address
                master:8030
                ResourceManager host:port for ApplicationMasters to talk to Scheduler to obtain resources. If set, overrides the hostname set in yarn.resourcemanager.hostname.
        
        
                yarn.resourcemanager.resource-tracker.address
                master:8025
                ResourceManager host:port for NodeManagers. If set, overrides the hostname set in yarn.resourcemanager.hostname.
        
        
                yarn.resourcemanager.admin.address
                master:8141
                ResourceManager host:port for administrative commands. If set, overrides the hostname set in yarn.resourcemanager.hostname.
        
        
                yarn.resourcemanager.webapp.address
                master:8088
                web-ui host:port. If set, overrides the hostname set in
                     
        
                yarn.resourcemanager.hostname
                master
                ResourceManager host
        
        
                yarn.resourcemanager.scheduler.class
                org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
                ResourceManager Scheduler class.
        
        
            yarn.scheduler.maximum-allocation-mb
            6144
            Maximum limit of memory to allocate to each container request at the Resource Manager. In MBs
         

        
            yarn.scheduler.minimum-allocation-mb
            2048
            Minimum limit of memory to allocate to each container request at the Resource Manager. In MBs
         
        
                yarn.resourcemanager.nodes.include-path
                /opt/current/hadoop/etc/hadoop/slaves
                List of permitted NodeManagers. If necessary, use these files to control the list of allowable NodeManagers.
        
        
                yarn.resourcemanager.nodes.exclude-path
                /opt/current/hadoop/etc/hadoop/excludes
                List of excluded NodeManagers. If necessary, use these files to control the list of allowable NodeManagers.
        

        
                yarn.nodemanager.resource.memory-mb
                2048
                Resource i.e. available physical memory, in MB, for given NodeManager. Defines total available resources on the NodeManager to be made available to running containers
        
        
            yarn.nodemanager.vmem-pmem-ratio
            2.1
            Maximum ratio by which virtual memory usage of tasks may exceed physical memory. The virtual memory usage of each task may exceed its physical memory limit by this ratio. The total amount of virtual memory used by tasks on the NodeManager may exceed its physical memory usage by this ratio.
        
        
                yarn.nodemanager.local-dirs
                /grid/hadoop1/yarn/local
                Comma-separated list of paths on the local filesystem where intermediate data is written.Multiple paths help spread disk i/o.
            
        
            yarn.nodemanager.log-dirs
            /var/log/hadoop-yarn/containers
            Where to store container logs.
         
        
            yarn.nodemanager.log.retain-second
            10800
            Default time (in seconds) to retain log files on the NodeManager Only applicable if log-aggregation is disabled.
             
        
            yarn.nodemanager.remote-app-log-dir
            /logs
            HDFS directory where the application logs are moved on application completion. Need to set appropriate permissions. Only applicable if log-aggregation is enabled.
         
        
            yarn.nodemanager.remote-app-log-dir-suffix
            logs
            Suffix appended to the remote log dir. Logs will be aggregated to ${yarn.nodemanager.remote-app-log-dir}/${user}/${thisParam} Only applicable if log-aggregation is enabled.
        
        
            yarn.nodemanager.aux-services
            mapreduce_shuffle
            Shuffle service that needs to be set for Map Reduce applications.
        


And finally /etc/hosts:

最后/ etc / hosts:

127.0.0.1 localhost

## BigData Hadoop Lab ##
#Name Node
172.25.28.100 master.hadoop.ot.ru master
172.25.28.101 secondary.hadoop.ot.ru secondary
#DataNodes on DL Servers
172.25.28.102 slave102.hadoop.ot.ru slave102
172.25.28.103 slave103.hadoop.ot.ru slave103
172.25.28.104 slave104.hadoop.ot.ru slave104
172.25.28.105 slave105.hadoop.ot.ru slave105
172.25.28.106 slave106.hadoop.ot.ru slave106
172.25.28.107 slave107.hadoop.ot.ru slave107
#DataNodes on ARM Servers
172.25.40.25 slave25.hadoop.ot.ru slave25
172.25.40.26 slave26.hadoop.ot.ru slave26
172.25.40.27 slave27.hadoop.ot.ru slave27
172.25.40.28 slave28.hadoop.ot.ru slave28

1 个解决方案

#1


0  

The answer is not enough memory. Every container of task (map or reduce), was too big for my machines.

答案是没有足够的记忆。每个任务的容器(map或reduce)对我的机器来说都太大了。

This error:

这个错误:

2015-02-03 15:02:13,808 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1422959549820_0005_m_000000_0: Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

tells me about it.

告诉我关于它。

That's optimal settings most of my servers:

这是我大部分服务器的最优设置:

yarn.scheduler.minimum-allocation-mb=768
 yarn.scheduler.maximum-allocation-mb=3072
 yarn.nodemanager.resource.memory-mb=3072
 mapreduce.map.memory.mb=768
 mapreduce.map.java.opts=-Xmx512m
 mapreduce.reduce.memory.mb=1536
 mapreduce.reduce.java.opts=-Xmx1024m
 yarn.app.mapreduce.am.resource.mb=768
 yarn.app.mapreduce.am.command-opts=-Xmx512m

推荐阅读
  • OpenMap教程4 – 图层概述
    本文介绍了OpenMap教程4中关于地图图层的内容,包括将ShapeLayer添加到MapBean中的方法,OpenMap支持的图层类型以及使用BufferedLayer创建图像的MapBean。此外,还介绍了Layer背景标志的作用和OMGraphicHandlerLayer的基础层类。 ... [详细]
  • 本文讨论了在openwrt-17.01版本中,mt7628设备上初始化启动时eth0的mac地址总是随机生成的问题。每次随机生成的eth0的mac地址都会写到/sys/class/net/eth0/address目录下,而openwrt-17.01原版的SDK会根据随机生成的eth0的mac地址再生成eth0.1、eth0.2等,生成后的mac地址会保存在/etc/config/network下。 ... [详细]
  • Jboss的EJB部署描述符standardjaws.xml配置步骤详解
    本文详细介绍了Jboss的EJB部署描述符standardjaws.xml的配置步骤,包括映射CMP实体EJB、数据源连接池的获取以及数据库配置等内容。 ... [详细]
  • 本文介绍了解决java开源项目apache commons email简单使用报错的方法,包括使用正确的JAR包和正确的代码配置,以及相关参数的设置。详细介绍了如何使用apache commons email发送邮件。 ... [详细]
  • 本文由编程笔记#小编为大家整理,主要介绍了logistic回归(线性和非线性)相关的知识,包括线性logistic回归的代码和数据集的分布情况。希望对你有一定的参考价值。 ... [详细]
  • Nginx使用AWStats日志分析的步骤及注意事项
    本文介绍了在Centos7操作系统上使用Nginx和AWStats进行日志分析的步骤和注意事项。通过AWStats可以统计网站的访问量、IP地址、操作系统、浏览器等信息,并提供精确到每月、每日、每小时的数据。在部署AWStats之前需要确认服务器上已经安装了Perl环境,并进行DNS解析。 ... [详细]
  • Skywalking系列博客1安装单机版 Skywalking的快速安装方法
    本文介绍了如何快速安装单机版的Skywalking,包括下载、环境需求和端口检查等步骤。同时提供了百度盘下载地址和查询端口是否被占用的命令。 ... [详细]
  • VScode格式化文档换行或不换行的设置方法
    本文介绍了在VScode中设置格式化文档换行或不换行的方法,包括使用插件和修改settings.json文件的内容。详细步骤为:找到settings.json文件,将其中的代码替换为指定的代码。 ... [详细]
  • 本文介绍了在rhel5.5操作系统下搭建网关+LAMP+postfix+dhcp的步骤和配置方法。通过配置dhcp自动分配ip、实现外网访问公司网站、内网收发邮件、内网上网以及SNAT转换等功能。详细介绍了安装dhcp和配置相关文件的步骤,并提供了相关的命令和配置示例。 ... [详细]
  • Linux重启网络命令实例及关机和重启示例教程
    本文介绍了Linux系统中重启网络命令的实例,以及使用不同方式关机和重启系统的示例教程。包括使用图形界面和控制台访问系统的方法,以及使用shutdown命令进行系统关机和重启的句法和用法。 ... [详细]
  • CSS3选择器的使用方法详解,提高Web开发效率和精准度
    本文详细介绍了CSS3新增的选择器方法,包括属性选择器的使用。通过CSS3选择器,可以提高Web开发的效率和精准度,使得查找元素更加方便和快捷。同时,本文还对属性选择器的各种用法进行了详细解释,并给出了相应的代码示例。通过学习本文,读者可以更好地掌握CSS3选择器的使用方法,提升自己的Web开发能力。 ... [详细]
  • 本文介绍了Oracle数据库中tnsnames.ora文件的作用和配置方法。tnsnames.ora文件在数据库启动过程中会被读取,用于解析LOCAL_LISTENER,并且与侦听无关。文章还提供了配置LOCAL_LISTENER和1522端口的示例,并展示了listener.ora文件的内容。 ... [详细]
  • 如何在服务器主机上实现文件共享的方法和工具
    本文介绍了在服务器主机上实现文件共享的方法和工具,包括Linux主机和Windows主机的文件传输方式,Web运维和FTP/SFTP客户端运维两种方式,以及使用WinSCP工具将文件上传至Linux云服务器的操作方法。此外,还介绍了在迁移过程中需要安装迁移Agent并输入目的端服务器所在华为云的AK/SK,以及主机迁移服务会收集的源端服务器信息。 ... [详细]
  • 深入理解Kafka服务端请求队列中请求的处理
    本文深入分析了Kafka服务端请求队列中请求的处理过程,详细介绍了请求的封装和放入请求队列的过程,以及处理请求的线程池的创建和容量设置。通过场景分析、图示说明和源码分析,帮助读者更好地理解Kafka服务端的工作原理。 ... [详细]
  • Spring框架《一》简介
    Spring框架《一》1.Spring概述1.1简介1.2Spring模板二、IOC容器和Bean1.IOC和DI简介2.三种通过类型获取bean3.给bean的属性赋值3.1依赖 ... [详细]
author-avatar
慧萍书群415
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有