首页
技术博客
PHP教程
数据库技术
前端开发
HTML5
Nginx
php论坛
新用户注册
|
会员登录
PHP教程
技术博客
编程问答
PNG素材
编程语言
前端技术
Android
PHP教程
HTML5教程
数据库
Linux技术
Nginx技术
PHP安全
WebSerer
职场攻略
JavaScript
开放平台
业界资讯
大话程序猿
登录
极速注册
取消
热门标签 | HotTags
netty
plugins
spring
export
match
install
case
hook
replace
instance
typescript
vba
import
bit
settings
lua
cPlusPlus
cSharp
httpclient
client
frameworks
dockerfile
hashcode
get
php
timezone
ascii
solr
filter
main
select
blob
javascript
php8
yaml
regex
php5
metadata
nodejs
include
merge
search
golang
bytecode
controller
list
const
random
audio
loops
process
shell
email
fetch
utf-8
python2
keyword
triggers
heap
range
cpython
uml
string
buffer
js
dll
scala
actionscrip
sum
cookie
eval
testing
io
future
less
substring
require
hashset
char
当前位置:
开发笔记
>
编程语言
> 正文
FileSplit简单使用
作者:mobiledu2502853623 | 来源:互联网 | 2023-06-28 11:35
hadoop的FileSplit简单使用FileSplit类继承关系:FileSplit类中的属性和方法:作业输入:[java]viewp
hadoop的FileSplit简单使用
FileSplit类继承关系:
FileSplit类中的属性和方法:
作业输入:
[java]
view plain
copy
hadoop
@hadoop
:/home/hadoop/blb$ hdfs dfs -text /user/hadoop/libin/input/inputpath1.txt
hadoop a
spark a
hive a
hbase a
tachyon a
storm a
redis a
hadoop
@hadoop
:/home/hadoop/blb$ hdfs dfs -text /user/hadoop/libin/input/inputpath2.txt
hadoop b
spark b
kafka b
tachyon b
oozie b
flume b
sqoop b
solr b
hadoop
@hadoop
:/home/hadoop/blb$
[java] view plaincopy
hadoop
@hadoop
:/home/hadoop/blb$ hdfs dfs -text /user/hadoop/libin/input/inputpath1.txt
hadoop a
spark a
hive a
hbase a
tachyon a
storm a
redis a
hadoop
@hadoop
:/home/hadoop/blb$ hdfs dfs -text /user/hadoop/libin/input/inputpath2.txt
hadoop b
spark b
kafka b
tachyon b
oozie b
flume b
sqoop b
solr b
hadoop
@hadoop
:/home/hadoop/blb$
代码:
[java]
view plain
copy
import
java.io.IOException;
import
org.apache.hadoop.conf.Configuration;
import
org.apache.hadoop.fs.Path;
import
org.apache.hadoop.io.LongWritable;
import
org.apache.hadoop.io.NullWritable;
import
org.apache.hadoop.io.Text;
import
org.apache.hadoop.mapred.SplitLocationInfo;
import
org.apache.hadoop.mapreduce.Job;
import
org.apache.hadoop.mapreduce.Mapper;
import
org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import
org.apache.hadoop.mapreduce.lib.input.FileSplit;
import
org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import
org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import
org.apache.hadoop.util.GenericOptionsParser;
public
class
GetSplitMapReduce {
public
static
void
main(String[] args)
throws
IOException, ClassNotFoundException, InterruptedException {
Configuration conf =
new
Configuration();
String[] otherArgs =
new
GenericOptionsParser(conf, args).getRemainingArgs();
if
(otherArgs.length!=
2
){
System.err.println(
"Usage databaseV1
"
);
}
Job job = Job.getInstance(conf, GetSplitMapReduce.
class
.getSimpleName() +
"1"
);
job.setJarByClass(GetSplitMapReduce.
class
);
job.setMapOutputKeyClass(Text.
class
);
job.setMapOutputValueClass(Text.
class
);
job.setOutputKeyClass(Text.
class
);
job.setOutputValueClass(NullWritable.
class
);
job.setMapperClass(MyMapper1.
class
);
job.setNumReduceTasks(
0
);
job.setInputFormatClass(TextInputFormat.
class
);
job.setOutputFormatClass(TextOutputFormat.
class
);
FileInputFormat.addInputPath(job,
new
Path(otherArgs[
0
]));
FileOutputFormat.setOutputPath(job,
new
Path(otherArgs[
1
]));
job.waitForCompletion(
true
);
}
public
static
class
MyMapper1
extends
Mapper
{
@Override
protected
void
map(LongWritable key, Text value, Mapper
.Context context)
throws
IOException, InterruptedException {
FileSplit fileSplit=(FileSplit) context.getInputSplit();
String pathname=fileSplit.getPath().getName();
//获取目录名字
int
depth = fileSplit.getPath().depth();
//获取目录深度
Class
extends
FileSplit> class1 = fileSplit.getClass();
//获取当前类
long
length = fileSplit.getLength();
//获取文件长度
SplitLocationInfo[] locationInfo = fileSplit.getLocationInfo();
//获取位置信息
String[] locations = fileSplit.getLocations();
//获取位置
long
start = fileSplit.getStart();
//The position of the first byte in the file to process.
String string = fileSplit.toString();
//fileSplit.
context.write(
new
Text(
"===================================================================================="
), NullWritable.get());
context.write(
new
Text(
"pathname--"
+pathname), NullWritable.get());
context.write(
new
Text(
"depth--"
+depth), NullWritable.get());
context.write(
new
Text(
"class1--"
+class1), NullWritable.get());
context.write(
new
Text(
"length--"
+length), NullWritable.get());
context.write(
new
Text(
"locationInfo--"
+locationInfo), NullWritable.get());
context.write(
new
Text(
"locations--"
+locations), NullWritable.get());
context.write(
new
Text(
"start--"
+start), NullWritable.get());
context.write(
new
Text(
"string--"
+string), NullWritable.get());
}
}
}
[java] view plaincopy
import
java.io.IOException;
import
org.apache.hadoop.conf.Configuration;
import
org.apache.hadoop.fs.Path;
import
org.apache.hadoop.io.LongWritable;
import
org.apache.hadoop.io.NullWritable;
import
org.apache.hadoop.io.Text;
import
org.apache.hadoop.mapred.SplitLocationInfo;
import
org.apache.hadoop.mapreduce.Job;
import
org.apache.hadoop.mapreduce.Mapper;
import
org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import
org.apache.hadoop.mapreduce.lib.input.FileSplit;
import
org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import
org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import
org.apache.hadoop.util.GenericOptionsParser;
public
class
GetSplitMapReduce {
public
static
void
main(String[] args)
throws
IOException, ClassNotFoundException, InterruptedException {
Configuration conf =
new
Configuration();
String[] otherArgs =
new
GenericOptionsParser(conf, args).getRemainingArgs();
if
(otherArgs.length!=
2
){
System.err.println(
"Usage databaseV1
"
);
}
Job job = Job.getInstance(conf, GetSplitMapReduce.
class
.getSimpleName() +
"1"
);
job.setJarByClass(GetSplitMapReduce.
class
);
job.setMapOutputKeyClass(Text.
class
);
job.setMapOutputValueClass(Text.
class
);
job.setOutputKeyClass(Text.
class
);
job.setOutputValueClass(NullWritable.
class
);
job.setMapperClass(MyMapper1.
class
);
job.setNumReduceTasks(
0
);
job.setInputFormatClass(TextInputFormat.
class
);
job.setOutputFormatClass(TextOutputFormat.
class
);
FileInputFormat.addInputPath(job,
new
Path(otherArgs[
0
]));
FileOutputFormat.setOutputPath(job,
new
Path(otherArgs[
1
]));
job.waitForCompletion(
true
);
}
public
static
class
MyMapper1
extends
Mapper
{
@Override
protected
void
map(LongWritable key, Text value, Mapper
.Context context)
throws
IOException, InterruptedException {
FileSplit fileSplit=(FileSplit) context.getInputSplit();
String pathname=fileSplit.getPath().getName();
//获取目录名字
int
depth = fileSplit.getPath().depth();
//获取目录深度
Class
extends
FileSplit> class1 = fileSplit.getClass();
//获取当前类
long
length = fileSplit.getLength();
//获取文件长度
SplitLocationInfo[] locationInfo = fileSplit.getLocationInfo();
//获取位置信息
String[] locations = fileSplit.getLocations();
//获取位置
long
start = fileSplit.getStart();
//The position of the first byte in the file to process.
String string = fileSplit.toString();
//fileSplit.
context.write(
new
Text(
"===================================================================================="
), NullWritable.get());
context.write(
new
Text(
"pathname--"
+pathname), NullWritable.get());
context.write(
new
Text(
"depth--"
+depth), NullWritable.get());
context.write(
new
Text(
"class1--"
+class1), NullWritable.get());
context.write(
new
Text(
"length--"
+length), NullWritable.get());
context.write(
new
Text(
"locationInfo--"
+locationInfo), NullWritable.get());
context.write(
new
Text(
"locations--"
+locations), NullWritable.get());
context.write(
new
Text(
"start--"
+start), NullWritable.get());
context.write(
new
Text(
"string--"
+string), NullWritable.get());
}
}
}
对应inputpath2.txt文件的输出:
[java]
view plain
copy
hadoop
@hadoop
:/home/hadoop/blb$ hdfs dfs -text /user/hadoop/libin/out2/part-m-
00000
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@4ff41ba0
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@2341ce62
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@35549603
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@4444ba4f
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@7c23bb8c
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@dee2400
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@d7d8325
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@2b2cf90e
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
[java] view plaincopy
hadoop
@hadoop
:/home/hadoop/blb$ hdfs dfs -text /user/hadoop/libin/out2/part-m-
00000
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@4ff41ba0
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@2341ce62
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@35549603
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@4444ba4f
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@7c23bb8c
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@dee2400
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@d7d8325
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
====================================================================================
pathname--inputpath2.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
66
locationInfo--
null
locations--[Ljava.lang.String;
@2b2cf90e
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath2.txt:0+66
对应inputpath1.txt文件的输出:
[java]
view plain
copy
hadoop
@hadoop
:/home/hadoop/blb$ hdfs dfs -text /user/hadoop/libin/out2/part-m-
00001
====================================================================================
pathname--inputpath1.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
58
locationInfo--
null
locations--[Ljava.lang.String;
@4ff41ba0
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath1.txt:0+58
====================================================================================
pathname--inputpath1.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
58
locationInfo--
null
locations--[Ljava.lang.String;
@2341ce62
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath1.txt:0+58
====================================================================================
pathname--inputpath1.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
58
locationInfo--
null
locations--[Ljava.lang.String;
@35549603
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath1.txt:0+58
====================================================================================
pathname--inputpath1.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
58
locationInfo--
null
locations--[Ljava.lang.String;
@4444ba4f
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath1.txt:0+58
====================================================================================
pathname--inputpath1.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
58
locationInfo--
null
locations--[Ljava.lang.String;
@7c23bb8c
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath1.txt:0+58
====================================================================================
pathname--inputpath1.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
58
locationInfo--
null
locations--[Ljava.lang.String;
@dee2400
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath1.txt:0+58
====================================================================================
pathname--inputpath1.txt
depth--
5
class1--
class
org.apache.hadoop.mapreduce.lib.input.FileSplit
length--
58
locationInfo--
null
locations--[Ljava.lang.String;
@d7d8325
start--
0
string--hdfs:
//hadoop:9000/user/hadoop/libin/input/inputpath1.txt:0+58
hadoop
@hadoop
:/home/hadoop/blb$
split
hadoop
java
view
copy
hdfs
text
input
spark
写下你的评论吧 !
吐个槽吧,看都看了
会员登录
|
用户注册
推荐阅读
client
mapreduce源码分析总结
这篇文章总结的非常到位,故而转之一MapReduce概述MapReduce是一个用于大规模数据处理的分布式计算模型,它最初是由Google工程师设计并实现的ÿ ...
[详细]
蜡笔小新 2023-10-17 12:36:35
import
hadoop学习;block数据块;mapreduce实现样例;UnsupportedClassVersionError异常;关联项目源代码...
对于开源的东东,尤其是刚出来不久,我认为最好的学习方式就是能够看源代码和doc,測试它的样例为了方便查看源代码,关联导入源代 ...
[详细]
蜡笔小新 2023-10-17 09:49:38
get
关于Perl中split的用法的更多说明 - More clarification about the usage of split in Perl
Ihavethisfollowinginputfile:我有以下输入文件:test.csvdone_cfg,,,,port<0>,clk_in,subcktA,ins ...
[详细]
蜡笔小新 2023-10-16 17:45:16
get
MapReduce工作流程最详细解释
MapReduce是我们再进行离线大数据处理的时候经常要使用的计算模型,MapReduce的计算过程被封装的很好,我们只用使用Map和Reduce函数,所以对其整体的计算过程不是太 ...
[详细]
蜡笔小新 2023-10-16 14:14:27
get
springmvc学习笔记(十):控制器业务方法中通过注解实现封装Javabean接收表单提交的数据
本文介绍了在springmvc学习笔记系列的第十篇中,控制器的业务方法中如何通过注解实现封装Javabean来接收表单提交的数据。同时还讨论了当有多个注册表单且字段完全相同时,如何将其交给同一个控制器处理。 ...
[详细]
蜡笔小新 2023-12-13 12:16:34
main
[大整数乘法] java代码实现
本文介绍了使用java代码实现大整数乘法的过程,同时也涉及到大整数加法和大整数减法的计算方法。通过分治算法来提高计算效率,并对算法的时间复杂度进行了研究。详细代码实现请参考文章链接。 ...
[详细]
蜡笔小新 2023-12-13 11:21:32
get
MySQL显示SQL语句执行时间的实例详解
本文详细介绍了如何使用MySQL来显示SQL语句的执行时间,并通过MySQL Query Profiler获取CPU和内存使用量以及系统锁和表锁的时间。同时介绍了效能分析的三种方法:瓶颈分析、工作负载分析和基于比率的分析。 ...
[详细]
蜡笔小新 2023-12-12 16:16:42
main
单击时动态创建
元素 - Dynamically create
element on click
Ihavethefollowingonhtml我在html上有以下内容<html><head><scriptsrc..3003_Tes ...
[详细]
蜡笔小新 2023-12-12 15:59:36
php
海马s5近光灯能否直接更换为H7?
本文主要介绍了海马s5车型的近光灯是否可以直接更换为H7灯泡,并提供了完整的教程下载地址。此外,还详细讲解了DSP功能函数中的数据拷贝、数据填充和浮点数转换为定点数的相关内容。 ...
[详细]
蜡笔小新 2023-12-12 11:39:00
get
iOS实现UITextField+Limit的字符限制方法
本文介绍了在iOS开发中使用UITextField实现字符限制的方法,包括利用代理方法和使用BNTextField-Limit库的实现策略。通过这些方法,开发者可以方便地限制UITextField的字符个数和输入规则。 ...
[详细]
蜡笔小新 2023-12-12 09:50:30
get
Hadoop 源码学习笔记(4)Hdfs 数据读写流程分析
Hdfs的数据模型在对读写流程进行分析之前,我们需要先对Hdfs的数据模型有一个简单的认知。数据模型如上图所示,在NameNode中有一个唯一的FSDirectory类负责维护文件 ...
[详细]
蜡笔小新 2023-10-17 11:27:29
main
Flink使用java实现读取csv文件简单实例
Flink使用java实现读取csv文件简单实例首先我们来看官方文档中给出的几种方法:首先我们来看官方文档中给出的几种方法:第一种:Da ...
[详细]
蜡笔小新 2023-10-17 10:21:46
main
Kylin 单节点安装
软件环境Hadoop:2.7,3.1(sincev2.5)Hive:0.13-1.2.1HBase:1.1,2.0(sincev2.5)Spark(optional)2.3.0K ...
[详细]
蜡笔小新 2023-10-16 16:09:42
main
spark任务已经执行结束,但还显示RUNNING状态
spark的任务已经执行完成:scalavallinesc.textFile(hdfs:vm122:9000dblp.rdf)line:org.apache ...
[详细]
蜡笔小新 2023-10-16 12:18:00
get
Hadoop框架之HDFS的shell操作
既然HDFS是存取数据的分布式文件系统,那么对HDFS的操作,就是文件系统的基本操作,比如文件的创建、修改、删除、修改权限等,文件夹的创建、删除、重命名等。对HDFS的操作命令类似于Linux的she ...
[详细]
蜡笔小新 2023-10-15 16:12:13
mobiledu2502853623
这个家伙很懒,什么也没留下!
Tags | 热门标签
netty
plugins
spring
export
match
install
case
hook
replace
instance
typescript
vba
import
bit
settings
lua
cPlusPlus
cSharp
httpclient
client
frameworks
dockerfile
hashcode
get
php
timezone
ascii
solr
filter
main
RankList | 热门文章
1
HTML form without CSRF protection,HTML表单没有CSRF保护
2
报表制作
3
在C中将指针传递给结构数组
4
linux find命令基本使用、dirname查询目录、basename查询文件名
5
介绍一个前端javascript解析url的库
6
爬虫实践-爬取简书网用户动态信息
7
c语言三元组作用,三元组顺序表,稀疏矩阵的三元组表示及(C语言)实现
8
flutter_bloc模式
9
revalidate java,javarevalidate()/ repaint()在ActionListener中不起作用堆栈内存溢出...
10
mysql事物处理的四大特征和简单用法
11
采用CreateThread()创建多线程程序
12
Hive的使用2
13
CSS实现iphone式开关
14
windows7html文件能不能删,c盘中可以删除的文件有哪些?
15
JVM_总结_01_JDK的安装
PHP1.CN | 中国最专业的PHP中文社区 |
DevBox开发工具箱
|
json解析格式化
|
PHP资讯
|
PHP教程
|
数据库技术
|
服务器技术
|
前端开发技术
|
PHP框架
|
开发工具
|
在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved |
京公网安备 11010802041100号
|
京ICP备19059560号-4
| PHP1.CN 第一PHP社区 版权所有