Skip to content

Problem_6_合并多个小文件_v0 #1

@jiuxian77

Description

@jiuxian77

version: Hadoop-2.6.5
我将您的MergeJob 里的 setJarClass 改成了MergeJob 且发现需要增加 job.setOutputKeyClass(Text.class);
job.setOutputValueClass(BytesWritable.class); 不然集群提交运行时会报错

因为是初学者, 说下我的情况,希望后来者能更快开发
因为默认reduce task num = 1,最终part-r-00000 格式为 key(文本1_name):value(即文本1_context), key(文本2_name):value(即文本2_context),

sequenceFile 读取见此文 https://www.cnblogs.com/skyl/p/4769542.html

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions