mapreduce - Hadoop - get results from output files after reduce? -

- March 15, 2012

given job map , reduce phases, can see output folder contains files named "part-r-00000".

if need post-process these files on application level, need iterate on files in output folder in natural naming order (part-r-00000, part-r-00001,part-r-00002 ...) in order job results?

or can use hadoop helper file reader, allow me "iterator" , handle file switching me (when file part-r-00000 read, continue file part-r-00001)?

in mapreduce specify output folder, thing contain part-r files (which output of reduce task) , _success file (which empty). think if want postprocessing need set output dir of job1 input dir job 2.

now there might requirements postprocessor can addressed, example important process output files in order?

or if want process files locally depends on outputformat of mapreduce job, tell how part-r files structured. can simple use standard i/o guess.

Search This Blog

LAVA

mapreduce - Hadoop - get results from output files after reduce? -

Comments

Post a Comment

Popular posts from this blog

c++ - Linked List error when inserting for the last time -

java - activate/deactivate sonar maven plugin by profile? -

tsql - Pivot with Temp Table (definition for column must include data type) -- SQL Server 2008 -