Difference between revisions of "TDSM 12.3"
From The Data Science Design Manual Wikia
Venkatkedar (talk | contribs) (map) |
Venkatkedar (talk | contribs) |
||
Line 26: | Line 26: | ||
reduce function recieves single key 'max' and a list of maximum values. The elements of max_values are maximum of each file. Then reduce function finds the maximum among the max_values. | reduce function recieves single key 'max' and a list of maximum values. The elements of max_values are maximum of each file. Then reduce function finds the maximum among the max_values. | ||
+ | |||
+ | |||
+ | 2. |
Revision as of 22:35, 12 December 2017
1. Largest Number
map(file_id,iterator numbers){
max=INTEGER.MIN_VALUE while(numbers.hasNext()): num=numbers.next() if(num>max): max=num end while emit('max',max)
}
reduce(key, iterator max_values){
max=INTEGER.MIN_VALUE while(max_values.hasNext()): num=max_values.next() if(num>max): max=num end while emit('overall_max',max)
}
We are given a list of files and each file has list of numbers. In MapReduce, each node parallelly picks a file and executes map function by passing file_num as key and list of integers in the file as iterator. map function then finds the maximum in that file. then the map function maps the maximum of that file with the key 'max' and emits to the map-reduce framework which distributes the key-value pair to the network of nodes
reduce function recieves single key 'max' and a list of maximum values. The elements of max_values are maximum of each file. Then reduce function finds the maximum among the max_values.
2.