Changes between Version 1 and Version 2 of PDAD_Conclusions


Ignore:
Timestamp:
Jan 15, 2010, 6:37:12 PM (14 years ago)
Author:
cristina.basescu
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • PDAD_Conclusions

    v1 v2  
    55It's interesting to discuss productivity. From an engineer's point of view, who knows what happens underneath, MapReduce is probably the best choice. Developing in Pig is in a way similar to sql, so leads to a great productivity for those not so familiar with how things are done, but is still not very stable and the weird error messages may lead to a greater development time than MapReduce. Also, its differences from sql, for instance grouping is not done in the same statement with applying group functions, may get people used to sql quite often into trouble. Working in MPI at a low level may increase performance, but we think the development time it's not worth. Developers must be aware of synchronization and overlapping IO and computation, which is not trivial.
    66
    7 Our first choice would be MapReduce, or at leastHadoop. The whole framework managing files, fault tolerance, migrating jobs and also offering logs is a productivity friendly environment.
     7Our first choice would be MapReduce, or at least Hadoop. The whole framework managing files, fault tolerance, migrating jobs and also offering logs is a productivity friendly environment.
     8
     9Finally, we present you some images of slaves@work in ED202:
     10
     11map task
     12
     13reduce task