Changes between Version 2 and Version 3 of PDAD_Conclusions


Ignore:
Timestamp:
Jan 15, 2010, 6:41:53 PM (14 years ago)
Author:
cristina.basescu
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • PDAD_Conclusions

    v2 v3  
    11= Conclusions =
    22
    3 From a portability point of view, Hadoop passes the test. We cannot say the same about MPI, which is highly dependend on the runtime system underneath. For example, the proper asynchronous message sending level deppends on the buffers used by the RTS. Also some MPI implementations may offer things others lack.
     3From a portability point of view, !Hadoop passes the test. We cannot say the same about MPI, which is highly dependend on the runtime system underneath. For example, the proper asynchronous message sending level depends on the buffers used by the RTS. Also some MPI implementations may offer things others lack.
    44
    5 It's interesting to discuss productivity. From an engineer's point of view, who knows what happens underneath, MapReduce is probably the best choice. Developing in Pig is in a way similar to sql, so leads to a great productivity for those not so familiar with how things are done, but is still not very stable and the weird error messages may lead to a greater development time than MapReduce. Also, its differences from sql, for instance grouping is not done in the same statement with applying group functions, may get people used to sql quite often into trouble. Working in MPI at a low level may increase performance, but we think the development time it's not worth. Developers must be aware of synchronization and overlapping IO and computation, which is not trivial.
     5It's interesting to discuss productivity. From an engineer's point of view, who knows what happens underneath, !MapReduce is probably the best choice. Developing in Pig is in a way similar to sql, so leads to a great productivity for those not so familiar with how things are done, but is still not very stable and the weird error messages may lead to a greater development time than !MapReduce. Also, its differences from sql, for instance grouping is not done in the same statement with applying group functions, may get people used to sql quite often into trouble. Working in MPI at a low level may increase performance, but we think the development time it's not worth. Developers must be aware of synchronization and overlapping IO and computation, which is not trivial.
    66
    7 Our first choice would be MapReduce, or at least Hadoop. The whole framework managing files, fault tolerance, migrating jobs and also offering logs is a productivity friendly environment.
     7Our first choice would be !MapReduce, or at least !Hadoop. The whole framework managing files, fault tolerance, migrating jobs and also offering logs is a productivity friendly environment.
    88
    99Finally, we present you some images of slaves@work in ED202: