Changes between Version 18 and Version 19 of PDAD_Applications


Ignore:
Timestamp:
Jan 17, 2010, 9:25:35 PM (14 years ago)
Author:
claudiu.gheorghe
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • PDAD_Applications

    v18 v19  
    6363For the d. application, one node will be the master and will establish what should be done and who should do it and the other will be slaves. The master divides the event_trace file into equal chunks and gives the slaves indexes from where each should read in a round-robin manner in an asynchronous manner. The master avoids the overhead of reading the whole file, however this means the slaves should first go from the first offset to the end of a line and then start reading. The slaves process each chunk and send a vector having all the combinations for job duration classification and failure cause having associated the number of failures registered. The master sums all the vectors it receives also in an asynchronous manner and if it doesn't receive anything for 10s, it stops the execution.
    6464
    65 TODO Claudiu for his apps
     65The second application implemented in MPI is nodesLocation, an application which implies a JOIN operation between ''node'' and ''event_trace'' datasets. The challenge for this application is to make possible a such JOIN on the large ''seti'' test. In this test, the ''event_trace.tab'' file has 17GB and ''node.tab'' has about 80MB.
     66Our solution is based on the previous problem and splits the bigger input file into chunks. The smaller file is pre-loaded on the master node, that builds an efficient !HashMap data structure kept in memory. For each entry read from the file chunk, each slave emits a query to the master, that searches the given key and returns the corresponding value to the slave. Based on the value (which is the location of the event), each slave sums appearance of each location and in the final step gives to the master the local results. Because of the similarity of consecutive keys used in queries, we also introduced a local !HashMap on each slave to keep track of locations, which acts like a cache. Caching is a good idea in this case as the input data never changes during execution and hence no consistency problem can occur.
     67