| 1 | = Performance Analysis = |
| 2 | |
| 3 | |
| 4 | We describe in this section the infrastructure we used for testing, the parameters we considered as relevant when running the tests and the results we obtained. |
| 5 | |
| 6 | == Testing Infrastructure == |
| 7 | |
| 8 | |
| 9 | We have used a maximum of four computers from ED202, running Ubuntu 8.04, Hadoop 0.20.1, Pig 0.5.0 and mpich2. As the main idea of a distributed system is to use commodity hardware, we did not consider the nodes' hardware structure as relevant. |
| 10 | |
| 11 | |
| 12 | The first test scenario uses two nodes. For the Hadoop Framework, one of them is master, having in the same time master attributions (namenode - keeps the structure of the file system, jobTracker - keeps track of the jobs' execution), taskTracker - keeps track of the tasks |
| 13 | |