Context Navigation

README

Last change on this file was 120, checked in by (none), 14 years ago
Added the mail files for the Hadoop JUNit Project
Property svn:executable set to ``*
File size: 3.6 KB

Line
1	**************** FailMon Quick Start Guide *********************
2
3	This document is a guide to quickly setting up and running FailMon.
4	For more information and details please see the FailMon User Manual.
5
6	*** Building FailMon ***
7
8	Normally, FailMon lies under <hadoop-dir>/src/contrib/failmon, where
9	<hadoop-source-dir> is the Hadoop project root folder. To compile it,
10	one can either run ant for the whole Hadoop project, i.e.:
11
12	$ cd <hadoop-dir>
13	$ ant
14
15	or run ant only for FailMon:
16
17	$ cd <hadoop-dir>/src/contrib/failmon
18	$ ant
19
20	The above will compile FailMon and place all class files under
21	<hadoop-dir>/build/contrib/failmon/classes.
22
23	By invoking:
24
25	$ cd <hadoop-dir>/src/contrib/failmon
26	$ ant tar
27
28	FailMon is packaged as a standalone jar application in
29	<hadoop-dir>/src/contrib/failmon/failmon.tar.gz.
30
31
32	*** Deploying FailMon ***
33
34	There are two ways FailMon can be deployed in a cluster:
35
36	a) Within Hadoop, in which case the whole Hadoop package is uploaded
37	to the cluster nodes. In that case, nothing else needs to be done on
38	individual nodes.
39
40	b) Independently of the Hadoop deployment, i.e., by uploading
41	failmon.tar.gz to all nodes and uncompressing it. In that case, the
42	bin/failmon.sh script needs to be edited; environment variable
43	HADOOPDIR should point to the root directory of the Hadoop
44	distribution. Also the location of the Hadoop configuration files
45	should be pointed by the property 'hadoop.conf.path' in file
46	conf/failmon.properties. Note that these files refer to the HDFS in
47	which we want to store the FailMon data (which can potentially be
48	different than the one on the cluster we are monitoring).
49
50	We assume that either way FailMon is placed in the same directory on
51	all nodes, which is typical for most clusters. If this is not
52	feasible, one should create the same symbolic link on all nodes of the
53	cluster, that points to the FailMon directory of each node.
54
55	One should also edit the conf/failmon.properties file on each node to
56	set his own property values. However, the default values are expected
57	to serve most practical cases. Refer to the FailMon User Manual about
58	the various properties and configuration parameters.
59
60
61	*** Running FailMon ***
62
63	In order to run FailMon using a node to do the ad-hoc scheduling of
64	monitoring jobs, one needs edit the hosts.list file to specify the
65	list of machine hostnames on which FailMon is to be run. Also, in file
66	conf/global.config the username used to connect to the machines has to
67	be specified (passwordless SSH is assumed) in property 'ssh.username'.
68	In property 'failmon.dir', the path to the FailMon folder has to be
69	specified as well (it is assumed to be the same on all machines in the
70	cluster). Then one only needs to invoke the command:
71
72	$ cd <hadoop-dir>
73	$ bin/scheduler.py
74
75	to start the system.
76
77
78	*** Merging HDFS files ***
79
80	For the purpose of merging the files created on HDFS by FailMon, the
81	following command can be used:
82
83	$ cd <hadoop-dir>
84	$ bin/failmon.sh --mergeFiles
85
86	This will concatenate all files in the HDFS folder (pointed to by the
87	'hdfs.upload.dir' property in conf/failmon.properties file) into a
88	single file, which will be placed in the same folder. Also the
89	location of the Hadoop configuration files should be pointed by the
90	property 'hadoop.conf.path' in file conf/failmon.properties. Note that
91	these files refer to the HDFS in which have stored the FailMon data
92	(which can potentially be different than the one on the cluster we are
93	monitoring). Also, the scheduler.py script can be set up to merge the
94	HDFS files when their number surpasses a configurable limit (see
95	'conf/global.config' file).
96
97	Please refer to the FailMon User Manual for more details.

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: proiecte/HadoopJUnit/hadoop-0.20.1/src/contrib/failmon/README

Download in other formats: