source: proiecte/Parallel-DT/R8/Doc/c4.5rules.1 @ 22

Last change on this file since 22 was 22, checked in by (none), 14 years ago

blabla

File size: 2.3 KB
Line 
1.EN
2.TH C4.5 1
3.SH NAME
4.PP
5c4.5rules \- form production rules from unpruned decision trees
6.SH SYNOPSIS
7.PP
8.B c4.5rules
9[ \fB-f\fR filestem ]
10[ \fB-u\fR ]
11[ \fB-v\fR verb ]
12[ \fB-F\fR siglevel ]
13[ \fB-c\fR cf ]
14[ \fB-r\fR redundancy ]
15.SH DESCRIPTION
16.PP
17.I C4.5rules
18reads the decision tree or trees produced by C4.5 and generates
19a set of production rules from each tree and
20from all trees together.
21All files read and written by C4.5 are of the form
22.I filestem.ext
23where
24.I filestem
25is a file name stem that identifies the induction task and
26.I ext
27is an extension that defines the type of file.
28The Rules program
29expects to find a
30.B names file
31defining class, attribute and attribute value names, a
32.B data file
33containing a set of objects whose class and value of each
34attribute is specified, a
35.B unpruned file
36generated by C4.5 from the
37.B data file,
38and (optionally) a
39.B test file
40containing unseen objects.
41.PP
42For each tree that it finds, the program generates a set of
43pruned rules, and then sifts this set in an attempt to find
44the most useful subset of them.  If more than one tree was
45found, all subsets are then merged and the resulting composite
46set of rules is then sifted.  The final set of rules is saved
47in a machine-readable format in a
48.B rules
49file.
50Each of the rulesets produced is then evaluated on the
51original training data and (optionally) on the test data.
52.PP
53.SH OPTIONS
54.PP
55.TP 12
56.BI \-f filestem\^
57Specify the filename stem (default
58.B DF).
59.TP
60.B \-u
61Evaluate rulesets on unseen cases in file
62.I filestem.test.
63.TP
64.BI \-v verb\^
65Set the verbosity level [0-3] (default 0).
66.TP
67.BI \-F siglevel\^
68Invoke Fisher's significance test when pruning rules.
69If a rule contains a condition whose probability of being irrelevant
70is greater than the stated level, the rule is pruned further
71(default: no significance testing).
72.TP
73.BI \-c cf\^
74Set the confidence level used in forming the pessimistic
75estimate of a rule's error rate (default 25%).
76.TP
77.BI \-r redundancy\^
78If many irrelevant or redundant attributes are included, estimate
79the ratio of attributes to ``sensible'' attributes (default 1).
80.PP
81.SH FILES
82.PP
83.in 8
84c4.5
85.br
86c4.5rules
87.br
88filestem.data
89.br
90filestem.names
91.br
92filestem.unpruned  (unpruned trees)
93.br
94filestem.rules  (production rules)
95.br
96filestem.test   (unseen data)
97.in 0
98.PP
99.SH SEE ALSO
100.PP
101c4.5(1), consultr(1)
102.SH BUGS
Note: See TracBrowser for help on using the repository browser.