1 | .EN |
---|
2 | .TH C4.5 1 |
---|
3 | .SH NAME |
---|
4 | .PP |
---|
5 | c4.5rules \- form production rules from unpruned decision trees |
---|
6 | .SH SYNOPSIS |
---|
7 | .PP |
---|
8 | .B c4.5rules |
---|
9 | [ \fB-f\fR filestem ] |
---|
10 | [ \fB-u\fR ] |
---|
11 | [ \fB-v\fR verb ] |
---|
12 | [ \fB-F\fR siglevel ] |
---|
13 | [ \fB-c\fR cf ] |
---|
14 | [ \fB-r\fR redundancy ] |
---|
15 | .SH DESCRIPTION |
---|
16 | .PP |
---|
17 | .I C4.5rules |
---|
18 | reads the decision tree or trees produced by C4.5 and generates |
---|
19 | a set of production rules from each tree and |
---|
20 | from all trees together. |
---|
21 | All files read and written by C4.5 are of the form |
---|
22 | .I filestem.ext |
---|
23 | where |
---|
24 | .I filestem |
---|
25 | is a file name stem that identifies the induction task and |
---|
26 | .I ext |
---|
27 | is an extension that defines the type of file. |
---|
28 | The Rules program |
---|
29 | expects to find a |
---|
30 | .B names file |
---|
31 | defining class, attribute and attribute value names, a |
---|
32 | .B data file |
---|
33 | containing a set of objects whose class and value of each |
---|
34 | attribute is specified, a |
---|
35 | .B unpruned file |
---|
36 | generated by C4.5 from the |
---|
37 | .B data file, |
---|
38 | and (optionally) a |
---|
39 | .B test file |
---|
40 | containing unseen objects. |
---|
41 | .PP |
---|
42 | For each tree that it finds, the program generates a set of |
---|
43 | pruned rules, and then sifts this set in an attempt to find |
---|
44 | the most useful subset of them. If more than one tree was |
---|
45 | found, all subsets are then merged and the resulting composite |
---|
46 | set of rules is then sifted. The final set of rules is saved |
---|
47 | in a machine-readable format in a |
---|
48 | .B rules |
---|
49 | file. |
---|
50 | Each of the rulesets produced is then evaluated on the |
---|
51 | original training data and (optionally) on the test data. |
---|
52 | .PP |
---|
53 | .SH OPTIONS |
---|
54 | .PP |
---|
55 | .TP 12 |
---|
56 | .BI \-f filestem\^ |
---|
57 | Specify the filename stem (default |
---|
58 | .B DF). |
---|
59 | .TP |
---|
60 | .B \-u |
---|
61 | Evaluate rulesets on unseen cases in file |
---|
62 | .I filestem.test. |
---|
63 | .TP |
---|
64 | .BI \-v verb\^ |
---|
65 | Set the verbosity level [0-3] (default 0). |
---|
66 | .TP |
---|
67 | .BI \-F siglevel\^ |
---|
68 | Invoke Fisher's significance test when pruning rules. |
---|
69 | If a rule contains a condition whose probability of being irrelevant |
---|
70 | is greater than the stated level, the rule is pruned further |
---|
71 | (default: no significance testing). |
---|
72 | .TP |
---|
73 | .BI \-c cf\^ |
---|
74 | Set the confidence level used in forming the pessimistic |
---|
75 | estimate of a rule's error rate (default 25%). |
---|
76 | .TP |
---|
77 | .BI \-r redundancy\^ |
---|
78 | If many irrelevant or redundant attributes are included, estimate |
---|
79 | the ratio of attributes to ``sensible'' attributes (default 1). |
---|
80 | .PP |
---|
81 | .SH FILES |
---|
82 | .PP |
---|
83 | .in 8 |
---|
84 | c4.5 |
---|
85 | .br |
---|
86 | c4.5rules |
---|
87 | .br |
---|
88 | filestem.data |
---|
89 | .br |
---|
90 | filestem.names |
---|
91 | .br |
---|
92 | filestem.unpruned (unpruned trees) |
---|
93 | .br |
---|
94 | filestem.rules (production rules) |
---|
95 | .br |
---|
96 | filestem.test (unseen data) |
---|
97 | .in 0 |
---|
98 | .PP |
---|
99 | .SH SEE ALSO |
---|
100 | .PP |
---|
101 | c4.5(1), consultr(1) |
---|
102 | .SH BUGS |
---|