[26] | 1 | .EN |
---|
| 2 | .TH C4.5 1 |
---|
| 3 | .SH NAME |
---|
| 4 | .PP |
---|
| 5 | c4.5rules \- form production rules from unpruned decision trees |
---|
| 6 | .SH SYNOPSIS |
---|
| 7 | .PP |
---|
| 8 | .B c4.5rules |
---|
| 9 | [ \fB-f\fR filestem ] |
---|
| 10 | [ \fB-u\fR ] |
---|
| 11 | [ \fB-v\fR verb ] |
---|
| 12 | [ \fB-F\fR siglevel ] |
---|
| 13 | [ \fB-c\fR cf ] |
---|
| 14 | [ \fB-r\fR redundancy ] |
---|
| 15 | .SH DESCRIPTION |
---|
| 16 | .PP |
---|
| 17 | .I C4.5rules |
---|
| 18 | reads the decision tree or trees produced by C4.5 and generates |
---|
| 19 | a set of production rules from each tree and |
---|
| 20 | from all trees together. |
---|
| 21 | All files read and written by C4.5 are of the form |
---|
| 22 | .I filestem.ext |
---|
| 23 | where |
---|
| 24 | .I filestem |
---|
| 25 | is a file name stem that identifies the induction task and |
---|
| 26 | .I ext |
---|
| 27 | is an extension that defines the type of file. |
---|
| 28 | The Rules program |
---|
| 29 | expects to find a |
---|
| 30 | .B names file |
---|
| 31 | defining class, attribute and attribute value names, a |
---|
| 32 | .B data file |
---|
| 33 | containing a set of objects whose class and value of each |
---|
| 34 | attribute is specified, a |
---|
| 35 | .B unpruned file |
---|
| 36 | generated by C4.5 from the |
---|
| 37 | .B data file, |
---|
| 38 | and (optionally) a |
---|
| 39 | .B test file |
---|
| 40 | containing unseen objects. |
---|
| 41 | .PP |
---|
| 42 | For each tree that it finds, the program generates a set of |
---|
| 43 | pruned rules, and then sifts this set in an attempt to find |
---|
| 44 | the most useful subset of them. If more than one tree was |
---|
| 45 | found, all subsets are then merged and the resulting composite |
---|
| 46 | set of rules is then sifted. The final set of rules is saved |
---|
| 47 | in a machine-readable format in a |
---|
| 48 | .B rules |
---|
| 49 | file. |
---|
| 50 | Each of the rulesets produced is then evaluated on the |
---|
| 51 | original training data and (optionally) on the test data. |
---|
| 52 | .PP |
---|
| 53 | .SH OPTIONS |
---|
| 54 | .PP |
---|
| 55 | .TP 12 |
---|
| 56 | .BI \-f filestem\^ |
---|
| 57 | Specify the filename stem (default |
---|
| 58 | .B DF). |
---|
| 59 | .TP |
---|
| 60 | .B \-u |
---|
| 61 | Evaluate rulesets on unseen cases in file |
---|
| 62 | .I filestem.test. |
---|
| 63 | .TP |
---|
| 64 | .BI \-v verb\^ |
---|
| 65 | Set the verbosity level [0-3] (default 0). |
---|
| 66 | .TP |
---|
| 67 | .BI \-F siglevel\^ |
---|
| 68 | Invoke Fisher's significance test when pruning rules. |
---|
| 69 | If a rule contains a condition whose probability of being irrelevant |
---|
| 70 | is greater than the stated level, the rule is pruned further |
---|
| 71 | (default: no significance testing). |
---|
| 72 | .TP |
---|
| 73 | .BI \-c cf\^ |
---|
| 74 | Set the confidence level used in forming the pessimistic |
---|
| 75 | estimate of a rule's error rate (default 25%). |
---|
| 76 | .TP |
---|
| 77 | .BI \-r redundancy\^ |
---|
| 78 | If many irrelevant or redundant attributes are included, estimate |
---|
| 79 | the ratio of attributes to ``sensible'' attributes (default 1). |
---|
| 80 | .PP |
---|
| 81 | .SH FILES |
---|
| 82 | .PP |
---|
| 83 | .in 8 |
---|
| 84 | c4.5 |
---|
| 85 | .br |
---|
| 86 | c4.5rules |
---|
| 87 | .br |
---|
| 88 | filestem.data |
---|
| 89 | .br |
---|
| 90 | filestem.names |
---|
| 91 | .br |
---|
| 92 | filestem.unpruned (unpruned trees) |
---|
| 93 | .br |
---|
| 94 | filestem.rules (production rules) |
---|
| 95 | .br |
---|
| 96 | filestem.test (unseen data) |
---|
| 97 | .in 0 |
---|
| 98 | .PP |
---|
| 99 | .SH SEE ALSO |
---|
| 100 | .PP |
---|
| 101 | c4.5(1), consultr(1) |
---|
| 102 | .SH BUGS |
---|