Context Navigation

c4.5rules.1 @ 22

Last change on this file since 22 was 22, checked in by (none), 14 years ago
blabla
File size: 2.3 KB

Line
1	.EN
2	.TH C4.5 1
3	.SH NAME
4	.PP
5	c4.5rules \- form production rules from unpruned decision trees
6	.SH SYNOPSIS
7	.PP
8	.B c4.5rules
9	[ \fB-f\fR filestem ]
10	[ \fB-u\fR ]
11	[ \fB-v\fR verb ]
12	[ \fB-F\fR siglevel ]
13	[ \fB-c\fR cf ]
14	[ \fB-r\fR redundancy ]
15	.SH DESCRIPTION
16	.PP
17	.I C4.5rules
18	reads the decision tree or trees produced by C4.5 and generates
19	a set of production rules from each tree and
20	from all trees together.
21	All files read and written by C4.5 are of the form
22	.I filestem.ext
23	where
24	.I filestem
25	is a file name stem that identifies the induction task and
26	.I ext
27	is an extension that defines the type of file.
28	The Rules program
29	expects to find a
30	.B names file
31	defining class, attribute and attribute value names, a
32	.B data file
33	containing a set of objects whose class and value of each
34	attribute is specified, a
35	.B unpruned file
36	generated by C4.5 from the
37	.B data file,
38	and (optionally) a
39	.B test file
40	containing unseen objects.
41	.PP
42	For each tree that it finds, the program generates a set of
43	pruned rules, and then sifts this set in an attempt to find
44	the most useful subset of them. If more than one tree was
45	found, all subsets are then merged and the resulting composite
46	set of rules is then sifted. The final set of rules is saved
47	in a machine-readable format in a
48	.B rules
49	file.
50	Each of the rulesets produced is then evaluated on the
51	original training data and (optionally) on the test data.
52	.PP
53	.SH OPTIONS
54	.PP
55	.TP 12
56	.BI \-f filestem\^
57	Specify the filename stem (default
58	.B DF).
59	.TP
60	.B \-u
61	Evaluate rulesets on unseen cases in file
62	.I filestem.test.
63	.TP
64	.BI \-v verb\^
65	Set the verbosity level [0-3] (default 0).
66	.TP
67	.BI \-F siglevel\^
68	Invoke Fisher's significance test when pruning rules.
69	If a rule contains a condition whose probability of being irrelevant
70	is greater than the stated level, the rule is pruned further
71	(default: no significance testing).
72	.TP
73	.BI \-c cf\^
74	Set the confidence level used in forming the pessimistic
75	estimate of a rule's error rate (default 25%).
76	.TP
77	.BI \-r redundancy\^
78	If many irrelevant or redundant attributes are included, estimate
79	the ratio of attributes to ``sensible'' attributes (default 1).
80	.PP
81	.SH FILES
82	.PP
83	.in 8
84	c4.5
85	.br
86	c4.5rules
87	.br
88	filestem.data
89	.br
90	filestem.names
91	.br
92	filestem.unpruned (unpruned trees)
93	.br
94	filestem.rules (production rules)
95	.br
96	filestem.test (unseen data)
97	.in 0
98	.PP
99	.SH SEE ALSO
100	.PP
101	c4.5(1), consultr(1)
102	.SH BUGS

Note: See TracBrowser for help on using the repository browser.

Download in other formats:

Original Format