[26] | 1 | Modifications since book was published: |
---|
| 2 | |
---|
| 3 | -------------------------------------------------------------------------------- |
---|
| 4 | (1) 17 August 1992: fixed bug in prunerule.c |
---|
| 5 | |
---|
| 6 | In routine Satisfies about line 434: |
---|
| 7 | moved statement |
---|
| 8 | t->Outcome = -1; |
---|
| 9 | to before the for loop |
---|
| 10 | -------------------------------------------------------------------------------- |
---|
| 11 | (2) 2nd Feb 1993: fixed errors reported by Dick Jackson |
---|
| 12 | |
---|
| 13 | c4.5rules.c line 34: changed ';' to ',' |
---|
| 14 | getnames.c: moved CopyString() declaration to head |
---|
| 15 | -------------------------------------------------------------------------------- |
---|
| 16 | (3) 19th June 1993: fixed error reported by Guillermo Irisarri |
---|
| 17 | |
---|
| 18 | ANSI C doesn't like "exit()" with no args in average.c, xval-prep.c |
---|
| 19 | -------------------------------------------------------------------------------- |
---|
| 20 | (4) 5th July 1993: fixed bug in c4.5rules reported by Ray Mooney |
---|
| 21 | |
---|
| 22 | SaveRules() was invoked before EvaluateRulesets(), but the latter |
---|
| 23 | can delete globally unhelpful rules. SaveRules() was moved to |
---|
| 24 | after evaluation of rules on training data |
---|
| 25 | (Note: this change affects only the use of consultr with the |
---|
| 26 | saved rules; experimental results are unaltered.) |
---|
| 27 | -------------------------------------------------------------------------------- |
---|
| 28 | (5) 13th July 1993: changed rules.c to improve printing with -s option |
---|
| 29 | |
---|
| 30 | When tests on discrete attributes use value groups, the standard |
---|
| 31 | form of test is |
---|
| 32 | "<attribute> in {<value>, <value>, ...}". |
---|
| 33 | If there is only one value, this should appear as |
---|
| 34 | "<attribute> = <value>". |
---|
| 35 | This has already been changed in trees; a similar change has now |
---|
| 36 | been made to function PrintCondition() in rules.c |
---|
| 37 | -------------------------------------------------------------------------------- |
---|
| 38 | (6) 28th July 1993: killed very large confusion matrices |
---|
| 39 | |
---|
| 40 | confmat.c line 19: added copout if number of classes > 20 |
---|
| 41 | -------------------------------------------------------------------------------- |
---|
| 42 | (7) 9th September 1993: fixed problems notified by Mike Jankulak. |
---|
| 43 | |
---|
| 44 | * Added checks for reasonable parameter values in c4.5, c4.5rules. |
---|
| 45 | Check in GetNames() for discrete N: N must be at least 2. |
---|
| 46 | |
---|
| 47 | * consult, consultr don't work with attributes of type discrete N ! |
---|
| 48 | Added routines in trees.c to save and restore values of attributes |
---|
| 49 | of this type when saving / reading trees. |
---|
| 50 | Modified rules.c to invoke these routines when saving / reading |
---|
| 51 | rulesets. |
---|
| 52 | |
---|
| 53 | NOTE: old .tree, .unpruned and .rules files must be regenerated |
---|
| 54 | if they are to be used by the modified programs. |
---|
| 55 | -------------------------------------------------------------------------------- |
---|
| 56 | (8) 3rd November 1993; problem notified by Jason Catlett |
---|
| 57 | |
---|
| 58 | c4.5rules prints an incorrect confusion matrix for the training |
---|
| 59 | set when rules are dropped. Altered testrules.c. |
---|
| 60 | -------------------------------------------------------------------------------- |
---|
| 61 | (9) 21st December 1993; tidying up only |
---|
| 62 | |
---|
| 63 | Changed definition of Log() in defns.i so that argument of log() |
---|
| 64 | is guaranteed float. |
---|
| 65 | -------------------------------------------------------------------------------- |
---|
| 66 | (10) 5th February 1994; problem notified by George John |
---|
| 67 | |
---|
| 68 | Calculation of Gain in build.c can be negative rather than zero |
---|
| 69 | due to FP rounding. Changed tests "Gain[Att] >= 0" to |
---|
| 70 | "Gain[Att] > -Epsilon". |
---|
| 71 | -------------------------------------------------------------------------------- |
---|
| 72 | (11) 25th May 1994; problem notified by Ronny Kohavi |
---|
| 73 | |
---|
| 74 | Similar problem in info.c with -g option. Changed test |
---|
| 75 | "ThisGain > 0" to "ThisGain > -Epsilon". |
---|
| 76 | -------------------------------------------------------------------------------- |
---|
| 77 | (12) 30th May 1994; tidying up |
---|
| 78 | |
---|
| 79 | Removed explicit Outcomes field from rules. This simplifies |
---|
| 80 | the code somewhat with little decrease in efficiency. |
---|
| 81 | -------------------------------------------------------------------------------- |
---|
| 82 | (13) 18th July 1994; problem notified by Ronny Kohavi |
---|
| 83 | |
---|
| 84 | Average gain evaluated incorrectly when all attributes have |
---|
| 85 | many discrete values. In build.c, introduced MultiVal to check |
---|
| 86 | for this contingency. |
---|
| 87 | -------------------------------------------------------------------------------- |
---|
| 88 | (14) 18th-20th July 1994; modifications to siftrules.c |
---|
| 89 | |
---|
| 90 | (a) Changed coding of exceptions: |
---|
| 91 | * added cost of encoding total number of errors to cost of |
---|
| 92 | identifying false positives and false negatives. |
---|
| 93 | * applied penalty to non-representative theories as described |
---|
| 94 | in my ML'94 paper. |
---|
| 95 | (b) Introduced a new form of local greedy search for finding |
---|
| 96 | good subsets when there are more than 10 rules. This is |
---|
| 97 | faster than simulated annealing and replaces it as the default: |
---|
| 98 | simulated annealing is still available via a new option -a. |
---|
| 99 | -------------------------------------------------------------------------------- |
---|
| 100 | |
---|
| 101 | *********************** Release 6 July 1994 ********************************* |
---|
| 102 | |
---|
| 103 | -------------------------------------------------------------------------------- |
---|
| 104 | (15) 11th August 1994; bug reported by KaiMing Ting and Zijian Zheng |
---|
| 105 | |
---|
| 106 | In subset.c, DiscrKnownBaseInfo() can be called when KnownItems = 0. |
---|
| 107 | Trapped such calls. |
---|
| 108 | -------------------------------------------------------------------------------- |
---|
| 109 | (16) 21st January 1995; bug reported by Tom Fawcett |
---|
| 110 | |
---|
| 111 | Very large trees can cause the short int in TreeSize to overflow. |
---|
| 112 | Changed to int. |
---|
| 113 | -------------------------------------------------------------------------------- |
---|
| 114 | (17) 6th April 1995; bug reported by Ronny Kohavi |
---|
| 115 | |
---|
| 116 | Exit status not being set properly. Modified the following: |
---|
| 117 | c4.5.c, c4.5rules.c, consult.c, consultr.c. |
---|
| 118 | -------------------------------------------------------------------------------- |
---|
| 119 | (18) 19th April 1995; bug reported by Kim Horn |
---|
| 120 | |
---|
| 121 | For very small values of CF less than 0.1%, confidence levels |
---|
| 122 | are computed erratically. Modified stats.c. |
---|
| 123 | -------------------------------------------------------------------------------- |
---|
| 124 | (19) June 1995: modifications to siftrules.c (again!) |
---|
| 125 | |
---|
| 126 | Scheme described above in 14(a) amended in line with my ML'95 |
---|
| 127 | paper, available by anonymous ftp from ftp.cs.su.oz.au, directory |
---|
| 128 | pub/ml, file q.ml95.ps.Z. |
---|
| 129 | -------------------------------------------------------------------------------- |
---|
| 130 | |
---|
| 131 | *********************** Release 7 June 1995 ********************************* |
---|
| 132 | |
---|
| 133 | -------------------------------------------------------------------------------- |
---|
| 134 | (20) 6th July 1995; bug reported by Andrew Taylor |
---|
| 135 | |
---|
| 136 | Tree printing can have problems when attribute names are very long. |
---|
| 137 | Modified trees.c. |
---|
| 138 | -------------------------------------------------------------------------------- |
---|
| 139 | (21) 18th October 1995: modifications to contin.c |
---|
| 140 | |
---|
| 141 | Altered the calculation of gain for continuous attributes (described |
---|
| 142 | in "Improved Use of Continuous Attributes in C4.5"). |
---|
| 143 | -------------------------------------------------------------------------------- |
---|
| 144 | |
---|
| 145 | *********************** Release 8 October 1995 ****************************** |
---|
| 146 | |
---|
| 147 | -------------------------------------------------------------------------------- |
---|
| 148 | (22) 26th Feb 1996; minor glitches reported by Ron Kohavi of SGI |
---|
| 149 | |
---|
| 150 | Fn declared extern in rules.c |
---|
| 151 | -lm removed from consult, consultr, xval-prep |
---|
| 152 | -------------------------------------------------------------------------------- |
---|