Changes between Version 38 and Version 39 of Parallel-DT


Ignore:
Timestamp:
Jan 18, 2010, 9:33:28 PM (14 years ago)
Author:
andrei.minca
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Parallel-DT

    v38 v39  
    1212 * Project Implementation Details: '''[wiki:Details]'''
    1313
    14 ''' Serial DT process '''
     14== Serial DT process ==
    1515 Most of the existing induction-based algorithms, also C4.5 that is analysed on this topic, use Hunt's method as the basic algorithm. Here is a recursive description of Hunt's method for constructing a decision tree from a set T of trainning cases with classes denoted {C1, C2, C3, ..., Ck} :
    1616 * ''' Case 1 ''' T contains cases all belonging to a single class Cj. The decision tree for T is a leaf identifying class Cj.
     
    2424[[Image(Outlook&Humidity.jpg)]]
    2525
    26 ''' Parallel approaches '''
     26== Parallel approaches ==
    2727 * '''Syncronous Tree Construction  - Depth First Expansion Strategy''' - the one that we implemented
    2828
    2929In this approach, all processors construct a decision tree syncronously by sending and receiving class distribution information of local data. Major steps for the approach:
    30  *
     30 {
    3131   * select a node to expand according to a decision tree expansion strategy (eg Depth-First or Breadth-First), and call that node as the current node. At the beginning, root node is selected as the current node
    3232   * for each data attribute, collect class distribution information of the local data at the current node
     
    3434   * simultaneously compute the entropy gains of each attribute at each processor and select the best attribute for child node expansion
    3535   * depending on the branching factor of the tree desired, create child nodes for the same number of partitions of attributes values, and split training cases accordingly
    36 
     36 }
    3737[[Image(SyncronusTreeConstruction-DepthFirstExpansionStrategy.jpg)]]
    3838