wiki:Details

Version 5 (modified by alexandru.sorici, 14 years ago) (diff)

--

Project implementation details

The parallel version of the decision tree inference process is based on the original C4.5 (revision 8) implementation by Quinlan.

The algorithm is written in C and the parallel version was created using OpenMP. The functions the algorithm uses are divided amongst several source files each having a representative name for the type of functions it contains.

Thus we have the following structure:

  • c4.5.c - contains main function
  • besttree.c - functions that implement different tree forming modes:
    • single tree in batch mode (all examples at once)
    • window mode - get samples of increasing window size from original data; compute several trees; select the best one
  • build.c - contains function that implement the basic structure of the tree forming algorithm
  • discr.c - contains functions that build a tree with discrete (nominal) attributes
  • contin.c - contains functions that build that also contains continuous (numerical) attributes
  • info.c - contains functions that compute auxiliary information such as information gain of a certain attribute
  • getnames.c, getdata.c - files that implement functions for reading input data from text files with a specified format.

The Src folder also contains other files, but those are irrelevant with regard to the tree inference process.

Main modifications to the serial version

Our test dataset (poker-hand) is composed of discrete attributes only so the files that were modified to obtain the parallel functionality are the following:

  • c4.5.c
  • besttree.c
  • build.c
  • discr.c

Mainly, all modifications include the addition of a modified version of the existing functions. The new functions, thus provide an alternative instruction flow which allows for parallelism. The modifications are as follows:

  • in c4.5.c - the if statement that determines whether batch mode is being used now calls OneTree_discr() instead of OneTree()
  • in besttree.c - OneTree_discr() calls FormTree_discr()
  • in build.c - functionality has been added to the InitialiseTreeData() function
                                nrThreads = omp_get_max_threads();
				Freq_discr = (ItemCount***) calloc (nrThreads, sizeof(ItemCount**));
				ValFreq_discr = (ItemCount**) calloc (nrThreads, sizeof(ItemCount*));
				UnknownRate_discr = (float**) calloc (nrThreads, sizeof(float*));
				for(i = 0; i < nrThreads; i++)
                                {
					Freq_discr[i] = (ItemCount **) calloc(MaxDiscrVal + 1, sizeof(ItemCount *));
					ForEach(v, 0, MaxDiscrVal)
                                        {
						Freq_discr[i][v] = (ItemCount *) calloc(MaxClass + 1, sizeof(ItemCount));
					}

					ValFreq_discr[i] = (ItemCount *) calloc(MaxDiscrVal + 1, sizeof(ItemCount));
					UnknownRate_discr[i] = (float *) calloc(MaxAtt + 1, sizeof(float));
				}
  • auxiliary work structures are allocated for the maximal number of threads possible.
  • thus every thread will have it's own heap allocated space to perform necessary computations whithout entering any race conditions (these structures need to be private to every thread).
  • in build.c - the FormTree_discr() function contains the #pragma omp parallel for construct that computes attribute gains in parllel.

The directive also has a reduction clause on two variables: AvGain, Possible. The variables are necessary to compute the average gain of the attribute. The FormTree_discr() function calls EvalAttributeDisc_discr() for every attribute in the for loop. This method is similar to the Synchronous Tree Construction approach described in theory, where the team of threads cooperates to expand (compute the best attribute) the tree at each step.

  • in discr.c - the EvalAttributeDisc_discr() function evaluates each attribute. In discr.c each function ending in _discr is our modified version of the original function. The modifications are minor and mostly only concern the number and type of parameters passed to the function.

Project Results

As our tests show, the dataset used to test the algorithm is not that well suited to parallelization because the amount of work done for attribute gain computation is not very big. Tests show that in the serial version the inclusive time spent in FormTree() equal 7.202 seconds.

In the parallel version (with 2 threads) the time drops to about 4.2 seconds. Yet because the algorithm uses recursive calls to the FormTree() function, the thread team is started at each recursive call, and the overhead caused by this + some necessary thread synchronizations compensate for the gain in execution speed, such that overall, there is no clearly noticeable speedup.

The attached files show the profiling results of the SunStudio? 12.1 Analyzer on the serial and our 2-threaded parallel versions of the C4.5 algorithm.

  • [ serial version]
  • [ our parallel version]

Attachments (2)

Download all attachments as: .zip