Page 52 - FULL REPORT 30012024
P. 52

choosing the most accurate algorithm and figuring out the support and

                                       confidence for each rule the DT generates might be difficult.


                                       Overall, DT are a useful tool in data mining because they offer models

                                       that are easy to understand and trustworthy outcomes for a range of
                                       applications.  Optimising  algorithm  selection,  figuring  out  rule

                                       support  and  confidence,  and  improving  DT  accuracy  in  complex

                                       circumstances are the challenges.


                                iv.    Random Forest (RF)


                                       Random forest (RF) is indeed an algorithm based on machine learning
                                       that utilizes the concept of decision trees. It excels at making highly

                                       accurate  predictions  of  outcomes,  especially  with  big  datasets
                                       (Sulaiman et al., 2022). RF offers a solution to challenging issues by

                                       merging  various  decision  trees.  By  lowering  dataset  lifting  and

                                       boosting precision, RF solves the shortcomings of the decision tree
                                       technique. Each tree in the RF functions as a weak learner, but when

                                       they are merged, they create a strong learner with increased predictive
                                       potential. One of RF's advantages is how quickly and effectively it

                                       can handle big, unbalanced datasets. It performs better at forecasting

                                       performance than other methods, such as SVM and decision trees. A
                                       robust algorithm, RF also guards against overfitting and can recognise

                                       interactions between variables.


                                       Because only a portion of predictors are taken into account for each
                                       split, RF may solve significantly more complex issues before slowing

                                       down. The size of the candidate feature set, however, affects the RF's

                                       accuracy. There is no consistent pattern in the ideal size of the feature
                                       collection, which changes from dataset to dataset.











                                                               35
   47   48   49   50   51   52   53   54   55   56   57