Page 52 - FULL REPORT 30012024

P. 52

choosing the most accurate algorithm and figuring out the support and

confidence for each rule the DT generates might be difficult.

Overall, DT are a useful tool in data mining because they offer models

that are easy to understand and trustworthy outcomes for a range of
applications. Optimising algorithm selection, figuring out rule

support and confidence, and improving DT accuracy in complex

circumstances are the challenges.

iv. Random Forest (RF)

Random forest (RF) is indeed an algorithm based on machine learning
that utilizes the concept of decision trees. It excels at making highly

accurate predictions of outcomes, especially with big datasets
(Sulaiman et al., 2022). RF offers a solution to challenging issues by

merging various decision trees. By lowering dataset lifting and

boosting precision, RF solves the shortcomings of the decision tree
technique. Each tree in the RF functions as a weak learner, but when

they are merged, they create a strong learner with increased predictive
potential. One of RF's advantages is how quickly and effectively it

can handle big, unbalanced datasets. It performs better at forecasting

performance than other methods, such as SVM and decision trees. A
robust algorithm, RF also guards against overfitting and can recognise

interactions between variables.

Because only a portion of predictors are taken into account for each
split, RF may solve significantly more complex issues before slowing

down. The size of the candidate feature set, however, affects the RF's

accuracy. There is no consistent pattern in the ideal size of the feature
collection, which changes from dataset to dataset.

47 48 49 50 51 52 53 54 55 56 57