Page 52 - FULL REPORT 30012024
P. 52
choosing the most accurate algorithm and figuring out the support and
confidence for each rule the DT generates might be difficult.
Overall, DT are a useful tool in data mining because they offer models
that are easy to understand and trustworthy outcomes for a range of
applications. Optimising algorithm selection, figuring out rule
support and confidence, and improving DT accuracy in complex
circumstances are the challenges.
iv. Random Forest (RF)
Random forest (RF) is indeed an algorithm based on machine learning
that utilizes the concept of decision trees. It excels at making highly
accurate predictions of outcomes, especially with big datasets
(Sulaiman et al., 2022). RF offers a solution to challenging issues by
merging various decision trees. By lowering dataset lifting and
boosting precision, RF solves the shortcomings of the decision tree
technique. Each tree in the RF functions as a weak learner, but when
they are merged, they create a strong learner with increased predictive
potential. One of RF's advantages is how quickly and effectively it
can handle big, unbalanced datasets. It performs better at forecasting
performance than other methods, such as SVM and decision trees. A
robust algorithm, RF also guards against overfitting and can recognise
interactions between variables.
Because only a portion of predictors are taken into account for each
split, RF may solve significantly more complex issues before slowing
down. The size of the candidate feature set, however, affects the RF's
accuracy. There is no consistent pattern in the ideal size of the feature
collection, which changes from dataset to dataset.
35