Page 108 - FULL REPORT 30012024
P. 108
Figure 4.26 The snippet code of the data cleaning.
Column names were homogenised, and extraneous information was
eliminated. The datasets were combined based on similar columns,
providing a coherent and integrated view of the data. Interpolation
was used within each nation to handle missing data in the 'Total Death'
measure. Rows with incomplete key measurements were then
eliminated in a systematic manner to guarantee the integrity of the
data. The last stage was integrating the aggregated dataset with
statistics on daily smoking prevalence, obtained from
OurWorldInData.org, without doing individual data cleansing. The
conclusion of these actions resulted in a cohesive, cleaned dataset,
later stored as "merged_data.csv," ready for the next step, which is
putting it into PowerBI for visualization. Figure 4.27 displays the
merged datasets inside the Microsoft Excel software.
Figure 4.27 The data of the merged dataset.
91