Page 104 - FULL REPORT 30012024
P. 104
streamline the information for easier analysis and interpretation.
The only alteration done to the dataset was the renaming of a
column, without any columns being eliminated in order to preserve
its comprehensive character. Figure 4.18 depicts the dataset before
it was renamed, whereas Figure 4.19 shows the dataset after the
adjustment. The dataset is thereafter stored under a distinct file
name, namely cleanedstroke.csv.
Figure 4.18 The stroke death rate dataset before data cleaning.
Figure 4.19 The stroke death rate dataset after data cleaning.
The second dataset, the death rate dataset, initially titled "number-
of-deaths-per-year.csv," was sourced from OurWorldInData.org.
Prior to cleaning, it included columns such as ‘Entity’, ‘Code’,
‘Year, Deaths: Sex: All, Age: All, Variant: Estimates’, and 'Deaths:
Sex: All, Age: All, Variant: Medium'. Following the cleaning
process, the dataset was renamed "cleaneddeath.csv." The primary
modification involved renaming the column 'Deaths: Sex: All,
Age: All, Variant: Estimates' to 'Total Death' for simplicity and
ease of interpretation. This renaming was the only alteration
performed, with no columns removed to preserve the dataset's
comprehensive scope. Figure 4.20 illustrates the dataset before the
87