Page 121 - FULL REPORT 30012024
P. 121
Whole number representations were used to alter the ages in the age column.
Entries that indicated an age of '0' were considered to be incorrect and were
thus excluded. In addition, in order to preserve the accuracy of health records,
entries that did not include BMI data were also eliminated.
The dataset did not include the 'avg_glucose_level' column. The decision was
made due to the recognition that frequent users may have difficulty in
providing precise blood glucose levels, therefore raising doubts about its
significance for the study. Following these changes, the enhanced dataset was
stored as "cleaned_dataset.csv," prepared for uploading into the database.
Figure 4.42 exhibits the first five rows of the dataset before to the cleaning
procedure, whereas Figure 4.43 showcases same rows after the cleaning
process.
Figure 4.42 The first five rows before cleaning.
Figure 4.43 The first five rows after cleaning.
104