Page 121 - FULL REPORT 30012024
P. 121

Whole number representations were used to alter the ages in the age column.

                                Entries that indicated an age of '0' were considered to be incorrect and were
                                thus excluded. In addition, in order to preserve the accuracy of health records,

                                entries that did not include BMI data were also eliminated.


                                The dataset did not include the 'avg_glucose_level' column. The decision was

                                made  due  to  the  recognition  that  frequent  users  may  have  difficulty  in

                                providing  precise  blood  glucose  levels,  therefore  raising  doubts  about  its
                                significance for the study. Following these changes, the enhanced dataset was

                                stored as  "cleaned_dataset.csv,"  prepared for uploading into the database.
                                Figure 4.42 exhibits the first five rows of the dataset before to the cleaning

                                procedure,  whereas  Figure  4.43  showcases  same  rows  after  the  cleaning
                                process.
























                                                     Figure 4.42 The first five rows before cleaning.















                                                     Figure 4.43 The first five rows after cleaning.


                                                               104
   116   117   118   119   120   121   122   123   124   125   126