Page 58 - Reclaim YOUR DIGITAL GOLD (without audio)
P. 58

RECLAIM YOUR DIGITAL GOLD



          Figure 1 shows the color, the percentage of alcohol, and
          whether the beverage is beer or wine. These will be the
          basis of our ML training data.


          DATA PREPARATION


          Now that we’ve gathered all of our training data, it is now
          time to progress to the next stage of machine learning,
          known as “Data Preparation.” During this stage, we will
          load our data into the appropriate setting and prepare it
          for use in our machine learning training.

          We’ll  start  by combining  all  of our data,  and  then
          we’ll  choose the  order  of appearance  at  random. We
          don’t want the order in which our data is presented to
          influence what we discover because that isn’t a factor in
          determining whether a beverage is beer or wine. To put
          it another way, when determining the characteristics of a
          beverage, we take neither its immediate predecessor nor
          its immediate successor into account.

          So, let’s run any relevant visualizations of your data to
          see  if  there  are  any  important  links  between  different
          factors that you can use to your advantage, as well as if
          there are any imbalances in the data. For example, if we
          collected far more data points about beer than wine, the
          model we train will be predisposed to guess that almost
          everything it sees is beer because it will be correct the
          majority of the time. On the other hand, the model could
          be exposed to an equal amount of beer and wine in the
          real world, which would mean that guessing “beer” would
          be incorrect 50% of the time.






           38
   53   54   55   56   57   58   59   60   61   62   63