Page 59 - Reclaim YOUR DIGITAL GOLD (with DesignLayout Dec3) (Clickable) (Dexxi-FLIP-Audio)_Neat
P. 59
DATA COLLECTION HARVESTING
In addition, we’ll need to split the data into two parts.
The first section, which will contain the majority of the
data, will be used to train our model. The second section
will be used to assess the performance of our trained
model. Ourgoal is not to assess a model’s ability to learn
from the data that trained it, just as you would not use
the same questions from your homework exercises for
the exam.
There are times when the data we collect requires
additional tweaking and processing. These include, but
are not limited to, de-duplication, normalization, error
correction, and other techniques. All of these events
would occur during the data preparation process.
We don’t need any additional data preparation in our
situation, so let’s move on.
THE MODEL SELECTION PROCEDURE
The next step in our workflow is to choose a model.
Researchers and data scientists have created a wide
range of models over the course of their careers. Some
are best suited for image data, others for sequences
(such as text or music), and still others for numerical or
text-based data. We can use a tiny linear model, which
is reasonably simple and should work, because we only
have two features, color and alcohol content.
TRAINING
We will now go over the training phase, which is widely
regardedas the most time-consuming aspect of machine
39