Page 54 - Reclaim YOUR DIGITAL GOLD (with DesignLayout Dec3) (Clickable) (Dexxi-FLIP-Audio)_Neat
P. 54
RECLAIM YOUR DIGITAL GOLD
The practice of delegating tasks to human workers
in order to collect the necessary pieces of data that,
when combined, form the generated dataset is referred
to as “crowdsourcing.” Crowdsourcing can be used to
complete a wide range of tasks, from simple activities
like image labelling to more involved endeavors like
collaborative writing, which can involve several stages.
Amazon Mechanical Turk is by far the most popular
crowdsourcing platform. Tasks are delegated to human
workers on this platform, who are then compensated for
successfully completing the tasks.
As you may have guessed, there are numerous
disadvantagesto this manual data generation.Extracting
and formatting data is a very complex process that
requires a substantial investment of time and money,
as well as extensive technical expertise. Also, when
it comes to personally identifiable information about
customers, the use of data collected internally raises a
number of privacy concerns, especially for businesses.
I hope you now have a better understanding of the
various methods for collecting data for machinelearning
models.
UNDERSTANDING THE DATA HARVESTING
PROCESS
Regardingthe collection and harvesting of AI data, there
is one fundamental concept that must be understood.
Theinformation gathered and the analysisperformed are
only as accurate as the data provided. In the field of data
mining and collection, the acronym GIGOis frequently
used. This is a reference to the phrase “Garbage In,
34