Page 54 - Reclaim YOUR DIGITAL GOLD (with DesignLayout Dec3) (Clickable) (Dexxi-FLIP-Audio)_Neat
P. 54

RECLAIM YOUR DIGITAL GOLD


          The practice of delegating tasks to human workers
          in order to collect the necessary pieces of data that,
          when combined, form the generated dataset is referred
          to as “crowdsourcing.” Crowdsourcing can be used to
          complete a wide range of tasks, from simple activities
          like image labelling to more involved endeavors like
          collaborative writing, which can involve several stages.

          Amazon Mechanical Turk is by far the most popular
          crowdsourcing platform. Tasks are delegated to human
          workers on this platform, who are then compensated for
          successfully completing the tasks.

          As you may have guessed, there are numerous
          disadvantagesto this manual data generation.Extracting
          and formatting data is a very complex process that
          requires a substantial investment of time and money,
          as well as extensive technical expertise. Also, when
          it comes to personally identifiable information about
          customers, the use of data collected internally raises a
          number of privacy concerns, especially for businesses.
          I hope you now have a better understanding of the
          various methods for collecting data for machinelearning
          models.



          UNDERSTANDING THE DATA HARVESTING
          PROCESS


          Regardingthe collection and harvesting of AI data, there
          is one fundamental concept that must be understood.
          Theinformation gathered and the analysisperformed are
          only as accurate as the data provided. In the field of data
          mining and collection, the acronym GIGOis frequently
          used. This is a reference to the phrase “Garbage In,


           34
   49   50   51   52   53   54   55   56   57   58   59