Page 12 - Data Transparency White Paper_FINAL_Neat
P. 12

Addressing Biases in Multicultural & Inclusive Identity Data
                                                                                                                     12





              over 400 million. Scale and big numbers are what marketers look for in target segments. It’s possible this is
              the reason the two key quality issues, adequate coverage of demographic groups and accurate demographic
              assignments, have been overlooked by marketers.


              In the data provided by Truthset, only about half of the records (hashed emails) were assigned any race/
              ethnicity. Similar to Nielsen’s findings,  a large number of data assets are missing any race and ethnicity
              identifiers, which cuts down the number of records available to multicultural marketers. Once again,
              marketers don’t have the ability to #SeeALL.

              Current Coverage/ Underrepresentation Rates
              While the Truthset study is not directly comparable to Nielsen’s studies, the underlying issues uncovered
              with multicultural marketing data are similar. Are multicultural consumers represented proportionally in the
              third-party datasets?  The table below shows clearly that they are not, and there is a large range across the
              providers.  The table shows the average incidence of hashed emails with specific ethnicity/race assigned
              among the U.S. population age 16 and older.

              Underrepresentation of Ethnicity in Data Records



                Segment                        Percent of Total U.S.   Incidence Rate: Range of Records with
                                               Population Age 16+    Ethnicity Assigned 16 and Older Population

                African-American                      13%                            2% - 7%

                Asian-American                         6%                           0.3% - 3.7%


                Hispanic                              16%                            6% - 17%

                White (including Hispanic)            75%                            39%-75%

              Source: U.S. Census, AIMM-Truthset Q3 2020 Results.
              (Please note the U.S. population segments do not add to 100 percent because “White” is not strictly
              “White non-Hispanic.” Providers identify records with race and then Hispanic origin/ethnicity separately.



              CURRENT PRACTICE: ACCURACY RATES


              Now that we know the degree to which multicultural data is underrepresented in the data assets, the question
              is whether the represented data is accurate. As the Nielsen benchmark highlighted, that is often not the case.
              Accuracy rates across providers can swing dramatically, depending on the particular race or ethnicity segment.


              Truthset provided AIMM with average accuracy rates for the six providers relative to an external validation
              set.  The providers were more accurate identifying White records, and accuracy rates fell considerably for the
              multicultural segments. The flip side of these accuracy rates show that a quarter to one-third of the multicultural
              assignments are incorrect, depending on the target segment.
   7   8   9   10   11   12   13   14   15   16   17