Page 19 - Data Transparency White Paper_FINAL_Neat
P. 19

Addressing Biases in Multicultural & Inclusive Identity Data

               QUESTION                ISSUE                                   BEST PRACTICES

                                                                               Four approaches are considered
                                                                               best practices for validating
                                                                               multicultural identity assignments:

                                                                               1.  Cross syndicated source
                                                                                verification (e.g., MRI-Simmons
                                                                                with self-identified individuals)
                                       There are different ways providers
                                       define race/ethnicity, such as first    2.  “Truth” dataset comparison
                                       name, surname, country of origin,         (e.g., client first-party data with
                                       English proficiency, U.S. Census          known, self-identified individuals
               How Should              definitions, neighborhood, as well as     and attributes from
               Multicultural Data      expert AI systems and algorithms.         a representative source)
               be Validated?           Benchmarking has shown substantial      3. For modeled segments,
                                       differences in data coverage and        comparison to holdout samples of
                                       accuracy generated by the different     self-identified individuals
                                       methods. All methods should be
                                       validated routinely.                    4.  Audit from independent third-
                                                                                 party sources (e.g., Neutronian,
                                                                                 Truthset, or providers that can
                                                                                 validate with self-report intercept
                                                                                 studies, such as Jolt or Lucid)

                                                                               In all cases, the standard of
                                                                               accuracy is self-report.

                                       The validation study will reveal how
                                       good the data is, but how good is
                                       good enough?  The need for accuracy     To be considered a Hispanic,
                                       and coverage varies with the use        African-American, Asian-American,
                                       case.  Benchmarking has shown that      or other multicultural segment, at
               How Accurate            it is reasonable, for broadly defined   least 67 percent of records in the
               Should Multicultural    cultural identities, to expect accuracy   segment must be accurate and
               Data Be?                of at least 67 percent. With this in    verified as that target. 67 percent
                                                                               is the minimum concentration of
                                       mind, AIMM recommends a minimum         multicultural consumers/records
                                       accuracy rate of 67 percent. Higher is   within a segment necessary to be
                                       better.  We expect this low bar to be   called that particular segment.
                                       raised over time as industry
                                       practices improve.

                                       A marketer’s need for reach often       •  Providers should disclose details
                                       requires that a data-based target        about the underlying base data,
               Can the Accuracy        audience segment be extended             how the match process works,
               and Coverage of         through modeling. Validation studies     and match rates.
               Modeled Audiences
               Be Validated?           will reveal accuracy and coverage       •  Validation studies should reveal
                                       trade-offs between probabilistic and     the coverage and accuracy of
                                       deterministic approaches.                modeled segments.
   14   15   16   17   18   19   20   21   22   23   24