Page 37 - The prevalence of the Val66Met polymorphism in musicians: Possible evidence for compensatory neuroplasticity from a pilot study
P. 37
bioRxiv preprint doi: https://doi.org/10.1101/2022.10.10.511614 ; this version posted October 13, 2022. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
available under a CC-BY-ND 4.0 International license .
4
as the number of SNPs yielding a genotype divided by the total number of SNPs interrogated.
Total intensity was derived using the sum of the signal intensities for the red and green
channels in the iScan. Genotype profiles from the NIST Genome-in-a-Bottle (GIAB) samples
13
from the National Center for Biotechnology Information (NCBI) were used to generate GSA
truth data (see supplemental methods for the data handling description). This data set was
used in the precision and accuracy study to assess concordance. All other statistical analysis
and visualization steps were performed using R.
17
Results
Sample information and GSA metrics for all samples included in this study are listed in Table
S2.
Precision and Accuracy
Two 200 ng replicates from NA12878, NA24385, and NA24631 were genotyped and
compared against the truth GIAB genotypes. Call rates for all samples were >99%, meeting
the manufacturer’s recommended metrics (Table 1A). Concordance at called sites was
>99.9% for all samples, establishing high accuracy of the genotypes obtained on the GSA.
Genotypes were also precise as concordance rates were >99.9% between GSA replicates of
the same sample for all three samples tested (Table 1B).
Sensitivity
Using genomic DNA from sample NA12878, three replicates were diluted for inputs at 200,
40, 20, 8.0, 2.0, 1.0, and 0.2 ng. Call rate and concordance metrics were calculated for each
individual replicate, then averaged across all replicates within an input. Call rates were >99%
for DNA inputs as low as 1.0 ng and >97% for inputs of 0.2 ng (Table 2). Comparing the
genotypes to the 200 ng sample, results were highly concordant with >99% concordance at
all DNA input amounts. While the 0.2 ng samples did not exceed the manufacturer’s
recommended call rate of >99%, the high concordance indicates the genotypes are accurate.
Contamination
Twelve NCs and three RBs were assessed. The call rates and total fluorescence intensity from
the iScan were compared to positive controls (NA12878, NA24385, NA24631) to determine
differences between the sample types. As expected, the average call rate for the positive
controls (200 ng) was 99.6%, but the NCs and RBs yielded much lower call rates ranging from
58-64%, (Table 3). The NC and RB call rates were higher than expected but can be explained
by the way data is generated on the iScan. In short, SNPs are called based on relative
fluorescence, and so samples with no true signal have genotypes called based on background
noise, leading to inflated call rates. To further assess whether call rates are a result of true
genotypes or background noise, the total fluorescence intensity was assessed on an individual
sample basis. A clear difference in the total intensity between the positive and negative
controls was demonstrated, with positive control samples generating intensity scores
averaging >40,000 while the average for RBs and NCs was 958 (Table 3). From this, an
intensity threshold (IT) was established using the intensity values from the NCs and RBs to
distinguish true signal from noise when assessing unknown samples. The baseline noise
value was calculated at a signal intensity of 1,400 by calculating 3 standard deviations above
Developmental Validation of the Illumina Infinium Assay using the Global Screening Array (GSA) on the iScan System for use in Forensic Laboratories