Page 71 - Annual report 2021-22
P. 71
Annual Report 2021-22 |
a number of possible orientations for EGF4-13 were generated. Next, all the refined 8 EGF patches
were joined with one of the conformations of EGF4-13 into one EGF-like repeats domain using the
same standard peptide bond criteria. Further, the crystal structure of NRR (PDB# 3ETO) was processed
and the Notch EGF-like repeat domain was anchored to the TM (PDB# 5KZO) region via NRR in order
to achieve the complete chimeric assembly of the NECD.
54
Since the significance of the EGF-like domains has been emphasized by its presence in functionally
diverse proteins, the conservation of the N1ECD was also analysed, specially the EGF-like domain
across 150 homolog sequences (Uniref90) with 35% and 95% as the minimum and maximum identity.
Amongst all the EGF-like repeats, previously annotated ligand binding region; EGF8-12 units showed
a high average conservation score of ~7.11 indicating its functional conservation.
In addition, during COVID-19 pandemic mutational ensemble of SARS-CoV-2 spike protein was studied
and specific combinations of mutations that are occurring in a specific manner were proposed.
Omicron variant protein was analyzed during Dec 2021 and structural variables were computed and
compared to Delta, which could potentially impact its clinical outcome.
In partnership with Intel Corp USA, Lipi Thukral was also involved in evaluating advanced
computational infrastructure for human digital data collection, storage, and analysis. A thorough
analysis and design of the entire end to end system architecture for workloads that will use multi-
modal data on Intel Xeon Scalable Server System was conducted. The multi-modal data is a mix of
clinical questionnaire, lifestyle and dietary habits, biochemical data, and molecular data including
genomics, plasma proteomics, and metabolomics to be collected in the CSIR-Cohort study.
In this industry-sponsored project, three research white paper documents were consolidated. The
focus was to present existing computational workloads in biology, characterize datatypes that exist at
multi-modal level, and present working case studies that have put together multiple data structures
towards disease biology. The first research theme focused on the ongoing pandemic causative agent
SARS-CoV-2 followed by a unified approach to tackle the complex nature of the pathogen and its
emerging variants during COVID19 was developed. The second research theme focused on genetic
disorders, in particular, genomic analysis workflow for neurogenetic disorders.
“I got myself a start by giving myself a start.” — C J Walker