1. Pan India 1000 SARS-CoV-2 RNA Genome Sequencing Consortium
An Initiative of DBT-AIs
Coordinated by
National Institute of Biomedical Genomics (NIBMG), Kalyani
Participating Institutes:
Teams: NIBMG: Arindam Maitra, Nidhan Biswas, Saumitra Das
ILS: Sunil Raghav, Arup Ghosh, Ajay Parida
CDFD: Ashwin Dalal, Murali Bashyam, Debasis Mitra
NCCS: Yogesh Souche, Manoj Kumar Bhat
InStem-NCBS: Aswin Sai Narain, Dasaradhi Palakodepi, Satyajit Mayor, Apurva Sarin
2. Objectives:
Short term:
i) Sequencing of 1000 viral genomes from samples collected from different zones within India.
ii) Molecular phylogeny of the virus, which will indicate the track of how the virus is evolving in India
iii) Correlate sequence variation in viral RNA to disease severity and transmission efficiency, ultimately
leading to identification of strains which are associated with enhanced pathogenicity.
Long term:
i) Identify host genetic polymorphisms that either confers susceptibility or protection from the viral
infections.
ii) Viral and host genetic determinants of disease severity
The PAN–INDIA consortium launched the programme on April 27, 2020, to get the countrywide landscape of the
variations in SARS-CoV-2 RNA genome sequences.
Coordinated by National Institute of Biomedical Genomics (NIBMG-Kalyani), West Bengal. Four other National
clusters, ILS-Bhubaneswar, CDFD-Hyderabad, InStem-NCBS, Bangalore and NCCS-Pune have actively participated
in sequencing and analysis.
Other collaborating National Institutes and clinical organizations involved are ICMR-NICED, IPGMER-Kolkata, IISc-
Bangalore, AIIMS-Uttarakhand, MAMC-Delhi, THSTI-Faridabad, GMC-Aurangabad, MGIMS-Wardha,
RMRC-Bhubaneswar, AFMC and BJMC-Pune and other hospitals.
4. Phylodynamic
Time Tree:
A2a (20A/B/C) is predominant
Along with few
parental haplotype 19A/B
Multiple lineages introduced
which are evolving over time
5. Haplotype diversities peaked between March-May,
early part of the outbreak.
By June A2a (20A/B/C) emerged as predominant
haplotype
The temporal haplotype diversities landscape appears
to be similar PAN India
North (Uttarakhand, Haryana & Delhi)
West (Maharashtra)
South (Karnataka & Telangana)
East (West Bengal & Odisha)
Pan India Temporal Clade Diversities: Pan India
7. Predicted Origin &
Introduction
of Viral Lineages
19A and 19B mostly came
from China and 20A, 20B
and 20C from the United
Kingdom, Italy and Saudi
Arabia
20A came from Italy and
Saudi Arabia in all Indian
regions, but in the case of
Eastern Indian, 20A came
additionally from the United
Kingdom and Switzerland.
20B was introduced mostly
from the United Kingdom in
all regions, additionally in
Western India from Brazil and
in Southern-India introduced
from Italy and Greece
8. Haplotype nodes with majority of the
genomes from West Bengal, Odisha and a
small percentage of the samples belonging
to Uttarakhand. Geographically Odisha
and West Bengal share borders and the
shared SARS-CoV-2 haplotypes might be
because of the high interstate travelling.
Maharashtra, Delhi, Haryana and
Uttarakhand grouped together with 2-4
single nucleotide variants (SNVs),
suggesting the infection might have
spread in a short duration of time.
There is a portion of the samples from
Haryana and Karnataka sharing same
parent haplotype, representing possible
transmission by migration.
The negative estimate Tajima’s D (D = -2.26817,
AMOVA p(D<=-2.26817) = 0.00281) [PMID:
2513255] is consistent with the rapid expansion of
SARS-CoV-2 population in India
Haplotype Networks
ILS team
9. Differences Between SARS-CoV (2003) and SARS-CoV-2 (2019)
Both are Genus: Betacoronavirus (lineage B)
Overall base similarity 82.3%
SL4’
SL1
SL2 SL3
SL4
SL5
SARS-CoV (2003) SARS-CoV-2 (2019)
SL1
SL2 SL3
SL4
SL5
In 2 sequences contain 134 (U to C) mutation which disrupts the SL4’ Stem-Loop
In ~90% of SARS-CoV-2 sequences contain 241 (C to U) mutation in SL5 which comes under A2a clade.
Emergence of New Stem-loop in 5’UTR in SARS-CoV-2
SL4’
The Fatality rate of SARS-CoV (2003): 9.6% (WHO)
The Fatality Rate of SARS-CoV (2019): ~3.9% (till now) (WHO)
SARS-CoV (2003)
10. Variant D614G in Spike, most common across the
country, is declining in frequency in Delhi wherein
a different set of variants (nsp3 T1198K, RdRp
A97V, N P13L) is dominating. Increase of one is
correlated with a decrease in the other across states.
Variant nsp3_A994D is highly prevalent, and
increasing in frequency, in Maharashtra; less so in
other states
Effect of these variants on virulence or infectivity
are being tested.
A sub-type seems to have attained
high frequencies in multiple states;
this one is characterized by two
consecutive point mutations in
Nucleocapsid gene (N RG203KR) and
appears to have originated on the
background of type ST4/A2a
Pan India 42%
North 37%
East 9%
West 85%
South 70%
11. • Within Spike (S) two mutations (D614G and G1124V)
specific for A2a clade are found in most of the samples.
• D614G is supposed to confer flexibility in Sd domain
and mutation in G1124V might impart partial rigidity in
the conformation of S2 domain. Both the mutations are
away from receptor binding domain (RBD), but can
affect positioning of the residues involved in receptor
binding.
• In the Nucleocapsid (N) two mutations (R203K,
G204R) are observed in samples from West Bengal,
which might have consequence in virus assembly and
packaging of the virus particle inside host.
• Mutation (P323L) in the RNA dependent RNA
polymerse (RdRp) is also observed in India which
might have implications in viral RNA replication.
(Maitra and Chawla et al, 2020, J. Biosciences, Vol 45, 0076)
12. N RG203KR affects miRNA binding sites &
N protein structure
Maitra, A., Sarkar, M.C., Raheja, H. et al. Mutations in SARS-CoV-2 viral RNA
identified in Eastern India: Possible implications for the ongoing outbreak in India
and impact on viral structure and host susceptibility. J Biosci 45, 76 (2020)
Mutations in N gene reported, might result in altered binding
ability of host miRNAs to viral RNA, that might affect host
susceptibility.
13. 225 COVID-19 Genomes
from Odisha
Virus clade distribution Clade evolution with time
Salient findings
• With migration from 9 Indian States
• All five reported clades 19A, 19B, 20A, 20B and 20C found.
• Total of 247 single nucleotide variants were identified.
• Europe and Southeast Asia as two major routes of disease transmission in India.
• Recently evolved clades 20A and 20B showed prevalence of four common
mutations. ILS team
14. 6
40
44
19A
20A
20B
Clade
90 COVID-19 Genomes
from Maharashtra
Virus clade distribution
Salient findings
• Three clades 19A, 20A and 20B found.
• Total of 125 single nucleotide variants were identified.
• Europe and Southeast Asia as two major routes of disease transmission in India.
• Recently evolved clades 20A and 20B showed prevalence of four common
mutations.
Clade evolution with time
NCCS
Team
15. SARS-CoV-2 genome analysis:
Hyderabad & Telangana
>200 COVID-19 genomes sequenced (includes symptomatic and asymptomatic patients)
Patient samples representing Hyderabad and additional districts from Telangana
Genomic position
No
of
samples
with
alteration
Preliminary indications -
● Presence of A3i clade defining mutations
C6312A, C13730T and C28311T
● Relatively higher number of mutations in
ORF1a (RdRp gene), higher frequency of
mutations in N gene, relative to S gene
Technical inputs: Sabarinathan, NCBS CDFD team
16. As on date, the Consortium has achieved its initial goal of completing the sequencing of 1000 SARS-CoV-2
genomes from nasopharyngeal and oropharyngeal swabs collected from individuals testing positive for COVID19
by Real Time PCR. The samples were collected across 10 states covering different zones within India.
Given the importance of this information for public health response initiatives investigating transmission of
COVID-19, the sequence data will soon be released in public domain (GISAID database). Information will
improve our understanding on how the virus is spreading, ultimately helping to interrupt the transmission chains,
prevent new cases of infection, and provide impetus to research on intervention measures.
Initial results indicate that multiple lineages of SARS-CoV-2 are circulating in India, probably introduced by
travel from Europe, USA and East Asia. In particular, there is a predominance of the A2a haplotype (20A/B/C)
with D614G mutation, which is found to be emerging in almost all regions of the country. This particular
haplotype is globally reported to be associated with enhanced transmission efficiency.
Additionally, mutations in important regions of the viral genome with significant geographical clustering have also
been observed.
Detailed mutational analysis to understand the gradual emergence of mutants at different regions of the country
and its possible impact on the disease management is in progress.
Highlights:
17. CDFD, Hyderabad
Dr Ashwin Dalal, Dr Murali Bashyam, Dr Pratyusha Bala, Mr Vinay Donipadi, Dr Divya Vashisht, Dr Debashis Mitra,
InStem-NCBS, Bengaluru
Dr. Apurva Sarin, Dasaradhi Palakodeti, Aswin Sai Narain Seshasayee, Uma Ramakrishnan, Shah-e-Jahan Gulzar,
Varadharajan Sundaramurthy, Srikar Krishna, Vanessa Molin Paynter, Awadhesh Pandit, Farhan Ali, Mohak Sharda, Dr.
Satyajit Mayor, Dr. Apurva Sarin
ILS, Bhuabneswar
Dr. Sunil Raghav. Mr. Arup Ghosh and Dr. Ajay Parida
NIBMG, Kalyani
Dr. Arindam Maitra, Dr. Nidhan K. Biswas, Dr. Sreedhar Chinnaswamy, Mr. Shekhar Ghosh, Mr. Sumanta Sarkar, Dr. Subrata
Patra, Dr. Rajib Mondal, Dr. Trinath Ghosh, Mr. Arnab Ghosh, Mr. Shouvik Chakraborty, Dr. Saumitra Das,
NCCS, Pune
Dhiraj Paul, Kunal Jani, Janesh Kumar, Radha Chauhan, Vasudevan Seshadri, Girdhari Lal, Dr. Arvind Sahu, Dr. Yogesh S
Shouche, Dr. Manoj Kumar Bhat,
Experimental & Computational Team
Coordinated by NIBMG, Kalyani
18. IPGMER, Kolkata
Dr. Monimoy Banerjee, Dr. Raja Ray, Dr. Jayeeta Halder, Dr. Aritra
Biswas
ICMR-NICED
Dr. Shanta Dutta, Dr. Mamta Chawla Sarkar, Ms. Ananya Chatterjee,
Ms. Hasina Banu, Mr. Agniva Majumdar
Clinical Collaborators
GMC, Aurangabad
Dr. Jyoti Iravan, Mr Dhaval Khatri, Mr Maitrik Dave
AIIMS, Rishikesh
Prof Ravi Kant, Dr Deepjyoti Kalita, Dr Amit Mangla
MAMC. Delhi
Dr Sonal Saxena, Dr Vikas Manchanda, Dr Oves Siddiqui
MGIMS, Wardha
Dr Vijayshri Deotale, Dr Rahul Narang, Dr Deepashri Maraskolhe
IISc, Bengaluru
Dr. Bharath K Sundararaj, Harsha Raheja, Prof. N. Srinivasan, Prof.
Deepak K Saini, Prof. Amit Singh, Prof. K N Balaji, Prof. Umesh
Varshney
THSTI, Faridabad
Dr. Guruprasad Medigeshi, Dr. Gagandeep Kang, Sharanabasava
Patil, Anbalagan Ananthraj, Madhu Pareek, Imran Khan, ESIC
Hospital and Medical College, Faridabad, Gurugram Civil Hospital,
Gurugram and Palwal Civil Hospital, Palwal
NIMS, Hyderabad
Dr Madhumohan Rao, Dr Vijay Dharma Teja
AFMC, Pune
Sourav Sen, Santosh Karade, KavitaBala Anand, Shelinder Pal Singh
Shergill, Rajiv Mohan Gupta
BJMC, Pune
Rajesh Karyakarte, Suvarna Joshi, Murlidhar Tambe
RMRC, Bhubaneswar
Dr. Sanghamitra Pari, Dr. Jyotirmayee Turuk