Page 71 - Science
P. 71
RESEARCH | RESEARCH ARTICLE
Wadjet is an alternative condensin system in- specific against a certain type of phages or for- total members (“defense score”) was calculated
volved in bacterial chromosome maintenance, eign genetic element not represented in our per pfam. A second score (“defense context var-
our data imply that its role is defensive. This phage set, whereas others may work in a specific iability score”) was calculated for each pfam as
system is highly enriched within defense islands, condition not tested in our study. Clade-specific follows: for each member gene occurring with
undergoes extensive horizontal gene transfer, potential systems, such as those found only in at least one defense gene in proximity, a list of
and is only sporadically found within strains archaea or cyanobacteria (table S3), were not the proximal defense genes was recorded, and
of the same species, all of which is inconsistent tested in this study and can represent a more the fraction of unique lists out of total number
with a core, essential role in chromosome main- specialized defense arsenal specific only to a of lists for that pfam represents the score (for
tenance. We hypothesize that the Wadjet system subset of organisms. Finally, we may have missed example: if pfamX is found within 20 genes in
has been adapted from a MukBEF condensin some true systems by falsely tagging them as our set, with 15 of them having Cas9 nearby
ancestor to become a defense system. Possibly, belonging to the “mobilome” (table S2), as mo- and 5 having type I R-M nearby, the number of
the system identifies foreign plasmids and uses bile genetic elements have an intimate evolu- unique listsistwo,and the “defense context var-
its condensin properties to interfere with proper tionary relationship with defense systems (47). iability score” is 2/20 = 0.1). Pfams with defense
plasmid segregation into daughter cells. Nota- In the past, the discovery and mechanistic un- score ≥ 65% and defense context variability score
bly, plasmid transformation in B. subtilis takes derstanding of antiviral defense systems led to ≥0.1 were taken for further analysis. This list was
place via the natural competence of this orga- the development of important biotechnological supplemented with 35 non-pfam gene families
nism, during which the plasmid DNA is trans- tools. For example, the discovery of restriction that were predicted to be associated with defense
formed to the cell through dedicated transporters enzymes resulted in a revolution in genetic engi- by Makarova et al.2011(15), as well as 23 pfams
as ssDNA (46). It is possible that the Wadjet sys- neering, and CRISPR-Cas now revolutionizes the that were predicted in the same study but did
tem protects against rampant natural transfor- genome editing field. Eukaryotic immune sys- notpassthe thresholds above(tableS2).
mation or, alternatively, may specifically target tems,suchasRNAi and antibodies,havealsobe-
From genes to systems
ssDNA phages. However, because no ssDNA phage come widely used tools. The tendency of defense
was reported for B. subtilis, we were not able systems to turn into revolutionary molecular tools Each of the putative defense-related gene families
to determine whether ssDNA phages are specif- stems from their intrinsic high degree of flexible was used as an anchor to search for multigene
ically blocked by the Wadjet systems cloned in molecular specificity (to differentiate between systems, as follows. The protein coding sequences Downloaded from
B. subtilis BEST7003. self and nonself), as well as their inherent cap- for neighboring genes (±10 genes) for all family
The Wadjet system is broadly spread in bac- ability to target the identified molecule. One may members were clustered based on sequence
terial and archaeal genomes (found in ~6% of envision that some of the new systems we dis- homology (for example, if pfamY is found with-
the genomes we studied), where it presents high covered, once their mechanism is deciphered, in 50 genomes in our set, the 20 neighboring
sequence diversity (table S15 and fig. S6). De- may also be adapted into useful molecular tools genes in each genome, plus the pfamY gene in
letion of each of the four genes in type I Wadjet in the future. each genome, were taken—altogether 50*21 =
from B. cereus Q1 abolished its activity and re- 1050 genes to be clustered). Clustering was done
stored plasmid transformation, indicating that Materials and methods with OrthoMCL software v2.0.9 (49)withblastp
each of the genes is essential for antiplasmid de- Computational prediction of parameters [-F 'm S' -v 100,000 -b 100,000 -e 1e-5 http://science.sciencemag.org/
fense (Fig. 5C). Moreover, point mutations E59K/ defense systems -m 8] and with mcl v12.068 downloaded from
K60E in JetB, predicted to disrupt the MukE- A set of gene families known to participate micans.org/mcl/ (50, 51) with inflation value
MukF–like protein-protein interactions, resulted in defense of 1.1. When the number of blastp hits for a
in loss of protective activity against plasmids and A set of pfams and COGs that are known to given anchor pfam was too large and prohibi-
so has the E1025Q mutation in the Walker B participate in antiphage defense was compiled tive for OrthoMCL to generate clusters (>75 mil-
motifofJetCthatispredicted to abolishadeno- based on the gene families present in table S10 lion blastp hits), a subset of genomes, containing
sine triphosphatase (ATPase) activity. The JetD from Makarova et al. 2011 (15) with the addi- only bacterial and archaeal genomes annotated
gene, which has no homology to genes in the tion of pfams/COGs present in the BREX (7)and as “complete” (rather than “draft”) was used for on March 1, 2018
Muk system, has a putative topoisomerase VI DISARM (9) antiphage systems. This set is found clustering.
domain based on structural predictions; a point in table S1. To detect the most prevalent genes around the
mutation JetD:E226A, predicted to diminish bind- anchor pfam, only the 10% largest clusters (“fre-
Identification of pfams enriched near
ing of the topoisomerase VI domain to DNA, also quent clusters”) were considered. For the sake of
defense genes
abolished the protective activity of the system. cluster size calculation, genes originating from the
The genome sequences, gene annotations, and same species (derived from the strain name in
Discussion
taxonomy annotations of all publicly available the NCBI annotation) were counted as one gene,
Our studyconsiderablyexpandstheknown ar- sequenced bacterial and archaeal genomes were to prevent organisms for which many strains
senal of defense systems used by prokaryotes for downloaded from the NCBI FTP site (ftp.ncbi. have been sequenced from inflating the cluster
protection against phages. However, our results nih.gov/genomes/genbank/bacteria/ and ftp. size. An edge between cluster(i) and cluster(j)
do not yet expose the complete set of prokaryotic ncbi.nih.gov/genomes/genbank/archaea/, respec- was defined if a gene from cluster(j) followed
defense systems. Out of the 26 candidate systems tively) on April 2016. Pfam annotations for bac- a gene from cluster(i) in a given genome with
we tested, nine were verified as antiphage defense terial and archaeal genes were obtained from the no other genes belonging to frequent clusters
systems, and an additional one showed protection Integrated Microbial Genomes (IMG) database found in between, with edge weight (“thick-
against plasmids. The remaining 16, although not (48) on December 2015, and cross-referenced ness”) defined as the number of such adjacency
verified by our experiments, do not necessarily to the genes in the genomes downloaded from cases. Again, edge weights were adjusted such
represent false predictions, as exemplified by NCBI using the locus_tag GenBank field. All that multiple appearances of a cluster pair orig-
the fact that only 50% of our positive control pfams annotated in at least 20 genes (“members”) inating from the same species were recorded as
systems showed defense in our assays. Lack of across the analyzed genomes (14,083 pfams) were a singleappearance. Onlythe10% thickest edges
activity of positive control systems or candidate scanned. For each pfam, the number of member were retained for further analysis. In each ge-
systems could possibly stem from incompatibil- genes for which a gene having an annotation nome, the maximal “path” that included the an-
ity of some tested systems with the recipient of a known defense gene family (table S1) was chor pfam gene and was composed of the retained
organism (E. coli or B. subtilis)orcould be due present in proximity (up to 10 genes upstream (largest) clusters and the retained (thickest) edges
to pseudogenization of some systems in their and 10 genes downstream) was recorded. The was recorded. Such a “path,” representing a set of
genome of origin. Some systems may be highly fraction of defense-associated members out of genes appearing in a conserved order in multiple
Doron et al., Science 359, eaar4120 (2018) 2 March 2018 8of11