I virus interagiscono con ospiti che si estendono su domini microbici lontanamente correlati in densi tappeti idrotermali

Nature Microbiology volume 8, pagine 946–957 (2023) Citare questo articolo

9124 accessi

165 Altmetrico

Dettagli sulle metriche

Molti microbi in natura risiedono in comunità dense e metabolicamente interdipendenti. Abbiamo studiato la natura e l’entità delle interazioni microbi-virus in relazione alla densità microbica e alla sintrofia esaminando le interazioni microbi-virus in un tappeto idrotermale di acque profonde denso di biomassa. Utilizzando il sequenziamento metagenomico, troviamo numerosi casi in cui microbi filogeneticamente distanti (fino a livello di dominio) codificano l'immunità basata su CRISPR contro gli stessi virus nel tappeto. La prova di interazioni virali con domini microbici trasversali degli ospiti è particolarmente sorprendente tra partner sintrofici noti, ad esempio quelli impegnati nella metanotrofia anaerobica. Questi modelli sono corroborati dall'inferenza basata sulla legatura di prossimità (Hi-C). Le indagini sui set di dati pubblici rivelano ulteriori virus che interagiscono con ospiti attraverso domini in diversi ecosistemi noti per ospitare biofilm sintrofici. Proponiamo che l’ingresso di particelle virali e/o DNA in cellule ospiti non primarie possa essere un fenomeno comune negli ecosistemi densamente popolati, con implicazioni ecoevolutive per i microbi sintrofici e l’aumento della resilienza interpopolazione contro i virus mediato da CRISPR.

La maggior parte dei batteri e degli archaea in natura si trovano sotto forma di aggregati o come biofilm1. Questi aggregati microbici sono spesso costituiti da organismi filogeneticamente distanti impegnati in metabolici interdipendenti (ad esempio, sintrofia)2. Tuttavia, la maggior parte delle interazioni ospite-virus sono studiate in colture liquide omogenee e rimangono molte lacune nella nostra comprensione delle interazioni ospite-virus in biofilm densi, legati al substrato ed eterogenei3. In particolare, esistono importanti domande per quanto riguarda la gamma dell’ospite, il ciclo di vita virale, le modalità di dispersione e la coevoluzione ospite-virus in comunità microbiche complesse dove microbi geneticamente diversi e filogeneticamente distanti coesistono in elevata prossimità e si impegnano in metabolici altamente annidati.

In generale, si ritiene che i virus infettino una gamma ristretta di host. Studi recenti, tuttavia, hanno suggerito che i virus ad ampio spettro di ospiti potrebbero essere più comuni in natura e potrebbero essere stati trascurati a causa di errori di coltivazione4. Finora esistono segnalazioni di virus che infettano più specie batteriche5, ordini6 e possibilmente phyla7,8,9. Inoltre, è stato dimostrato che anche gli intervalli di ospiti virali sono un tratto dinamico10. In particolare, uno studio recente11 ha riportato che l’adsorbimento dei fagi e l’ingresso nelle cellule non equivalgono al completamento completo del ciclo litico, indicando che i virus possono interagire con un insieme più diversificato di cellule in cui può essere eseguito un ciclo di infezione completo.

Abbiamo ipotizzato che virus con una gamma di ospiti più ampia possano essere prevalenti nei biofilm dominati da metabolici sintrofici a causa del contatto prolungato con microbi filogeneticamente diversi e della dispersione limitata dell'ospite e del virus e/o della gamma di habitat causata da sostanze polimeriche extracellulari (EPS) e dall'eterogeneità spaziale. Per affrontare questa ipotesi, abbiamo caratterizzato i genomi virali e qualsiasi interazione virale con batteri o archaea (di seguito denominate interazioni ospite-virus) in un tappetino microbico idrotermale di acque profonde, questi tappetini sono biofilm chemioautotrofi onnipresenti attorno alle prese d'aria idrotermali. Questi tappeti sono costituiti da comunità molto dense e metabolicamente accoppiate di batteri e archaea12 e presentano forti gradienti spaziali e variabilità temporale nella temperatura e nella geochimica13. Mostriamo che i microbi filogeneticamente distanti (cioè taxa provenienti da diversi phyla e persino domini) con presunte capacità metaboliche sintrofiche spesso codificano l'immunità basata su Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) contro gli stessi virus nel tappeto. Questo modello non viene rilevato dai campioni di pennacchi idrotermali fisicamente adiacenti caratterizzati da biomasse inferiori di comunità metabolicamente simili. Inoltre, questi genomi microbici mostrano co-localizzazioni con gli stessi genomi virali sulla base del sequenziamento della legatura di prossimità Hi-C. Esaminando i metagenomi disponibili al pubblico, abbiamo anche scoperto virus che interagiscono con taxa batterici e archaeali in altri ecosistemi noti per ospitare biofilm sintrofici. Abbiamo studiato ulteriormente le implicazioni ecoevolutive di queste interazioni ospite-virus esaminando i geni metabolici ausiliari (AMG) nei genomi virali, nonché identificando i geni virali e microbici sottoposti a selezione. Infine, proponiamo quattro modelli di interazioni polivalenti virali con ospiti sintrofici e discutiamo le loro implicazioni sull'evoluzione microbica, in particolare per quanto riguarda il trasferimento genico orizzontale, la diversificazione genetica e la memoria immunologica a livello di comunità mediata da CRISPR.

0.05) between microbial and viral compositions was identified. The rep_vMAGs recovered from this study exhibited very high taxonomic and gene content diversity relative to the genetic diversity space occupied by the reference viral genomes (Extended Data Fig. 2). Only 4 rep_vMAGs could be clustered at the ‘genus’ level20 with reference viral genomes, and could be classified as two (previously designated) Podoviridae, one Myoviridae and one Siphoviridae (Supplementary Table 4). Notably, a taxonomic cluster consisting of 7 rep_vMAGs was distantly associated with Flavobacterium phages, and 3 of the rep_vMAGs formed a novel genus-level cluster that shared no similar genes with any of the characterized reference viral sequences. A majority (29 out of 49) did not share high similarity in gene content with the reference or with each other. Many of the viral genomes contained novel auxiliary metabolic genes (AMGs) such as Rubisco large domain-containing protein, aldolase II domain-containing protein, nitroreductase domain-containing protein, phosphate starvation-inducible protein PhoH and terillium resistance protein TerD (Extended Data Fig. 3a,b). We also detected evidence of host-virus arms race, with some viral genomes encoding defence machinery such as RelE/StbE family toxin, HigA family antidote and a putative abortive infection protein (Extended Data Fig. 3c). A complete list of the annotated AMGs and other notable viral genes is provided in Supplementary Table 5./p>95% nucleotide identity (ID)) into 102 clusters. Most (91%) of the detected CRISPRs were specific to a population and 80% of the CRISPR-encoding populations were associated with at most 2 unique CRISPRs. However, we observed identical or near identical (>95% ID) CRISPR repeats shared among phylogenetically distant populations. It is possible that these CRISPR loci were horizontally transferred21, but we cannot rule out the possibility of binning errors resulting from their repetitive and divergent nature. Such CRISPR repeats detected across taxa were excluded from spacer-based host-virus matching due to the ambiguity in assigning a specific host taxon to a repeat. Additionally, we identified populations (Gammaproteobacteria_17_1, Desulfobacteria_193_1, Desulfobacteria_189_1) encoding as many as 6 distinct CRISPR repeats, probably representing within-population diversity of CRISPR loci. No correlation was found between the number of unique CRISPRs and the rep_mMAG size, relative abundance or habitat range./p> 0.05). We binned 168 mid- to high-quality rep_mMAGs (see Supplementary Table 11 for the full description) across the 10 HW assemblies, and although taxonomically distinct from the rep_mMAGs recovered from the mat assemblies, the two datasets featured similar metabolic capabilities (Supplementary Table 12 and Extended Data Fig. 8a) and similar levels of species evenness (Extended Data Fig. 8b, Welch’s t-test, two-sided, n = 20, P > 0.05). The microbial communities of the HW samples were more homogeneous than the mat samples (Extended Data Fig. 8c) despite the larger physical distances between the HW samples. Similar to the mat samples, HW samples were dominated by two sulfur oxidizing Gammaproteobacteria (HW_Gammaproteobacteria_164_1, HW_Gammaproteobacteria_163_1; Extended Data Fig. 9a). Interestingly, we observed an order of magnitude less frequent detection of CRISPR loci in the HW assemblies compared with the mat assemblies (Supplementary Table 13, Welch’s t-test, two-sided, n = 20, P = 0.001). Furthermore, only 12 of the CRISPRs in the HW assemblies could be associated with medium- to high-quality MAGs (Supplementary Table 14), resulting in a much sparser and less robust CRISPR-based immunity network (Extended Data Fig. 5b and Supplementary Table 15), with only one confident interaction between an SOB (HW_Gammaproteobacterira_162_1) and a virus. The similarities between the plume and mat samples, such as geographical proximity, community metabolic capabilities and sequencing depth, provide a rationale and opportunity for comparison. Lower abundances of the CRISPRs in the plume samples indicate that the plume communities are less reliant on CRISPR-based adaptive immunity. The transferability and specificity of CRISPR-based immunity confer ecological significance to this observation, raising the question of how such immunological memory is selected for in different environments. While this comparison illuminates key differences in the nature and extent of host-virus interactions between the mat and the plume, there are some caveats to consider for further interpretation: first, the sparseness in the plume CRISPR-based immunity network is likely due in part to the lower abundance and diversity of recovered viral contigs (Supplementary Table 16 and Extended Data Fig. 9b), where only the fraction of viruses that were infecting microbes and/or were attached to particles larger than 0.022 µm were recovered. Second, differences in the CRISPR-based immunity do not necessarily reflect the patterns of the underlying networks of in situ host and virus interactions./p> 2.5); however, most could not be annotated with a function. Interestingly, 3 of the 4 annotated genes undergoing diversifying selection were involved in DNA and RNA metabolism, such as genes encoding DNA-directed RNA polymerase (RNAP) beta and beta prime (rep_vMAG_21), DNA ligase (rep_vMAG_31) and Superfamily II DNA/RNA helicase (rep_vMAG_6). We also detected a LamG domain-containing protein (vMAG_4), possibly involved in signalling and cell adhesion, to be undergoing diversifying selection. The gene encoding RNAP in rep_vMAG_21 (RNAP1; Extended Data Fig. 10a) featured the highest pN/pS ratio of 4.9, with 8 non-synonymous mutations scattered throughout the protein (Extended Data Fig. 10b). Notably, rep_vMAG_21 featured a second RNAP gene fragment encoding the beta subunit (RNAP2) (Extended Data Fig. 10a) that is not homologous to RNAP1 and not seemingly undergoing selection, possibly contributing to the relaxation of purifying selection on RNAP1. RNAP1 was highly divergent from the previously characterized RNAP sequences and was rooted at the base of the Caudoviricetes multimeric RNAP clade38 (Extended Data Fig. 10c). This example of diversifying selection on RNAP1 suggests that these viruses may play an important role in expediting the evolution of housekeeping proteins that typically undergo purifying selection in cellular organisms. Microbial genes undergoing diversifying selection (pN/pS > 2) included genes encoding products involved in various defence systems, such as type II toxin-antitoxin system RelE/ParE toxin, HindIII family type II restriction endonuclease, Type III-B CRISPR module RAMP protein Cmr1, as well as genes involved in more recently characterized PARIS and Septu anti-phage arsenal39./p>70% completeness and <10% contamination) MAGs were used for subsequent analysis. Mid- to high-quality MAGs were dereplicated at 97% ANI using dRep v3.0.1 (ref. 54) and were designated as representative MAGs (rep_mMAGs). rep_mMAGs were taxonomically classified using GTDB-Tk v1.7.0 (ref. 55). Genes were predicted using Prodigal v2.6.3 (ref. 56) and annotated by aligning them using Diamond v2.0.7.145 (ref. 57) against the UniRef100 database58 with an e-value cut-off 1 × 10−5. Additionally, METABOLIC v4 (ref. 59) and DefenseFinder v1 (ref. 60) were used to identify potential metabolic and antiviral genes, respectively./p>

5× and breadth (fraction of the rep_mMAG covered by at least one read) of >0.7, relative abundances in each sample were determined using the genome-wide average read-mapping coverage. For rep_vMAGs with an average coverage of >5× and breadth >0.7, normalized abundances in each sample were calculated by normalizing the average coverage of viral scaffolds in each rep_vMAG by the number of reads in each sample./p>

20 bp) than in Fig. 3 (spacer length >25 bp and each edge representing two distinct matches). Only interaction with spacer length >25 bp is highlighted with the red edge. Viral nodes are scaled to the rep_vMAG length, and rep_mMAGs with genomic capacity to carry out sulfur oxidation are colored in blue./p>5, breadth >0.7). Viral nodes (circular) are labeled according to the corresponding rep_vMAG ID. microbial nodes are colored according to the taxon, using the same color scheme as the main Fig. 3. Node sizes correspond to the sample-specific read-mapping coverages. Thickness of the edges represent the number of contig-to-contig linkages, while the darkness of the edges correlates to the maximal normalized strength of the Hi-C contacts between any two contigs in a host-virus pair. Host-virus pairs that were previously detected using CRISPR-spacer matches are colored in red. Identified Hi-C linkages between viruses are noted with blue edges./p> 0.05). Box plot shows the quartiles (25, 50, 75 percentiles) with the upper and lower whiskers showing the max and min value within 1.5 times the interquartile respectively. (C) Principal coordinate analyses of the rep_mMAGs in the two datasets; hydrothermal mat samples are colored in red and hydrothermal water samples are colored in blue. The percentage of variance explained by each axis is shown in the axis label./p>5 coverage and >0.7 breadth using read mapping are shown and proviruses are excluded./p>25 with at most two mismatches. Supplementary Table 9. Hi-C library statistics. Supplementary Table 10. Hi-C normalized linkage between rep_vMAG and rep_mMAG. Contig-to-contig linkage information was consolidated by count of linkage, average residual (normalized ‘strength’) of the linkage and maximum residual of the linkage. Supplementary Table 11. Statistics of the rep_mMAGs binned across the ten hydrothermal water samples. Supplementary Table 12. Genome-based metabolic capabilities and other genetic features of rep_mMAGs in the hydrothermal water samples. Supplementary Table 13. Number of high-confidence (evidence level = 4) CRISPR repeats binned across environments. For hydrothermal mat samples and hydrothermal mat samples from this study, we include information on the total number of evidence level 4 CRISPRs detected across environments as well as those that were binned in mid- to high-quality MAGs. Supplementary Table 14. Binned CRISPR repeats in hydorthermal water samples. Supplementary Table 15. All CRISPR-spacer to protospacer matches in hydrothermal water samples with spacer length >20 with at most two mismatches. Supplementary Table 16. Information of the high-quality and complete rep_vMAGs binned across the ten hydorthermal water samples. Supplementary Table 17. List of UViG ID and their putative hosts and host-prediction methods./p>