banner
Centro notizie
Elegante e moderno

Interrogazione della materia oscura virale dell'ecosistema ruminale con un database globale del viroma

Jun 18, 2023

Nature Communications volume 14, numero articolo: 5254 (2023) Citare questo articolo

331 accessi

8 Altmetrico

Dettagli sulle metriche

Il diverso viroma del rumine può modulare il microbioma ruminale, ma rimane in gran parte inesplorato. Qui, estraiamo 975 metagenomi ruminali pubblicati per sequenze virali, creiamo un database globale del viroma ruminale (RVD) e analizziamo il viroma ruminale per diversità, collegamenti virus-ospite e potenziali ruoli nell'influenzare le funzioni ruminali. Contenente 397.180 unità tassonomiche operative virali (vOTU) a livello di specie, RVD aumenta sostanzialmente il tasso di rilevamento dei virus ruminali dai metagenomi rispetto a IMG/VR V3. La maggior parte dei vOTU classificati appartengono ai Caudovirales, diversi da quelli trovati nell'intestino umano. Si prevede che il viroma del rumine infetti il ​​microbioma centrale del rumine, compresi i degradatori delle fibre e i metanogeni, porti diversi geni metabolici ausiliari e quindi probabilmente abbia un impatto sull’ecosistema ruminale sia in modo dall’alto verso il basso che dal basso verso l’alto. RVD e i risultati forniscono risorse utili e un quadro di riferimento per la ricerca futura volta a studiare il modo in cui i virus possono avere un impatto sull’ecosistema del rumine e sulla fisiologia digestiva.

Una raffica di recenti studi metagenomici focalizzati sui virus ha generato cataloghi e database di genomi virali molto ampi per diversi ecosistemi, inclusi i virus oceanici1,2, l’intestino umano3,4,5 e il suolo6. Hanno rivelato viromi molto diversi, identificato numerosi geni metabolici ausiliari e gettato nuova luce sull’impatto ecologico dei virus. Inoltre, studi focalizzati su sistemi modello hanno iniziato a rivelare come i virus possono riprogrammare il metabolismo dei loro ospiti procarioti formando virocellule distinte che alterano l’idoneità ecologica e il metabolismo degli ospiti7. Prove emergenti supportano i potenziali impatti dei virus sulla biogeochimica dell’oceano1,8, sulla fisiologia umana4 e sugli stati patologici9. Non sono disponibili studi simili sul viroma ruminale o sul database del viroma rumine-specifico.

Il rumine ospita un diversificato ecosistema multi-regno contenente batteri, archaea, funghi, protozoi e virus. Nel complesso, il microbioma ruminale digerisce e fermenta alimenti altrimenti indigeribili e fornisce la maggior parte dell’energia (sotto forma di acidi grassi volatili) e dell’azoto metabolizzabile (sotto forma di proteine ​​microbiche) necessari ai ruminanti per crescere e produrre carne e latte. Sono state documentate forti associazioni tra batteri, archaea e protozoi del rumine con l'efficienza alimentare, le emissioni di metano (CH4) e la salute degli animali10, ma i virus ruminali, nonostante siano abbondanti, rimangono poco compresi, nonostante gli studi focalizzati sui virus abbiano contribuito alla caratterizzazione del rumine viroma11,12. I primi studi che utilizzavano la microscopia elettronica documentavano batteriofagi morfologicamente diversi e rivelavano la predominanza dei fagi dotati di coda13,14. I primi studi coltura-dipendenti hanno trovato batteriofagi che potevano infettare un'ampia gamma di specie o ceppi di batteri ruminali, comprese le specie prevalenti di Prevotella, Ruminococcus e Streptococcus, e hanno classificato la maggior parte di questi fagi in base alla loro morfologia nelle famiglie Myoviridae, Siphoviridae, Podoviridae. e Inoviridae (recensiti da Gilbert e Klieve15). Sebbene questi studi abbiano fornito preziose informazioni sui virus ruminali, le semplici morfologie dei fagi non consentono una classificazione tassonomica affidabile e, pertanto, il Comitato internazionale sulla tassonomia dei virus (ICTV: https://ictv.global/taxonomy) non riconosce più la morfologia- classificazione dei virus in base

La genomica, la metagenomica e la metatrascrittomica sono diventate le tecnologie primarie per lo studio dei viroma, compreso il viroma del rumine. Il recente sequenziamento dell'intero genoma dipendente dalla coltura ha identificato 10 fagi che infettano Prevotella ruminicola, Ruminococcus albus, Streptococcus bovis e Butyrivibrio fibrisolvens16,17, che svolgono un ruolo importante nella digestione e nella fermentazione dei mangimi. Questi genomi fagici mostrano un'organizzazione genomica modulare, geni virali conservati e il potenziale per essere sia litici che lisogenici17. I virus ruminali sono stati studiati anche utilizzando metagenomi di particelle simili a virus (VLP) (rivisto in 11). Tuttavia, i database del genoma di riferimento utilizzati sottorappresentano i virus ruminali, limitando così l’identificazione e la classificazione dei virus ruminali e la previsione del loro ospite. Ad esempio, sono stati trovati virus ruminali con genotipi diversi, ma la maggior parte di essi non è stata classificata a causa della mancanza di corrispondenze con sequenze virali di riferimento18,19,20. Miller et al.18 hanno trovato elementi clusterizzati di brevi ripetizioni palindromiche regolarmente interspaziate (CRISPR)/proteine ​​associate a CRISPR (Cas) in alcuni genomi e metagenomi microbici ruminali, ma hanno trovato poche sequenze distanziatrici corrispondenti alle sequenze virali ruminali per la previsione dell'ospite. Pertanto, è stato difficile caratterizzare il viroma del rumine, soprattutto per quanto riguarda i nuovi virus.

12-fold) and IMG/VR V3 and improving the identification of viral sequences based on rumen metagenomics, RVD will be useful as a new community resource and will provide new insights for future studies on the rumen virome and its implication in feed digestion, microbial protein synthesis, feed efficiency, and CH4 emissions./p>5 kb each and clustered them into 411,125 vOTUs. After validation with VIBRANT23, we constructed a rumen virome database (RVD, download available at https://zenodo.org/record/7412085#.ZDsE2XbMK5c) representing 397,180 vOTUs (Supplementary Fig. 1), with 193,327 vOTUs of >10 kb. Checking with CheckV21 revealed 4400 complete vOTUs, 4396 high-quality vOTUs, and 32,942 medium-quality vOTUs. The completeness and quality of the RVD vOTUs were probably underestimated because CheckV is database dependent, and the databases used are primarily derived from other ecosystems. All the vOTUs in RVD meet Uncultivated Virus Genome (MIUViG) standards25./p>50% completeness of the current study and the two largest human gut virome databases (MGV4 and GPD5). For better visualization, only one representative vOTU (the longest and most complete) was included for each genus-level vOTU (714 in total). The branches were color-coded: green, the Caudovirales lineages exclusively found in the human virome; red, the lineages exclusively found in the rumen virome of the current study; blue, the lineages found in both the rumen and the human viromes. Lysogeny rates (proportion) were calculated with VIBRANT and shown as the inner ring. The number of vOTUs representing each lineage was shown as a bar plot (red for human viruses, and black for human viruses). d Proportion of lineages of Caudovirales viruses unique to the human intestine, the rumen, and shared. e A rarefaction curve of the vOTUs identified in the rumen virome. The upward trend of the rarefaction curve indicates that more rumen viruses remain to be identified at the specie level./p>1 phage per host genome. The percentage of lysogenic viruses varied among the host genera, and it was low for most host genera (Fig. 3c). Most ciliate SAGs presented multiple EVEs, among which all five SAGs of Isotricha sp. YL-2021b and Dasytricha ruminantium presented the greatest number (>50) EVEs per SAG (Supplementary Fig. 5). Little is known about viruses infecting ciliates, and no EVEs have been reported for even model ciliate species (e.g., Tetrahymena thermophila). However, EVEs have been recently found in Entamoeba and Giardia in human stool metagenomes32. Therefore, rumen ciliates probably carry EVEs. The large number of EVEs per ciliate SAG may correspond to the high polyploidy and the enormous numbers of chromosomes found in many rumen ciliates (e.g., >10,000 in Entodinium caudatum33)./p>12-fold). Based on the gene-sharing network, most rumen vOTUs were clustered into four groups (Fig. 3b). Groups I (the largest) and IV (the smallest) contained more classified vOTUs than groups II and III. Groups I and IV had a broader host range among bacterial phyla, including both gram-positive and gram-negative bacteria with different niches and capacities, but few of their genera or families were predominant in the rumen. Groups II and III mainly infected Bacteroidota and Methanobacteriota, respectively (Fig. 3c), and most viruses of these two groups could not be classified with any of the current virome databases; thus, they represent new viral lineages. The narrow host range (a single phylum) of groups II and III supports the notion that phages with a high degree of gene sharing generally infect phylogenetically related hosts./p>2400) and bacteriophages (>40,000) down to the species level, and many of the host species are known to play important roles in feed digestion, fermentation, and methane emissions. Advancement in the prediction of hosts and virus‒host linkages will aid in understanding the ecological roles of rumen viruses. Such information will be especially useful when both the rumen metagenome and virome are investigated for their association with major rumen functions. Among the rumen vOTUs with a predicted host match, 99.5% were inferred to infect prokaryotes primarily found in the rumen, even though most of the reference prokaryote genomes that were used came from prokaryotes in other environments, demonstrating the rigor and low false positive rate of our host prediction pipeline./p>5 kb were verified using VirSorter222 (option: --min-score 0.5), and the contigs that passed the verification procedure were input to CheckV21 to trim off host sequences flanking prophages. We only chose viral contigs >5 kb because the currently available bioinformatics tools show a relatively high false positive rate when identifying viral contigs <5 kb30. Only the contigs falling into categories Keep1 and Keep2 were retained as putative viral contigs (708,580 in total) for further analyses./p>10 kb to genus-level viral taxa based on a gene-sharing network using vConTACT226, which uses NCBI RefSeq Viral (release 88) as reference genomes. The vOTUs that could be clustered with the reference genomes of a viral genus were assigned to that genus according to the vConTACT2 workflow. We assigned the vOTUs that failed to be assigned to a viral genus and those <10 kb to family-level viral taxa using the majority rule, as applied previously4. Briefly, we predicted the ORFs of each vOTU using Prodigal56 and then aligned the ORF sequences with those of NCBI RefSeq Viral using BLASTp with a bit score of ≥50. The vOTUs that were aligned with the NCBI RefSeq Viral genomes of a viral family with >50% of their protein sequences were assigned to that family. We identified crAss-like phages using BLASTn against 2,478 crAss-like phage genomes identified from previous studies57,58,59, with a threshold of ≥80% sequence identity along ≥50% of the length of previously identified crAss-like vOTUs./p>50% were included in the search. We then aligned each of the marker genes from the three databases using MAFFT62, sliced out the positions with >50% gaps using trimAl63, concatenated each aligned marker gene, and filled the gap where a marker gene was absent. Only the concatenated marker genes that each showed >3 marker genes and were found in >5% of all the aligned concatemers were retained, resulting in 10,203 Caudovirales marker gene concatemers, each with 13,573 alignment columns. These marker gene concatemers were clustered into genus-level vOTUs as described previously5, where benchmarking was performed to achieve high taxonomic homogeneity using NCBI RefSeq Viral genomes. We built a phylogenetic tree of Caudovirales viruses using FastTree v.2.1.9 (option: -mlacc 2 -slownni -wag)64 and aligned the concatenated marker genes of the representative vOTUs sequences of all the genus-level vOTUs with genome completeness >50% (based on CheckV analysis). The Caudovirales tree was visualized using iTOL65. The vOTUs identified as prophages or encoding an integrase were considered lysogenic. The lysogenic rate (%) was calculated based on the VIBRANT results as the percentage of lysogenic viruses of all the viruses for each genus of their probable hosts./p>2,500 bp of a host genome or MAG matched a vOTU sequence at >90% sequence identity over 75% of the vOTU sequence length4. We predicted probable protozoal hosts of the rumen viruses by searching the 52 high-quality ciliate SAGs68 for EVEs using BLASTn and the above criteria./p>10 kb (5912 in total) for AMG identification using the criteria recommended in a benchmarking paper30. The selected vMAGs were then subjected to AMG identification and genome annotation using DRAMv72 after processing with VirSorter2 with the options “—prep-for-dramv” applied. Second, the AMG-carrying vMAGs were removed if the AMGs were at an end of the vMAGs or if the AMGs were not flanked by both one viral hallmark gene and one viral-like gene or by two viral hallmark genes (category 1 and category 2 as determined by DRAMv). Third, the remaining vMAGs were further manually curated based on the criteria specified in the VirSorter2 SOP (https://doi.org/10.17504/protocols.io.bwm5pc86; also see https://github.com/yan1365/RVD/blob/main/vmags_check_helper/readme.txt). We eventually obtained 1,880 vMAGs. To further minimize false identification, we manually checked the genomic context of these vMAGs and found that some of them were still possible genomic islands. Therefore, we filtered the 1880 vMAGs based on the criteria established by Sun and Pratama et al. (unpublished data). Briefly, vMAGs with only integrases/transposases, tail fiber genes, or any nonviral genes were removed. The remaining vMAGs were filtered again to remove those that did not have at least one of the viral structural genes (i.e., capsid protein, portal protein, phage coat protein, baseplate, head protein, tail protein, virion structural protein, and terminase) and those containing genes encoding an endonuclease, plasmid stability protein, lipopolysaccharide biosynthesis enzyme, glycosyltransferase (GT) families 11 and 25, nucleotidyltransferase, carbohydrate kinase, or nucleotide sugar epimerase. We eventually obtained 504 vMAGs free of genomic islands. To benchmark our curation pipeline, 100 of the vMAGs were randomly selected for detailed manual curation based on their genomic context. According to the benchmarking results, we were confident that we retained only complete vMAGs for AMG prediction. Detailed results of each curation step and full annotation of the final vMAGs and the annotation of the identified AMGs are presented in Supplementary Data 4. We compared the AMGs identified in the rumen virome to the previously identified AMGs from other viromes, which are available in an expert-curated AMG database (https://github.com/WrightonLabCSU/DRAM/blob/master/data/amg_database.tsv). For the newly identified AMGs, we double-checked the annotations and searched the literature to ensure that they were truly AMGs./p>50% concentrate). First, we transformed the raw abundance table into a binary matrix (presence or absence). Then, the prevalence of each vOTU in each sample was calculated. A vOTU was included in the core rumen virome if its prevalence exceeded 50% of the prevalence for each concentrate level or all cattle. Based on prevalence, the vOTUs were categorized as individualized (observed in only one sample), one concentrate level (observed in more than 1 sample but exclusively from a single concentrate level), two concentrate levels (observed in animals from two concentrate levels) and three concentrate levels (observed in all three concentrate levels). The numbers of vOTUs shared by the core viromes among the three concentrate levels were visualized with a Venn graph in R. We examined whether animals from the same diet or same breed share more vOTUs compared to animals fed different diets or of different breeds using subsets of data from Stewart et al.78 and Li et al.79 respectively. The Kruskal–Wallis test was used to compare the numbers of shared vOTUs in different groups in R./p>12 metagenomes were retained for the analysis. The number of vOTUs shared by two studies was compared for every study pair, and the results were subjected to hierarchical clustering. The hierarchical clustering results were visualized in R with the ComplexHeatmap package81 and annotated according to the metadata./p>