SICB Logo: Click Here to go to the SICB Home Page

Meeting Abstract

P2-35   -   Comparisons Between Bioinformatic Pipelines Reveal Key Differences in the Analysis of an Ecofunctional Gene Brzezinski, HG*; Reynolds, MC; Cadillo-Quiroz, H; Arizona State University; Arizona State University; Arizona State University hbrzezin@asu.edu

Amplicon surveys using PCR commonly target a biomarker gene to determine composition in a sample. Instead of a broad phylogenetic marker like the 16S rRNA gene, another gene can be targeted specific to a narrow group of microbes catalyzing an ecologically relevant (“ecofunctional”) metabolism (e.g., methanogenesis and nitrogen fixation) of interest. Bioinformatic analysis of ecofunctional genes commonly process amplicons as either DNA or amino acids. Yet, detailed comparisons of these pipelines are scarce. We hypothesize that important disparities exist between pipelines with consideration of algorithms used in key analysis steps (e.g., clustering, chimera checking, and taxonomy/classification). Here, we compared two established, open reference-based operational taxonomic unit (OTU) pipelines, Qiime2-vsearch (q2-vsearch) and the Fungene Pipeline (FGP), on identical datasets, percent identity (86%), and reference databases. Preliminary results identified variable OTU and chimera counts within the dataset, despite similar read retention (~300,000 sequences). FGP generated nearly 10x more OTUs (4541 vs. 410) and excluded nearly 6,000 (20%) more chimeric sequences compared to q2-vsearch. Comparison of Shannon alpha diversity by a metadata category was more sensitive using reads processed from q2-vsearch. Specifically, a lower pairwise, alpha value (p=0.0021 vs. p=0.0131) was observed between dissimilar methanogenic microbiomes in landfill leachate along with lower Shannon standard error of means. Our results elucidate disparities across similar protocols that affect retention and interpretation of ecofunctional amplicons. The conclusions from this study assist microbial ecologists to make more informed decisions about pipeline choice, lowers bioinformatic barriers, and aids in establishing a unified framework for ecofunctional amplicon analysis.