Travis Build Status status


phylosmith is a conglomeration of functions written to process and analyze phyloseq-class objects. Phyloseq objects are a great data-standard for microbiome, gene-expression, and many other data types.

A lot of these functions are just to make “data-wrangling” easier for the user. Others will implement complex routines in a, hopefully, efficient and concise manner. I have also made functions to make figures for quick examination of data, but they may or may not be suitable for publication as some may require parameter optimization.


Installation

library(devtools)
install_github('schuyler-smith/phylosmith')
library(phylosmith)

*for WINDOWS you need to install Rtools, when prompted, select add rtools to system PATH.


Functions


Data Wrangling

Call Description
common_taxa find taxa common to each treatment
conglomerate_samples combines samples based on common factor within sample_data
conglomerate_taxa combines taxa that have same classification
melt_phyloseq melts a phyloseq object into a data.table
merge_treatments combines multiple columns in meta-data into a new column
relative_abundance transform abundance data to relative abundance
set_sample_order sets the order of the samples of a phyloseq object
set_treatment_levels sets the order of the factors in a sample_data column
taxa_filter filter taxa by proportion of samples seen in
taxa_proportions computes the proportion of a taxa classification
unique_taxa find taxa unique to each treatment

Graphs

Call Description
abundance_heatmap_ggplot create a ggplot object of the heatmaps of the abndance table
abundance_lines_ggplot create a ggplot object of the abundance data as a line graph
network_ps creates a network of the co-occurrence of taxa
network_layout_ps create a layout object for a network
nmds_phyloseq_ggplot create a ggplot object of the NMDS from a phyloseq object
phylogeny_profile_ggplot create a ggplot barplot object of the compositons of each sample at a taxonomic level
taxa_abundance_bars_ggplot create a ggplot object of the abundance of taxa in each sample
tsne_phyloseq_ggplot create a ggplot object of the t-SNE from a phyloseq object

Calculations

Call Description
co_occurrence calculate co-occurrence between taxa
permute_rho runs permutations of the otu_table to calculate a significant \(\rho\) value
histogram_permuted_rhos Create a ggplot object of the distribution of rho values.
quantile_permuted_rhos calculate quantiles for the permuted rho values from the Spearman-rank co-occurrence

Datasets

Originally I had created 2 mock phyloseq objects (mock_phyloseq and mock_phyloseq2) that had no real-world data but served to show simple examples of how the functions worked.

Then I decided that I should include a real example of microbiome data (soil_column) becasue it’s always nice to see real examples. soil_column is a published dataset from my lab-group. The data is from an experiment where they looked at the microbial composition of farmland soil before and after manure application, over time, using 16S-sequencing.



Schuyler Smith
Ph.D. Student - Bioinformatics and Computational Biology
Iowa State University. Ames, IA.