phylosmith is a conglomeration of functions written to process and analyze phyloseq-class objects. Phyloseq objects are a great data-standard for microbiome, gene-expression, and many other data types.

A lot of these functions are just to make “data-wrangling” easier for the user. Others will implement complex routines in a, hopefully, efficient and concise manner. I have also made functions to make figures for quick examination of data, but they may or may not be suitable for publication as some may require parameter optimization.

# Installation

library(devtools)
install_github('schuyler-smith/phylosmith')
library(phylosmith)

*for WINDOWS you need to install Rtools, when prompted, select add rtools to system PATH.

# Functions

### Data Wrangling

Call Description
common_taxa find taxa common to each treatment
conglomerate_samples combines samples based on common factor within sample_data
conglomerate_taxa combines taxa that have same classification
melt_phyloseq melts a phyloseq object into a data.table
merge_treatments combines multiple columns in meta-data into a new column
relative_abundance transform abundance data to relative abundance
set_sample_order sets the order of the samples of a phyloseq object
set_treatment_levels sets the order of the factors in a sample_data column
taxa_filter filter taxa by proportion of samples seen in
taxa_proportions computes the proportion of a taxa classification
unique_taxa find taxa unique to each treatment

### Graphs

Call Description
abundance_heatmap_ggplot create a ggplot object of the heatmaps of the abndance table
abundance_lines_ggplot create a ggplot object of the abundance data as a line graph
network_ps creates a network of the co-occurrence of taxa
network_layout_ps create a layout object for a network
nmds_phyloseq_ggplot create a ggplot object of the NMDS from a phyloseq object
phylogeny_profile_ggplot create a ggplot barplot object of the compositons of each sample at a taxonomic level
taxa_abundance_bars_ggplot create a ggplot object of the abundance of taxa in each sample
tsne_phyloseq_ggplot create a ggplot object of the t-SNE from a phyloseq object

### Calculations

Call Description
co_occurrence calculate co-occurrence between taxa
permute_rho runs permutations of the otu_table to calculate a significant $$\rho$$ value
histogram_permuted_rhos Create a ggplot object of the distribution of rho values.
quantile_permuted_rhos calculate quantiles for the permuted rho values from the Spearman-rank co-occurrence

### Datasets

Originally I had created 2 mock phyloseq objects (mock_phyloseq and mock_phyloseq2) that had no real-world data but served to show simple examples of how the functions worked.

Then I decided that I should include a real example of microbiome data (soil_column) becasue it’s always nice to see real examples. soil_column is a published dataset from my lab-group. The data is from an experiment where they looked at the microbial composition of farmland soil before and after manure application, over time, using 16S-sequencing.

Schuyler Smith
Ph.D. Student - Bioinformatics and Computational Biology
Iowa State University. Ames, IA.