Examples used in this vignette will use the GlobalPatterns dataset from phyloseq.

library(phyloseq)
data(GlobalPatterns)


Normalization Methods


library_size

Performs a library-size normalization on the phyloseq-object


Usage

library_size(phyloseq_obj)


Arguments

Call Description
phyloseq_obj A phyloseq-class object.

Examples

phyloseq::sample_sums(GlobalPatterns)
##      CL3      CC1      SV1  M31Fcsw  M11Fcsw  M31Plmr  M11Plmr  F21Plmr 
##   864077  1135457   697509  1543451  2076476   718943   433894   186297 
##  M31Tong  M11Tong LMEpi24M SLEpi20M   AQC1cm   AQC4cm   AQC7cm      NP2 
##  2000402   100187  2117592  1217312  1167748  2357181  1699293   523634 
##      NP3      NP5  TRRsed1  TRRsed2  TRRsed3     TS28     TS29    Even1 
##  1478965  1652754    58688   493126   279704   937466  1211071  1216137 
##    Even2    Even3 
##   971073  1078241
normalized_obj <- library_size(GlobalPatterns)
phyloseq::sample_sums(normalized_obj)
##      CL3      CC1      SV1  M31Fcsw  M11Fcsw  M31Plmr  M11Plmr  F21Plmr 
##   813560   813238   812484   813642   812806   812915   813267   812818 
##  M31Tong  M11Tong LMEpi24M SLEpi20M   AQC1cm   AQC4cm   AQC7cm      NP2 
##   812618   812874   812751   813366   813325   812674   812123   813593 
##      NP3      NP5  TRRsed1  TRRsed2  TRRsed3     TS28     TS29    Even1 
##   813679   812232   813431   813620   813538   813343   813339   813499 
##    Even2    Even3 
##   813589   813631




relative_abundance

Transforms the the otu_table count data to relative abundance. Relative abundance sets the count sums for each sample to 1, and then assigns each taxa an abundance equal to its proportion on the total sum (very low abundance taxa may ).


Usage

relative_abundance(phyloseq_obj)


Arguments

Call Description
phyloseq_obj A phyloseq-class object that contains otu_table count data.

Examples

phyloseq::sample_sums(relative_abundance(GlobalPatterns, 10))
##       CL3       CC1       SV1   M31Fcsw   M11Fcsw   M31Plmr   M11Plmr   F21Plmr 
## 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 
##   M31Tong   M11Tong  LMEpi24M  SLEpi20M    AQC1cm    AQC4cm    AQC7cm       NP2 
## 1.0000000 1.0000000 1.0000000 1.0000000 0.9999999 0.9999999 1.0000000 1.0000000 
##       NP3       NP5   TRRsed1   TRRsed2   TRRsed3      TS28      TS29     Even1 
## 0.9999999 1.0000001 1.0000000 1.0000001 1.0000000 1.0000000 1.0000000 1.0000001 
##     Even2     Even3 
## 1.0000000 1.0000000




Taxa


common_taxa

Used to identify which entries in the taxa_table are shared among treatment-groups. It will return a vector of taxa names that are all seen in n groups.


Usage

common_taxa(phyloseq_obj, treatment = NULL, subset = NULL, n = 'all')


Arguments

Call Description
phyloseq_obj A phyloseq-class object.
treatment Column name as a string, or vector of, in the sample_data.
subset A factor within the treatment. This will remove any samples that to not contain this factor. This can be a vector of multiple factors to subset on.
n Number of treatment groups that need to share the taxa to be considered a common taxa.

Examples

common_taxa(GlobalPatterns, treatment = 'SampleType', 
  subset = 'Tongue', n = 'all')[1:35]
##  [1] "100071" "100077" "100099" "100171" "100201" "100683" "100730" "100757"
##  [9] "100807" "100847" "100937" "100954" "100980" "101000" "101152" "101184"
## [17] "101210" "101215" "101310" "101369" "101411" "101437" "101444" "101464"
## [25] "101503" "101552" "101628" "101632" "101660" "101707" "101731" "101824"
## [33] "101837" "101880" "10189"



taxa_core

Filter taxa in phyloseq-object to only include core taxa

Usage

taxa_core(phyloseq_obj, treatment = NULL, subset = NULL, frequency = 0.5, abundance_threshold = 0.01)


Arguments

Call Description
phyloseq_obj A phyloseq-class object.
treatment Column name as a string, or vector of, in the sample_data.
subset A factor within the treatment. This will remove any samples that to not contain this factor. This can be a vector of multiple factors to subset on.
frequency The proportion of samples the taxa is found in.
abundance_threshold The minimum relative abundance the taxa is found in for each sample.

Examples The soil_column data has 18,441 OTUs listed in its taxa_table.

taxa_core(GlobalPatterns, frequency = 0.2, abundance_threshold = 0.01)
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 5 taxa and 26 samples ]
## sample_data() Sample Data:       [ 26 samples by 7 sample variables ]
## tax_table()   Taxonomy Table:    [ 5 taxa by 7 taxonomic ranks ]
## phy_tree()    Phylogenetic Tree: [ 5 tips and 4 internal nodes ]



taxa_proportions

Computes the proportion of a taxa classification. This can be done by treatment, sample, or across the dataset.

Usage

taxa_proportions(phyloseq_obj, classification, treatment = NA)


Arguments

Call Description
phyloseq_obj phyloseq_obj
classification Column name as a string or numeric in the tax_table for the prportions to be reported on.
treatment Column name as a string, or vector of, in the sample_data.

Examples

taxa_proportions(GlobalPatterns, 'Phylum', treatment = "SampleType")
taxa_proportions(GlobalPatterns, 'Phylum', treatment = 'Sample')
taxa_proportions(GlobalPatterns, 'Phylum', treatment = NULL)




unique_taxa

Identify which taxa are unique to a specific treatment-group. It will return a list of vectors of taxa-names that are only seen in each group.


Usage

unique_taxa(phyloseq_obj, treatment, subset = NULL)


Arguments

Call Description
phyloseq_obj A phyloseq-class object.
treatment Column name as a string, or vector of, in the sample_data.
subset A factor within the treatment. This will remove any samples that to not contain this factor. This can be a vector of multiple factors to subset on.

Examples

uniques <- unique_taxa(GlobalPatterns, treatment = "SampleType")
data.frame(lapply(uniques, "length<-", max(lengths(uniques))))





Schuyler Smith
Ph.D. Bioinformatics and Computational Biology