Metagenomics & Bioinformatics

Metagenomics is the study of whole communities of microbes, using the entire “soup” of DNA present in a sample. We focus mostly on “shotgun metagenomics” that involves sequencing the complete collection of DNA present, rather that a subset of genes or genetic regions, such as 16S rRNA amplicon sequencing.

One advantage of this approach is that it provides the full genetic picture of a community with every gene region from every organism, allowing us to use powerful tools like comparative evolutionary genomics (looking at gene repertoires), functional comparative genomics (looking at the completeness of pathways), and population genomics (looking at mutations or SNPs across genomes across time and space).

One challenge is that sequencing technologies (e.g. Illumina) require us to fragment the genomes before sequencing them, leaving us with an incredibly large puzzle of small sequenced pieces that must be computationally stitched back together, based on overlaps, to reveal the true original genomes. This process has a great deal of inherent error, due to natural mutations, sequencing mistakes, and repetitive DNA regions, that can leave uncertainty in the correct solution at the overlaps.

Thus, our work is rewarding, but involves a lot of energy in both the assembly of the genomes and then in the interpretation of the results. If you are interested in this work and/or have computational and/or molecular biology skills to contribute, we are looking for undergrad/grad/postdoc researchers to join our lab! Please contact me.

We combine and complement this metagenomics with metatranscriptomics. Metatranscriptomics involves sequencing the entire “soup” of RNA, much of which is mRNA and acts as a proxy for gene expression. We are developing and refining tools for small RNA metatranscriptomics, proteomics, metabolomics, and other related approaches. If you wish to help us in these exciting endeavours, please contact me.

Bioinformatics is a broad field addressing biological (and big data) problems computationally. We are particularly interested in tool development, especially as this relates to the two most challenging ends of our work: de novo assembly of mixed-strain microbial communities, and comparative functional genomics pipelines to make sense out of the qualitative and quantitative data we obtain from our samples.

We are always interested in help and collaboration with computer scientists! Please contact me if you have some thoughts on this or would like to be involved.