Whole Genome Sequencing for High-Resolution Investigation of Methicillin Resistant Staphylococcus aureus Epidemiology and Genome Plasticity.

Publication Type:

Journal Article

Source:

Journal of clinical microbiology (2014)

Keywords:

2014, Human Biology Division, June 2014

Abstract:

Methicillin resistant Staphylococcus aureus (MRSA) infections pose a major challenge to health care, yet limited heterogeneity within this group hinders molecular investigation of outbreaks. Pulsed field gel electrophoresis (PFGE) has been a gold standard approach, but is impractical for many clinical laboratories and is often replaced with PCR-based methods. Regardless, both approaches can prove problematic in identifying subclonal outbreaks. Here we explored the use of whole genome sequencing for clinical laboratory investigation of MRSA molecular epidemiology. We examine the relationships of 44 MRSA isolates collected over a period of 3 years through whole genome sequencing and two PCR-based methods: Multi Locus Variable Nucleotide Repeat Analysis (MLVA), and spa-typing. We find that MLVA offers higher resolution than spa-typing, resolving 17 versus 12 discrete isolate groups, respectively. In contrast, whole genome sequencing reproducibly cataloged genomic variants (131,424 different single nucleotide polymorphisms and indels across the strain collection) that uniquely identified each MRSA clone, recapitulating those groups but enabling higher resolution phylogenetic inference of epidemiological relationships. Importantly, whole genome sequencing detected significant numbers of variants distinguishing among groups considered identical by both spa-typing (minimum 1,124 polymorphisms) and MLVA (minimum 193 polymorphisms), suggesting these more conventional approaches can lead to false-positive identification of outbreaks due to inappropriate grouping of genetically distinct strains. Analysis of the distribution of variants across the MRSA genome reveals 47 mutational hotspots (comprising ∼2.5% of the genome) that account for 23.5% of observed polymorphisms, and use of this selected data set successfully recapitulates most epidemiological relationships.