When will someone call laST orders?

|

Genomics and knowing the enemy - Genpax

Sequence Typing (ST) is a widely established and used methodology. It is most certainly better than its predecessors from the era before affordable and accessible DNA sequencing, and its development and implementation was an enormous step forward for pathogen identification and for the study of the genetics of bacterial populations. In its original form, MLST, it was very much a DNA-based version of its protein-based predecessor, only better in every meaningful way. The conceptual predecessor, MLEE, involved separating proteins on gels in slow ways that were hard to do reproducibly in the same lab, let alone in different ones, and what could be studied was limited to enzymes that could be detected after they were separated. There were other DNA-based tests, which were also far from ideal or reproducible, with the added disadvantage that you often didn’t know what you were looking at, or why strains were different (e.g. PFGE and RAPiD).

MLST changed this dramatically. Pieces of any gene could be selected, amplified, and sequenced. The same sequences could be amplified in any lab, and the sequences could be shared, so that labs could get the same result on different days, and different labs could generate results that could be reliably compared. When it started it was quite expensive, but as amplification and sequencing costs came down it became cheaper as well as better. It was not perfect: not everyone could afford it, it uses genes that are always present and change at a useful rate (meaning slowly) – so it is under genes that are very highly conserved and changes within them don’t change what the bacteria do. For this last reason, I personally never liked it very much, also because the STs it attributes don’t readily tell you how closely related different bugs are.

In the new world of whole genome sequencing ST has evolved, or at least got bigger. It is now cheaper to sequence a whole genome than to specifically amplify 6 to 8 genes and sequence them individually. You can get millions of bases of information for less than the cost of a few thousand, and if all you want are still those few thousand it’s easier to pull them out of the few million than to get them separately. This is probably why there was little to no resistance to converting MLST-based reference laboratories from focussed to whole genome sequencing. Since you are generating the information on more than the original 8-ish genes used for MLST you might as well use it, for cgST and wgST are the result, where the genes though to be in all strains, or the larger number of genes commonly present in more related strains are used. But they are all ST, which means that the information from each gene looked at is reduced to a single number, a gene version, and it is the string of these versions which makes up ‘the Sequence Type’.

This is the blessing and the curse of ST methods. By reducing each individual version of a gene (an allele) to a single number it become possible to easily work out whether you are dealing with related strains or not. But the price of sacrificing the detail of a gene which is an average of about 800 or more nucleotides to a single number and doing so in a way that the detail is no longer accessible is substantial. Also it’s not perfect, sequence-error can generate a mistaken ST, and only some of the genome can be compared, even when it is the ‘core’ or a larger set of genes. Also, the ST used to describe a strain has nothing to do with how similar they are. ST1034 might be completely unrelated to ST1035. An additional problem is that strains often spread and evolve more quickly than the ST changes, and for the most successful clones the number of strains within an ST can be huge. If only you could compare all strains in ways that enabled you to track and trace them at greater resolution, and which retained the computational and practical deliverability of STs.

The problem is … you can’t. You can’t because the very thing that is most annoying about ST systems is the same thing that enables them to work. Whole genome sequencing isn’t complete genome sequencing. This might sound like splitting hairs, but it isn’t. Just because you put the whole of a genome into your sequencing process, it doesn’t mean that you get all of it out the other end, nor that it’s all joined-up. In fact, it isn’t (a well sequenced genome might still be in 100 or more pieces at the end of the process). Also, different genomes contain different genes, and identifying which bits are equivalent and should be compared is an imperfect science. By focussing on a subset of parts which are more comparable than others (and coincidentally happen to have a lower tendency to contain sequencing errors) and then reducing this to a ST system it then becomes possible to compare the information in simple computationally possible ways. Whether the typing number has 8 parts or a thousand – it is still computationally easy and linear. And this is where we are STuck. At least for now.

Related blogs

None found

 1,307 total views,  11 views today

  • Sign up for free to join our community

    GenPAx has a team of world class bioinformaticians developing new technology to detect outbreaks and tools to enable collaboration when they occur. We are looking for colleagues and partners to join us on this journey.

  • Shannon Rapp

    VP, Finance & HR

    Shannon Rapp's expertise in Finance and HR spans 30 years across a wide-range of industries. She specializes in growing start-up companies from infrastructure to culture. She excels at keeping companies focused to ensure both their strategic and organic development. Shannon is renowned for her extensive accounting experience, audit experience, and energetic drive.

    When she's not helping companies successfully grow, Shannon spends her time scuba diving, practicing yoga, and walking her dogs on the beach.

    Ray Schaaf

    President, Inc.

    Ray has over 30 years of experience spanning digital content, wireless, and e-commerce industries where he has held numerous executive positions managing companies in the USA, Asia, and Europe. He focuses on building successful companies with high quality teams that deliver creative products and services to the market.
    Passions: skiing and hiking

    Shannon Rapp

    VP, Finance & HR

    Shannon Rapp's expertise in Finance and HR spans 30 years across a wide-range of industries. She specializes in growing start-up companies from infrastructure to culture. She excels at keeping companies focused to ensure both their strategic and organic development. Shannon is renowned for her extensive accounting experience, audit experience, and energetic drive.

    When she's not helping companies successfully grow, Shannon spends her time scuba diving, practicing yoga, and walking her dogs on the beach.

    Julian Hardy

    CEO

    Julian Hardy is a serial entrepreneur with a proven track record of rapidly forming and building companies to the point product and revenue delivery. He’s been at the forefront of a number of innovative high growth technology sectors: mobile gaming, IoT, crypto and, Biotech. Bringing highly differentiated solutions to market, executing on a big vision. Julian’s strategic nature, energy, and communication skills make him a standout leader.

    Father of 6, Julian is also a highly competitive cyclist and keen snowboarder.

    Nigel Saunders

    CSO

    Nigel is an expert in clinical bacteriology and genomics. Most recently Interdisciplinary (Bioscience, Computer Science, & Maths) Professor of Systems Biology at Brunel University London, he was formerly head of the Bacterial Pathogenesis and Functional Genomics Group, at the Dunn School of Pathology, and Wellcome Fellow in Medical Microbiology at the Institute of Molecular Medicine, Oxford University. At Oxford he was also a Founding Manager of the Computational Biology Research Group.

    Notable achievements include changing the use/monitoring of vancomycin antibiotics, contributions to the sequencing project underpinning the BEXSERO vaccine, and pioneering work in microarray comparative and functional genomics.

    Nigel is also fully trained yoga instructor and runs classes out of his home studio in Devon.

    [contact-form-7 404 "Not Found"]

    [email-subscribers-form id="2"]

    Home box

    CSO

    Nigel is an expert in clinical bacteriology and genomics. Most recently Interdisciplinary (Bioscience, Computer Science, & Maths) Professor of Systems Biology at Brunel University London, he was formerly head of the Bacterial Pathogenesis and Functional Genomics Group, at the Dunn School of Pathology, and Wellcome Fellow in Medical Microbiology at the Institute of Molecular Medicine, Oxford University. At Oxford he was also a Founding Manager of the Computational Biology Research Group.

    Notable achievements include changing the use/monitoring of vancomycin antibiotics, contributions to the sequencing project underpinning the BEXSERO vaccine, and pioneering work in microarray comparative and functional genomics.

    Nigel is also fully trained yoga instructor and runs classes out of his home studio in Devon.