Knowledge gap13 April 2020
Whole exome sequencing is designed to evaluate all of the protein-producing genes in a patient to inform both diagnosis and treatment. However, researchers from UT Southwestern, Dallas, Texas, have identified significant gaps in analysis conducted by laboratories offering this service, significantly reducing the value of results. Emma Green speaks to Jason Park, associate professor at UT Southwestern, about the implications of these findings.
Next-generation sequencing (NGS) refers to the deep, high-throughput, in-parallel DNA sequencing technologies that were developed a few decades after the Sanger DNA sequencing method was created in the late 1970s. These newer techniques vary from the Sanger method because they can analyse DNA on a much larger scale at a reduced cost.
NGS aims to provide important insights about disease by analysis of a patient’s genome. There are several types, such as panels of human genes, the evaluation of non-human pathogenic organisms and whole exome sequencing. The latter has become a routine clinical test for evaluating complex and rare diseases that have nonspecific patient presentations, particularly in children. There are also several emerging applications of wholeexome sequencing, such as using the technique to provide information about the risk of future disease.
There are approximately 18,000 genes in an exome, and analysing them fully is not a straightforward process. Over the past decade that the testing has been available, genomic research centres have published several studies demonstrating differences in the comprehensiveness of evaluation of genes by exome sequencing. This occurs for many reasons, including issues with kits/methods, sequencing technology and laboratory practices. Despite these aspects being relatively well covered in the research literature, there have been few studies that have examined the consistency of clinical exome testing between laboratories.
The whole picture
Jason Park, associate professor at UT Southwestern, and other researchers noticed this lack of data and were keen to address it. “I am aware of one study in 2014 published in The Journal of the American Medical Association (JAMA) that examined the completeness of genes covered in whole-genome sequencing,” says Park.
The team decided to investigate the inter and intra-laboratory variation of gene coverage for clinical samples analysed at multiple clinical exome laboratories to look more closely at the differences. Researchers reanalysed exome tests for 36 patients that had been conducted between 2012 and 2016 across three US clinical laboratories. A gene was not considered completely analysed unless the lab met the industry standard of at least 20 raw sequencing reads at each DNA base.
The results were published in Clinical Chemistry and were stark. “There were clear differences between clinical laboratories in both the completeness of genes evaluated as well as the consistency of testing,” says Park. “This study showed that whole-genome sequencing does not completely analyse many of the clinically relevant genes.”
This is significant because inadequate coverage may contribute to false-negative clinical exome results.
A lack of consistency can be particularly problematic for particular types of exome tests, such as trio assessments. Trio analysis is a combined exome analysis of the proband and both biologic parents to determine gene variant structure or whether a gene variant is inherited or new. Gene-specific variant locations need to have similar assessments across trio samples. If consistency is low, a trio analysis may suggest that a variant is de novo when it is a variant not covered in either parent.
Researchers suspected that they would find issues but were taken aback by the scale of their findings. “What we did not anticipate was each lab’s degree of inconsistency between samples,” explains Park. “One lab was very consistent between samples, but another lab showed marked variation between samples.”
Notably, less than 1.5% of the genes were completely analysed in all the samples. A review of one lab’s tests demonstrated that 28% of the genes were never adequately examined and only 5% were always covered. Another lab consistently covered only 27% of the genes.
An additional surprise was that a traditional quality threshold of 90% of coding nucleotides covered at 20x [20 sequencing reads at a nucleotide position] was not a quality indicator of complete gene coverage,” says Park. “A 90% 20x threshold was correlated with less than 50% of genes completely analysed.
Exome testing can be powerful in identifying the cause of a genetic disease but this study raises important implications for professionals, patients and laboratories. It highlights several reasons that a negative exome result may be received.
Firstly, it may be because of limitations in medical knowledge. For example, the gene associated with a patient’s disease may not yet have been discovered. Secondly, it could be due to general limitations in exome testing, which might occur if the diseasecausing DNA variant is located in non-coding DNA. Finally, it is possible to obtain a negative exome result because of low or inconsistent coverage of the disease-causing gene.
Regardless of the cause of a negative result, action needs to be taken. “If an exome test is negative, and a physician suspects a specific gene or group of genes may be related to a patient’s disease, then the physician needs to check with the lab to make sure that all of the suspected genes have been completely analysed,” says Park.
Above all, quality matters. “The quality of the initial exome test is just as important as periodic reinterpretation of test results over time,” explains Park.
A possible application of the study findings is to expand the current College of American Pathologists Proficiency Testing Programme for Next-Generation Sequencing (CAP NGS PT). Currently, this involves sending blinded genomic proficiency testing material to laboratories and grading laboratories on their ability to identify a specific variant at a genomic location. Researchers propose that this could be expanded to include laboratories submitting data files to investigate the coverage consistency between laboratories.
Scientists suggest that laboratories should be required to document the inter-sample consistency by measures demonstrated in this research to improve quality. These could be used by accreditation and regulatory inspectors to rapidly audit control plots to assess the inter-sample coverage quality and the overall performance of the laboratory.
Regulation aside, laboratories need to continually improve the consistency of their exome testing. Rather than just using average measurements to assess performance, these should be combined with summary statistics – such as standard deviation and coefficient variation – which is particularly important when using large data sets.
More generally, healthcare professionals and patients need to demand more complete coverage information from the clinical exome laboratories they plan on using. Knowing their consistency is hugely valuable in determining whether they will be able to provide useful insights about a particular disease state.
The right test
It’s also important to carefully consider the particular genetic test that is being requested. While clinical exome tests can be hugely valuable in helping to diagnose more complex cases, they are not always necessary. If a diagnosis is already known but there are just additional issues that need to be investigated, a smaller genetic test that analyses the panel of genes specific to that disease may be sufficient. These are not only less likely to come back with a negative result but are also less expensive.
These findings also suggest that regulatory requirements and practices might need to be revised. “Accreditation and regulatory inspectors could rapidly audit control plots to assess the inter-sample coverage quality and the overall performance of the laboratory,” say Park and colleagues in their paper. “Although genomic testing is complex, traditional quality-control tools such as control charting may be useful in visually demonstrating the importance of inter-sample sequencing assessment for entire genes or specific variant locations.”
In light of the significance of these findings, researchers are keen to take their work forward. “We look forward to evaluating clinical exome results as new technologies are incorporated into clinical labs,” says Park. “A natural extension of this study is to examine the completeness and consistency of clinical whole-genome sequencing studies.”
What is clear is that to advance this technique, and other types of next-generation sequencing, is that the issues raised in this research need to be addressed promptly. “For clinical whole-genome sequencing to become a routine diagnostic test, the completeness and consistency of sequencing disease-associated genes needs to be assured,” Park concludes.
The approximate amount of genes in an exome.
American Journal of Epidemiology
The percentage of the genes that were completely analysed in all the samples from three laboratories from 2012–16.
The first-generation sequencing technologies, and the pioneering computing and bioinformatics tools, produced the initial sequencing data and information within a framework of structural and functional genomics in readiness for the following NGS developments. NGS provides substantially cheaper, friendlier, and more flexible high-throughput sequencing options with a quantum leap towards the generation of much more data on genomics, transcriptomics, and methylomics that translate more productively into proteomics, metabolomics and systeomics.
This major progression towards a more comprehensive characterisation of genomes, epigenomes, and transcriptomes of humans and other species provides even more data as a proxy to probe diverse molecular interactions in the era of ‘omics’ in many fields of biology, industry and healthcare. A few years ago, the McKinsey Global Institute produced a report predicting that NGS and genomics, including the sequencing of a million human genomes, would become an economically and socially disruptive technology as well as an annual trillion-dollar industry by 2025.
The authors assessed that next-generation genomics would affect many highimpact areas of molecular biology and bioindustry such as improving genetic engineering tools to custom build organisms, genetically engineer biofuels, modify crops to improve farming practices and food stocks, and develop drugs to treat cancers and other diseases.
Although these technologies promise huge benefits, they also come with social, ethical and regulatory risks in regard to privacy and security of personal genetic information, the dangerous effects of modified organisms on the environment, the spectre of bioterrorism, eugenics, and concerns about the ownership and commercialisation of genomic information.
The application of prenatal genome sequencing for genetic screening already points to the potential of producing genetically modified babies with desired traits. Much will need to be done to educate and inform regulators and society about the risks and benefits when formulating the regulatory policies about the advances and applications of these next-generation technologies. However, many challenges still remain in regard to NGS data acquisition, storage, analysis, integration and interpretation.
Future advancements will undoubtedly rely on new technologies and large-scale collaborative efforts from multidisciplinary and international teams to continue generating comprehensive, high-throughput data production and analysis. The availability of economically friendlier bench-top sequencers and third-generation sequencing tools will allow smaller laboratories and individual scientists to participate in the genomics revolution and contribute new knowledge to the different fields of structural and functional genomics in the life sciences.
Source: ‘Next-Generation Sequencing — An Overview of the History, Tools, and “Omic” Applications’