Thursday, January 17, 2019

Let's talk about NGS, baby (part 2)

Find Part I here.

In late December 2018, an article was released in Genetics of Medicine reported the high healthcare costs for patients with suspected pediatric genetic diagnoses. A summary of the article can be found here, but the take-home message was one that I'm all too familiar with, which is that medical expenses can quickly get out of control when facing diseases that require long-term care. In addition to that, the longer the period of investigation leading to diagnosis, the greater the expenses, making a firm case for the urgent need for inexpensive early diagnostic tests.

Until recently, making genetic testing widely available for patients with suspected genetic diagnoses has been extremely expensive and limited due to outdated guidelines for who qualifies for such testing, fueling most insurance companies refusing to cover the costs. Never mind that some companies have artificially inflated the expense, leading to ethical debates. Within the past 5 years, there's been a dramatic shift, with companies like Invitae, Color and Veritas Genetics employing whole genome sequencing (WGS) strategies to genetic testing and precision medicine, fueling a new approach of patient-driven healthcare. The goal being a simple one: empower patients in their healthcare decisions instead of waiting for physicians (and insurance companies) to okay care based on traditional population data.

With the availability of this technology, though, has come a need for consumer understanding of what is being ordered. What is the difference between WGS, exome sequencing, traditional genetic testing and genotyping (i.e. 23andMe)? What about companies that are using NGS technologies to do tissue analysis (biopsies or blood draws)? How are (and should) samples be collected and how are these companies pairing with clinicians (or not pairing) in order to ensure that high-quality data? And what are things patients should be aware of as these tests become available?

Let's continue...

As mentioned previously, resolving the structure of DNA lead to a shift in the science community. Knowing that DNA was the vehicle for heredity, another focus within the community that complimented the push for developing DNA sequencing technology was understanding how that information was being translated to the cells for use.

In 1958, Francis Crick authored "On Protein Synthesis," where he laid the foundation for his proposal for the Central Dogma of Molecular Biology: DNA makes RNA and RNA makes Protein. But outside of those who actively study biology, the Central Dogma is confusing given that there's usually no context for what these molecules are and what their purpose in a cell's function (let alone a person) would be.

So let's start by giving that context.

Whenever I introduce my students to DNA, I show them an image of a grand library found on most university campuses, teaming with information in the form of books and archives on every possible topic imaginable. This is what DNA is to your cell: an archive of all information possibly needed in order to generate an organism, be that a single-cell bacteria or a human being. All of the books you see lining the shelves are genes, which are heritable units of information that code for a different function/ characteristic (i.e. hair color, lobed ears, freckles, etc). More impressive to think about is that outside of a few exceptions, every-single-one of the cells in your body contains the exact same library of information, which is all tightly packaged inside that cell (we'll get to why that packaging is important later).

Despite having all this information, your cells won't use all of it. Just as most of us have areas of interest and career-focuses, such as a medical doctor won't know much of anything about corporate accounting and a licensed CPA will be inexperienced with chemical engineering, so too do your muscle cells not need the exact same information as your neurons or your skin cells or your liver cells. Yes, there will be some bits of knowledge that will be shared, but even if they do access similar volumes for the needed information, they may require different chapters.

That's where RNA comes in. The purpose of RNA is meant to make available the only information needed by a cell. Think of it as going into that grand university library, but instead of checking out the book, you are only allowed to make copies. And since going through and making copies of all the pages takes too long and leads to less available data, only the information that is needed is copied.

I'm going to back up a bit and point that up until the last decade, there were originally only 3 types of RNA (mRNA, tRNA, and rRNA) that were known and their role was solely involved with passing on instructions to make protein. Over the last decade, though, new types of RNA that don't go on to make proteins, also known as non-coding RNAs, were discovered and their presence has drastically changed our understanding for RNA's role in normal cellular functions and disease. I won't go into miRNAs, siRNAs, piRNAs, and long ncRNAs here (though you can read more here) but I will say is all RNA plays a role in determining not only what proteins are made in the cell, but also when and how long they will be around.

What are proteins? This article provides a nice overview, but the easiest way to think about them is that they are molecular machines in the cell that provide structure, carry out a function (i.e. metabolism, import nutrients, export waste, etc) or oversee regulation (balance pH, oversee product development, etc), often doing a combination of these. If you think of each of your cells as factories, with each factory being specialized depending on the cell type, you'll see similarities in how operations are performed, but you'll also see the differences that give rise to different tissue types. And though most factories need some similar things, there will be differences in the materials they need in order to generate a final product.

So what does any of this have to do with sequencing your DNA and genetic testing? Why not just sequence your entire genome to identify diseases or underlying conditions, especially now that I've told you in my previous post that the technology has advanced to the point that we can? The answer to that question gets back to understanding how your cells function. Despite the fact that all humans are ~99.6% genetically identical to one another (and I want you to sit with that as you look at everyone sitting in the room with you, realizing how closely related you are to people who look differently than you do), you also have ~ 6 billion base pairs of DNA in each of your cells (remember, you have 2 copies of your genome: one from mom and one from dad), that when put together is about 2 meters long (all the DNA in your body put together would be twice the diameter of our solar system). What that means is that despite the similarity, there's also a lot of variation. And how that variation translates is not in you and I having different genes, but there are different versions of the ~20,000 genes we have, resulting either in different versions of the proteins our cells will make (or not be allowed to make) or a disruption in timing for when those proteins are present to do their work. Two added wrinkles in all of this are that 1) outside of a few cases, most proteins work in concert with one another, resulting in genetics behind many diseases being complex and 2) we know environment can also have an effect. Illustrating this, recent work published in Nature Genetics focused on untangling the genetic and environmental ties for 560 conditions, offering more of insight into how this interplay is happening, but also highlighting we still have a long way to go for yesterday the molecular mechanisms behind many of these diseases.

Hence where the choice of WGS vs exome sequencing vs genotyping comes in. With traditional genotyping, the focus is on looking at differences in your genetic code that have known effects, usually in the form of single base pair changes called single nucleotide polymorphisms (SNPs). Because these changes are known, identifying them is quick and fairly inexpensive. The downside is that this data provides only a limited snapshot (there are ~ 4-5 million SNPs in the human genome) and doesn't explore more of the complexities surrounding many conditions. Exome sequencing focuses on sequencing only the portions of your genome that will make the proteins in your cells, thus excluding all the genetic information regulating the generation of these proteins. Those an extremely powerful approach for rare Mendelian diseases and more extensive than genotyping, exome sequencing only focuses on 1% (~ 30 million bp) of the genome, excluding all the genetic information controlling the generation of those proteins. Finally, there's whole genome sequencing (WGS), involving sequencing your entire genome and allowing researchers (and now clinicians) access to even variation in your genetic code. But with all this information comes challenges with interpretation, as the effects to all these changes are not known and even if there are known effects, it's not a guarantee that you will develop the condition, leading to questions about clinical significance.

One final caveat is that sometimes clinicians don't know what variations in your genome they are looking, so they are looking for when the RNA is present. In this case, sample collection is critical as RNA can easily be damaged and degrade, making it difficult to turn back into DNA and sequence. So in addition to the type of sequencing is taking into account sample collection to ensure good results.

A final wrinkle in all of this is whether to even consider sequence your genome. In an era where there are leaks in personal information, the fears about own's genetic code being used to discriminate are more real today than ever, meaning anyone venturing into this terrain is behooved for doing their homework ahead of time.

As I type all of this out, it's hard not to feel a bit overwhelmed. Even for someone like me, who is used to being surrounded by information like this, I find myself nervous and second-guessing about whether I've covered this well and correctly. What's pushing me to get this information out there in this manner is the daily evidence that we are entering an era where biotech innovations can have a meaningful impact on our daily lives. And though the current focus for many in the US is on using this information to healthcare (and I encourage you to learn more about the  Precision Medicine Initiative), the truth is this is just the tip of the iceberg given I haven't even touched on gene editing.

That said, my hope is that I have made this information a bit more approachable, or at least armed you with the needed information necessary to do your own research. Because with all that is happening with this technology, it's likely it won't be long before it becomes more a part of our daily lives. And that's an exciting prospect.

1 comment:

  1. I think I understand about 30% of what you say. Which means I'm smarter now than before I read your post.



Design by Small Bird Studios | All Rights Reserved