[HN Gopher] Whole Genome Sequencing ___________________________________________________________________ Whole Genome Sequencing Author : faebi Score : 29 points Date : 2022-07-25 20:57 UTC (2 hours ago) (HTM) web link (sequencing.com) (TXT) w3m dump (sequencing.com) | henearkr wrote: | Important note: whole genome does not mean whole DNA. | | Their claim to sequence 100% of the genome could not be true if | it was 100% DNA, as some locations like near the centromere or | the telomeres are notoriousy difficult to sequence (and just | impossible with the technique of alignment that they use). | | It's not that bad, you can already know a lot of interesting | things with a whole genome, but it won't be enough information to | e.g. synthesize a copy of your DNA or be able to repair all of | your adult cell's genetic damages (supposing this kind of tech | exists in some future). | | There are also more and more research on how the exons (DNA _not_ | in some gene) participate in the regulation of genes and are | involved in many diseases. | | I'd like to see some real (and affordable) 100% DNA sequencing in | a near future! | aroch wrote: | I have no skin in this particular game (Though I work in the | sequencing industry), but... I don't think anyone is | fooled/trying to trick anyone when they say "whole genome | sequencing" or that they sequence 100% of the genome. That is | the term of art for non-targeted/unenriched sequencing of DNA | (nb: it may be RNA/ribosome depleted, so fine it's technically | enriched but not in a meaningful way). Also exons are genic | region, are you thinking introns (arguably genic or at least | adjacent), or promoters/enhancers and chromatin state? | brofallon wrote: | Just to clarify, exons are the portions of genes that code for | protein sequence (they are expressed). Introns span the | distance between exons and may contain regulatory or splicing | information. Areas between genes are referred to as intergenic, | and also may contain regulatory sequences that affect how genes | are expressed. | | Part of the issue with centromeric or telomeric sequence is | that not only is it hard to sequence (being super repetitive), | little is known about what a sequence variant in such an area | might mean. It's kind of a chicken-and-egg issue: it doesn't | get much attention because no one knows how to interpret | variants there, and since no one knows how to interpret | variants there it doesn't get much attention | mbreese wrote: | Just to pile on to the existing comments... if you're getting | the raw FASTQ files, you will have telomeric and centromeric | DNA sequences. You just won't necessarily be able to accurately | align the data. Sequencing those regions is easy... you just | can't assemble the data into larger contiguous regions because | of the lack of variation. | | And given the consistency of those regions, you will have | 99(.999)% of all of the information you would need to make a | copy of your DNA. | | There are some regions that are difficult to sequence, but the | major problem isn't getting the raw sequence, but the alignment | process. | gillesjacobs wrote: | A note on these WGS services: The industry is nacent, DNA | extraction and the sequencing fail often. You'll be shipping | samples across the globe and waiting on your data for about 12 | weeks to a few months at best. My advice: If you're not looking | for a hobby you can follow up on every few weeks, don't buy these | budget direct-to-consumer services yet. | | I heard good things about Sequencing.com, but then again also | Nebula Genomics when I bought their service end of 2021. I ended | up asking a refund due to the painfully slow processes and | communication last week. Dante Labs is a previous pioneer but has | also fallen from grace with long delays and undelivered data. | Caveat Emptor. | torquemodwanted wrote: | Do you think something like the Oxford Nanopore MinION [0] is a | viable alternative, assuming you'd want to do just the | sequencing yourself? I suspect most people wouldn't be able to | do the prep without proper wet lab training and equipment [1], | but the starter kit is $1000 for the device, a single flow | cell, and one rapid sequencing kit (SQK-RAD004). | | [0]: https://nanoporetech.com/products/minion | | [1]: https://www.youtube.com/watch?v=iS1pz3IhJvU | aroch wrote: | Not for home gamers, no. ONT is fairly low throughput and the | prep is surprisingly difficult to do well without experience. | You would need to run tens of flow cells to get enough | sequencing data. | mbreese wrote: | Not to mention there is a good chunk of required "extra" | equipment that is expected (centrifuges, etc) that bring | the initial cost to well over $1000. You could buy most | thing used, but some of the reagents could be difficult to | acquire outside of a traditional lab. | ejstronge wrote: | This is not strictly true - there is a transposase based | prep that does not require special equipment. | | Still, ONT users would need to understand how to convert | raw reads into useable basecalls at loci of clinical | interest | xipho wrote: | I suspect many people that frequent HN could use the MinION | to generate the raw data, but generating gigabytes of DNA | reads != assembling a genome. Remember, only this year the | very _first_ genome for any human was "completed". You're | going to get thousands of overlapping reads, of varying | lengths. Then you'll need serious processing power to combine | these overlaps, then overlap the overlaps, so to speak, and | on and on. What software to pick, how to parameterize/use it, | this is the job of post-docs and others who like to tear | their hair out. I haven't looked lately, but many one-off | scientific machines have their own propitiatory binary | versions of the data, for vendor lock in, so that you must | also buy their crappy software to process it, double check | that this isn't the case. | | When your first run fails, are you willing to pay again as | much for the kits to run it again (these machines are very | much following the cheap printer/expensive ink model IMO)? | | Once you have some data, do you know how you will BLAST it | against annotated genomes to figure out if you have mutation | X? How do you interpret e-scores, etc? | | For the OP company, when they say "download the data" do they | mean the raw reads, or assemblies? Make sure this is spelled | out (it likely is, I haven't looked lately). Do the | downloads/service provide adequate metadata on the data | generation process so that you can tease out errors in reads | from reality (real single-nucleotide mutations), etc.? | | All fun stuff, but only for a _very_ serious hobbyist, so to | speak. | aroch wrote: | No one should be BLASTing individual reads... And pretty | much no one is going to be assembling with OLC, even for | ONT data. | | On the products page it says they provide FASTQs, aligned | BAM and VCF. Which is exactly what one would expect. | They're almost certainly just running the DRAGEN pipeline | (or something similar) and giving you the output it | creates. | jefftk wrote: | I think your typical programmer could automate the | assembly, even without reading about existing algorithms. | It isn't much harder than a standard interview question, | especially if you're willing to use your hardware | inefficiently? | | You aren't trying to sequence the human genome from | nothing, but instead have a reference genome to work with. | You take each of the reads you get from the sequencer and | find where it best matches up with the reference genome. | | I think the wet portion is a far bigger challenge for the | typical computer programmer hobbyist. | | (Also I don't think $1,000 will get you, in addition to the | sequencer, a flow cell sufficient to gather enough reads to | tell the difference between sequencing errors and | mutations?) | topynate wrote: | I bought the Nebula 30x product and got my results quite | quickly, in under three months. It's hard to tell if my | experience is better than average or not, given that people | tend not to speak up about services when they're delivered on | time. There may be some issues with the data quality. I | happened to post about it a couple of hours ago: | https://www.reddit.com/r/Nebulagenomics/comments/w8013t/rate... | | My current best guess is that more data than usual were of | bacterial origin, but I'll have to learn a bit more | bioinformatics to test that. | saxonww wrote: | Chuckling a little at the prominent "Security Grade: A+" display | on their privacy page. It's the Qualys score for their site, | which they are kind of implying is an overall security metric for | their business. Not very inspiring. | walnutclosefarm wrote: | Goes along with a big badge saying they are HIPAA compliant. I | mean, if they are a covered entity (which is to say, if they | are providing healthcare services), they have to be HIPAA | compliant, or be prepared to be fined heavily by the Feds for | breaking the law. If they are not a covered entity, they don't | really have any meaningful obligations under HIPAA, so being | compliant is meaningless. | exact_string wrote: | I assume this is yet another sequencing company outsourcing the | actual sequencing? | | >> Do you sell or share my data with anyone? | | > No, we do not sell or share your data, including your DNA data, | with anyone. | | However, in their privacy policy: | | > Your Personal Information may be shared in the following ways: | | > With our service providers, allowing them to provide their | services to us. | | edit: | | More clearly stated here: | | https://sequencing.com/ordering-dna-test-kit-and-genome-sequ... | | > When you purchase the Ultimate DNA Test or Ultimate Genome | Sequencing test, your test will be processed by our Laboratory | Partners. We will provide our Laboratory Partner with only your | basic contact information so that it can send you a DNA | collection kit; we will not share your billing information with | our Laboratory Partner but will pass on your payments to them. | FollowingTheDao wrote: | > However, in their privacy policy: | | That does not mean they are sharing your genetic data, maybe | just address phone number etc. | | I had the WORST experience with Nebula Genomics which ended up | with them refunding my money after waiting 6 months. | | I hope this is not another one of those companies. I still have | some blank spots I am trying to understand from my 23andMe | results and this would be great. | adora wrote: | I emailed Sequencing.com asking what the difference was | between them and Nebula Genomics and this is what they wrote | back: | | _(1) Nebula is our laboratory partner, and they sequence | with high-throughput MGI DNBSEQ-T7 DNA sequencing machines. | | (2) They are our laboratory provider, but the product is not | the same! The whole genome sequencing product sold by Nebula | on their site is different from the Ultimate Genome | Sequencing service we offer in some key ways. One big one is | that our service includes more than $200 in DNA analysis apps | and reports (such as the Healthcare Pro report designed for | healthcare providers and the Rare Disease Screen that | analyzes thousands of rare diseases). | | The biggest difference, however, is in the processing of the | raw genetic data into data files that you can then use to get | health insights. We have enhanced processing for these kits | that is able to create insights not available from Nebula | directly, including enhanced raw data processing through two | special bioinformatics pipelines that provide comprehensive | data and analysis of structural variations, copy number | variations, and mitochondrial heteroplasmy. Both Telomere | Length and HLA Typing are coming soon and will be | retroactively added to all past purchases of our Ultimate | Genome Sequencing service. | | This is all fed into our One Genome technology, which allows | you to do analysis in ways that you can't do anywhere else. | This technology takes all of your genetic data and turns it | into an enhanced virtual genome. This advantage of this is | that you can run analysis on everything all at once, where in | some other cases you might have separate files that can't be | analyzed simultaneously. | | The long and short of it is that it takes a lot of processing | to turn raw genetic data into usable data, and then to | analyze that data. That piece is where we come in, and are at | the forefront of the field._ | omgwtfbyobbq wrote: | I believe allofus provides you with your WGS data if you opt into | receiving it, and it's free. | | https://allofus.nih.gov/ | | My mom did Nebula, which was kind of a pain, but she eventually | got her results. | yawnxyz wrote: | If this is truly WGS then I'm amazed at how it's only a few | hundred dollars to run. | adora wrote: | Potentially sending to China. | toomuchtodo wrote: | Feedback if someone from sequencing.com sees this: import from | 23andme.com doesn't appear to work if you use Sign In with Apple | for 23andme. Regardless, I love the broker functionality to pull | data from other sequencing providers. Kudos! | peter303 wrote: | Venters company was charging $25K. That included lots of other | diagnostic tests too. (Venter was the first human sequenced. By | his own company.) | bobsmooth wrote: | The sample report is quite a read. ___________________________________________________________________ (page generated 2022-07-25 23:00 UTC)