[HN Gopher] Whole Genome Sequencing
       ___________________________________________________________________
        
       Whole Genome Sequencing
        
       Author : faebi
       Score  : 29 points
       Date   : 2022-07-25 20:57 UTC (2 hours ago)
        
 (HTM) web link (sequencing.com)
 (TXT) w3m dump (sequencing.com)
        
       | henearkr wrote:
       | Important note: whole genome does not mean whole DNA.
       | 
       | Their claim to sequence 100% of the genome could not be true if
       | it was 100% DNA, as some locations like near the centromere or
       | the telomeres are notoriousy difficult to sequence (and just
       | impossible with the technique of alignment that they use).
       | 
       | It's not that bad, you can already know a lot of interesting
       | things with a whole genome, but it won't be enough information to
       | e.g. synthesize a copy of your DNA or be able to repair all of
       | your adult cell's genetic damages (supposing this kind of tech
       | exists in some future).
       | 
       | There are also more and more research on how the exons (DNA _not_
       | in some gene) participate in the regulation of genes and are
       | involved in many diseases.
       | 
       | I'd like to see some real (and affordable) 100% DNA sequencing in
       | a near future!
        
         | aroch wrote:
         | I have no skin in this particular game (Though I work in the
         | sequencing industry), but... I don't think anyone is
         | fooled/trying to trick anyone when they say "whole genome
         | sequencing" or that they sequence 100% of the genome. That is
         | the term of art for non-targeted/unenriched sequencing of DNA
         | (nb: it may be RNA/ribosome depleted, so fine it's technically
         | enriched but not in a meaningful way). Also exons are genic
         | region, are you thinking introns (arguably genic or at least
         | adjacent), or promoters/enhancers and chromatin state?
        
         | brofallon wrote:
         | Just to clarify, exons are the portions of genes that code for
         | protein sequence (they are expressed). Introns span the
         | distance between exons and may contain regulatory or splicing
         | information. Areas between genes are referred to as intergenic,
         | and also may contain regulatory sequences that affect how genes
         | are expressed.
         | 
         | Part of the issue with centromeric or telomeric sequence is
         | that not only is it hard to sequence (being super repetitive),
         | little is known about what a sequence variant in such an area
         | might mean. It's kind of a chicken-and-egg issue: it doesn't
         | get much attention because no one knows how to interpret
         | variants there, and since no one knows how to interpret
         | variants there it doesn't get much attention
        
         | mbreese wrote:
         | Just to pile on to the existing comments... if you're getting
         | the raw FASTQ files, you will have telomeric and centromeric
         | DNA sequences. You just won't necessarily be able to accurately
         | align the data. Sequencing those regions is easy... you just
         | can't assemble the data into larger contiguous regions because
         | of the lack of variation.
         | 
         | And given the consistency of those regions, you will have
         | 99(.999)% of all of the information you would need to make a
         | copy of your DNA.
         | 
         | There are some regions that are difficult to sequence, but the
         | major problem isn't getting the raw sequence, but the alignment
         | process.
        
       | gillesjacobs wrote:
       | A note on these WGS services: The industry is nacent, DNA
       | extraction and the sequencing fail often. You'll be shipping
       | samples across the globe and waiting on your data for about 12
       | weeks to a few months at best. My advice: If you're not looking
       | for a hobby you can follow up on every few weeks, don't buy these
       | budget direct-to-consumer services yet.
       | 
       | I heard good things about Sequencing.com, but then again also
       | Nebula Genomics when I bought their service end of 2021. I ended
       | up asking a refund due to the painfully slow processes and
       | communication last week. Dante Labs is a previous pioneer but has
       | also fallen from grace with long delays and undelivered data.
       | Caveat Emptor.
        
         | torquemodwanted wrote:
         | Do you think something like the Oxford Nanopore MinION [0] is a
         | viable alternative, assuming you'd want to do just the
         | sequencing yourself? I suspect most people wouldn't be able to
         | do the prep without proper wet lab training and equipment [1],
         | but the starter kit is $1000 for the device, a single flow
         | cell, and one rapid sequencing kit (SQK-RAD004).
         | 
         | [0]: https://nanoporetech.com/products/minion
         | 
         | [1]: https://www.youtube.com/watch?v=iS1pz3IhJvU
        
           | aroch wrote:
           | Not for home gamers, no. ONT is fairly low throughput and the
           | prep is surprisingly difficult to do well without experience.
           | You would need to run tens of flow cells to get enough
           | sequencing data.
        
             | mbreese wrote:
             | Not to mention there is a good chunk of required "extra"
             | equipment that is expected (centrifuges, etc) that bring
             | the initial cost to well over $1000. You could buy most
             | thing used, but some of the reagents could be difficult to
             | acquire outside of a traditional lab.
        
               | ejstronge wrote:
               | This is not strictly true - there is a transposase based
               | prep that does not require special equipment.
               | 
               | Still, ONT users would need to understand how to convert
               | raw reads into useable basecalls at loci of clinical
               | interest
        
           | xipho wrote:
           | I suspect many people that frequent HN could use the MinION
           | to generate the raw data, but generating gigabytes of DNA
           | reads != assembling a genome. Remember, only this year the
           | very _first_ genome for any human was  "completed". You're
           | going to get thousands of overlapping reads, of varying
           | lengths. Then you'll need serious processing power to combine
           | these overlaps, then overlap the overlaps, so to speak, and
           | on and on. What software to pick, how to parameterize/use it,
           | this is the job of post-docs and others who like to tear
           | their hair out. I haven't looked lately, but many one-off
           | scientific machines have their own propitiatory binary
           | versions of the data, for vendor lock in, so that you must
           | also buy their crappy software to process it, double check
           | that this isn't the case.
           | 
           | When your first run fails, are you willing to pay again as
           | much for the kits to run it again (these machines are very
           | much following the cheap printer/expensive ink model IMO)?
           | 
           | Once you have some data, do you know how you will BLAST it
           | against annotated genomes to figure out if you have mutation
           | X? How do you interpret e-scores, etc?
           | 
           | For the OP company, when they say "download the data" do they
           | mean the raw reads, or assemblies? Make sure this is spelled
           | out (it likely is, I haven't looked lately). Do the
           | downloads/service provide adequate metadata on the data
           | generation process so that you can tease out errors in reads
           | from reality (real single-nucleotide mutations), etc.?
           | 
           | All fun stuff, but only for a _very_ serious hobbyist, so to
           | speak.
        
             | aroch wrote:
             | No one should be BLASTing individual reads... And pretty
             | much no one is going to be assembling with OLC, even for
             | ONT data.
             | 
             | On the products page it says they provide FASTQs, aligned
             | BAM and VCF. Which is exactly what one would expect.
             | They're almost certainly just running the DRAGEN pipeline
             | (or something similar) and giving you the output it
             | creates.
        
             | jefftk wrote:
             | I think your typical programmer could automate the
             | assembly, even without reading about existing algorithms.
             | It isn't much harder than a standard interview question,
             | especially if you're willing to use your hardware
             | inefficiently?
             | 
             | You aren't trying to sequence the human genome from
             | nothing, but instead have a reference genome to work with.
             | You take each of the reads you get from the sequencer and
             | find where it best matches up with the reference genome.
             | 
             | I think the wet portion is a far bigger challenge for the
             | typical computer programmer hobbyist.
             | 
             | (Also I don't think $1,000 will get you, in addition to the
             | sequencer, a flow cell sufficient to gather enough reads to
             | tell the difference between sequencing errors and
             | mutations?)
        
         | topynate wrote:
         | I bought the Nebula 30x product and got my results quite
         | quickly, in under three months. It's hard to tell if my
         | experience is better than average or not, given that people
         | tend not to speak up about services when they're delivered on
         | time. There may be some issues with the data quality. I
         | happened to post about it a couple of hours ago:
         | https://www.reddit.com/r/Nebulagenomics/comments/w8013t/rate...
         | 
         | My current best guess is that more data than usual were of
         | bacterial origin, but I'll have to learn a bit more
         | bioinformatics to test that.
        
       | saxonww wrote:
       | Chuckling a little at the prominent "Security Grade: A+" display
       | on their privacy page. It's the Qualys score for their site,
       | which they are kind of implying is an overall security metric for
       | their business. Not very inspiring.
        
         | walnutclosefarm wrote:
         | Goes along with a big badge saying they are HIPAA compliant. I
         | mean, if they are a covered entity (which is to say, if they
         | are providing healthcare services), they have to be HIPAA
         | compliant, or be prepared to be fined heavily by the Feds for
         | breaking the law. If they are not a covered entity, they don't
         | really have any meaningful obligations under HIPAA, so being
         | compliant is meaningless.
        
       | exact_string wrote:
       | I assume this is yet another sequencing company outsourcing the
       | actual sequencing?
       | 
       | >> Do you sell or share my data with anyone?
       | 
       | > No, we do not sell or share your data, including your DNA data,
       | with anyone.
       | 
       | However, in their privacy policy:
       | 
       | > Your Personal Information may be shared in the following ways:
       | 
       | > With our service providers, allowing them to provide their
       | services to us.
       | 
       | edit:
       | 
       | More clearly stated here:
       | 
       | https://sequencing.com/ordering-dna-test-kit-and-genome-sequ...
       | 
       | > When you purchase the Ultimate DNA Test or Ultimate Genome
       | Sequencing test, your test will be processed by our Laboratory
       | Partners. We will provide our Laboratory Partner with only your
       | basic contact information so that it can send you a DNA
       | collection kit; we will not share your billing information with
       | our Laboratory Partner but will pass on your payments to them.
        
         | FollowingTheDao wrote:
         | > However, in their privacy policy:
         | 
         | That does not mean they are sharing your genetic data, maybe
         | just address phone number etc.
         | 
         | I had the WORST experience with Nebula Genomics which ended up
         | with them refunding my money after waiting 6 months.
         | 
         | I hope this is not another one of those companies. I still have
         | some blank spots I am trying to understand from my 23andMe
         | results and this would be great.
        
           | adora wrote:
           | I emailed Sequencing.com asking what the difference was
           | between them and Nebula Genomics and this is what they wrote
           | back:
           | 
           |  _(1) Nebula is our laboratory partner, and they sequence
           | with high-throughput MGI DNBSEQ-T7 DNA sequencing machines.
           | 
           | (2) They are our laboratory provider, but the product is not
           | the same! The whole genome sequencing product sold by Nebula
           | on their site is different from the Ultimate Genome
           | Sequencing service we offer in some key ways. One big one is
           | that our service includes more than $200 in DNA analysis apps
           | and reports (such as the Healthcare Pro report designed for
           | healthcare providers and the Rare Disease Screen that
           | analyzes thousands of rare diseases).
           | 
           | The biggest difference, however, is in the processing of the
           | raw genetic data into data files that you can then use to get
           | health insights. We have enhanced processing for these kits
           | that is able to create insights not available from Nebula
           | directly, including enhanced raw data processing through two
           | special bioinformatics pipelines that provide comprehensive
           | data and analysis of structural variations, copy number
           | variations, and mitochondrial heteroplasmy. Both Telomere
           | Length and HLA Typing are coming soon and will be
           | retroactively added to all past purchases of our Ultimate
           | Genome Sequencing service.
           | 
           | This is all fed into our One Genome technology, which allows
           | you to do analysis in ways that you can't do anywhere else.
           | This technology takes all of your genetic data and turns it
           | into an enhanced virtual genome. This advantage of this is
           | that you can run analysis on everything all at once, where in
           | some other cases you might have separate files that can't be
           | analyzed simultaneously.
           | 
           | The long and short of it is that it takes a lot of processing
           | to turn raw genetic data into usable data, and then to
           | analyze that data. That piece is where we come in, and are at
           | the forefront of the field._
        
       | omgwtfbyobbq wrote:
       | I believe allofus provides you with your WGS data if you opt into
       | receiving it, and it's free.
       | 
       | https://allofus.nih.gov/
       | 
       | My mom did Nebula, which was kind of a pain, but she eventually
       | got her results.
        
       | yawnxyz wrote:
       | If this is truly WGS then I'm amazed at how it's only a few
       | hundred dollars to run.
        
         | adora wrote:
         | Potentially sending to China.
        
       | toomuchtodo wrote:
       | Feedback if someone from sequencing.com sees this: import from
       | 23andme.com doesn't appear to work if you use Sign In with Apple
       | for 23andme. Regardless, I love the broker functionality to pull
       | data from other sequencing providers. Kudos!
        
       | peter303 wrote:
       | Venters company was charging $25K. That included lots of other
       | diagnostic tests too. (Venter was the first human sequenced. By
       | his own company.)
        
       | bobsmooth wrote:
       | The sample report is quite a read.
        
       ___________________________________________________________________
       (page generated 2022-07-25 23:00 UTC)