[HN Gopher] Sequencing your DNA with a USB dongle and open sourc... ___________________________________________________________________ Sequencing your DNA with a USB dongle and open source code Author : johntortugo Score : 218 points Date : 2021-12-26 18:31 UTC (4 hours ago) (HTM) web link (stackoverflow.blog) (TXT) w3m dump (stackoverflow.blog) | m12k wrote: | I'm really curious about what I could learn by getting my DNA | sequenced, but I'm worried about my rights to not have it | recorded and shared without my consent if I got someone else to | do it for me - so any advance toward an affordable home test | setup is very welcome. | adabaed wrote: | Imagine insurers refusing to give you a service due to your | predisposition to certain diseases... | foobarbecue wrote: | If you haven't seen Gattaca, you should | haihaibye wrote: | There should be a directors cut where the mission fails | because of Vincent's hidden heart condition. | | Gattaca shows eugenics has been so vilified that the | audience will root for a character who selfishly commits | fraud, risking lives and scientific progress for his own | vanity. | | The really scary fact is that there would be no need for a | police state and segregation. The genetically enhanced | would just completely dominate an open and fair | competition. | adabaed wrote: | Yeah super good!! | meltedcapacitor wrote: | Protection from this comes from laws that ban DNA-based | policies, not by being secretive about sequencing. If it is | allowed, insurers will have no need to obtain DNA sequences | in devious ways, they will just ask and refuse cover or | charge more when clients refuse to get sampled. | m12k wrote: | Sure, but being secretive about your DNA seems like the | prudent course of action until those laws are in place | toomuchtodo wrote: | "Passed in 2008, a federal law called the Genetic | Information Nondiscrimination Act (GINA) made it illegal | for health insurance providers in the United States to | use genetic information in decisions about a person's | health insurance eligibility or coverage." | | Also prevents employment discrimination based on | genetics. | | https://www.genome.gov/about-genomics/policy- | issues/Genetic-... | | (disclosure: have had my DNA sequenced by multiple | organizations, and it's publicly available) | jrumbut wrote: | What I worry about is having this data laundered through | a couple of vendors. | | "How could we know our vendor's vendor was using genetic | information in their proprietary risk score?" | | "How could we know our client's client was using our | score for life, health, or auto | insurance/employment/lending/etc decisions?" | | It's a "can't unring a bell" situation and the gaps in | the regulations and the incentives for bad behavior are | enormous. | ajuc wrote: | It's amazing how many problems you avoid by having public | health system. | inglor_cz wrote: | You avoid the problem with medical debt, to be precise. | | You cannot really avoid the fundamental constraints - | anywhere in the world, there are only so many doctors and | so much money available for treatments. IDK if USA has a | shortage of doctors, but plenty of European countries do. A | country like Romania just cannot give its doctors big | enough wages to stop them from seeking employment | elsewhere, where they will get five to ten times as much | (UK, Germany, Switzerland). As a result, local hospitals | are seriously understaffed. | | Where I live, having personal connections to good doctors | gives you an advantage - you will be examined and treated | faster. Then there is outright nepotism. | | The outgroups are different than in America, but there are | always people for whom the system sucks. | adabaed wrote: | You resolve part of them, but immediately generate others. | Hybrid systems are the way to go. | | In Spain, for example, we have a private system but it is | extremely inefficient in some areas (and very good in | others). Of course, you can have private insurance, but you | still have to pay your social security. Curiously, the only | ones who can decide which system they want are the public | servants... | Method-X wrote: | When I had 23andme sequence my DNA, I used a fake last name and | pre-paid credit card. | biophysboy wrote: | Its only valuable if somebody also interprets it for you, such | as telling you whether you have a genetic predisposition for | certain diseases. | DoctorOW wrote: | Is that not something software can theoretically provide? | jacquesm wrote: | Your DNA can tell you a lot about what _could_ happen, but | not about what _is_ happening. | m12k wrote: | One of the other comment threads indicates that the data, | that you need to do that kind of annotation of the sequence, | is to some extent available for home use as well: | https://news.ycombinator.com/item?id=29695449 | | I'm really hoping someone will work on an open source | "23andme@home" solution that ties all this together in an | accessible way. | rumblerock wrote: | Years ago I used Ancestry, then requested the .txt file and | asked them to delete it from their records. Uploaded it to | run a report at https://promethease.com/ that cross- | references your SNPs against the existing body of genetic | research. | | The results have been pretty astounding. I found markers | that pointed to poor response to a specific blood thinner | my grandfather was put on before he passed. Currently I'm | researching the cluster of Bipolar / ADHD / SAD symptoms I | experience that all seem to trace back to a certain | genotype of circadian rhythm genes I have (thank you, Sci | Hub). To boot, some of the studies I've come across have | been done on Han Chinese populations that match my | descendance. | | Perhaps going too far down this rabbit hole poses a self- | diagnosis risk, but the correlations to my family history | and my own life experience working with doctors to diagnose | and treat symptoms are pretty undeniable. And given that | your run-of-the-mill psychiatrist is going to treat you off | of a DSM checklist, I feel much more confident knowing | there have been genomic studies to back things up, since my | doctor isn't up to date on this research, and finding one | that would be will be difficult and expensive. I've shared | the papers with my doc and he's been supportive, sometimes | I feel like I should be getting a discount on services | rendered. | ClumsyPilot wrote: | >"poses a self-diagnosis risk" | | Self-diagnosing is not the problem it is made out to be - | I live with my symptoms 24/7, doctor sees me for 5 | minutes. The amount of times doctors have missed fairly | clear sign of trouble in my family is disturbingly high. | A simple procedure, done in time, would have saved two | people I know. | | Unfortunately our educational system teaches you about | mitochondrion, but not the practical difference between | ibuprophene and paracetomol, or CRP. | dekhn wrote: | Note that you are literally shedding identifiable DNA from your | body at all times and a truly motivated adversary would have no | problem obtaining enough sample material to do high quality | sequencing. | nomercy400 wrote: | It's not the motivated adversary I am worried about, who | actually has to show up where I have physically been. It is | the company on the other side of the world in a country with | lax legislation, profiling me based on the data I 'shed' | online, like a cloud-based DNA sequencing service. | shukantpal wrote: | At scale? | hourislate wrote: | I'm curious whether a Covid PCR test could be used to | sequence your DNA. Is there enough of a specimen in the | process. | eurasiantiger wrote: | Absolutely. | dekhn wrote: | Sure. I've worked with and know people who could carry this | out at scale, although obviously individual sample | collection isn't highly scalable. | | Edit: I used to help Google fund researchers like Joe | Derisi and others who develop technology to do this, and | some of the people I worked with in my academic career are | quite good at identifying serial killers from 30 year old | DNA. If you're downvoting because you think I'm making this | up, you're wrong. If you're downvoting because you don't | think large-scale individual detection using genetic | sampling of the environment is possible, you're wrong. If | you're downvoting because you think you couldn't do a whole | genome sequence of an individual using a sample collected | in the wild, you're wrong. If you're downvoting because you | think this is a terrible idea (morally, ethically), that's | fine but I didn't say anything about my own moral or | ethical beliefs about this. | | It's simply factually correct to say that large-scale | individual sample collection (at order tens of thousands, | if not hundreds of thousands of individuals in a country | the size of the US) is possible. All the technology is | there to do this. | ClumsyPilot wrote: | The data monopolies and abuse originate from people giving | these companies data for free. If they had to buy it, or pay | goons to collect it, they wouldn't be profitable. | russdill wrote: | In the near future (or arguably now depending on your | purpose) you don't even need that. Assuming enough of your | relative's sequences are available, the probability of you | having certain genes/mutations can be narrowed down so much | that having your individual genome doesn't add much. | kingcharles wrote: | So, how long before I can take my DNA "ROM" file and boot it in | an emulator that would allow it to grow? | dekhn wrote: | it's unlikely we would ever be able to achieve this. Even | simulating a single cell at high resolution is a serious | challenge. | 323 wrote: | You seriously underestimate the continuous growth of computer | power. And quantum computers after, which are perfect for | simulating chemical reactions. | | What was unthinkable 50 years ago, playing chess better than | a human, it's now trivial for a $100 device. | | And it's not necessarily required that to simulate the growth | of a human you'll need to simulate the entirety of chemical | reactions in all 50 trillion cells and all that. | dekhn wrote: | It's possible I underestimate, but I have worked in all the | relevant fields of simulation, ~20 years of running various | simulations on large HPC, built the largest instance of | folding@home using idle cycles inside google data centers, | published papers simulating proteins, developed | infrastructure to process the voluminous data, etc, etc. | Quantum computing remains fantasy (in terms of being useful | for science). | | It's unlikely even if we improved computing hardware many | orders of magnitude beyond all reasonable predictions, that | the calculations would be able to simulate all the | necessary details; most of our simulations now are based on | many approximations due to hardware limitations. | | As to the question of "what level of fidelity is required | to turn a FASTQ of somebody's genome into an accurate model | of the resulting human, with some sort of realistic | environment also provided", that's so far beyond what is | even remotely comprehensible it's not worth speculating | about in terms of science fact; it's just fiction. | GistNoesis wrote: | It's likely that you don't have to simulate even a single | cell at high resolution to be able to simulate how an | organism would grow. There are numerical shortcuts. | | For example today we can already predict the color of the | eyes and other phenotype from the DNA. | | If you are able to observe enough samples of cell growth and | their associated DNA, you probably can model and predict the | statistics of a cell from their DNA. Because the cell is | itself the result of a lot of chemical processes, the law of | large number will help smooth those statistics. | | Given that we have a lot of cells, the collective behavior is | probably entirely governed by these statistics. | Lev1a wrote: | An idea just popped into my head reading your comment: | | What if you could take the (binary) data file of your DNA and | use it as input in the (recently remastered) Monster Rancher | games to generate a monster? Apparently those games use | external user-provided data (like music CDs, game discs etc.) | to generate the monsters the player would then train and use | (something I only recently learned about through gaming | livestreams). | | I'd actually like to see the level of jank that would come out | of something like that. | LinuxBender wrote: | This is very cool. Are there by chance any associated projects | that could evolve into something like 23andme but remain entirely | within a private network meaning that the data is entirely in the | hands of the individual? | ampdepolymerase wrote: | A used laboratory grade NGS system can be had for less than 10K | | https://www.ebay.com/itm/265148387179 | | Nanopore is still not quite ready yet for precise and high | accuracy sequencing. Give it another five years. | anderspitman wrote: | I work in a dry lab but I'm pretty sure you need a lot of | expensive chemicals to actually make one of these work, yeah? | mylons wrote: | yup. that's the business model for Illumina. it's very much | akin to video game consoles. Illumina might take a hit on | selling the machine but makes it up in selling you | proprietary reagents. | rbartelme wrote: | Cost/benefit analysis may dictate that, as other posters | suggested, you'd be better served to get raw fastq files from | a sequencing lab. Even better if you can send the lab a | sample and they'll process the extractions for extra $$. | mylons wrote: | wow i didn't know they were that "cheap" now. i used to work | for a major competitor to the sequencer you linked, the | SOLiD. | | and i feel like nanopore is the VR of dna sequencing. it's | always just another few years off. | ampdepolymerase wrote: | The one I linked to is a decade out of date and OEM | discontinued. | mylons wrote: | ya my first thought was how hard are reagents to get, but | probably not that hard. i wasn't in the lab, i was in | bioinformatics so i'm generally clueless on reagent | acquisition. | joshuamcginnis wrote: | What do you mean by it's always a few years off? Nanopore | will allow you to do high-quality genomic sequencing _now_, | in a home lab if you wanted, for less than $3K. If you | amortize the 3K by the number of genomes you can sequence | on the same flow cell, the price per base or per genome | falls precipitously, depending on the size of the genome of | course. | divbzero wrote: | > and i feel like nanopore is the VR of dna sequencing. | it's always just another few years off. | | Is this also true for nanopores in protein sequencing? This | HN comment from a few weeks back [1] pointed out recent | progress but perhaps the tech is still not quite there. | | [1]: https://news.ycombinator.com/item?id=29481075 | joshuamcginnis wrote: | That's not true. I just did a high-quality sequence and | assembly of a new species of fungus from my home lab using | nanopore. You can see all my code used for assembly and | analysis that will be referenced in a paper I plan to publish | in Jan here: https://github.com/EverymanBio/pestalotiopsis | AstroDogCatcher wrote: | Interested outsider here; I work with a lot of HCLS research | customers but don't have a biology-related background. Can | you explain the problems with the Nanopore sequencer accuracy | in more detail? Basically, I was wondering if I could get one | for myself and sequence my own genome, then user the data to | learn about life-sciences computing techniques. If I were to | buy one of the USB-attachable devices and run it, is the data | simply not viable for use in a genomics pipeline, or is it | just that the results would be questionable? Also, if | accuracy is an issue, what about just running the same sample | N times and doing some error correction? | ampdepolymerase wrote: | I recommend reading this review | | https://genomebiology.biomedcentral.com/articles/10.1186/s1 | 3... | | I guess there are limits to ensemble methods if the | underlying accuracy doesn't increase. I don't work on gene | sequencing algorithms but from what I understand of ML | ensemble techniques, there are certain assumptions | regarding the underlying independence of the errors. The | errors for nanopore _should_ be uniform but I am not sure. | Any molecular biologist here care to comment? | biophysboy wrote: | I know that the error rate of the oxford nanopore | sequencer depends on GC content (guanine/cytosine | nucleotides), and that the Pacific Biosciences sequencer | uses a polymerase that gets worn down during reading. So | there is some non-uniformity in the chemistry. | ampdepolymerase wrote: | GC rich regions as in hairpin loops? How would the | sequencer deal with those? | biophysboy wrote: | The instruments do exactly as you say (run the sample N | times), but this obviously comes at a cost. Also, keep in | mind that sequencing needs to be very, very accurate to be | useful. We share most of our DNA, and the small variations | make up all the difference. | netizen-936824 wrote: | Sounds like a fediverse project? | Malp wrote: | Oh God, I would not want a distributed group of actors with | limited trust to sequence my DNA. Maybe it's a project for | close group of friends that would be interested? | netizen-936824 wrote: | I wasn't thinking sequencing but rather comparison. Could | even hash data for comparison to enforce privacy (unsure | how effective that would be) | | But this could enable things like finding relatives which | is what I got out of the comment about 23andme. Instead of | all the data being centralized, storage and comparison | could be distributed | inciampati wrote: | Your DNA is almost exactly the same as other people's, just | a unique mix. | | Not sure what you are concerned about. What would you | expect a bad actor to do with your DNA sequences? I'm | genuinely curious. | snovv_crash wrote: | Using that analogy, all the 1s and 0s in your private key | are the same as everyone else's as well. Genetic data can | be used for all kinds of things, the worst of which would | be things like targeted diseases or planting your DNA at | a crime scene. | LinuxBender wrote: | _Your DNA is almost exactly the same as other people 's, | just a unique mix._ | | Music is exactly the same notes, just a unique mix. So | why is Sony upset that I want to stream their entire | library? But jokes aside... | | A few decades ago I fought the military on collecting my | DNA. I stalled them long enough to get my honorable | discharge and avoid that all together. It's funny you ask | because the commander asked the same thing and joked _" | Are you afraid we are going to clone you?!"_ to which I | replied, _" No sir, you should be afraid you are going to | clone me."_ and we both had a laugh because he knew I was | right. The military are not fond of critical/free | thinkers. One of me was plenty. I explained that | insurance companies were already using this data to | retroactively cancel peoples policies even if they were | not actively afflicted by something. The commander showed | me how to use the FOIA request system. | | Laws have evolved a little since then but there are | plenty of other risks. For starters, I can't easily | change my DNA like I can change my debit card. That data | can be used to tie me to others or _guilt by association_ | which is undesirable drama. It can also be used to try to | sell me things. It can also be used to target biological | weapons against specific groups of people. There appears | to be an imbalance of data sharing in this regard. [1] | Then there is simply the matter of privacy. If I want to | share my DNA with some lab that is in turn going to sell | it out to hundreds of other companies over and over | forever, I should at very least be getting paid a vast | amount of money and land and have legally binding | contracts and NDA 's that cover what is and is not | allowed to be done with my data and how long it may be | retained. That contract and the laws enforcing the | contract must have some serious teeth with very serious | ramifications for anyone violating it whether | intentionally or by mistake. | | [1] - https://www.youtube.com/watch?v=biNxl7tiVSY | dav_Oz wrote: | From a more paranoid perspective: | | I'm curious about the possible abuse scenarios given the | ubiquitous use of PCR-testing for nearly two years, now. | | If I'm informed correctly for a viable sample for NGS you | need like 2mL saliva (which sounds little but it really | takes some time: >1 min) not those trace amounts which | gets usually collected by the swabs? | fragmede wrote: | A very practical reason not to want your DNA out there, | unrestricted, is insurance costs. From car insurance, to | health insurance, to mortgage lending rates, and life | insurance, and while GINA from 2008 is supposed to | protect that information, there are loopholes with the | interpretation of that law that should give everybody | pause. | mylons wrote: | yes. if you wanted to annotate your genome you could "easily" | do it on your brand new macbook (this is ram intensive, you | probably need 32G). you'd need a reference genome, like | https://www.nist.gov/programs-projects/genome-bottle | | then you'd need a program like bwa http://bio- | bwa.sourceforge.net/ to map your data. | | then use https://samtools.github.io/bcftools/howtos/variant- | calling.h... or something else to produce variants from the | mapping results. | | then compare your resultant vcf file to something like dbSNP: | https://www.ncbi.nlm.nih.gov/snp/ | | at this point you can start generating a raw version of a | 23andMe report. | tootie wrote: | I'm unclear from this what kind of equipment you need to | extract and analyze the material? | mylons wrote: | you'd likely to have to get the nanopore sequencer in the | article or find a lab using Next Generation Sequencing to | sequence your DNA and give you "raw data" which are usually | fastq files | LinuxBender wrote: | Nice! Thankyou for the links. I will research all of this. | mylons wrote: | good luck! it's not that tough, just a lot of new | vocabulary. | GekkePrutser wrote: | I don't see any reference to the "USB dongle" mentioned in the | title. I was thinking this would be some cool thing you could do | at home. | dekhn wrote: | https://nanoporetech.com/products/minion | fragmede wrote: | I don't know if this is the exact nanopore USB dongle used in the | article, but this one is $1,000 for the base package, first | released in 2014 | | https://store.nanoporetech.com/us/minion.html | | https://www.extremetech.com/extreme/190409-minion-usb-stick-... | koeng wrote: | Yep that's the one. They update the flow cells over time. The | bit they don't tell you is the stuff you need, like a qubit, to | properly run the thing. | joshuamcginnis wrote: | A qubit or fluorometer isn't required. You can use a simple | DNA ladder to measure the relative quantity and quality of | DNA that's good enough for nanopore sequencing. I just did a | full genome sequence of a novel fungus using this exact | approach. | koeng wrote: | Huh, interesting. Did you fragment? I'd imagine comparison | of high weight gDNA wouldn't be too nice on a gel. | | You also still, in that case, need a gelbox + ladder + | loading dye + sybrsafe or whatever, so it's still not | nothing. | joshuamcginnis wrote: | I did a HMW extraction kit on the DNA and used a gel to | estimate the volume of HMW DNA. Yes, you need to be able | to run a gel, but I'm not sure what the expectation is | from folks; that you just place a random piece of non- | sterile tissue on a chip and have it do the extraction, | sequencing and assembly? That seems like an unrealistic | expectation. | inglor_cz wrote: | DNA sequencing bugs me quite a bit. | | On one hand, I would love to learn something new about my body. | | On the other hand, what if the results tell me that I am | predisposed to some horrible untreatable disease? Will I spend | the rest of my days observing every little pain or discomfort and | thinking "is this IT?" | nomercy400 wrote: | How about affinities to possible health issues, which could be | avoided if you started now and not in 20 years? | inglor_cz wrote: | I know. There is a lot of different scenarios. It is the | worst one that bugs me. Human nature in action. | | Perhaps a trusted middleman would be a solution: "just don't | tell me about anything that is totally beyond my control". | wallacoloo wrote: | well, build a whitelist of the conditions you are interested in | knowing. then just run the report through a sed filter so that | it strips out all the information you're not interested in. | destroy the original report. problem solved: infohazards | avoided. | lend000 wrote: | How does it get the DNA to go through the hole? | Cyclical wrote: | Initially, the DNA is brought near the pore through diffusive | (brownian) motion + any small attraction it'll have to the | membrane. Close to the pore it uses a combination of the | electrophoretic and electro-osmotic effects to draw the DNA | molecules through. The application of an external magnetic | field will cause the charged DNA molecules to migrate along the | field (electrophoresis). This is independent of the fluid, and | happens to any ions under voltage. The electro-osmotic flow, on | the other hand, is a motion of the fluid itself, pulling the | DNA molecules along with it. EOF is a really interesting | phenomenon which is caused by the interaction between the | surface chemistry (vis-a-vis charge distribution) and the | concentration gradient of charge carriers in the fluid. I'd | recommend Fundamentals and Application of Microfluidics by | Nguyen et al if you're looking for a good primer on | electrically induced flows in microfluidics. | dekhn wrote: | Folks are free to analyze my genome, https://my.pgp- | hms.org/profile/hu80855C | | Last time it was analyzed the conclusion was that there was | nothing actionable. | zmmmmm wrote: | Have you ever encountered any insurance implications from it? | eg: questioned whether you have ever had a genomic test etc. | and had to answer yes and then them wanting to see results? | | I guess in your case where nothing actionable is found it's | benign. It will be the cases where there are risk factors for | late onset things - cancer, diabetes, heart disease etc. where | it would get sticky. | dekhn wrote: | No, my health insurance company doesn't care about my whole | genome data. Health Insurance companies are already quite | skilled at (and profitable due to) their ability to model | life expectancy and health issues without genomic data, and | they are legally prohibited from using this data, in my | country anyway. Life insurance is different (they are allowed | to incorporate much more information) but I've never been | asked for anything like that. | | As for the case where nothing actionable is found- it's not | benign. It's absence of information, not information of | absence. | Cyclical wrote: | Nanopore sequencing is a really interesting technology. It | utilizes fundamentally the same apparatus as a Coulter Counter | [1], which is a general method of counting and sizing arbitrary | particles that's frequently used in flow cytometry. Applying it | to sequencing by drawing unwound DNA through the pore was a | really excellent logical leap, and we're only now starting to see | the benefits of even though it was first ideated over 30 years | ago. | | [1] https://en.wikipedia.org/wiki/Coulter_counter | billiam wrote: | TMI. | a-dub wrote: | the nanopore units are awesome! although if i recall, most of the | device is a replaceable one time use consumable and the cost of | that consumable is quite expensive (at least hundreds, if not | thousands). | | when i looked i was interested, but was turned off when i saw | that the cost far outstripped commercial sequencing services. ___________________________________________________________________ (page generated 2021-12-26 23:00 UTC)