[HN Gopher] AlphaFold won't revolutionise drug discovery ___________________________________________________________________ AlphaFold won't revolutionise drug discovery Author : panabee Score : 79 points Date : 2022-08-06 19:35 UTC (3 hours ago) (HTM) web link (www.chemistryworld.com) (TXT) w3m dump (www.chemistryworld.com) | pcrh wrote: | This article makes some good points, but is incorrect on some | others. | | In particular the statement " It is very, very rare for knowledge | of a protein's structure to be any sort of rate-limiting step in | a drug discovery project!" does not reflect the realities of drug | discovery. | | Knowing a protein's structure, and its structure when complexed | with ligands/drugs, is a massively important bit of data in the | armoury of medicinal chemists, which Derek Lowe knows all too | well. | | Of course, it may be that knowing which protein to target is more | important, and that problem isn't affected by AlphaFold. | aurizon wrote: | I think Alphafold will be of immense value. The article is too | pessimistic. Alphafold will reveal a huge number of structural | parameters that can be exploited by careful synthetic chemistry. | You can know what base pair to change and see if that changes or | blocks function. Things like Paxlovid can be tweaked towards an | optimum. It is an analog to the way the Rosetta stock unlocked a | few ancient languages. With this tool a huge number of testable | structures will be amenable to tweaking, since we now have online | ordering of almost any sequence. The tedious step will be the wet | testing, but that has been solved by the multiple well test | slides.We can now sequence and protein via dye/pore/emf methods. | https://en.wikipedia.org/wiki/Nanopore_sequencing | zmmmmm wrote: | This take seems to be somewhat over sceptical and slightly over- | reaching to me. | | > when your entire computational technique is built on finding | analogies to known structures, what can you do when there's no | structure to compare to | | Lots of people seem focused on the idea that deep networks can't | do anything novel and are just like fancy search engines that | find a similar example and copy it. This is _not true_. They do | learn from much deeper low level structures in the domain they | are exposed to. They can be aware of implicit correlations and | constraints that are totally outside what may be recognised in | the scientific understanding. Hence AlphaFold is quite capable of | predicting a structure for which there is no previous direct | "analogy". As long as the protein has to follow the laws of | physics then AlphaFold as at least a basis to work from in | successfully predicting the structure. | | > It is very, very rare for knowledge of a protein's structure to | be any sort of rate-limiting step in a drug discovery project! | | This and the following text are very reductive. It's like saying, | back in 1945 that nuclear weapons would not be any sort of | advantage in WW2 because it is very rare for weapons of mass | destruction to win a war. Well yes it was rare, because they | didn't exist. And so too did we not have a meaningfully accurate | way to predict protein structures until AlphaFold. We've barely | even begun to exploit the possible new opportunities for how to | use that. And people have barely scratched the surface in | adapting AlphaFold to tackle the related challenges downstream | from straight up structure prediction. Predicting formation of | complexes and interactions is the obvious next step and it's | exactly what people are doing. | | It's not to say that it _will_ revolutionise drug development, | but the author 's argument here is that he is confident it _will | not_ and he really doesn 't assert much evidence of that. | ramraj07 wrote: | If you're gonna get mad and quote a sentence to rail on the | author, at the least quote the full sentence: the author ends | it with "and there never will be." Because among other things | he's talking about intrinsically disordered proteins[1]. What | can the best prediction model do to predict the truly | unpredictable? Just tell us that it's unpredictable. | | And what is your second criticism exactly? The author comes | from the drug discovery industry. What the author said should | be generalized to: even if we know the perfect experimentally | confirmed 1A resolution structure of every protein out there | tomorrow, that won't exactly revolutionize drug discovery. | That's because protein structure gives maybe 10% of the context | you need to successfully design a drug. It's dynamics, higher | order interaction specifics, complex interplay in signaling | pathways in particular cells in particular contexts and what | entire cells and organ systems in THAT PARTICULAR ORGANISM do | when this protein is perturbed, are what truly affects drug | discovery. | | If you absolutely want to revolutionize DD, find us a better | model to test things on that's closer to the human body as a | whole. Currently mice and rats are used and they're not cutting | it anymore. | | This fundamentally goes back to the downfall of the | prototypical math or software guy trying to come and say "im | gonna cure cancer with MATH!" No you're not. You're gonna help, | and it's appreciated, but if you're gonna truly cure cancer you | better start stomping on a few thousand mice and maybe also get | an MD. | | 1. https://www.nature.com/articles/nrm3920 | vcdimension wrote: | This proposed improvement in the way RCT's are conducted could | have a big impact on the speed of drug discovery: | https://arxiv.org/pdf/1810.02876.pdf | bilsbie wrote: | > Forming these coils, loops, and sheets is what proteins | generally do, but 'why?' doesn't enter into it. | | How do we know the model hasn't figured out some of the 'whys' | somewhere in there? | bigdict wrote: | Because it learns a conditional distribution. It doesn't work | on figuring out why the distribution is the way it is. | lrem wrote: | Your question fundamentally falls into the area of unanswerable | philosophy akin to "do insects feel pain?" | | But there's a reasonable intuition suggesting that the answer | to your question is "no". What we're looking at is a non-linear | regression model reproducing the function (which according to | the article isn't really a function, but that's above both my | and the model's knowledge) from a gene sequence to a 3d | structure. It is heavily meta-optimised, so the "why's" would | only be in the model, if reproducing the process of folding the | protein was the cheapest way to guess the structure (). | Intuitively it introduces at least one extra dimension, so | should be way more expensive than finding analogues among known | sub-aspects of the function. Hence, I would expect none of the | "why's" to be in there. | | Sadly, if any _insight_ for the "why's" was there after all, | we don't have a method to extract it anyway. | | Disclaimer: I work in Google, far away from DeepMind, have no | internal knowledge on this. | salty_biscuits wrote: | "Sadly, if any insight for the "why's" was there after all, | we don't have a method to extract it anyway." | | This has been my central frustration with working in ML. | People always expect a "why" to exist, and by why I mean a | cogent narrative explanation to complex phenomena. Maybe | there is no "why" like this for a bunch of physical | phenomena, maybe it is just a bunch of low level intricate | stuff interacting in complex ways. There might be an emergent | model that you can get a useful predictive model for with an | ML model, then people get mad because the prediction doesn't | solve the real meta problem that they were expecting to solve | via the sub problem (e.g. solve folding then get mad because | folding itself turns out not to be super useful because we | don't know which protein to target, solve image | classification then get mad because that doesn't make it easy | to make a self driving car, etc, etc). "More is different" is | definitely an idea in physics that needs to propagate into | other fields to temper our expectations. | sgt101 wrote: | The "why" is a bit of an odd question anyway - the structure is | as as the structure is. It's like asking why "red hears a | galaxy", just words. | freemint wrote: | Well, no. A bunch of mechanism have models of lower | complexity that have almost exactly the same predictive power | but a completely different structures. Those higher order | structures are the "why". | | Why did does a cube on a inclined plane start to slide? You | act like the correct answer is "because the subatomic | particles and space time in the light cone of the experiment | made it that way" when one should expect "because the sin of | the incline angle times mass times local gravity became | bigger then the static friction between a cos(incline angle) | times the original cube weight and the surface at no incline" | which is a lot simpler. | pelorat wrote: | It's probably not even possible for a human to understand the | "why" | summerlight wrote: | The modern world is not that simple enough to allow a single | paper or technology to revolutionize anything. I don't understand | why people are reiterating this obvious fact over and over? Most | of the technological breakthroughs are usually a culmination of | decades of research and investments. | evouga wrote: | My observation is that breathless hype pieces proclaiming that | a new technology will imminently revolutionize area X outnumber | the articles expressing common-sense skepticism about the | technology, by two to three orders of magnitude. | xiphias2 wrote: | While the article is correct that knowing the protein structures | in itself is not that interesting, it's a prerequisite step to | predicting interactions between proteins, which is super | interesting for drug discovery. | | What's encouraging is the rate of progress, not what has already | been done. | AlbertCory wrote: | In the early 2000s, I took a bunch of UCSC Extension courses on | mol bio, bioinformatics, and drug discovery. Back then, abundant | DNA information was the thing revolutionizing the field. | | What the scientists (all from Roche) said was, more or less, | "yeah, that helps a lot. It doesn't solve the whole problem, | though." | | 20 years later they've gotten yet more help with Alphafold. Once | again, they can do things faster, but it isn't a Moore's Law-type | of change. It's still a really hard problem demanding culture, | animal, and human tests, and those take time and money. | aabhay wrote: | The author doesn't answer the question. If not this, then what | will? Because as far as I can tell, we're nowhere near extracting | the full value of AI-generated protein structures. Why plant this | flag and be wrong later if you have no real idea of what should | be done instead? | ChrisRackauckas wrote: | The questions he asks here are exactly the questions that | quantitative systems pharmacology (QSP) seeks to answer (and as | a result, it's booming as a field). Just because you can build | a drug to inactivate said protein doesn't mean you should. 85% | of clinical trials fail as he states, and one of the main | reasons why is because the target ends up being incorrect. | Targeting some protein because a lot of it seems to exist when | a given disease is occurring might end up targeting the symptom | instead of the cause. Understanding how the complex systems | interact, their feedbacks and their nonlinearities, is | essential to knowing what needs to be targeted. We had already | been able to quickly create new drug candidates, and with | protein folding predictions we can now do that even faster. | Those drugs can be tested in a lab to see if they bind to the | proteins they're supposed to, and they keep getting quicker at | hitting exactly the function they expected. But without making | the billion dollar clinical trial more likely to be solving the | actual problem, we're still going to be limited by "okay, so | what in this pool of possible drugs should we risk trying | next"? We can accurately knock out protein function, but we're | still fishing in the dark when it comes to how to actually fix | and regulate bodies. | echelon wrote: | Because we have to be honest with ourselves. Don't tell the | crystallographers they're no longer necessary for structure | determination. If people flee the field and ML doesn't pan out, | then we're worse off. | | Treat this as it is. An exciting approach that may help some | now and yield fantastic results in the future. Don't count the | chickens before they hatch. | | Even if the structures were entirely correct - and they're | definitely not - there's a massive complex metabolome to figure | out. | | Google is certainly milking the PR as much as they can, and | that can be dangerous to the laymen approving research budgets. | tigershark wrote: | Alphafold demonstrated beyond any reasonable doubt that | crystallography by itself is useless in certain | circumstances. There are plenty of research groups working on | crystallography that found the correct solution only | combining their data with Alphafold data. In the last | competition, if I remember correctly, there was one protein | that escaped crystallography for many years until they used | Alphafold predicted structure. I'm not really sure how can | you simply discount these really groundbreaking results when | crystallography provided much less wins in many more years. | l33tman wrote: | You are aware of that almost all known protein structures | come from crystallography? | pas wrote: | How much of the AlphaFold training data is from | crystallography results? | evouga wrote: | The crystallographically-determined structures are the | _ground truth_! [1] | | Saying that AlphaFold makes x-ray crystallography useless | is like saying DALL-E makes photography useless or Copilot | makes GitHub useless. You've got the dependency chain | backwards. | | [1] (Or at least, they're treated as the ground truth--- | they don't necessarily predict the conformation of proteins | in solution, but that's a separate topic for another | thread). | freemint wrote: | Near real time (max 100 times slower then real time) | differentiable, stochastic multi organ simulations with | chemically accurate time and environment depending dynamic | structure changes at all possible binding targets or | interactions with body own components and third party drugs. | | Without machine learning at every atom is dynamic precision we | are at 10^-18 L (liters) at 20 micro seconds a week with a | specialised super computer | https://dl.acm.org/doi/abs/10.1145/3458817.3487397 . | | A solution does not need that precision everywhere. However a | machine learning proxy of such precision in every relevant | environment including 2d surface along non mixing fluid etc for | every likely type of interaction is required so we can be | certain of the possible outcomes. | | That would allow humanity to pre-screen a bunch of edge | conditions and check for unintended or previously explained | side effects. The derived surrogates for environment dependent | reaction rates could be used in a spatially distributed event | based simulations with level of precision ranging from atoms | with position and electrons in orbits subject to electro- | magnetic force interaction, molecules as things with position | and rotation and folding state, concentration gradients of | those as stochastic 3d PDEs, 2d PDEs, 1d PDEs and ODEs of the | number of moles with relevant boundary conditions. If we had | those reaction rates down and knew of all the proteins and | other structures i am positive that a proxy model of relevant | parts of the human body could achieve enough accuracy to be | practical at pre-screening drugs with todays super computers. | dekhn wrote: | To revolutionize drug discovery, you need to solve a number of | problems that ML can't really address right now. | | We do not have well-formed theories of the molecular details of | many diseases. There is no immediate computational approach that | address this defect. The community has had fairly simplified | models for some time, and there's a lot of historical belief that | by knowing protein structures in details, we can understand the | nature of a disease through its molecular etiology, and from | that, we can make drugs (either small molecules or biomolecules) | that modulate proteins in rational ways to eliminate the disease | with a minimum of side effects. | | In my mind, much of the problem is similar to modern deep | learning compared to previous techniques. Several extremely | challenging problems (high accuracy voice recognition, image | recognition, object detection) simple were not solvable through | the statistical techniques and mental models adopted by the | practitioners. It is not abundantly obvious that stupidly simple | deep networks can be pretrained on enormous amounts of labelled | data, or even unlabelled data, but we didn't even have the | ability to know this confidently until we had the right network | architectures, enough high quality labelled data, and adequate | compute power to train them. | | I believe that by starting to think about disease modelling from | the same mindset as deep learning (simple models with many | parameters, the models don't actually represent the assumed | mechanism, large amounts of high quality data, lots of CPU, GPU, | and RAM) and also thinking of the disease treatment process in | the same way will greatly increase our ability to "understand" | and "treat" diseases, while knowing far less about their | underlying mechanism that we thought. | | A common example is disease/patient stratification. If you've | developed drugs that treat disease A, but it turns out later, | there are really two diseases, A1 and A2 with different | underlying mechanisms but superficially similar exterior | symptoms, you'll realize why some percentage of your population | didn't get better (and often got worse, given the underlying | toxicity of some medicines). If we could just stratify diseases | better, and classify patients into the right bins, the | effectiveness of medicine will go up (and drugs will get through | clinical trials faster/better). | | None of this addresses the later-stage issues, such as | successfully running all the phases of a clinical trial and the | other gauntlets you must pass in order to get a drug FDA- | approved. | | I would continue to expect marginal improvements for the | foreseeable future. But be aware: some companies already have | managed to do a good enough job developing new medicines that | they routinely create multi-billion-dollar blockbuster drugs year | after year after year (my employer, Genentech, is a perfect | example of that). It maintains an enormous and well-funded R&D | arm that expends untold neurons attempting to understanding | disease better even before we start to consider something as | "druggable". | curious_cat_163 wrote: | Mapping DNA sequences to 3D protein structure is the problem that | the AlphaFold tries to solve. I don't think it tries to solve for | "drug discovery". | | I suspect that, like any ML problem, this one is a small part of | the whole solution of drug discovery. There are always system- | level dynamics at play. | | To me, some relevant questions before deciding to take on an ML | problem tend to be: | | [x] Does solving it eliminate manual labor from the process? [x] | Does it save $ in the progress towards solving the whole problem? | [x] Is it fun to solve it? | microSnowball wrote: | I think alphafold gets hated on too much. It won't revolutionize | things but I bet people are out there right now looking at | different structures and motifs only seen on alphafold to get a | better idea on how existing drugs bind and affect them. And then | designing analogues and so on. Time will tell, I guess. | | It's kind of like anything in research, lots of small steps | enable revolutionary breakthroughs every so often. | fabian2k wrote: | You can assume that any known drug target has experimentally | determined structures available, once you spend the enormous | amounts of effort necessary to put a drug through real clinical | trials the effort to determine the target structure is pretty | much irrelevant. | | Of course there are plenty of drugs where we either don't know | where they bind or we're probably wrong about where we think | they bind. Or they bind at multiple places and some desirable | or non-desirable effect are due to binding at places we don't | know yet. | | There are real uses to having lots of high-quality structure | predictions for proteins. Drug development is something that | only get limited benefits here. If you want to know how drugs | or drug candidates bind to proteins you first create a protein | structure with X-ray crystallography. Then you soak your | crystals with your drugs or drug candidates and determine even | more structures. The interesting part here is not necessarily | the overall fold of the protein (which is mostly what AlphaFold | gives you) but e.g. a single hydrogen bond to the drug in the | active pocket of the target protein. You need really high- | quality data if you want to do any kind of rational drug | design, most of the time we still just semi-randomly vary | structures until they bind better as far as I understand. | epistasis wrote: | I think it gets marketed too much and hated on too much. | | Given the utter dominance of Google advertising, I think the | hating is a necessary counter in order to at least place it in | its right place. | | Whatever skill Google has computationally is more than matched | by their media dominance and public relations prowess. | mtlmtlmtlmtl wrote: | I find this view very strange. If you apply the same logic to | politics, the outcome is pretty grim. And we've been seeing | more and more of that. | | I don't like hype or hate that's devoid of nuance. But actual | scientists working in these fields don't generally pay | attention to these things as much as we might. They read the | papers, and they have years of training to help them decide | what is overhyped and what isn't. I'm not sure what happens | on HN or in advertising channels has such a huge bearing on | this. | [deleted] | lrem wrote: | I _love_ this article. It nicely answers the question I posed | (https://news.ycombinator.com/threads?id=lrem#32263287) in the | discussion of the original announcement: is today's db good | enough to be a breakthrough for something useful, e.g. pharma or | agriculture? And the answer, somewhat unsurprisingly, seems to be | "useful, but not life-changing". And that's a perfectly good | result in my eyes :) | frozencell wrote: | OpenAI's GPT-3 and DALL*E2 might be life-changing for their | creative users, writers and illustrators or beginner creators, | I can't remember any life-changing use case for groups (outside | of the creators themselves). For ML researchers, transformers | seem to not be used as AGI at all (despite general or multi- | modal potential) but mostly used for test and probability tool. | p1esk wrote: | _For ML researchers, transformers seem to not be used as AGI | at all (despite general or multi-modal potential) but mostly | used for test and probability tool._ | | What do you mean? | SilasX wrote: | That link goes to the top of your comment history. I think you | want this link to ensure you see the right comment: | | https://news.ycombinator.com/item?id=32263287 ___________________________________________________________________ (page generated 2022-08-06 23:00 UTC)