[HN Gopher] AlphaFold won't revolutionise drug discovery
       ___________________________________________________________________
        
       AlphaFold won't revolutionise drug discovery
        
       Author : panabee
       Score  : 79 points
       Date   : 2022-08-06 19:35 UTC (3 hours ago)
        
 (HTM) web link (www.chemistryworld.com)
 (TXT) w3m dump (www.chemistryworld.com)
        
       | pcrh wrote:
       | This article makes some good points, but is incorrect on some
       | others.
       | 
       | In particular the statement " It is very, very rare for knowledge
       | of a protein's structure to be any sort of rate-limiting step in
       | a drug discovery project!" does not reflect the realities of drug
       | discovery.
       | 
       | Knowing a protein's structure, and its structure when complexed
       | with ligands/drugs, is a massively important bit of data in the
       | armoury of medicinal chemists, which Derek Lowe knows all too
       | well.
       | 
       | Of course, it may be that knowing which protein to target is more
       | important, and that problem isn't affected by AlphaFold.
        
       | aurizon wrote:
       | I think Alphafold will be of immense value. The article is too
       | pessimistic. Alphafold will reveal a huge number of structural
       | parameters that can be exploited by careful synthetic chemistry.
       | You can know what base pair to change and see if that changes or
       | blocks function. Things like Paxlovid can be tweaked towards an
       | optimum. It is an analog to the way the Rosetta stock unlocked a
       | few ancient languages. With this tool a huge number of testable
       | structures will be amenable to tweaking, since we now have online
       | ordering of almost any sequence. The tedious step will be the wet
       | testing, but that has been solved by the multiple well test
       | slides.We can now sequence and protein via dye/pore/emf methods.
       | https://en.wikipedia.org/wiki/Nanopore_sequencing
        
       | zmmmmm wrote:
       | This take seems to be somewhat over sceptical and slightly over-
       | reaching to me.
       | 
       | > when your entire computational technique is built on finding
       | analogies to known structures, what can you do when there's no
       | structure to compare to
       | 
       | Lots of people seem focused on the idea that deep networks can't
       | do anything novel and are just like fancy search engines that
       | find a similar example and copy it. This is _not true_. They do
       | learn from much deeper low level structures in the domain they
       | are exposed to. They can be aware of implicit correlations and
       | constraints that are totally outside what may be recognised in
       | the scientific understanding. Hence AlphaFold is quite capable of
       | predicting a structure for which there is no previous direct
       | "analogy". As long as the protein has to follow the laws of
       | physics then AlphaFold as at least a basis to work from in
       | successfully predicting the structure.
       | 
       | > It is very, very rare for knowledge of a protein's structure to
       | be any sort of rate-limiting step in a drug discovery project!
       | 
       | This and the following text are very reductive. It's like saying,
       | back in 1945 that nuclear weapons would not be any sort of
       | advantage in WW2 because it is very rare for weapons of mass
       | destruction to win a war. Well yes it was rare, because they
       | didn't exist. And so too did we not have a meaningfully accurate
       | way to predict protein structures until AlphaFold. We've barely
       | even begun to exploit the possible new opportunities for how to
       | use that. And people have barely scratched the surface in
       | adapting AlphaFold to tackle the related challenges downstream
       | from straight up structure prediction. Predicting formation of
       | complexes and interactions is the obvious next step and it's
       | exactly what people are doing.
       | 
       | It's not to say that it _will_ revolutionise drug development,
       | but the author 's argument here is that he is confident it _will
       | not_ and he really doesn 't assert much evidence of that.
        
         | ramraj07 wrote:
         | If you're gonna get mad and quote a sentence to rail on the
         | author, at the least quote the full sentence: the author ends
         | it with "and there never will be." Because among other things
         | he's talking about intrinsically disordered proteins[1]. What
         | can the best prediction model do to predict the truly
         | unpredictable? Just tell us that it's unpredictable.
         | 
         | And what is your second criticism exactly? The author comes
         | from the drug discovery industry. What the author said should
         | be generalized to: even if we know the perfect experimentally
         | confirmed 1A resolution structure of every protein out there
         | tomorrow, that won't exactly revolutionize drug discovery.
         | That's because protein structure gives maybe 10% of the context
         | you need to successfully design a drug. It's dynamics, higher
         | order interaction specifics, complex interplay in signaling
         | pathways in particular cells in particular contexts and what
         | entire cells and organ systems in THAT PARTICULAR ORGANISM do
         | when this protein is perturbed, are what truly affects drug
         | discovery.
         | 
         | If you absolutely want to revolutionize DD, find us a better
         | model to test things on that's closer to the human body as a
         | whole. Currently mice and rats are used and they're not cutting
         | it anymore.
         | 
         | This fundamentally goes back to the downfall of the
         | prototypical math or software guy trying to come and say "im
         | gonna cure cancer with MATH!" No you're not. You're gonna help,
         | and it's appreciated, but if you're gonna truly cure cancer you
         | better start stomping on a few thousand mice and maybe also get
         | an MD.
         | 
         | 1. https://www.nature.com/articles/nrm3920
        
       | vcdimension wrote:
       | This proposed improvement in the way RCT's are conducted could
       | have a big impact on the speed of drug discovery:
       | https://arxiv.org/pdf/1810.02876.pdf
        
       | bilsbie wrote:
       | > Forming these coils, loops, and sheets is what proteins
       | generally do, but 'why?' doesn't enter into it.
       | 
       | How do we know the model hasn't figured out some of the 'whys'
       | somewhere in there?
        
         | bigdict wrote:
         | Because it learns a conditional distribution. It doesn't work
         | on figuring out why the distribution is the way it is.
        
         | lrem wrote:
         | Your question fundamentally falls into the area of unanswerable
         | philosophy akin to "do insects feel pain?"
         | 
         | But there's a reasonable intuition suggesting that the answer
         | to your question is "no". What we're looking at is a non-linear
         | regression model reproducing the function (which according to
         | the article isn't really a function, but that's above both my
         | and the model's knowledge) from a gene sequence to a 3d
         | structure. It is heavily meta-optimised, so the "why's" would
         | only be in the model, if reproducing the process of folding the
         | protein was the cheapest way to guess the structure ().
         | Intuitively it introduces at least one extra dimension, so
         | should be way more expensive than finding analogues among known
         | sub-aspects of the function. Hence, I would expect none of the
         | "why's" to be in there.
         | 
         | Sadly, if any _insight_ for the  "why's" was there after all,
         | we don't have a method to extract it anyway.
         | 
         | Disclaimer: I work in Google, far away from DeepMind, have no
         | internal knowledge on this.
        
           | salty_biscuits wrote:
           | "Sadly, if any insight for the "why's" was there after all,
           | we don't have a method to extract it anyway."
           | 
           | This has been my central frustration with working in ML.
           | People always expect a "why" to exist, and by why I mean a
           | cogent narrative explanation to complex phenomena. Maybe
           | there is no "why" like this for a bunch of physical
           | phenomena, maybe it is just a bunch of low level intricate
           | stuff interacting in complex ways. There might be an emergent
           | model that you can get a useful predictive model for with an
           | ML model, then people get mad because the prediction doesn't
           | solve the real meta problem that they were expecting to solve
           | via the sub problem (e.g. solve folding then get mad because
           | folding itself turns out not to be super useful because we
           | don't know which protein to target, solve image
           | classification then get mad because that doesn't make it easy
           | to make a self driving car, etc, etc). "More is different" is
           | definitely an idea in physics that needs to propagate into
           | other fields to temper our expectations.
        
         | sgt101 wrote:
         | The "why" is a bit of an odd question anyway - the structure is
         | as as the structure is. It's like asking why "red hears a
         | galaxy", just words.
        
           | freemint wrote:
           | Well, no. A bunch of mechanism have models of lower
           | complexity that have almost exactly the same predictive power
           | but a completely different structures. Those higher order
           | structures are the "why".
           | 
           | Why did does a cube on a inclined plane start to slide? You
           | act like the correct answer is "because the subatomic
           | particles and space time in the light cone of the experiment
           | made it that way" when one should expect "because the sin of
           | the incline angle times mass times local gravity became
           | bigger then the static friction between a cos(incline angle)
           | times the original cube weight and the surface at no incline"
           | which is a lot simpler.
        
         | pelorat wrote:
         | It's probably not even possible for a human to understand the
         | "why"
        
       | summerlight wrote:
       | The modern world is not that simple enough to allow a single
       | paper or technology to revolutionize anything. I don't understand
       | why people are reiterating this obvious fact over and over? Most
       | of the technological breakthroughs are usually a culmination of
       | decades of research and investments.
        
         | evouga wrote:
         | My observation is that breathless hype pieces proclaiming that
         | a new technology will imminently revolutionize area X outnumber
         | the articles expressing common-sense skepticism about the
         | technology, by two to three orders of magnitude.
        
       | xiphias2 wrote:
       | While the article is correct that knowing the protein structures
       | in itself is not that interesting, it's a prerequisite step to
       | predicting interactions between proteins, which is super
       | interesting for drug discovery.
       | 
       | What's encouraging is the rate of progress, not what has already
       | been done.
        
       | AlbertCory wrote:
       | In the early 2000s, I took a bunch of UCSC Extension courses on
       | mol bio, bioinformatics, and drug discovery. Back then, abundant
       | DNA information was the thing revolutionizing the field.
       | 
       | What the scientists (all from Roche) said was, more or less,
       | "yeah, that helps a lot. It doesn't solve the whole problem,
       | though."
       | 
       | 20 years later they've gotten yet more help with Alphafold. Once
       | again, they can do things faster, but it isn't a Moore's Law-type
       | of change. It's still a really hard problem demanding culture,
       | animal, and human tests, and those take time and money.
        
       | aabhay wrote:
       | The author doesn't answer the question. If not this, then what
       | will? Because as far as I can tell, we're nowhere near extracting
       | the full value of AI-generated protein structures. Why plant this
       | flag and be wrong later if you have no real idea of what should
       | be done instead?
        
         | ChrisRackauckas wrote:
         | The questions he asks here are exactly the questions that
         | quantitative systems pharmacology (QSP) seeks to answer (and as
         | a result, it's booming as a field). Just because you can build
         | a drug to inactivate said protein doesn't mean you should. 85%
         | of clinical trials fail as he states, and one of the main
         | reasons why is because the target ends up being incorrect.
         | Targeting some protein because a lot of it seems to exist when
         | a given disease is occurring might end up targeting the symptom
         | instead of the cause. Understanding how the complex systems
         | interact, their feedbacks and their nonlinearities, is
         | essential to knowing what needs to be targeted. We had already
         | been able to quickly create new drug candidates, and with
         | protein folding predictions we can now do that even faster.
         | Those drugs can be tested in a lab to see if they bind to the
         | proteins they're supposed to, and they keep getting quicker at
         | hitting exactly the function they expected. But without making
         | the billion dollar clinical trial more likely to be solving the
         | actual problem, we're still going to be limited by "okay, so
         | what in this pool of possible drugs should we risk trying
         | next"? We can accurately knock out protein function, but we're
         | still fishing in the dark when it comes to how to actually fix
         | and regulate bodies.
        
         | echelon wrote:
         | Because we have to be honest with ourselves. Don't tell the
         | crystallographers they're no longer necessary for structure
         | determination. If people flee the field and ML doesn't pan out,
         | then we're worse off.
         | 
         | Treat this as it is. An exciting approach that may help some
         | now and yield fantastic results in the future. Don't count the
         | chickens before they hatch.
         | 
         | Even if the structures were entirely correct - and they're
         | definitely not - there's a massive complex metabolome to figure
         | out.
         | 
         | Google is certainly milking the PR as much as they can, and
         | that can be dangerous to the laymen approving research budgets.
        
           | tigershark wrote:
           | Alphafold demonstrated beyond any reasonable doubt that
           | crystallography by itself is useless in certain
           | circumstances. There are plenty of research groups working on
           | crystallography that found the correct solution only
           | combining their data with Alphafold data. In the last
           | competition, if I remember correctly, there was one protein
           | that escaped crystallography for many years until they used
           | Alphafold predicted structure. I'm not really sure how can
           | you simply discount these really groundbreaking results when
           | crystallography provided much less wins in many more years.
        
             | l33tman wrote:
             | You are aware of that almost all known protein structures
             | come from crystallography?
        
             | pas wrote:
             | How much of the AlphaFold training data is from
             | crystallography results?
        
             | evouga wrote:
             | The crystallographically-determined structures are the
             | _ground truth_! [1]
             | 
             | Saying that AlphaFold makes x-ray crystallography useless
             | is like saying DALL-E makes photography useless or Copilot
             | makes GitHub useless. You've got the dependency chain
             | backwards.
             | 
             | [1] (Or at least, they're treated as the ground truth---
             | they don't necessarily predict the conformation of proteins
             | in solution, but that's a separate topic for another
             | thread).
        
         | freemint wrote:
         | Near real time (max 100 times slower then real time)
         | differentiable, stochastic multi organ simulations with
         | chemically accurate time and environment depending dynamic
         | structure changes at all possible binding targets or
         | interactions with body own components and third party drugs.
         | 
         | Without machine learning at every atom is dynamic precision we
         | are at 10^-18 L (liters) at 20 micro seconds a week with a
         | specialised super computer
         | https://dl.acm.org/doi/abs/10.1145/3458817.3487397 .
         | 
         | A solution does not need that precision everywhere. However a
         | machine learning proxy of such precision in every relevant
         | environment including 2d surface along non mixing fluid etc for
         | every likely type of interaction is required so we can be
         | certain of the possible outcomes.
         | 
         | That would allow humanity to pre-screen a bunch of edge
         | conditions and check for unintended or previously explained
         | side effects. The derived surrogates for environment dependent
         | reaction rates could be used in a spatially distributed event
         | based simulations with level of precision ranging from atoms
         | with position and electrons in orbits subject to electro-
         | magnetic force interaction, molecules as things with position
         | and rotation and folding state, concentration gradients of
         | those as stochastic 3d PDEs, 2d PDEs, 1d PDEs and ODEs of the
         | number of moles with relevant boundary conditions. If we had
         | those reaction rates down and knew of all the proteins and
         | other structures i am positive that a proxy model of relevant
         | parts of the human body could achieve enough accuracy to be
         | practical at pre-screening drugs with todays super computers.
        
       | dekhn wrote:
       | To revolutionize drug discovery, you need to solve a number of
       | problems that ML can't really address right now.
       | 
       | We do not have well-formed theories of the molecular details of
       | many diseases. There is no immediate computational approach that
       | address this defect. The community has had fairly simplified
       | models for some time, and there's a lot of historical belief that
       | by knowing protein structures in details, we can understand the
       | nature of a disease through its molecular etiology, and from
       | that, we can make drugs (either small molecules or biomolecules)
       | that modulate proteins in rational ways to eliminate the disease
       | with a minimum of side effects.
       | 
       | In my mind, much of the problem is similar to modern deep
       | learning compared to previous techniques. Several extremely
       | challenging problems (high accuracy voice recognition, image
       | recognition, object detection) simple were not solvable through
       | the statistical techniques and mental models adopted by the
       | practitioners. It is not abundantly obvious that stupidly simple
       | deep networks can be pretrained on enormous amounts of labelled
       | data, or even unlabelled data, but we didn't even have the
       | ability to know this confidently until we had the right network
       | architectures, enough high quality labelled data, and adequate
       | compute power to train them.
       | 
       | I believe that by starting to think about disease modelling from
       | the same mindset as deep learning (simple models with many
       | parameters, the models don't actually represent the assumed
       | mechanism, large amounts of high quality data, lots of CPU, GPU,
       | and RAM) and also thinking of the disease treatment process in
       | the same way will greatly increase our ability to "understand"
       | and "treat" diseases, while knowing far less about their
       | underlying mechanism that we thought.
       | 
       | A common example is disease/patient stratification. If you've
       | developed drugs that treat disease A, but it turns out later,
       | there are really two diseases, A1 and A2 with different
       | underlying mechanisms but superficially similar exterior
       | symptoms, you'll realize why some percentage of your population
       | didn't get better (and often got worse, given the underlying
       | toxicity of some medicines). If we could just stratify diseases
       | better, and classify patients into the right bins, the
       | effectiveness of medicine will go up (and drugs will get through
       | clinical trials faster/better).
       | 
       | None of this addresses the later-stage issues, such as
       | successfully running all the phases of a clinical trial and the
       | other gauntlets you must pass in order to get a drug FDA-
       | approved.
       | 
       | I would continue to expect marginal improvements for the
       | foreseeable future. But be aware: some companies already have
       | managed to do a good enough job developing new medicines that
       | they routinely create multi-billion-dollar blockbuster drugs year
       | after year after year (my employer, Genentech, is a perfect
       | example of that). It maintains an enormous and well-funded R&D
       | arm that expends untold neurons attempting to understanding
       | disease better even before we start to consider something as
       | "druggable".
        
       | curious_cat_163 wrote:
       | Mapping DNA sequences to 3D protein structure is the problem that
       | the AlphaFold tries to solve. I don't think it tries to solve for
       | "drug discovery".
       | 
       | I suspect that, like any ML problem, this one is a small part of
       | the whole solution of drug discovery. There are always system-
       | level dynamics at play.
       | 
       | To me, some relevant questions before deciding to take on an ML
       | problem tend to be:
       | 
       | [x] Does solving it eliminate manual labor from the process? [x]
       | Does it save $ in the progress towards solving the whole problem?
       | [x] Is it fun to solve it?
        
       | microSnowball wrote:
       | I think alphafold gets hated on too much. It won't revolutionize
       | things but I bet people are out there right now looking at
       | different structures and motifs only seen on alphafold to get a
       | better idea on how existing drugs bind and affect them. And then
       | designing analogues and so on. Time will tell, I guess.
       | 
       | It's kind of like anything in research, lots of small steps
       | enable revolutionary breakthroughs every so often.
        
         | fabian2k wrote:
         | You can assume that any known drug target has experimentally
         | determined structures available, once you spend the enormous
         | amounts of effort necessary to put a drug through real clinical
         | trials the effort to determine the target structure is pretty
         | much irrelevant.
         | 
         | Of course there are plenty of drugs where we either don't know
         | where they bind or we're probably wrong about where we think
         | they bind. Or they bind at multiple places and some desirable
         | or non-desirable effect are due to binding at places we don't
         | know yet.
         | 
         | There are real uses to having lots of high-quality structure
         | predictions for proteins. Drug development is something that
         | only get limited benefits here. If you want to know how drugs
         | or drug candidates bind to proteins you first create a protein
         | structure with X-ray crystallography. Then you soak your
         | crystals with your drugs or drug candidates and determine even
         | more structures. The interesting part here is not necessarily
         | the overall fold of the protein (which is mostly what AlphaFold
         | gives you) but e.g. a single hydrogen bond to the drug in the
         | active pocket of the target protein. You need really high-
         | quality data if you want to do any kind of rational drug
         | design, most of the time we still just semi-randomly vary
         | structures until they bind better as far as I understand.
        
         | epistasis wrote:
         | I think it gets marketed too much and hated on too much.
         | 
         | Given the utter dominance of Google advertising, I think the
         | hating is a necessary counter in order to at least place it in
         | its right place.
         | 
         | Whatever skill Google has computationally is more than matched
         | by their media dominance and public relations prowess.
        
           | mtlmtlmtlmtl wrote:
           | I find this view very strange. If you apply the same logic to
           | politics, the outcome is pretty grim. And we've been seeing
           | more and more of that.
           | 
           | I don't like hype or hate that's devoid of nuance. But actual
           | scientists working in these fields don't generally pay
           | attention to these things as much as we might. They read the
           | papers, and they have years of training to help them decide
           | what is overhyped and what isn't. I'm not sure what happens
           | on HN or in advertising channels has such a huge bearing on
           | this.
        
         | [deleted]
        
       | lrem wrote:
       | I _love_ this article. It nicely answers the question I posed
       | (https://news.ycombinator.com/threads?id=lrem#32263287) in the
       | discussion of the original announcement: is today's db good
       | enough to be a breakthrough for something useful, e.g. pharma or
       | agriculture? And the answer, somewhat unsurprisingly, seems to be
       | "useful, but not life-changing". And that's a perfectly good
       | result in my eyes :)
        
         | frozencell wrote:
         | OpenAI's GPT-3 and DALL*E2 might be life-changing for their
         | creative users, writers and illustrators or beginner creators,
         | I can't remember any life-changing use case for groups (outside
         | of the creators themselves). For ML researchers, transformers
         | seem to not be used as AGI at all (despite general or multi-
         | modal potential) but mostly used for test and probability tool.
        
           | p1esk wrote:
           | _For ML researchers, transformers seem to not be used as AGI
           | at all (despite general or multi-modal potential) but mostly
           | used for test and probability tool._
           | 
           | What do you mean?
        
         | SilasX wrote:
         | That link goes to the top of your comment history. I think you
         | want this link to ensure you see the right comment:
         | 
         | https://news.ycombinator.com/item?id=32263287
        
       ___________________________________________________________________
       (page generated 2022-08-06 23:00 UTC)