[HN Gopher] AlphaFold reveals the structure of the protein universe
       ___________________________________________________________________
        
       AlphaFold reveals the structure of the protein universe
        
       Author : MindGods
       Score  : 376 points
       Date   : 2022-07-28 11:18 UTC (5 hours ago)
        
 (HTM) web link (www.deepmind.com)
 (TXT) w3m dump (www.deepmind.com)
        
       | codedokode wrote:
       | Today I learned that there are bacteria that have a protein
       | helping to form ice on plants [1] to destroy them and extract
       | nutrients (however I didn't understand how bacteria themselves
       | survive this).
       | 
       | Machine learning typically uses existing data to predict new
       | data. Please explain: Does it mean that AlphaFold can only use
       | known types of interactions between atoms and will mispredict the
       | structure of proteins that use not yet known interactions?
       | 
       | And why we cannot just simulate protein behaviour and
       | interactions using quantum mechanics?
       | 
       | [1] https://pubs.acs.org/doi/10.1021/acs.jpcb.1c09342
        
         | flobosg wrote:
         | > And why we cannot just simulate protein behaviour and
         | interactions using quantum mechanics?
         | 
         | QM calculations have been done in proteins, but they're
         | computationally very expensive. IIRC, there are hybrid
         | approaches where only a small portion of interest in the
         | protein structure is modelled by QM and the rest by classical
         | molecular mechanics.
        
         | beanwood wrote:
         | >And why we cannot just simulate protein behaviour and
         | interactions using quantum mechanics?
         | 
         | If you wanted to simulate the behaviour of an entire protein
         | using quantum mechanics, the sheer number of calculations
         | required would be infeasible.
         | 
         | For what it's worth, I have a background in computational
         | physics and am studying a PhD in structural biology. For any
         | system (of any size) that you want to simulate, you have to
         | consider how much information you're willing to 'ignore' in
         | order to focus on the information you would like to 'get out'
         | of a set of simulations. Being aware of the approximations you
         | make and how this impacts your results is crucial.
         | 
         | For example, if I am interested in how the electrons of a group
         | of Carbon atoms (radius ~ 170 picometres) behave, I may want to
         | use Density Functional Theory (DFT), a quantum mechanical
         | method.
         | 
         | For a single, small protein (e.g. ubiquitin, radius ~ 2
         | nanometres), I may want to use atomistic molecular dynamics
         | (AMD), which models the motion of every single atom in response
         | to thermal motion, electrostatic interactions, etc using
         | Newton's 2nd law. Electron/proton detail has been approximated
         | away to focus on overall atomic motion.
         | 
         | In my line of work, we are interested in how big proteins (e.g.
         | the dynein motor protein, ~ 40 nanometres in length) move
         | around and interact with other proteins at longer time (micro-
         | to millisecond) and length (nano- to micrometre) scales than
         | DFT or AMD. We 'coarse-grain' protein structures by
         | representing groups of atoms as tetrahedra in a continuous mesh
         | (continuum mechanics). We approximate away atomic detail to
         | focus on long-term motion of the whole protein.
         | 
         | Clearly, it's not feasible to calculate the movement of dynein
         | for hundreds of nanoseconds using DFT! The motor domain alone
         | in dynein contains roughly one million atoms (and it has
         | several more 'subunits' attached to it). Assuming these are
         | mostly Carbon, Oxygen or Nitrogen, then you're looking at
         | around ten million electons in your DFT calculations, for a
         | single step in time (rounding up). If you're dealing with the
         | level of atomic bonds, you're probably going to a use time
         | steps between a femto- (10^-15 s) or picosecond (10^-12 s). The
         | numbers get a bit ridiculous. There are techniques that combine
         | QM and AMD, although I am not too knowledgeable in this area.
         | 
         | Some further reading, if you're interested (I find Wikipedia
         | articles on these topics to generally be quite good):
         | 
         | DFT: https://en.wikipedia.org/wiki/Density_functional_theory
         | 
         | Biological continuum mechanics:
         | https://doi.org/10.1371/journal.pcbi.1005897
         | 
         | Length scales in biological simulations:
         | https://doi.org/10.1107/S1399004714026777
         | 
         | Electronic time scales:
         | https://www.pnas.org/doi/10.1073/pnas.0601855103
        
       | kache_ wrote:
       | This is an incredible gift to humanity. A huge positive impact.
       | The team should be proud
        
       | carbocation wrote:
       | The press release is a bit difficult to place into historical
       | context. I believe that the first AlphaFold release was mostly
       | human and mouse proteins, and this press release marks the
       | release of structures for additional species.
        
         | azangru wrote:
         | > I believe that the first AlphaFold release was mostly human
         | and mouse proteins,
         | 
         | More than that. The press release actually contains an
         | infographic comparing the amount of published protein models
         | for different clades of organisms. The infographic shows that
         | the previous release (~1mln proteins) contained proteins of
         | some animal, plant, bacterial, and fungal species.
        
       | cm2187 wrote:
       | How do you know that the predicted structure will be correct? I
       | presume researchers will need to validate the structure
       | empirically. Do we know how good the model has been at predicting
       | so far?
        
       | donut2d wrote:
       | A fun way I've been thinking about all this is what
       | nanotech/nanobots are actually going to look like. Tiny little
       | protein machines doing what they've been doing since the dawn of
       | life. We now have a library of components, and as we start
       | figuring out what they can do, and how to stack them, we can
       | start building truly complex machinery for whatever crazy tasks
       | we can imagine. The impact goes so far beyond drugs and
       | treatments.
        
       | candiddevmike wrote:
       | Is folding@home obsolete now?
        
         | foxhop wrote:
         | I would say no, the two approaches may be used to validate each
         | other.
        
         | flobosg wrote:
         | Folding@home answers a related but different question. While
         | AlphaFold returns the picture of a folded protein in its most
         | energetically stable conformation, Folding@home returns a video
         | of the protein undergoing folding, traversing its energy
         | landscape.
        
         | dekhn wrote:
         | It's not, but the question is (and has long been) whether the
         | energy expended by folding@home is worth the scientific result.
         | IMHO- probably not.
        
         | seventytwo wrote:
         | Good question... I'd imagine that other methods of folding
         | solutions are still valuable, because AlphaFold needs to be
         | checked.
        
       | klemola wrote:
       | As an aside, the protein structure visualizations in the article
       | are pretty. Is there a good source for more?
        
         | alphabetting wrote:
         | https://alphafold.ebi.ac.uk/
        
         | flobosg wrote:
         | * https://pdb101.rcsb.org/motm/
         | 
         | * https://ccsb.scripps.edu/goodsell/
         | 
         | * https://pdb101.rcsb.org/sci-art/geis-archive/irving-geis
         | 
         | * https://www.digizyme.com/portfolio.html
         | 
         | * https://www.drewberry.com/
         | 
         | * https://biochem.web.utah.edu/iwasa/projects.html
         | 
         | * http://onemicron.com/
         | 
         | * The art of Jane Richardson, of which I couldn't find a link
         | 
         | * This blog has plenty of good links:
         | https://blogs.oregonstate.edu/psquared/
        
       | yuan43 wrote:
       | > Today, I'm incredibly excited to share the next stage of this
       | journey. In partnership with EMBL's European Bioinformatics
       | Institute (EMBL-EBI), we're now releasing predicted structures
       | for nearly all catalogued proteins known to science, which will
       | expand the AlphaFold DB by over 200x - from nearly 1 million
       | structures to over 200 million structures - with the potential to
       | dramatically increase our understanding of biology.
       | 
       | And later:
       | 
       | > Today's update means that most pages on the main protein
       | database UniProt will come with a predicted structure. All 200+
       | million structures will also be available for bulk download via
       | Google Cloud Public Datasets, making AlphaFold even more
       | accessible to scientists around the world.
       | 
       | This is the actual announcement.
       | 
       | UniProt is a large database of protein structure and function.
       | The inclusion of the predicted structures alongside the
       | experimental data makes it easier to include the predictions in
       | workflows already set up to work with the other experimental and
       | computed properties.
       | 
       | It's not completely clear from the article whether any of the
       | 200+ million predicted structures deposited to UniProt have not
       | be previously released.
       | 
       | Protein structure determines function. Before AlphaFold,
       | experimental structure determination was the only option, and
       | that's very costly. AlphaFold's predictions appears to be good
       | enough to jumpstart investigations without an experimental
       | structure determination. That has the potential to accelerate
       | many areas of science and could percolate up to therapeutics.
       | 
       | One area that doesn't get much discussion in the press is the
       | difference between solid state structure and solution state
       | structure. It's possible to obtain a solid state structure
       | determination (x-ray) that has nothing to do with actual behavior
       | in solution. Given that AlhpaFold was trained to a large extent
       | on solid state structures, it could be propagating that bias into
       | its predicted structures.
       | 
       | This paper talks about that:
       | 
       | > In the recent Critical Assessment of Structure Prediction
       | (CASP) competition, AlphaFold2 performed outstandingly. Its worst
       | predictions were for nuclear magnetic resonance (NMR) structures,
       | which has two alternative explanations: either the NMR structures
       | were poor, implying that Alpha-Fold may be more accurate than
       | NMR, or there is a genuine difference between crystal and
       | solution structures. Here, we use the program Accuracy of NMR
       | Structures Using RCI and Rigidity (ANSURR), which measures the
       | accuracy of solution structures, and show that one of the NMR
       | structures was indeed poor. We then compare Alpha-Fold
       | predictions to NMR structures and show that Alpha-Fold tends to
       | be more accurate than NMR ensembles. There are, however, some
       | cases where the NMR ensembles are more accurate. These tend to be
       | dynamic structures, where Alpha-Fold had low confidence. We
       | suggest that Alpha-Fold could be used as the model for NMR-
       | structure refinements and that Alpha-Fold structures validated by
       | ANSURR may require no further refinement.
       | 
       | https://pubmed.ncbi.nlm.nih.gov/35537451/
        
         | sseagull wrote:
         | > Before AlphaFold, experimental structure determination was
         | the only option
         | 
         | Other computational methods have existed for a long time.
         | Folding@home was founded 22 years ago.
        
           | dekhn wrote:
           | folding@home doesn't predict structures, it simulates protein
           | folding. Different area with some overlap.
        
       | COGlory wrote:
       | Before my comment gets dismissed, I will disclaim I am a
       | professional structural biologist that works in this field every
       | day.
       | 
       | These threads are always the same: lots of comments about protein
       | folding, how amazing DeepMind is, how AlphaFold is a success
       | story, how it has flipped an entire field on it's head, etc. The
       | language from Google is so deceptive about what they've actually
       | done, I think it's actually intentionally disingenuous.
       | 
       | At the end of the day, AlphaFold is amazing homology modeling. I
       | love it, I think it's an awesome application of machine learning,
       | and I use it frequently. But it's doing the same thing we've been
       | doing for 2 decades: pattern matching sequences of proteins with
       | unknown structure to sequences of proteins with known structure,
       | and about 2x as well as we used to be able to.
       | 
       | That's extremely useful, but it's not knowledge of protein
       | folding. It can't predict a fold de novo, it can't predict folds
       | that haven't been seen (EDIT: this is maybe not strictly true,
       | depending on how you slice it), it fails in a number of edge
       | cases (remember, in biology, edge cases are everything) and
       | again, I can't stress this enough, we have no new information on
       | how proteins fold. We know all the information (most of at least)
       | for a proteins final fold is in the sequence. But we don't know
       | much about the in-between.
       | 
       | I like AlphaFold, it's convenient and I use it (although for
       | anything serious or anything interacting with anything else, I
       | still need a real structure), but I feel as though it has been
       | intentionally and deceptively oversold. There are 3-4 other deep
       | learning projects I think have had a much greater impact on my
       | field.
       | 
       | EDIT: See below: https://news.ycombinator.com/item?id=32265662
       | for information on predicting new folds.
        
         | flobosg wrote:
         | > AlphaFold is amazing homology modeling
         | 
         | If it is homology modelling, then how can it work without input
         | template structures?
        
           | COGlory wrote:
           | It has template structures. AlphaFold uses the following
           | databases:                   BFD,         MGnify,
           | PDB70,         PDB (structures in the mmCIF format),
           | PDB seqres - only for AlphaFold-Multimer,         Uniclust30,
           | UniProt - only for AlphaFold-Multimer,         UniRef90.
        
             | flobosg wrote:
             | Those databases are used to derive the evolutionary
             | couplings and distance matrices used by the algorithm.
             | Several of those databases aren't even structural ones.
             | Furthermore, AlphaFold can function with only a MSA as an
             | input, without retrieving a single PDB coordinate.
        
               | COGlory wrote:
               | It's all about boosting signal by finding other proteins
               | that are similar, until you get to the point that you can
               | identify a fold to assign to a region of the protein.
               | That's why some are structural, and some are not.
               | 
               | >Furthermore, AlphaFold can function with only a MSA as
               | an input, without retrieving a single PDB coordinate.
               | 
               | Yes, it has a very nice model of what sequences should
               | look like in 3D. That model is derived from experimental
               | data. So if I give AlphaFold an MSA of a new, unknown
               | protein fold (substantively away from any known fold), it
               | cannot predict it.
        
               | flobosg wrote:
               | > Yes, it has a very nice model of what sequences should
               | look like in 3D.
               | 
               | A structural model, you would say.
               | 
               | > That model is derived from experimental data.
               | 
               | That doesn't make it a template-based model, or a
               | homology one.
               | 
               | > if I give AlphaFold an MSA of a new, unknown protein
               | fold (substantively away from any known fold), it cannot
               | predict it
               | 
               | That will depend on the number of effective sequences
               | found to derive couplings. Domains with novel folds
               | usually have a low number of remotely homolog sequences
               | and for that reason the method will fail, not just
               | because they are novel.
        
         | zack-m wrote:
         | Yup. It's great, but there are still many aspects to unpack and
         | work on. Hence why Rosetta is a thing.
        
           | flobosg wrote:
           | Rosetta methods are also moving towards ML. Here's an article
           | from last week:
           | https://www.science.org/doi/10.1126/science.abn2100
        
         | [deleted]
        
         | Lloydksk wrote:
         | I mean like whats this about AlphaFold is gone
        
         | adamsmith143 wrote:
         | > it can't predict folds that haven't been seen
         | 
         | This seems strange to me. The entire point of these types of
         | models is to predict things on unseen data. Are you saying
         | Deepmind is completely lying about their model?
         | 
         | Deepmind solved CASP, isn't the entire point of that
         | competition to predict unseen structures?
         | 
         | If AlphaFold doesn't predict anything then what are you using
         | it to do?
        
           | COGlory wrote:
           | AlphaFold figures out that my input sequence (which has no
           | structural data) is similar to this other protein that has
           | structural data. Or maybe different parts of different
           | proteins. It does this extremely well.
        
             | flobosg wrote:
             | This is a gross misrepresentation of the method.
        
               | COGlory wrote:
               | Perhaps you'd care to explain how? AlphaFold does not
               | work on new folds. It ultimately relies on mapping
               | sequence to structure. It does it better than anyone
               | else, and in ways a human probably couldn't, but if you
               | give it a brand new fold with no relation to other folds,
               | it cannot predict it. I routinely areas of extremely low
               | confidence many of my AlphaFold models. I work in
               | organisms that have virtually 0 sequence identity. This
               | is a problem I deal with every day. I wish AlphaFold
               | worked in the way you are suggesting, but it just flat
               | out does not.
        
               | dekhn wrote:
               | No organisms have virtually 0 sequence identity. That's
               | nonsense. Can you give an example? n Even some random
               | million-year-isolated archae shares the majority of its
               | genes with common bacteria.
        
               | flobosg wrote:
               | > It ultimately relies on mapping sequence to structure.
               | 
               | So does every structural prediction method.
               | 
               | > if you give it a brand new fold with no relation to
               | other folds, it cannot predict it
               | 
               | That will depend on the number of effective sequences,
               | not the actual fold.
               | 
               | > I work in organisms that have virtually 0 sequence
               | identity.
               | 
               | Then the problem is low sequence coverage, not the
               | protein fold. On a side note, there are sensitive
               | homology search protocols that rely very little on actual
               | sequence identity.
        
         | [deleted]
        
         | mupuff1234 wrote:
         | > There are 3-4 other deep learning projects I think have had a
         | much greater impact on my field.
         | 
         | Don't leave us hanging... which projects?
        
           | COGlory wrote:
           | 1) Isonet - takes low SNR cryo-electron tomography images
           | (that are extremely dose limited, so just incredibly blurry
           | and frequently useless) and does two things:
           | 
           | * Deconvolutes some image aberrations and "de-noises" the
           | images
           | 
           | * Compensates for missing wedge artifacts (missing wedge is
           | the fact that the tomography isn't done -90deg --> +90deg,
           | but usually instead -60deg --> +60deg, leaving a 30deg wedge
           | on the top and bottom of basically no information) which
           | usually are some sort of directionality in image density. So
           | if you have a sphere, the top and bottom will be extremely
           | noisy and stretched up and down (in Z).
           | 
           | https://www.biorxiv.org/content/10.1101/2021.07.17.452128v1
           | 
           | 2) Topaz, but topaz really counts as 2 or 3 different
           | algorithms. Topaz has denoising of tomograms and of flat
           | micrographs (i.e. images taken with a microscope, as opposed
           | to 3D tomogram volumes). That denoising is helpful because it
           | increases contrast (which is the fundamental problem in Cryo-
           | EM for looking at biomolecules). Topaz also has a deep
           | learning particle picker which is good at finding views of
           | your protein that are under-represented, or otherwise
           | missing, which again, normally results in artifacts when you
           | build your 3D structure.
           | 
           | https://emgweb.nysbc.org/topaz.html
           | 
           | 3) EMAN2 convolutional neural network for tomogram
           | segmentation/Amira CNN for segmentation/flavor of the week
           | CNN for tomogram segmentation. Basically, we can get a 3D
           | volume of a cell or virus or whatever, but then they are
           | noisy. To do anything worthwhile with it, even after
           | denoising, we have to say "this is cell membrane, this is
           | virus, this is nucleic acid" etc. CNNs have proven to be
           | substantially better at doing this (provided you have an
           | adequate "ground truth") than most users.
           | 
           | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5623144/
        
         | [deleted]
        
         | dekhn wrote:
         | I've directly communicated with the leaders of CASP and at DM
         | that they should stop representing this as a form of protein
         | folding and just call it "crystal/cryoEM structure prediction"
         | (they filter out all the NMR structures from PDB since they
         | aren't good for prediction). They know it's disingenuous and
         | they do it on purpose to give it more impact than it really
         | deserves.
         | 
         | I would like to correct somethign here- it does predict
         | structures de novo and predict folds that haven't been seen
         | before. That's because of the design of the NN- it uses
         | sequence information to create structural constraints. If those
         | constraints push the modeller in the direction of a novel fold,
         | it will predict that.
         | 
         | To me what's important about this is that it demonstrated the
         | obvious (I predicted this would happen eventually, shortly
         | after losing CASP in 2000).
        
           | COGlory wrote:
           | >I would like to correct somethign here- it does predict
           | structures de novo and predict folds that haven't been seen
           | before. That's because of the design of the NN- it uses
           | sequence information to create structural constraints. If
           | those constraints push the modeller in the direction of a
           | novel fold, it will predict that.
           | 
           | Could you expand on this? Basically it looks at the data, and
           | figures out what's an acceptable position in 3D space for
           | residues to occupy, based on what's known about other
           | structure?
           | 
           | I will update my original post to point out I may be not
           | entirely correct there.
           | 
           | The distinction I'm trying to make is that there's a
           | difference between looking at pre-existing data and modeling
           | (ultimately homology modeling, but maybe slightly different)
           | and understanding how protein folding works, being able to
           | predict de novo how an amino acid sequence will become a 3D
           | structure.
           | 
           | Also thank you for contacting CASP about this.
        
             | dekhn wrote:
             | From what I can tell, the model DM built is mining subtle
             | relationships between aligned columns of multiple sequence
             | alignments and any structural information which is tangibly
             | related to those sequences. Those relationships can be used
             | to infer rough atomic distances ("this atom should be
             | within 3 and 7 angstroms of this other atom"). A large
             | matrix (partially filled out) of distances is output, and
             | those distances are used as constraints in a force field
             | (which also includes lots of prior knowledge about protein
             | structure) and then they run simulations which attempt to
             | minimize both the force field and constraint terms.
             | 
             | In principle you don't even need a physical force field- if
             | you have enough distance information between pairs of
             | atoms, you can derive a plausible structure by embedding
             | the distances in R3
             | (https://en.wikipedia.org/wiki/Distance_geometry and https:
             | //citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.21...
             | 
             | Presumably, the signal they extract includes both rich
             | local interactions (amino acids near in sequence) and
             | distant ones inferred through sequence/structure
             | relationships, and the constraints could in fact push a
             | model towards a novel fold, presumably through some
             | extremely subtle statistical relationships to other
             | evolutionarily related proteins that adopt a different
             | fold.
        
         | ramraj07 wrote:
         | Not sure if you should be reminded of how alpha fold started,
         | it started by winning a competition thought un winnable by
         | academics. Top labs working in protein structure prediction
         | have fundamentally changed direction after alpha fold and are
         | working to do the same even better.
         | 
         | This is not the first (or even tenth) time I'm seeing an
         | academic trying to undermine genuine progress almost to the
         | level of gaslighting. Comparing alphafold to conventional
         | homology modeling is disingenuous at its most charitable
         | interpretation.
         | 
         | Not sure what else to say. Structural biology has always been
         | the weirdest field I've seen, the way students are abused
         | (crystallize and publish in nature or go bust), and how every
         | nature issue will have three structure papers as if that cures
         | cancer every day. I suppose it warps one's perception of
         | outsiders after being in such a bubble?
         | 
         | signed, someone with a PhD in biomedical engineering, did a ton
         | of bio work.
        
           | shpongled wrote:
           | > Comparing alphafold to conventional homology modeling is
           | disingenuous at its most charitable interpretation.
           | 
           | It's really not - have you played around with AF at all? Made
           | mutations to protein structures and asked it to model them?
           | Go look up the crystal structures for important proteins like
           | FOXA1 [1], AR [2], EWSR1 [3], etc (i.e. pretty much any
           | protein target we really care about and haven't previously
           | solved) and tell me with a straight face that AF has "solved"
           | protein folding - it's just a fancy language model that's
           | pattern matching to things it's already seen solved before.
           | 
           | signed, someone with a PhD in biochemistry.
           | 
           | [1] https://alphafold.ebi.ac.uk/entry/P55317 [2]
           | https://alphafold.ebi.ac.uk/entry/P10275 [3]
           | https://alphafold.ebi.ac.uk/entry/Q01844
        
           | COGlory wrote:
           | > Not sure if you should be reminded of how alpha fold
           | started, it started by winning a competition thought un
           | winnable by academics. Top labs working in protein structure
           | prediction have fundamentally changed direction after alpha
           | fold and are working to do the same even better.
           | 
           | Not sure what part of "it does homology modeling 2x better"
           | you didn't see in my comment? AlphaFold scored something like
           | 85% in CASP in 2020, in CASP 2016, I-TASSER had I think 42%?
           | So it's ~2x as good as I-TASSER which is exactly what I said
           | in my comment.
           | 
           | >This is not the first (or even tenth) time I'm seeing an
           | academic trying to undermine genuine progress almost to the
           | level of gaslighting. Comparing alphafold to conventional
           | homology modeling is disingenuous at its most charitable
           | interpretation.
           | 
           | It literally is homology modeling. The deep learning aspect
           | is to boost otherwise unnoticed signal that most homology
           | modeling software couldn't tease out. Also, I don't think I'm
           | gaslighting, but maybe I'm wrong? If anything, I felt gaslit
           | by the language around AlphaFold.
           | 
           | >Not sure what else to say. Structural biology has always
           | been the weirdest field I've seen, the way students are
           | abused (crystallize and publish in nature or go bust), and
           | how every nature issue will have three structure papers as if
           | that cures cancer every day. I suppose it warps one's
           | perception of outsiders after being in such a bubble?
           | 
           | What on earth are you even talking about? The vast, VAST
           | majority of structures go unpublished ENTIRELY, let alone
           | published in nature. There are almost 200,000 structures on
           | deposit in the PDB.
        
             | underdeserver wrote:
             | > Not sure what part of "it does homology modeling 2x
             | better" you didn't see in my comment? AlphaFold scored
             | something like 85% in CASP in 2020, in CASP 2016, I-TASSER
             | had I think 42%? So it's ~2x as good as I-TASSER which is
             | exactly what I said in my comment.
             | 
             | Wait, stop, I don't know anything about proteins but 84%
             | success is not ~2x better than 42%.
             | 
             | It doesn't really make sense to talk about 2x better in
             | terms of success percentages, but if you want a feel, I
             | would measure 1/error instead (a 99% correct system is 10
             | times better than a 90% correct system), making AlphaFold
             | around 3.6 times better.
        
               | palmtree3000 wrote:
               | I think odds ratio ( p/(1-p) ) is the thing I'd use here.
               | It gives the right limiting behavior (at p ~= 0, doubling
               | p is twice as good, and at p~=1, halving 1-p is twice as
               | good) and it's the natural way to express Bayes rule,
               | meaning you can say "I'm twice as sure (in odds ratio
               | terms) based on this evidence" and have that be solely a
               | property of the update, not the prior.
        
             | dekhn wrote:
             | What ramraj is talking about: if you go into a competitive
             | grad program to get a PhD in structural biology, your
             | advisor will probably expect that in 3-4 years you will:
             | crystallize a protein of interest, collect enough data to
             | make a model, and publish that model in a major journal.
             | Many people in my program could not graduate until they had
             | a Nature or Science paper (my advisor was not an asshole, I
             | graduated with just a paper in Biochemistry).
             | 
             | In a sense both of you are right- DeepMind is massively
             | overplaying the value of what they did, trying to expand
             | its impact far beyond what they actually achieved (this is
             | common in competitive biology), but what they did was such
             | an improvement over the state of the art that it's
             | considered a major accomplishment. It also achieved the
             | target of CASP- which was to make predictions whose scores
             | are indistinguishable from experimentally determined
             | structures.
             | 
             | I don't think academics thought CASP was unwinnable but
             | most groups were very surprised that an industrial player
             | using 5 year old tech did so well.
        
               | flobosg wrote:
               | Hear, hear. This is probably the best take.
        
               | valarauko wrote:
               | > What ramraj is talking about: if you go into a
               | competitive grad program to get a PhD in structural
               | biology, your advisor will probably expect that in 3-4
               | years you will: crystallize a protein of interest,
               | collect enough data to make a model, and publish that
               | model in a major journal.
               | 
               | All of that is generally applicable to molecular biology
               | in general, and I don't see how the field of structural
               | biology is especially egregious, the way ramraj is making
               | it out to be.
        
               | flobosg wrote:
               | Protein crystallization can be very difficult and there
               | is no general solution. Kits that screen for crystal
               | growth conditions usually help but optimization is needed
               | in most cases. Then, that crystal must have certain
               | properties that allow for good data acquisition at the
               | X-ray facility. That's another problem by itself and
               | months or years can pass until you get a suitable protein
               | crystal and X-ray diffraction dataset where you can model
               | your structure.
        
               | valarauko wrote:
               | I'm familiar with protein crystallization and the
               | difficulties associated with it. What I don't agree with
               | is the characterization of the field as especially
               | difficult, above and beyond modern biology in general.
               | Nor can I support the assertion that structural biology
               | students are subject to special abuse that regular grad
               | students are not.
               | 
               | > ... can be very difficult and there is no general
               | solution
               | 
               | This is true of pretty much any graduate work in
               | molecular biology.
        
               | dekhn wrote:
               | I did rotations in multiple types of lab as part of my
               | program and I can't say I ever found that students in
               | regular molecular biology labs had nearly as hard a time
               | as structural biologists; SB is its own class of hell.
               | Given the number of papers published in molecular biology
               | that turn out to be "gel was physically cut and
               | reasssembled to show the results the authors desired"
               | (it's much harder to cheat on a protein structure)...
        
               | lucidrains wrote:
               | To add to this, the deep learning field has already moved
               | on towards MSA-less structure prediction. None of this
               | would be possible without building on top of the work
               | open sourced by Deepmind.
               | 
               | https://www.biorxiv.org/content/10.1101/2022.07.21.500999
               | v1 https://www.biorxiv.org/content/10.1101/2022.07.20.500
               | 902v1
               | 
               | To be overly dismissive is to lack imagination.
        
       | jakosz wrote:
       | Now we can start guessing what futures they are betting on:
       | these, in which open-sourcing the whole thing commoditises
       | critical complements.
       | 
       | ---
       | 
       | https://www.gwern.net/Complement
        
       | jarenmf wrote:
       | This is probably one of the best applications of AI in science in
       | terms of impact so far. I can't think of any other problem with
       | the same potential impact.
       | 
       | EDIT: grammar
        
         | 323 wrote:
         | AI translate has probably a bigger worldwide impact so far.
        
           | jebarker wrote:
           | jarenmf said "in science" - but it is an interesting question
           | how much automated translation has helped scientists
           | translate papers from other languages.
        
             | hijodelsol wrote:
             | It even goes both ways - it allows non-native English
             | speakers to publish their work in correct
             | technical/scientific English with far less barriers.
        
         | calebkaiser wrote:
         | This is definitely one of the most exciting spaces in AI right
         | now. Another somewhat-related startup is PostEra (medicinal
         | chemistry for drug discovery via AI) https://postera.ai/about/
        
         | axg11 wrote:
         | AlphaFold is the best counterpoint to tech cynics.
         | 
         | One of the largest public tech companies in the world funded a
         | multi-year scientific project, executed the research flawlessly
         | and moved forward an entire scientific field. They then went on
         | to openly release the code _and_ data, working with a publicly
         | funded organization (EMBL-EBI) to ensure researchers across the
         | globe can easily access the outputs.
         | 
         | I'm not arguing that every tech company is a net positive for
         | humanity. Google itself isn't perfect. Google + DeepMind is
         | setting a bloody high bar though.
        
         | t00 wrote:
         | You are right and when thinking about it I can see 2 problems
         | which I hope in the future can have even more impact:
         | 
         | 1. Using AI to determine the most efficient methods of doing
         | mathematical expressions, transformations and computation
         | algorithms - division, square root, maybe traveling salesman -
         | these which take relatively high amount of CPU cycles to
         | compute and are used everywhere. If inputs and outputs can be
         | assigned to it, AI can eventually build a transformation which
         | can be reproduced using a silicon.
         | 
         | 2. Physics phenomena in general, not only organic protein, can
         | be measured and with sufficient ability to quantize them to
         | inputs and experimentally obtained outputs to train the
         | network, we could in theory establish new formulas or constants
         | and progress the understanding of the Universe.
        
           | lrhegeba wrote:
           | the groundworks, at least partially, happen as you typed
           | this: https://www.nature.com/articles/d41586-021-01627-2
        
       | dekhn wrote:
       | Demis and John will probably win either the Chemistry or Physics
       | Nobel Prize in the next couple of years.
        
         | thomasahle wrote:
         | Some people are using "AI wins a Nobel price" as the new Turing
         | test. Maybe that is going to happen sooner than they expect. Or
         | maybe the owners of the AI will always claim it on its behalf.
        
           | dekhn wrote:
           | there's no AI here. This is just ML. All deepmind did here
           | was use multiple excellent resources- large numbers of
           | protein sequences, and small numbers of protein structures,
           | to create an approximation function of protein structure,
           | without any of the deep understanding of "why".
           | Interestingly, the technology they used to do this didn't
           | exist 5 years ago!
        
       | dalbasal wrote:
       | Can someone put AlphaFold's problem space into perspective for
       | me?
       | 
       | Why is protein folding important? Theoretical importance? Can we
       | do something with protein folding knowledge? If so, what?
       | 
       | I've been hearing about AlphaFold from the CS side. There they
       | seem to focus on protein folding primarily as an interesting
       | space to apply their CS efforts.
        
         | axg11 wrote:
         | If we knew:
         | 
         | (a) the structure of every protein (what DeepMind is doing
         | here)
         | 
         | (b) how different protein structures interact (i.e. protein
         | complexes - DeepMind is working on this but not there yet)
         | 
         | Then we could use those two building blocks to design new
         | proteins (drugs) that do what we want. If we solve those two
         | problems with very high accuracy, we can also reduce the time
         | it takes to go from starting a drug discovery programme to
         | approved medicine.
         | 
         | Obtaining all protein structures and determining how they
         | interact is a key step towards making biology more predictable.
         | Previously, solving the structure of a protein was very time
         | consuming. As a result, we didn't know the structure for a
         | majority of proteins. Now that it's much faster, downstream
         | research can move faster.
         | 
         | Caveat: we should remember that these are all computational
         | predictions. AlphaFold's predictions can be wrong and protein
         | structures will still need to be validated. Having said that,
         | lots of validation has already occurred and confidence in the
         | predictions grows with every new iteration of AlphaFold.
        
           | lamename wrote:
           | How are the predictions validated? Waiting for the old
           | fashioned way for... very difficult crystal structure
           | experiments? Or something else?
        
             | RandomLensman wrote:
             | If the predictions are generally good enough, could also
             | skip the validation and directly try to get a desired
             | effect or reaction. That isn't strictly speaking validating
             | the structure, but depending on the use case might be
             | easier to just go for an outcome - really a question of
             | application and cost efficiency.
        
               | siver_john wrote:
               | I mean nothing is stopping you from skipping validation
               | with pre-alphafold techniques and say for drug discovery
               | to already do drug screening using the predicted
               | structure. It's just the drug screening software is
               | already error prone so you are still going to have to do
               | some validation. However having an idea on a potential
               | structure means that you can do other techniques that are
               | simpler to validate it that are less expensive/time
               | consuming (I'm thinking of things similar to FRET).
               | 
               | Another idea is these may come into play for anti-
               | verification, so if you are drug screening against a
               | known structure. You could potentially use these more
               | flawed structures of proteins you don't want to target
               | but may be similar, and try to reduce the drug's efficacy
               | at binding them. Or something to that effect. All of that
               | is fun ideas that are currently being explored in that
               | space but we'll see where it takes us.
        
             | flobosg wrote:
             | > very difficult crystal structure experiments?
             | 
             | Apart from X-ray crystallography there are other methods
             | for structure determination such as nuclear magnetic
             | resonance (NMR) or cryo-electron microscopy (cryo-EM). The
             | latter has seen a dramatic improvement in resolution over
             | the last decade.
        
             | misnome wrote:
             | For a lot of X-ray crystallography cases, some of the
             | difficulty is working out with no prior information, the
             | actual structure from the collected data. This makes a lot
             | of that... much easier because with
             | https://en.wikipedia.org/wiki/Molecular_replacement
             | something that is "close, but not correct" can be used to
             | bootstrap the actual structure from.
        
             | f38zf5vdt wrote:
             | Most of them are not, just estimations based on previous
             | results given sequences with known structure.
             | 
             | Every couple years there is a massive competition called
             | CASP where labs submit previously unresolved protein
             | structures derived from experimental EM, x-ray
             | crystallography, or NMR studies and other labs attempt to
             | predict these structures using their software. AlphaFold2
             | absolutely destroyed the other labs in the main contest
             | (regular monomeric targets, predominantly globular) for
             | structure resolution two years ago, in CASP 14.
             | 
             | https://predictioncenter.org/casp14/zscores_final.cgi
             | 
             | The latest contest, CASP15, is currently underway and
             | expected to end this year. As with all ML, the usual
             | caveats apply to the models Google generated -- the dangers
             | of overfitting to existing structures, artifacts based on
             | the way the problem was modelled, etc
        
           | jcranmer wrote:
           | > Then we could use those two building blocks to design new
           | proteins (drugs) that do what we want. If we solve those two
           | problems with very high accuracy, we can also reduce the time
           | it takes to go from starting a drug discovery programme to
           | approved medicine.
           | 
           | Drugs are usually not proteins, but instead small molecules
           | that are designed to help or interfere with the operation of
           | proteins instead.
        
             | axg11 wrote:
             | That is only true because of our current tools and
             | capabilities. With improved manufacturing techniques and
             | AlphaFold++ I think biologics will dominate. Even still,
             | there are ~2000 approved biologics [0].
             | 
             | [0] - https://purplebooksearch.fda.gov/advanced-search
        
               | bckr wrote:
               | Yep, proteins are so much more flexible / precise than
               | small molecules. Also we can get the body to produce
               | them. Think mRNA vaccines.
        
         | empiricus wrote:
         | The endgame of protein folding is nanotechnology. Life is also
         | nanotechnology. So basically the end game is to take full
         | control of life: change existing life forms, design new life
         | forms, new ecosystems. Eat the earth first, then the universe.
         | Not yet there, but getting much closer.
        
         | green-eclipse wrote:
         | There are a million articles and podcasts explaining exactly
         | your question. Those will be better than HN responses. I
         | suggest you take 15 seconds to Google it.
        
           | dev_tty01 wrote:
           | Yes there are a million articles. That is why asking a
           | question here on HN is useful. The HN community more often
           | than not offers intelligent insight as well as curated
           | recommended links for learning more about a topic. Yes, the
           | signal-to-noise ratio isn't perfect on HN, but it is a lot
           | better than random Google searches. If nothing else, it often
           | leads to enough basic understanding so that someone can then
           | perform more refined and therefore more productive Google
           | searches. I appreciate the basic questions and the generous
           | nature of many HN commenters who offer thoughtful responses.
        
         | feet wrote:
         | As others have already mentioned, proteins are the machinery of
         | the cell. They perform an immense array of functions and they
         | must fold in a certain way to perform these functions. This is
         | part of what's known as the structure-function relationship.
         | 
         | Misfolded proteins are contributors to numerous pathological
         | conditions and the more we can understand about how and why
         | this folding happens, the better we can treat these conditions.
         | 
         | Another aspect is that while we can at least partially
         | determine the primary structure (the amino acid sequence) of
         | proteins from DNA and RNA, we don't necessarily know their
         | secondary or tertiary structures (3 dimensional conformation).
         | This is a key piece of the puzzle for figuring out how these
         | proteins do their proteiny things and how they interact with
         | other proteins and even how they form quaternary structures
         | with other proteins (an assembly of multiple proteins that
         | perform some function, many pores are assemblies like this).
         | Once we know these structures and understand how they work on a
         | structural and chemical level, we can manipulate them far more
         | easily.
         | 
         | In order to do rational drug design, which is designing a drug
         | for a specific target or active site on a protein, we need to
         | understand these structures. Working to solve protein folding
         | is a key step in treating disease states and understanding how
         | cells work on a fundamental level. The impact is hard to
         | understate.
        
         | turkeygizzard wrote:
         | My understanding is that protein folding is a major cost
         | bottleneck in drug design.
         | 
         | Researchers can come up with candidate molecule formulas that
         | might work as good drugs, but the problem is that these
         | proteins organize/fold themselves physically in a hard-to-
         | predict way. And how they fold directly affects their
         | properties as drugs.
         | 
         | If AlphaFold can accurately predict folding, it'll allow
         | researchers to prioritize drug candidates more accurately which
         | will reduce research time and costs. Supposedly the major
         | pharmaceutical companies can spend up to billions when
         | designing a single drug. Optimistically, predicting protein
         | folding better will allow for much more rapid and cheaper drug
         | development
        
         | matheusmoreira wrote:
         | Proteins are the molecular machines of all living beings on
         | this planet. They do almost everything. We need to understand
         | their structure and how that structure changes when they
         | perform some function. Essentially we have to reverse engineer
         | nature's machines.
         | 
         | Biology has traditionally studied these things from the outside
         | in, top down. Advances in biology are correlated with advances
         | in microscope technology. We went from seeing with the naked
         | eye to being able to magnify tissue to see cells and even
         | what's inside of them.
         | 
         | Studying biology from the inside out, bottom up is
         | significantly harder. You have chemistry which is hard science.
         | On top of that you have biochemistry which is equally rigorous.
         | Above that level of abstraction things start to get a lot less
         | well defined. We go from organic chemistry (carbon, hydrogen,
         | oxygen, etc.) to more complex things found in life (aminoacids,
         | carbohydrates, lipids) to polymers of those things (proteins,
         | polysaccharides, triglycerides) to a point where they get so
         | big and complex their tridimensional structure starts
         | influencing how they interact with other molecules then we
         | start seeing aggregates (quaternary proteins) of these complex
         | things!
         | 
         | Important proteins like ATP synthase are well understood (it's
         | a friggin' turbine like in a hydroelectric power plant!) but
         | there are so many others haven't received the same attention.
         | Bioinformatics technology is changing that. The human genome's
         | been mapped. Now it's time to map the proteome.
        
         | Freebytes wrote:
         | The applications and importance has been discussed, but let me
         | explain why what we are doing right now does not work which
         | will also emphasize the importance of this.
         | 
         | At this time, we create drugs, test them on animals, and see
         | what the side effects and results actually are. We are very
         | limited in our capabilities and basically throw mud at the wall
         | and see what sticks. This would allow us to try potential drug
         | candidates without so much randomness.
        
         | fabiospampinato wrote:
         | You are basically made of proteins, which are basically folded
         | sequences of amino acids, proteins are molecular machines that
         | are the fundamental building block of animals, plants,
         | bacteria, fungi, viruses etc.
         | 
         | So yeah the applications are enormous, from medicine to better
         | industrial chemical processes, from warfare to food
         | manufacturing.
        
           | jebarker wrote:
           | > proteins are molecular machines
           | 
           | Does that imply proteins have some dynamics that need to be
           | predicted too? I remember seeing animations of molecular
           | machines that appeared to be "walking" inside the body - are
           | those proteins or more complex structures?
        
             | gilleain wrote:
             | Yes, very much so. Even for proteins that seems like they
             | are just scaffolding for a catalytic centre can have
             | important dynamics.
             | 
             | A classic example is haemoglobin, that 'just' binds to
             | oxygen at the iron in the middle of the haem. Other binding
             | sites remote from the oxygen binding one can bind to other
             | molecules - notably carbon dioxide. The 'Bohr effect'
             | mechanism is outlined here : https://en.wikipedia.org/wiki/
             | Bohr_effect#Allosteric_interac...
             | 
             | Even at the lowest level, there is some evidence that
             | ligand binding can affect the structure of the backbone of
             | the protein. For example, peptide plane flipping
             | https://en.wikipedia.org/wiki/Peptide_plane_flipping
             | although I'm not sure where the research is on this
             | nowadays.
        
               | jebarker wrote:
               | Fascinating. I'm a mathematician turned AI researcher by
               | day. The more I learn about biology the more I feel that
               | our human engineering (hard and soft) is trivially simple
               | compared to what evolution has already created and left
               | for us to study.
        
               | gilleain wrote:
               | Heh, I'm somewhat the opposite, as I went from a degree
               | in biochemistry into bioinformatics (protein folds and
               | structure) and have an amateur interest in maths.
               | 
               | I often recommend the book 'Cats' Paws and Catapults' by
               | Steven Vogel, which discusses the differences between
               | human and biological engineering. There are very
               | different constraints, as well as the more obvious
               | difference between intelligently directed design (by
               | humans) and evolutionary design.
               | 
               | It's not totally true that our engineering is simpler
               | than biological systems. After all, we can now manipulate
               | matter down to the atomic level. There have been attempts
               | to move towards self-assembling systems, or even self-
               | repairing ones. Not quite on the same scale or
               | sophistication as biological systems, of course.
               | 
               | Is a modern computer chip more complex than a virus? Is a
               | large connected system of computers more complex than a
               | mycelial network? Are there more parts connected in more
               | ways in a nuclear power station as there are in a simple
               | bacterial cell?
               | 
               | I think it is definitely worth considering the
               | similarities and differences.
        
               | jebarker wrote:
               | Definitely interesting comparisons.
               | 
               | I think my sense that biology is more complex than human
               | engineering is that our engineering seems much more
               | homogeneous. Computer chips are certainly very dense but
               | the individual number of different component types are
               | very small. Biology on the other hand seems to have a
               | huge number of fundamental building blocks, e.g.
               | proteins, with different behaviors. I suppose that maybe
               | that's the wrong comparison though and software is where
               | our engineering diversity is present.
               | 
               | It may well just be the case that my lack of familiarity
               | with biology makes it seem more complex to me.
        
             | fabiospampinato wrote:
             | Yes proteins can be dynamic, and multiple proteins can come
             | together to form bigger structures (like a cell's membrane
             | for example), and they can very much interact with other
             | proteins and molecules in complex ways.
             | 
             | I think AlphaFold gives us more of a static picture that
             | doesn't tell us much about how the protein will interact
             | with the world. Predicting these dynamic aspects, protein-
             | to-protein interactions, and ultimately an entire cell, is
             | probably what they'll like to do next.
             | 
             | I don't know if the walking buddy is a single protein, it
             | may be a handful of different ones joined together.
        
         | andrepd wrote:
         | https://en.m.wikipedia.org/wiki/Protein_folding
        
         | epups wrote:
         | Proteins are what makes everything in a cell work. They are
         | produced as a "linear" structure that must fold into a proper
         | shape to execute its function, such as acting as a pore that
         | only lets a specific chemical through the cell membrane.
         | 
         | The importance here is to figure out potential targets for
         | treatments that take into account particularities of certain
         | proteins. That could produce better drugs with less side
         | effects.
        
         | flobosg wrote:
         | This might be an interesting resource for you:
         | https://pdb101.rcsb.org/
        
         | pbw wrote:
         | The genome, all of our DNA combined, is just a bunch of 1D
         | strings like "cgtattctgcttgta". Those strings encode proteins,
         | which fold up into a 3D shape once created. This 3D shape is
         | what determines what the protein actually does inside the cell.
         | Without understanding protein folding we don't understand what
         | the DNA actually does.
        
       | swayvil wrote:
       | I had a dream about this a few days ago. About complexly
       | wrinkled/crumpled/convolved things.
       | 
       | Like a fresh crepe stuffed into the toe of a boot. Bewilderingly
       | complex.
       | 
       | But I have a question. Does such contortion work for 3d
       | "membranes" in a 4d space? It's something I'm chewing on. Hard to
       | casually visualize, obviously.
        
         | gspr wrote:
         | Of course! The term you might wanna start off googling is
         | "curvature of manifolds". What's even neater than "3d thing
         | curving in 4d space" is that these notions can be made precise
         | also without the "in [whatever] space" part (see "intrinsic
         | curvature" and "Riemannian manifold").
        
           | swayvil wrote:
           | Thank you very much.
        
       | naves wrote:
       | Just imagine if the tech world puts all programatic advertising
       | development on hold for a year and the collective brain power is
       | channeled to science instead...
        
       | alphabetting wrote:
       | _Obtaining this dataset prior to alphafold would have cost on the
       | order of $200 trillion._
       | https://twitter.com/wintonARK/status/1552653527670857729
       | 
       | Anyone knowledgeable know if this estimate is accurate? Insane if
       | true
        
         | green-eclipse wrote:
         | It's impossible to really put a number on it, because the task
         | itself was impossible. PHDs and the field's top scientists
         | simply couldn't figure out many complicated protein structures
         | after years of attempts, and the fact that there's so many
         | (200M+) mean that the problem space is vast.
        
         | dekhn wrote:
         | It doesn't make any sense on multiple levels. This is a
         | computational prediction and there was no computational
         | alternative- for many of these proteins would never have had a
         | structure solved even if you spent the money. They are just
         | taking $cost_per_structure_solved *
         | number_of_remaining_structures and assuming that things scale
         | linearly like that.
         | 
         | Note that crystallographers are now using these predicftions to
         | bootstrap models of proteins they've struggled to work with,
         | which indicates the level of trust in the structural community
         | for these predictions is pretty high.
        
         | shauryamanu wrote:
         | Even if that's exaggerated, it might have taken significant
         | time to reach to this stage. Probably on the order of >50
         | years.
        
       | [deleted]
        
       | crispyambulance wrote:
       | I got a 5th grader question about how proteins are
       | used/represented graphically that I've never been able to find a
       | satisfying answer for.
       | 
       | Basically, you see these 3D representations of specific proteins
       | as a crumple of ribbons-- literally like someone ran multi-
       | colored ribbons though scissors to make curls and dumped it on
       | the floor (like a grade school craft project).
       | 
       | So... I understand that proteins are huge organic molecules
       | composed of thousands of atoms, right? Their special capabilities
       | arise from their structure/shape. So basically the molecule
       | contorts itself to a low energy state which could be very complex
       | but which enables it to "bind?" to other molecules expressly
       | because of this special shape and do the special things that
       | proteins do-- that form the basis of living things. Hence the
       | efforts, like Alphafold, to compute what these shapes are for any
       | given protein molecule.
       | 
       | But what does one "do" with such 3D shapes?
       | 
       | They seem intractably complex. Are people just browsing these
       | shapes and seeing patterns in them? What do the "ribbons"
       | signify? Are they just some specific arrangement of C,H,O? Why
       | are some ribbons different colors? Why are there also thread-like
       | things instead of all ribbons?
       | 
       | Also, is that what proteins would really look like if you could
       | see at sub-optical wavelength resolutions? Are they really like
       | that? I recall from school the equipartition theorem-- 1/2 KT of
       | kinetic energy for each degree of freedom. These things obviously
       | have many degrees of freedom. So wouldn't they be "thrashing
       | around" like rag doll in a blender at room temperature? It seems
       | strange to me that something like that could be so central to
       | life, but it is.
       | 
       | Just trying to get myself a cartoonish mental model of how these
       | shapes are used! Anyone?
        
         | dekhn wrote:
         | The ribbons and helices you see in those pictures are abstract
         | representations of the underlying positions of specific
         | arrangements of carbon atoms along the backbone.
         | 
         | There are tools such as DSSP
         | https://en.wikipedia.org/wiki/DSSP_(hydrogen_bond_estimation...
         | which will take out the 3d structure determined by
         | crystallography and spit out hte ribbons and helices- for
         | example, for helices, you can see a specific arrangement of
         | carbons along the protein's backbone in 3d space (each carbon
         | interacts with a carbon 4 amino acids down the chain).
         | 
         | Protein motion at room temperature varies depending on the
         | protein- some proteins are rocks that stay pretty much in the
         | same single conformation forever once they fold, while others
         | do thrash around wildly and others undergo complex, whole-
         | structure rearrangements that almost seem magical if you try to
         | think about them using normal physics/mechanical rules.
         | 
         | Having a magical machine that could output the full manifold of
         | a protein during the folding process at subatomic resolution
         | would be really nice! but there would be a lot of data to
         | process.
        
           | crispyambulance wrote:
           | Thanks, awesome! So what do molecular biologists do with
           | these 3D representations once they have them? Do they
           | literally just see how they fit to other proteins?
        
             | buntsai wrote:
             | Often the ribbons (alpha-helices and beta=sheets) form
             | "protein domains". Canonically, these are stable, folded
             | structures with conserved shapes and functions that serve
             | as the building blocks of proteins, like lego pieces. These
             | protein domains can be assembled in different ways to form
             | proteins of different function. Different protein domains
             | that have the same evolutionary origin have conserved
             | structure even when the underlying amino acid sequence, or
             | DNA sequence has changed beyond recognition over millions
             | of years of evolution. In other words, molecular biologists
             | use structure as a proxy for function. Looking at how the
             | same protein domains works in different proteins in
             | different species can give us clues as to how a protein
             | might work in human biology or disease.
        
             | dekhn wrote:
             | There are many uses for structure. Personally, I find the
             | 3d structures to be useful as a mental guide for picturing
             | things, and certainly people do try to "dock" proteins that
             | have complementary structures, but unfortunately, the
             | biophysics of protein complexes suggests that the
             | conformation change on binding is so large that the
             | predicted structures aren't super-helpful.
             | 
             | Certainly, in a corpo like mine (Genentech/Roche) protein
             | structures have a long history of being used in drug
             | discovery- not typically a simple "dock a ligand to a
             | protein" but more for constructing lab experiments that
             | help elucidate the actual mechanistic biology going on.
             | That is only a tiny part of a much larger process to work
             | on disease targets to come up with effective treatments.
             | Genentech is different from most pharma in that their
             | treatments are themselves typically proteins, rather than
             | small molecules.
        
         | cletus wrote:
         | The ability for another molecule (probably another protein) to
         | "react" or interact with the protein depends not only on the
         | chemistry but also the shape. An otherwise compatible sequence
         | of atoms might not be able to react because it and the binding
         | site are just incompatibly shaped.
         | 
         | This is hugely important for developing drugs and vaccines.
         | 
         | To see the effect of this look no further than prions. Prions
         | are the exact same protein that are folded in weird ways.
         | Worse, they can "transmit" this misfolded shape to other
         | otherwise normal proteins. Prions behave differently just
         | because of the different shape and can lead to disease. This is
         | exactly what Mad Cow's Disease (BSE) is.
         | 
         | What we get taught in high school about chemistry is incredibly
         | oversimplified.
         | 
         | One example of this I like is the geometry of a water molecule.
         | When we first learn about atoms, we learn the "solar system"
         | model (aka Bohr). The reality is instead that we have 3D
         | probability distributions of where electrons might be. These
         | clouds are in pairs. I believe this is to do with the inverted
         | wavefunction really we're getting beyond my knowledge of
         | quantum mechanics here so that's just a guess.
         | 
         | Well those clouds additionally form valence shells. We learn
         | about these and how atoms want to form completely valence
         | shells. So Oxygen has 8 electrons ie 4 pairs of electrons. When
         | bonding with 2 hydrogen atoms we end up with a weird geometry
         | of ~104.5 degrees between the two hydrogen atoms because of how
         | these pairs interact. The naive assumption might expect that
         | the two hydrogen atoms are 180 degree apart.
         | 
         | So back to proteins, you may have learned about hydrogen bonds.
         | This affects molecular shape because when a hydrogen atom
         | shares an electron, it is often positively charged. That
         | positive charge pushes away other positive charges. This is the
         | realy difficulty in protein folding because with a molecule of
         | thousands of atoms and weird geometry you may find distant
         | parts of the molecule interacting with hydrogen bonds.
         | 
         | So a single cell consists of thousands (IIRC) of different
         | proteins. Figuring out those interactions is important but
         | incredibly difficult.
        
         | comicjk wrote:
         | All of the loops and swirls are summary representations of
         | known atomic positions: really, knowing a protein structure
         | means knowing the position of every atomic nucleus, relative to
         | the nuclei, down to some small resolution, and assuming a low
         | temperature.
         | 
         | The atoms do wiggle around a bit at room temperature (and even
         | more at body temperature), which means that simulating them
         | usefully typically requires sampling from a probability
         | distribution defined by the protein structure and some prior
         | knowledge about how atoms move (often a potential energy
         | surface fitted to match quantum mechanics).
         | 
         | There are many applications of these simulations. One of the
         | most important is drug design: knowing the structure of the
         | protein, you can zoom in on a binding pocket and design a set
         | of drug molecules which might disable it. Within the computer
         | simulation, you can mutate a known molecule into each of your
         | test molecules and measure the change in binding affinity,
         | which tells you pretty accurately which ones will work. Each of
         | these simulations requires tens of millions of samples from the
         | atomic probability distribution, which typically takes a few
         | hours on a GPU given a good molecular dynamics program.
        
         | bckr wrote:
         | > Are people just browsing these shapes and seeing patterns in
         | them
         | 
         | That's one approach.
         | 
         | The thing to understand is that proteins form "binding sites":
         | areas that are more likely to attract other particular regions
         | of proteins or other molecules, or even atoms. Think about
         | hemoglobin. The reason it holds onto oxygen atoms is because it
         | has binding sites.
         | 
         | Binding sites are great because they represent more freedom to
         | do things than molecules typically have. Normal chemistry
         | consists of forming strong electronic bonds between atoms, or
         | forming rigid lattices/crystals.
         | 
         | Binding sites allow molecules to do things like temporarily
         | attach to each other and let each other go under certain
         | circumstances, for instance when another binding site is
         | active/inactive. This can happen through "conformation change",
         | where a molecule bound/unbound on some binding site makes the
         | protein change shape slightly. This is how proteins can act
         | like machines.
         | 
         | > What do the "ribbons" signify
         | 
         | Different regions of the protein have different sequences of
         | amino acids. Amino Acids have somewhat different shapes from
         | each other. The ribbons are actually broader than the spindles
         | (or threads), and less flexible. Not sure about the different
         | colors, maybe someone else can fill in.
         | 
         | > Also, is that what proteins would really look like if you
         | could see at sub-optical wavelength resolutions?
         | 
         | Not really, it's an abstraction. They're big molecules, so if
         | you look closely they're made of atoms, which are (kinda, sorta
         | not really, quantum stuff) spherical.
         | 
         | > So wouldn't they be "thrashing around" like rag doll in
         | blender at room temperature?
         | 
         | Yes, but the attractions between the different parts of the
         | molecule keeps it somewhat under control. So more like an
         | undulating little creature, jellyfish perhaps.
         | 
         | > It seems strange to me that something like that could be so
         | central to life
         | 
         | Yep, gotta remember that it's all statistical. These things are
         | getting made, do their job, breaking, and getting degraded some
         | insane number of times per second. Swarm behavior, sort of.
        
         | ghkbrew wrote:
         | In addition to /u/dekhn 's excellent description, this
         | phenomenon is referred to as a protein's "secondary structure"
         | [0]
         | 
         | [0] https://en.m.wikipedia.org/wiki/Protein_secondary_structure
        
         | biofox wrote:
         | Short answer is that the ribbon representation is a visual
         | simplification based on known structures -- they are actually
         | composed of atoms.
         | 
         | They certainly do "thrash around", but that thrashing is
         | constrained by the bonds that are formed, which greatly limits
         | the degrees of freedom. Here's a short video of a simulation to
         | demonstrate:
         | 
         | https://www.youtube.com/watch?v=fggqPtaZj8g
        
         | paraph1n wrote:
         | > I recall from school the equipartition theorem-- 1/2 KT of
         | kinetic energy for each degree of freedom. These things
         | obviously have many degrees of freedom. So wouldn't they be
         | "thrashing around" like rag doll in a blender at room
         | temperature?
         | 
         | It's funny you say that, because the first image on the English
         | Wikipedia page for Equipartition Theorem[1] is an animation of
         | the thermal motion of a peptide.
         | 
         | [1]: https://en.wikipedia.org/wiki/Equipartition_theorem
        
           | dekhn wrote:
           | BTW, in terms of protein dynamics, before you even think
           | about the thrashing around- 1.2kt at room temperature is
           | enough to form and break hydrogen bonds in real time (around
           | 1-2kcal) so presumably, protein h-bonds are breaking and
           | reforming spontaneously at scale.
        
       | sabujp wrote:
       | Many thanks to Deepmind for releasing predicted structures of all
       | known protein _monomers_. What I 'd like next is for Alphafold
       | (or some other software) to be able to show us multimeric
       | structures based on the single monomer/subunit predictions and
       | protein-protein interactions (i.e. docking). For example the one
       | I helped work on back in my structural biology days was the
       | circadian clock protein KaiC :
       | https://www.rcsb.org/structure/2GBL, that's the "complete"
       | hexameric structure that shows how each of the subunits pack. The
       | prediction for the single monomer that forms a hexamer is very
       | close to the experimental
       | https://alphafold.ebi.ac.uk/entry/Q79PF4 and in fact shows the
       | correct structure of AA residues 500 - 519 which we were never
       | able to validate until 12 years later
       | (https://www.rcsb.org/structure/5C5E) when we expressed those
       | residues along with another protein called KaiA which we knew
       | binds to the "top" CII terminal (AAs 497-519) of KaiC. If we
       | would have had this data then, it would have allowed us to not
       | only make better predictions about biological function and
       | protein-protein interactions but would have helped better guide
       | future experiments.
       | 
       | What we can do with this data now is use methods such as cryo-em
       | to see the "big picture", i.e. multi-subunit protein-protein
       | interactions where we can plug in the Alphafold predicted
       | structure into the cryo-em 3d density map and get predicted
       | angstrom level views of what's happening without necessarily
       | having to resort to slower methods such as NMR or x-ray
       | crystallography to elucidate macromolecular interactions.
       | 
       | A small gripe about the alphafold ebi website: it doesn't seem to
       | show the known experimental structure, it just shows
       | "Experimental structures: None available in PDB". For example the
       | link to the alphafold structure above should link to the 2GBL,
       | 1TF7, or any of the other kaic structures from organism PCC7942
       | at RCSB. This would require merging/mapping data from RCSB with
       | EBI and at least doing some string matching, hopefully they're
       | working on it!
        
       | inspirerhetoric wrote:
       | Does anyone know what it would cost to download this whole
       | dataset? Google Cloud Datasets only allow 1 TB/month for free to
       | download, I believe
        
       | [deleted]
        
       | bifftastic wrote:
       | How do they know their structures are correct?
        
         | tomrod wrote:
         | This is the right line of questioning.
         | 
         | As we solve viewability into the complex coding of proteins, we
         | need to be right. Next, hopefully, comes causal effect
         | identification, then construction ability.
         | 
         | If medicine can use broad capacity to create bespoke proteins,
         | our world becomes both weird and wonderful.
        
         | __rito__ wrote:
         | They won a decades-long standing challenge predicting the
         | protein structures of a much smaller (yet significantly quite
         | large) set of proteins using a model (AlphaFold).
         | 
         | Then they use the model to predict more.
         | 
         | Although we don't know if they are _correct_ , these structures
         | are the best (or the least bad) we have for now.
        
         | ArnoVW wrote:
         | We know the structure of some proteins. It's not that it's
         | impossible to measure, it's just very expensive. This is why
         | having a model that can "predict" it is so useful.
        
         | luma wrote:
         | Same as any other prediction I'd presume. Run it against a
         | known protein and see how the answer lines up. Predict the
         | structure of an unknown protein, then use traditional methods
         | (x-ray crystallography, maybe STEM, etc) to verify.
        
           | iandanforth wrote:
           | "Verify" is almost correct. The crystallography data is taken
           | to be "ground truth" and the predicted protein structure from
           | AlphaFold is taken to be a good guess starting point. Then
           | _other_ software can produce a model that is a best fit to
           | the ground truth data starting from the good guess. So even
           | if the guess is wrong in detail it 's still useful to reduce
           | the search space.
        
           | christudor wrote:
           | This is exactly right.
        
           | gilleain wrote:
           | As a simple example, one measure used to compare a predicted
           | structure against a reference is the RMSD (root mean square
           | deviation).
           | 
           | https://en.m.wikipedia.org/wiki/Root-mean-
           | square_deviation_o...
           | 
           | The lower the RMSD between two structures, the better (up to
           | some limit).
        
         | lrem wrote:
         | Disclaimer: I work in Google, organizationally far away from
         | Deep Mind and my PhD is in something very unrelated.
         | 
         | They can't possibly know that. What they know is that their
         | guesses are very significantly better than the previous best
         | and that they could do this for the widest range in history.
         | Now, verifying the guess for a single (of the hundreds of
         | millions in the db) protein is up to two years of expensive
         | project. Inevitably some will show discrepancies. These will be
         | fed to regression learning, giving us a new generation of even
         | better guesses at some point in the future. That's what I
         | believe to be standard operating practice.
         | 
         | A more important question is: is today's db good enough to be a
         | breakthrough for something useful, e.g. pharma or agriculture?
         | I have no intuition here, but the reporting claims it will be.
        
           | f38zf5vdt wrote:
           | The press release reads like an absurdity. It's not the
           | "protein universe", it's the "list of presumed globular
           | proteins Google found and some inferences about their
           | structure as given by their AI platform".
           | 
           | Proteins don't exist as crystals in a vacuum, that's just how
           | humans solved the structure. Many of the non-globular
           | proteins were solved using sequence manipulation or other
           | tricks to get them to crystallize. Virtually all proteins
           | exist to have their structures interact dynamically with the
           | environment.
           | 
           | Google is simply supplying a list of what it presumes to be
           | low RMSD models based on their tooling, for some sequences
           | they found, and the tooling is based itself on data mostly
           | from X-ray studies that may or may not have errors. Heck,
           | we've barely even sequenced most of the DNA on this planet,
           | and with methods like alternative splicing the transcriptome
           | and hence proteome has to be many orders of magnitude larger
           | than what we have knowledge of.
           | 
           | But sure, Google has solved the structure of the "protein
           | universe", whatever that is.
        
             | dekhn wrote:
             | People have been making grand statements about the
             | structure of the protein universe for quite some time (I've
             | seen a fair number of papers on this, such as https://faseb
             | .onlinelibrary.wiley.com/doi/abs/10.1096/fasebj... and http
             | s://faseb.onlinelibrary.wiley.com/doi/abs/10.1096/fasebj...
             | from a previous collaborator of mine).
             | 
             | Google didn't solve the structure of the protein universe
             | (thank you for saying that). But the idea of the protein
             | structure universe is fairly simple- it's a latent space
             | that allows for direct movement over what is presumably the
             | rules of protein structures along orthogonal directions. It
             | would encompass all the "rules" in a fairly compact and
             | elegant way. Presumably, superfamilies would automagically
             | cluster in this space, and proteins in different
             | superfamilies would not.
        
             | lrem wrote:
             | I recognize your superior knowledge in the topic and assume
             | you're right.
             | 
             | But you also ignore where we're at in the standard cycle:
             | 
             | https://phdcomics.com/comics/archive_print.php?comicid=1174
             | 
             | ;)
        
               | f38zf5vdt wrote:
               | That's exactly what this is, but it's embarrassing that
               | it's coming from somewhere purported to be a lab. Any of
               | the hundreds or more of labs working in protein structure
               | prediction for the past 50 years could have made this
               | press release at any time and said, "look, we used a
               | computer and it told us these are the structures, we
               | solved the protein universe!"
               | 
               | It's not to diminish the monumental accomplishment that
               | was the application of modern machine learning techniques
               | to outpace structure prediction in labs, but other famous
               | labs have already moved to ML predictions and are
               | competitive with DeepMind now.
        
               | VikingCoder wrote:
               | > but other famous labs have already moved to ML
               | predictions and are competitive with DeepMind now.
               | 
               | That's great! AlphaFold DB mas made 200 million structure
               | predictions available for everyone. How many structure
               | predictions have other famous labs made available for
               | everyone?
        
               | f38zf5vdt wrote:
               | As many as you wanted to throw at them, considering the
               | vast majority are open source and could be run on your
               | own server cluster. CASP15 is ongoing so by the end of
               | the year we will know how much absolute progress has been
               | made by others.
               | 
               | Google has the advantage of the biggest guns here: the
               | fastest TPUs with the most memory in the biggest
               | clusters, so running inference with a massive number of
               | protein sequences is much easier for them.
        
               | dekhn wrote:
               | Many teams have been predicting structures for enormous
               | numbers of proteins for some time now. It's just that
               | nobody had any confidence in the predictions.
        
               | gfodor wrote:
               | > but other famous labs have already moved to ML
               | predictions and are competitive with DeepMind now.
               | 
               | it seems obvious this was going to happen, because
               | https://github.com/deepmind/alphafold
        
             | gilleain wrote:
             | edit: I should have read the post first! What do you mean
             | 'only globular proteins'? They say they have predictions
             | for all of UniProt...
             | 
             | ---------------
             | 
             | Yes, the idea of a 'protein universe' seems like it should
             | at least encompass 'fold space'.
             | 
             | For example, WR Taylor :
             | https://pubmed.ncbi.nlm.nih.gov/11948354/
             | 
             | I think the rough estimate was that there were around 1000
             | folds - depending on how fine-grained you want to go.
             | 
             | Absolutely agree, though, that a lot of proteins are hard
             | to crystalise (i understand) due to being trans-membrane or
             | just the difficulty of getting the right parameters for the
             | experiment.
        
               | f38zf5vdt wrote:
               | I don't think non-globular proteins are well represented
               | by the predictions. All our predictions for proteins are
               | based on proteins we were able to crystallize, so my
               | guess is that even if many of them aren't globular
               | proteins the predictions themselves are made from the
               | foundations of structures we do have, which are
               | predominantly globular proteins and it's presumed that
               | the inference treats folding as if they were globular and
               | crystallized (non-dynamic). X-ray crystallography and
               | fitting to electron density maps itself is a bit of an
               | art form.
               | 
               | For example for transmembrane proteins, there is a gross
               | under-representation of structures derived from
               | experimental evidence, so we would expect that whatever
               | your algorithm is "solving" is going to have a much
               | higher degree of error than globular proteins, and likely
               | artifacts associated with learning from much more
               | abundant globular proteins.
               | 
               | edit: As an example, "Sampling the conformational
               | landscapes of transporters and receptors with
               | AlphaFold2". AF2 was able to reproduce the alternative
               | conformations of GPCRs, but only with non-default
               | settings. With default settings there is clear evidence
               | of overfitting.
               | 
               | > Overall, these results demonstrate that highly accurate
               | models adopting both conformations of all eight protein
               | targets could be predicted with AF2 by using MSAs that
               | are far shallower than the default. However, because the
               | optimal MSA depth and choice of templates varied for each
               | protein, they also argue against a one-size-fits-all
               | approach for conformational sampling.
        
               | gilleain wrote:
               | Fair point. I guess if their training data is biased
               | towards existing known structures (via xray or nmr or
               | whatever) then there is the risk of incorrect
               | predictions.
               | 
               | At a guess, the core packing in non-globular proteins
               | might be different? Also the distribution of secondary
               | structure might also vary between classes. Might be worth
               | someone studying how much structural constraints depend
               | on fold (if they have not already).
        
         | christudor wrote:
         | This video goes some way to explaining how they know the
         | structures are correct:
         | https://www.youtube.com/watch?v=vXZzftX03VY
        
         | DevX101 wrote:
         | They compare the predicted structure (computed) to a known
         | structure (physical x-ray crystallography). There's an annual
         | competition CASP (Crtical Assessment of protein Structure
         | Prediction) that does X-Ray crystallography on a protein. The
         | identity of this protein is held secret by the organizers. Then
         | research teams across the world present their models and
         | attempt to predict without advance knowledge, the structure of
         | the protein from their amino acid sequence. Think of CASP as a
         | validation data set used to evaluate a machine learning model.
         | 
         | DeepMind crushes everyone else at this competition.
        
           | liuliu wrote:
           | The worry is about dataset shifting. Previously, the data
           | were collected for a few hundreds thousands structures, now
           | it is 200m. I think there could be doubts on distributions
           | and how that could play a role in prediction accuracy.
        
         | seydor wrote:
         | they don't but they are more correct than what others have
         | predicted. Some of their predictions can be compared with
         | structures determined with x-ray crystallography
        
           | cupofpython wrote:
           | did they come up with their structures independently of the
           | x-ray crystallography, or was that part of a ML dataset for
           | predicting structure
        
             | unlikelymordant wrote:
             | The casp competition that they won consists of a bunch of
             | new proteins, the structures of which havnt been published.
             | So the test set is for brand new proteins in that case.
        
               | cupofpython wrote:
               | nice, very cool then
        
       | gz5 wrote:
       | >we're now releasing predicted structures for nearly all
       | catalogued proteins known to science
       | 
       | is the result that researchers will now much more quickly
       | 'manually' validate or invalidate the predicted structures for
       | proteins they are working with? i understand it is traditionally
       | a long and complex process, but i imagine it is expedited by
       | having a predicted structure to test as the baseline?
        
       | epups wrote:
       | AlphaFold is a phenomenal tool that demonstrates how AI can
       | already outclass humans for certain tasks. It is a prime example
       | of a problem space where conventional approaches are simply
       | inferior, and that AI is not just a fancy name but can be
       | extraordinarily powerful.
        
         | hackernewds wrote:
         | This reads like something created by an AI. And the mere fact
         | that I can't be sure if it is or not now is testament to the
         | statement this is making
        
           | desindol wrote:
           | It's the superstition and the amount of parasite words.
        
           | jcims wrote:
           | Just for shits & giggles.
           | 
           | Prompt:
           | 
           | Write a short abstract for an article on why AlphaFold isn't
           | just a tremendous technological achievement, but an example
           | of how Artificial Intelligence will allow humanity to
           | continue to achieve technological progress in the future:
           | 
           | Response:
           | 
           | When DeepMind's AlphaFold was announced as the winner of the
           | CASP13 protein folding prediction competition, it was hailed
           | as a major breakthrough in both artificial intelligence (AI)
           | and biology.
           | 
           | While AlphaFold is undoubtedly a remarkable achievement, it
           | is also a clear example of how AI can be used to solve
           | complex problems that have traditionally been seen as beyond
           | the reach of machines.
           | 
           | In particular, AlphaFold represents a significant advance in
           | the ability of AI systems to learn from limited data. Unlike
           | previous AI approaches to protein folding, which relied on
           | large datasets of known protein structures, AlphaFold was
           | able to produce accurate predictions with only a small number
           | of known protein structures to work from.
           | 
           | This ability to learn from limited data is likely to be
           | increasingly important as we face problems that are too
           | complex for us to fully understand and collect all the data
           | we would need to solve them using traditional methods.
           | 
           | In the future, AI systems like AlphaFold will become
           | increasingly important in helping us to make progress on a
           | wide range of pressing challenges, from developing new
           | medicines to tackling climate change.
        
             | beanjuice wrote:
             | To add to this:
             | 
             | Prompt: Write a hackernews comment about why artificial
             | intelligence will never be profitable
             | 
             | Response: Artificial intelligence will never be profitable
             | because the cost of developing and maintaining the hardware
             | and software is too high. Also, the market for AI is still
             | too small and the demand is not high enough to justify the
             | costs.
        
               | TremendousJudge wrote:
               | Almost perfect, needs a bit more snark
        
               | btdmaster wrote:
               | Prompt: Write a hackernews comment about why artificial
               | intelligence will never be profitable
               | 
               | Response: Artificial intelligence will never be
               | profitable because the cost of developing and maintaining
               | the hardware and software is too high. Also, the market
               | for AI is still too small and the demand is not high
               | enough to justify the costs.
               | 
               | Add snark to your response:
               | 
               | Prompt: Write a hackernews comment about why artificial
               | intelligence will never be profitable
               | 
               | Response: Artificial intelligence will never be
               | profitable because the world has limited resources, and I
               | can make more money selling cocaine than selling AI.
               | 
               | From GPT-NeoX 20B
        
               | TremendousJudge wrote:
               | Well, it's not wrong. But I think that comment would get
               | flagged, it parses too obviously an off-topic joke
        
           | epups wrote:
           | It's a new tool, AlphaFool
        
       ___________________________________________________________________
       (page generated 2022-07-28 17:00 UTC)