[HN Gopher] Learning to think critically about machine learning ___________________________________________________________________ Learning to think critically about machine learning Author : SleekEagle Score : 99 points Date : 2022-05-02 15:13 UTC (7 hours ago) (HTM) web link (news.mit.edu) (TXT) w3m dump (news.mit.edu) | xyzzy21 wrote: | It would also be nice to remove the "magically thinking" around | machine learning. It's mathematically related to all prior signal | processing techniques (mostly a proper superset) but it also have | fundamental limits that no one talks about seriously. ML et al. | are NOT MAGIC but they are treated as if they were. | | And that is in itself a dangerous moral and ethical lapse. | vincentmarle wrote: | When you have a complex system that produces nth-order effects, | then the only approach is to treat it as empirical phenomena | (aka black box magic), and that is what most research papers in | this field do. | throwawaygh wrote: | In the 80s and 90s it was really common to anthropomorphize | spaghetti code. | | Just because something is difficult to analyze doesn't mean | it has limitless power. | ravi-delia wrote: | Maybe this is just my soft, theory-laiden pure math brain | talking, but I'd be a lot less impressed with machine learning | if we had a decent formal understanding of them. As is they're | way weirder than I think most engineering types give them | credit for. But then again, that's how I feel about a lot of | applied stuff, it all feels a little magic. I can read the | papers, I can mess around with it, but somehow it's still | surprising how well it can work. | SleekEagle wrote: | Ultimately it comes down to gradient-based descent (which is | pretty magical in its own right), but what's most surprising | to me is that the loss landscape is actually organized enough | to yield impressive results. Obviously the difficulties of | training large NNs are well-documented, but I'm surprised | it's even that easy | nonrandomstring wrote: | > It would also be nice to remove the "magically thinking" | around machine learning. | | To be honest it would be a morally and ethical less dangerous | world if we could get our feet back on the ground in relation | to digital technologies in general. | | > fundamental limits that no one talks about seriously. | | I am starting to touch and stumble into the invisible cultural | walls that I think make people "afraid" to talk about | limitations. I am not yet done analysing that, but suspect it | has something to do with the maxim that people are reluctant to | question things on which their salary depends. That seems to be | a difference between "scientists" and "hackers" in some way. | | Going back to Hal Abelson's philosophy, "magic" _is_ a | legitimate mechanism in coding, because we _suppose_ that | something is possible, and by an inductive /deductive interplay | (abduction) we create the conditions for the magic to be true. | | The danger comes when that "trick" (which is really one of | Faith) is mixed with ignorance and monomaniacal fervour, and so | inflated to a general philosophy about technology. | time_to_smile wrote: | > suspect it has something to do with the maxim that people | are reluctant to question things on which their salary | depends. | | I once worked on a team that spent a lot of time building | models to optimize parts of the app for user behavior (trying | to intentionally remain vague for anonymity reasons). Through | an easy experiment I ran I ended up (accidentally) | demonstrating that the majority of DS work was not adding | more than minimal improvements, and so little monetary value | and it did not justify any of the time spend on this. | | I was let go not long after this, despite having help lead | the team to record revenues by using a simple model (which | ultimately was what proved the futility of much of the work | the team did). | | Just a word of caution as you | | > start to touch and stumble into the invisible cultural | walls that I think make people "afraid" to talk about | limitations | hotpotamus wrote: | It's long been my suspicion that much of tech is just | throwing more and more effort into ever diminishing returns | and I think a lot of us at least feel that too, but the pay | is good and you don't have to dig ditches, so what are you | going to do? | nonrandomstring wrote: | Good story. I guess you had done with your work there. | Sometimes teams/places have a way of naturally helping us | move to the next stage. | | Competences work at multiple levels, visible and invisible. | Being good at your job. Showing you're good at your job. | Believing in your job. Getting other people to believe in | your job. Getting other people to believe that you believe | in your job... and so on _ad absurdum_. Once one part of | that slips the whole game can unravel fast. | arcticfox wrote: | > ML et al. are NOT MAGIC but they are treated as if they were. | | They're not magic - nothing is, but what _are_ they? | | > but it also have fundamental limits that no one talks about | seriously | | What are these fundamental limits? 20 years ago I imagine | skeptics in your camp would have set these "fundamental" limits | at lower than DALL-E 2, GPT-3, AlphaStar etc. Or are you | talking about limits today? In which case, sure, but I think | "fundamental" is the wrong word to use there given they change | continuously. | | > It's mathematically related to all prior signal processing | techniques (mostly a proper superset) | | And human brains are what if not signal processing machines? | amelius wrote: | > They're not magic - nothing is, but what are they? | | Emergent magic. | woopwoop wrote: | Honestly at this point it kind of is magic. These things are | knocking out astonishing novel tasks every month, but the state | of our knowledge is "why does sgd even work lol". There is no | coherent theory. | srean wrote: | > "why does sgd even work lol" | | I find this hand a little over played. | | It depends on the degree of fidelity we demand of the answer | and how deep we want to go questioning the layers of answers. | However, if one is happy with a LOL CATS fidelity, which | suffices in many cases, we do have a good enough | understanding of SGD -- change the parameters slightly in the | direction that makes the system work a little bit better, | rinse and repeat. | | No one would be astonished that using such a system leads to | better parameter settings than ones starting point, or at | least not significantly worse. | | Its only when we ask more questions, ask deeper questions | that we get to "we do not understand why SGD works so | astonishingly well" | woopwoop wrote: | Yeah I didn't mean to imply "Why does SGD result in lower | training loss than the initial weights" is an open | question. But I don't think even lolcatz would call that a | sufficient explanation. After all if the only criterion is | "improves on initial training loss" you could just try | random weights and pick the best one. The non-convexity | makes sgd already pretty mysterious, and that is without | even getting into the generalization performance, which | seems to imply that somehow sgd is implicitly regularizing. | srean wrote: | I dont disagree, except perhaps the lolcatz's demand for | rigour. _Improve with small and simple steps till you | cant_ is not a bad idea after all. | | BTW your randomized algorithm with a minor tweak is | surprisingly (unbelievably) effective -- randomize the | weights of the hidden layers, do a gradient descent on | just the final layer. Note the loss is even convex in the | last layer weights if matching/canonical activation | function is used. In fact you dont even have to try | different random choices, but of course that would help. | The random kitchen sink line of results are a more recent | heir to this line of work. | | I suspect that you already know this and the fact that | the noise in SGD does indeed regularize and the way it | does so for convex function has been well understood | since the 70s, so I am leaving this tidbit for others who | are new to this area. | Filligree wrote: | Why are there so few local minima, you mean? | | I think it'd have to be related to the huge number of | dimensions it works on. But I have no idea how I'd even | begin to prove that. | srean wrote: | Its not even certain that they are few. Whats rather | unsettling is that with these local moves of SGD the | parameters settle on a good enough local minima in spite | of the fact that we know that many local minima exists | that have zero or near zero training loss. There are | glimmers or insight here and there but the thing is yet | to be fully understood | SemanticStrengh wrote: | No neural networks aee stagnant on most key NLP tasks. While | there has been some advances in cool tasks, the needed tasks | for NLU are potently wintered. | 300bps wrote: | _Honestly at this point it kind of is magic._ | | How much of that magic is smoke and mirrors? For example, the | First Tech Challenge (from FIRST Robotics) used Tensor Flow | to train a library to detect the difference between a white | sphere vs a golden cube using a mobile phone's on-board | camera. | | The first time I saw it, it did seem pretty magical. Then in | testing realized it was basically a glorified color sensor. | | I think these things make for great and astonishing demos but | don't hold up to their promise. Happy to hear real-world | examples that I can look into though. | woopwoop wrote: | Even if it were practically useless (which it is not, | although the practical applications are less impressive | than the research achievements at this point), it would be | magical. Deep learning has dominated imagenet for a decade | now, for example. One reason this is magical is because the | sota models are extremely over parametrized. There exist | weights that perform perfectly on the training data but | give random answers on the test data [0]. But in practice | these degenerate weights are not found during sgd. What's | going on there? As far as I know there is no satisfying | explanation. | | [0] https://arxiv.org/abs/1611.03530 | wgd wrote: | I mentored an FTC team that was using the vision system | this year, and my overall impression was that the | TensorFlow model was absolute garbage and probably | performed worse than a simple "identify blobs by color" | algorithm would have. | | The vision model was tolerably decent at tracking | incremental updates to object positioning, but for some | reason would take 2+ seconds to notice that a valid object | was now in view (which is quite a lot, in the context of a | 30s autonomous period), and frequently identified the back | walls of the game field as giant cubes. | dekhn wrote: | there's a big difference between a glorified color sensor | and a well trained deep learning library (I can say this | with authority because I hired an intern at Google to help | build one of those detectors). It's still not magic, but a | well-trained network is robust and generalizable in a way | that a color sensor cannot be. | SemanticStrengh wrote: | NNs are just glorified logistic regression. People should | simply understand that neural networks cannot emulate a dumb | calculator accurately, this simple fact is enough to realize | being an universal approximator is in practice a fallacy, and | true Causal NLU or AGI is essentially out of reach of neural | networks, by design. Only a brain fidel architecture would have | hope however C.elegans retro engineering is underfunded and | spiking neural networks are untrainable. | mpfundstein wrote: | knock knock. some critic from the 70s arrived. hows gofai | going? | SemanticStrengh wrote: | Oh yes it's not GOFAI that has won the ARC challenge it's | neural networks, right? right? | https://www.kaggle.com/c/abstraction-and-reasoning- | challenge | | I have more expertise in deep learning than anyone else | here and the delusions of the incoming transformer winter | will be painful to watch. In the meantime, enjoy your echo | chamber. | nuclearnice1 wrote: | > the delusions of the incoming transformer winter will | be painful to watch | | Meaning? | SemanticStrengh wrote: | Meaning that HN in ten years will mock current HN | Der_Einzige wrote: | Using gradient based techniques does a LOT to force neural | network weights to resemble surfaces that they do not at all | look like when using global optimization and gradient free | techniques to optimize them. | | Most of the stupid crap that people give about degenerate | cases where deep learning doesn't work (cartpoll in | reinforcement learning, sine/infinite unbounded functions) | are showcasing how bad gradient based training is - not how | bad deep learning is at solving these problems. I can within | seconds solve cartpoll with neural networks using | neuroevolution of weights.... | [deleted] | visarga wrote: | > NNs are just glorified logistic regression. | | 2015 called, they want you back! Now seriously, "just" does | an amazing amount of work for you. How do you "just" make | logistic regression write articles on politics, convert | queries in SQL statements? or draw a daikon radish in a tutu? | | Humans are "just" chemistry and electricity, and the whole | universe just a few types of forces and particles. But that | doesn't explain our complexity at all. | SemanticStrengh wrote: | Neural networks do achieve impressive things but they also | fail to achieve essential things that preclude them from an | AGI or Causal NLU ambition, such as the inability to | approximate a dumb calculator without significant accuracy | loss. | Filligree wrote: | _I_ can't approximate a dumb calculator without | significant accuracy loss. Not without emulating symbolic | computation, which current AI is perfectly capable of | doing if you ask it the right way. | | Whatever makes you think it's necessary for AGI, when we | don't have it? | SemanticStrengh wrote: | NNs fails to do any algorithmy like pathfinding, sorting, | etc The point is not that you have it it's that you can | have it by learning and using a pen and paper. Natural | language understanding require both neural network like | pattern recognition abilities and advanced algorithmic | calculations. Since neural networks are pathetically bad | at algorithmy, we need neuro-symbolic software. However | the symbolic part is rigid and program synthesis is | exponential. Therefore the brain is the only technology | on earth to be able to dynamically code algorithmic | solutions. Neural networks have only solved a subset of | the class of automated programs. | visarga wrote: | There are about 3,610 results for "neural network | pathfinding" in Google Scholar since 2021. Try a search. | SemanticStrengh wrote: | And as you can trivially see it is outputting nonsense | values https://www.lovebirb.com/Projects/ANN- | Pathfinder?pgid=kqe249... (see last slide) At least in | this implementation | | Even if it had 80% accuracy (optimistic) it would still | he too mediocre to be used at any serious scale. | visarga wrote: | It's a model mismatch, not an inherent impossibility. A | calculator needs to have an adaptive number of | intermediate steps. Usually our models have fixed depth, | but in auto-regressive modelling the tape can become | longer as needed by the stepwise algorithm. Recent models | show LMs can do arithmetic, symbolic math and common | sense chain-of-thought step by step reasoning and reach | much higher accuracies. | | In other words, we too can't do three digit | multiplication in our heads reliably, but can do it much | better on paper, step by step. The problem you were | mentioning is caused by the bad approach - LMs need | intermediate reasoning steps to get from problem to | solution, like us. We just need to ask them to produce | the whole reasoning chain. | | - Chain of Thought Prompting Elicits Reasoning in Large | Language Models https://arxiv.org/abs/2201.11903 | | - Deep Learning for Symbolic Mathematics | https://arxiv.org/abs/1912.01412 | drdeca wrote: | Do you mean that a network _trained_ to imitate a calculator | won't do so accurately or that there is no combination of | weights which would produce the behaviors of a calculator? | | Because, with RELU activation, I'm fairly confident that the | latter, at least, is possible. | | (Where inputs are given using digits (where each digit could | be represented with one floating point input), and the output | is also represented with digits) | | Like, you can implement a lookup table with neural net | architecture. That's not an issue. | | And composing a lookup table with itself a number of times | lets one do addition, etc. | | ... ok, I suppose for multiplication you would have to like, | use more working space than what would effectively be a | convolution, and one might complain that this extra structure | of the network is "what is really doing the work", but, I | don't think it is more complicated than the existing NN | architectures? | SemanticStrengh wrote: | I am talking about training a neural network to achieve | calculations. And yes look-up tables might be fit for | addition but not for multiplication. The accuracy would be | <90% which is a joke for any serious use. | wolverine876 wrote: | People will (and I'm sure do) use this magical thinking | politically, persuading people to trust the computer and | therefore, unwittingly, trust the persons who control the | computer. That, to me, is the greatest threat - it is an | obvious way to grab power, and most people I know don't even | question it. It's a major consequence of mass public | surveillance. | heavyset_go wrote: | Bureaucracies would love for a blackbox to delegate all of | their decisions and responsibilities to in an effort to shift | liability away from themselves. | | You can't be liable for anything, you were just doing what | the computer told you to do, and computers aren't fallible | like people are. | godelski wrote: | I think this is a common problem and comes because we stressed | how these models are not interpretable. It is kinda like | talking about Schrodinger's cat. With a game of telephone | people think the cat is both alive and dead and not that our | models can't predict definite outcomes, only probabilities. | Similarly with ML people do not understand that "not | interpretable" doesn't mean we can't know anything about the | model's decision making, but that we can't know everything that | the model is choosing to do. Worse though, I think a lot of ML | folks themselves don't know a lot of stats and signal | processing. They just aren't things that aren't taught in | undergrad and frequently not in grad. | mirntyfirty wrote: | Along with that it becomes remarkably more difficult to | distinguish causation vs correlation although I'm sure that | point is heavily debated | godelski wrote: | > difficult to distinguish causation vs correlation | | I mean this is an extremely difficult thing to disentangle | in the first place. It is very common for people in one | breath to recite that correlation does not equate to | causation and then in the next breath propose causation. | Cliches are cliches because people keep making the error. | People really need to understand that developing causal | graphs is really difficult, and that there's almost always | more than one causal factor (a big sticking point for | politics and the politicization of science, to me, is that | people think there are one and only one causal factor). | | Developing causal models is fucking hard. But there is work | in that area in ML. It just isn't as "sexy" because they | aren't as good. The barrier to entry is A LOT higher than | other type of learning, so this prevents a lot of people | from pursuing this area. But still, it is an necessary | condition if we're ever going to develop AGI. It's probably | better to judge how close we are to AGI with causal | learning than it is for something like Dall-E. But most | people aren't aware of this because they aren't in the | weeds. | | I should also mention that causal learning doesn't | necessitate that we can understand the causal relationships | within our model, just the data. So our model wouldn't be | interpretable although it could interpret the data and form | causal DAGs. | bell-cot wrote: | In my wishful thinking, by far the best way to do that would be | for the courts to stick companies with full legal liability for | the shortcomings of their "machine learning" systems. And if | it's fairly easily demonstrate that GiantCo's ML decision | making system is a sexist, racist, ageist...then GiantCo is not | just guilty, but also presumed to have known in advance that | they were systematically and deliberately on the wrong side of | the law. | Jenk wrote: | Supplant "magic" with "not understood" | | Suddenly it all becomes a lot more palatable that many don't | know how it works. | photochemsyn wrote: | ML is really cool technology with incredible applications and | potential. For example, biological genomes can now be sequenced | relatively easily but finding the actual protein-coding sequences | hidden in these massive genome sequences can be difficult. ML | provides some novel approaches to this problem, and can | potentially markup these genomes and even classify the probable | structure/function of the resulting proteins. Amazing stuff, | really. | | However, anyone who thinks this tech couldn't go off the rails in | the hands of nefarious actors should go read Ed Black's "IBM and | the Holocaust" or Josef Teboho Ansorge's "Identify and Sort". | noasaservice wrote: | Perhaps this is a hot take, but when I see ML being used (for the | most part), it's because people do not fundamentally understand | OR know how to solve the problem in question. | | And instead of understanding what's going on, and solving the | problem efficiently, instead it's "throw GB's or TB's of data at | an algo that we tweak and hope for the best". | | Sure, it gets results for massive processing times and data. And | sure it's "fuzzy" and can fail on really stupid stuff and provide | bad answers confidently. But it'll get the next VC funding line, | won't it? | axg11 wrote: | Let's look at a few of the most impressive applications of | machine learning: | | - Classification in computer vision | | - Protein folding (AlphaFold) | | - Image generation (Dall-E 2) | | - Answering general language queries (GPT-3) | | It's unclear how any of these applications could have been | tackled _without_ machine learning. We had no solid grasp on | these problems before ML, with the exception of protein | folding. I think you are being cynical. Of course not every ML | project will result in a home run success. Should that make us | sceptical of the entire field? | Jensson wrote: | But most people using neural nets in the industry aren't | working on those things. They just slap it on any data | inference task, even though in most cases traditional | statistical/data science models works much better. | PartiallyTyped wrote: | How would you solve problems like Q/A? How would you solve | problems in RL scenarios? How would you solve image recognition | problems? | | In the end, ML is not that different to what we are doing with | all models, ie use a set of data to create a crude model of the | process that we are trying to learn/figure out. | jimbokun wrote: | I'm not sure engineering types will be satisfied with these kinds | of conclusions: | | > "It is not someone else's job to figure out the why or what | happens when things go wrong. It is all of our responsibility and | we can all be equipped to do it. Let's get used to that. Let's | build up that muscle of being able to pause and ask those tough | questions, even if we can't identify a single answer at the end | of a problem set," Kaiser says. | | Hopefully the actual course content is more concrete than this. | But this kind of language strikes me as encouraging people to | feel the "correct" way about a problem, but not really | emphasizing coming up with concrete, actionable solutions. | | And without actionable solutions, I feel like the value of this | content would be very limited. | SemanticStrengh wrote: | I would prefer machine learning to learn critical thinking | oxff wrote: | It is just a super massive graph optimization, don't get confused | about it like the OpenAI guys or whoever thinks matrix | multiplication is achieving consciousness . | SleekEagle wrote: | Why are the two necessarily unrelated? Can human being just be | considered to be learning via optimization, and perhaps | consciousness is an emergent property of an agent with a large | enough world model, or a world model that includes the agent | itself? | | While I don't think a majority really thinks current systems | are conscious, SOTA results are absolutely astounding (check | out DALL-E 2 if you haven't seen it already). Whether or not an | agent is conscious doesn't really matter from a practical | standpoint (but obviously a moral one) in the long run - it is | intelligence that matters with these agents, and they're | getting absurdly more intelligent by the half-decade | SemanticStrengh wrote: | We are on an AGI winter in NLU. Cool openAI demos are cool | and irrelevant. The HN crowd should really learn to go see | the leaderboards by himself on paperswithcode.com then he | would realize the reality, we are stagnant on the key basic | tasks (e.g coreference resolution). While GPT-3 is just a | subtile bullshit generator that push to its paro(t)xism the | illusion of understanding that mere collocated statistics | amalgamation provide, Dalle-2 on the other hand is very | impressive but it does not advance the key NLU tasks and just | show how far a smart trick (constrastive learning) can go | before plateau-ing. | | The idea consciousness emerge proportionately with the | accuracy of your mental isomorphic world representation is | cute however we don't become more conscious by becoming more | erudite, and the most intense magical qualias, such as e.g | orgasms are accessible to the simplest mammals and are | unrelated to activities in the higher cognitive regions of | the brain. Even a newborn that has no understanding of its | surrounding experience qualias. | johnsimer wrote: | You can make the argument that stagnant progress isn't | actually not progress, when it comes to AI progress. | Kilcher and Karpathy recently had a video where they | discussed how some new model (PALM or Dalle2 I forget | which) showed zero progress during X thousand training | cycles, and then suddenly rapid progress after those | training cycles. It was as if the model was spending | thousands of training cycles on grokking the concept, and | then finally grokked it. It could simply be that as we | continue to increase the number of parameters and data | quality on these models that we will continue to see | progress on the route to AGI as a whole, but only in step | change functions that require many training cycles | iMage wrote: | Out of curiosity, what was the video? | SemanticStrengh wrote: | How much more parameters do you need? PALM is 530 | BILLIONs and underperform in NLP tasks vs XLnet (300 | millions), as such very large language model are extreme | failures. They do not improve the state of the art once | you have proper datasets and do full shot learning and | I'm not even talking about fine-tuning. | | Very large languages model hide to the layman that they | are the gigantesque failure in NLP ever by showing they | improve the state of the art in zero or few shot | learning. Who cares this is so cringe. Full size learning | is what matter the most and even full size learning do | not yield satisfying accuracy on most Nlp tasks (but | close enough) Therefore the only use of PALM is to have | mediocre (70-80%) accuracy which is better than previous | SOTA, only for tasks that have no good quality existing | datasets. And 530 billion is close to the max we can | realistically achieve, it already cost ~10 millions in | hardware and underperform a 300 million model in full | size learning (e.g dependency parsing, word sense | disambiguation, coreference resolution, NER, etc) | | It's crazy people don't realize this gigantic failure but | as always it's because they don't care enough | oneoff786 wrote: | Feelings aren't an emergent property of intelligence, with | intelligence being defend as some nth derivative of | optimization capability. So I think no. | | Human consciousness isn't just a brain. It's a system of | which the brain is a part, occurring through time. | jasfi wrote: | For those interested in NLU/AGI, see LxAGI: https://lxagi.com. No | demo available just yet. | lacker wrote: | It's interesting to look at the actual course content here. | | https://ocw.mit.edu/courses/res-tll-008-social-and-ethical-r... | | _More generally, what does it mean for a model to be "fair"?_ | | _LIT Company's Definition of Fairness (Group Unaware): The | company believes that a fair process and, therefore, a fair | model, would not account for gender or race at all._ | | _Advocacy Group 's Definition (Demographic parity): An advocacy | group believes that a model is fair if the distribution of | outcomes for each demographic, gender, or other subgroup is the | same among those that applied and those that were accepted. For | example, in the example above, 30% of the applicants for loan | applications come from women. In the demographic parity | definition of fairness, this means 30% of the approved loan | applications should come from women._ | | I feel like the course content is somewhat slanted here. It is | missing the definition of "fairness" in which you treat race and | gender just like any other feature. Many systems work this way in | practice - for example car insurance charges you differently by | gender, because the statistics for genders are different. Ad- | matching by gender and race is a longstanding practice. And any | new system that you just train from scratch, by default it will | not know to treat gender or race different from anything else. | | It is an interesting question, though. The main problems, I | think, are practical ones - large enough AI models cannot be | "race-blind" because if you remove race as a feature, they will | be able to infer it anyways from proxy features. Whereas the only | real way to enforce a system achieves the same percentage results | for different groups is to add a "quota system" where you | explicitly use different thresholds for different groups. So the | practical alternatives often become "quota" or "nothing". | pfortuny wrote: | The second definition assumes that the law of large numbers | applies to any instance. It is impossible to satisfy each and | every time. And it may also be blind to inherenet inequalities | (as insurance companies know). | andersource wrote: | Overall agree, although regarding | | > large enough AI models cannot be "race-blind" because if you | remove race as a feature, they will be able to infer it anyways | from proxy features | | In theory using a gradient reversal layer and an adversarial | classifier you could do just that, to an extent. It could hurt | the model's performance, which is exactly your point (should we | ignore features with signal because they can be used to | discriminate.) | wolverine876 wrote: | > It is missing the definition of "fairness" in which you treat | race and gender just like any other feature. | | The real issue here, of course, is whether they are just like | every other feature: | | Certainly in our society they are not perceived that way. | People perceive very serious issues and have very strong | feelings around race and gender. We see that right here on HN, | of course. | | There is also, of course, a lot of discrimination by humans | based on race and gender. If we want an unbiased, fair (and | accurate) system, we have to correct for that. And the | discrimination creates higher order effects: If there is | discrimination against group X in K-12 education funding, then | fewer of X will go to college, and fewer will have higher- | paying jobs. If we then select blindly for income, we | incorporate that bias (which might be appropriate if studying | income by group, but not if we use it as a proxy for | intelligence or effort). | | > the practical alternatives often become "quota" or "nothing". | | Those aren't practical alternatives, they are logical extremes | creating a Manichean choice. Those are alternatives or a | political debate, not for practical problem-solving. | RangerScience wrote: | I'm pretty fascinated by all of this, although have barely | dipped my toes in. IMHO - | | "Fairness" is a technical term-of-art meaning "the outcome | _should not_ be effected by inputs X, Y and Z", and the | collected science around making a system behave that way. It | closely but not quite matches the colloquial meaning, ie | "that's not fair!" - kinda like how "a fair coin" has a | specific meaning that mostly tracks with how people use the | word, but not quite, and with _a lot_ more specificity. | | It's typically applied when you want to correct for real-world | "unfair biases" in the training data; which in practical | application is typically race, gender and the other legally | protected categories - but AFAIK is just whichever inputs you | decide you want to not have an impact on the outcome. | | AFAIK what you get out of the AI/ML "fairness" science is a way | to measure, and correct for, dependency on the inputs that you | (exterior to the system) have decided that you want to _not_ | impact the outcome. ___________________________________________________________________ (page generated 2022-05-02 23:01 UTC)