[HN Gopher] The Gap: Where Machine Learning Education Falls Short ___________________________________________________________________ The Gap: Where Machine Learning Education Falls Short Author : andreyk Score : 43 points Date : 2020-10-10 17:21 UTC (5 hours ago) (HTM) web link (thegradient.pub) (TXT) w3m dump (thegradient.pub) | glitchc wrote: | "Ah I see, your college degree is in Arts." | | "Wait, what do you mean you can't sing, or act, or draw? What do | they teach you in school??!?!?" | _RPL5_ wrote: | The author makes a point that I relate to. I've been on the | receiving end of a couple 'Statistical Learning 101' courses. | These courses go roughly as he describes it in his blog post: | they first teach you how to multiply two matrices, then launch | into linear and logistic regression, then classification via | clustering, then decision trees and SVMs, then CNNs and deep | learning. Along the way, they do a lecture or two on | reinforcement learning and HMMs. | | In the end, I ended up with a thin smear of half-baked knowledge | in my head, where I stop understanding the math once we are half- | way through the material. | | So how to you achieve this level of deep/intuitive understanding: | | > Without understanding the mathematical underpinnings of key | models and techniques in full detail, students aren't able to | quickly choose the right models for certain scenarios. | | Does anyone have a good study plan with MOOCs and so on? If you | have any practical advice, I would appreciate it! | orange3xchicken wrote: | I mean it really depends how deep you want to go. Like you | point out that the classes you took are 101 courses. These are | really just "tasting" courses. I'm sure if you decided to take | more advanced/grad numbered courses, or unnumbered "topics" | courses, you would have a better idea of what's going on. | | In general, "having a deep understanding of models & techniques | in full detail" is not well-defined. For example, analysis of | linear regression is often offered as a full year-long sequence | for graduate students in math/stats depts. Is this necessary | for doing linear regression in practice? Not really, but who | cares - it's interesting stuff in its own right. Most people | just need just enough understanding to finish a job. | | In general, the precise medium that you use to study something | isn't that important as long as it works for you, but there is | a good reason that there are longstanding classic textbooks | that people swear by in most fields of mathematics. I do | strongly feel that in the context of any mathematical subject, | that there are few substitutes to the grind - doing proofs and | solving problems on your own. | | Okay, but just to have at least one link in my post, I want to | share this guy MathematicalMonk who used to make really great | videos on ml-related stuff: | | https://www.youtube.com/channel/UCcAtD_VYwcYwVbTdvArsm7w | Exuma wrote: | So how about actually giving suggestions of courses or resources | that fix this, instead of just saying what's broken and then | printing your name. | marketingPro wrote: | Keep reading the article | lukeplato wrote: | He proposes changing the syllabus of advanced/graduate-level | courses to skip reviewing linear classification and backprop | and make that a pre-requisite. | shekharshan wrote: | The author highlights not teaching enough mathematical theory | behind the various techniques. I have tried Andrew Ng's course on | Coursera. It uses Octave from what I remember. After a point the | lack of mathematical background started to show up. I have always | wondered where can I find course that teaches both the | mathematical background as well as the hands-on programming in a | balanced way. | bonoboTP wrote: | This complaint comes up time and time again. "Universities should | prepare students better for jobs!" "Teach more real life skills!" | "Turn CS uni into trade school!" | | But a computer science university program is not about scikit- | learn or TensorFlow! It's about long-lasting principles, | underlying mathematics, mental models and ways of thinking. | | None of my computer science lectures were about how to apply that | particular part of CS knowledge in some hot new Python library. | It's expected that there will be some amount of time required to | adjust to a company's software setup. That's not a big hurdle | usually. | | I'm not saying it should be only theory, though. University | courses often have accompanying assignments or projects. | Depending on the country in question, they often offer more | hands-on, practical courses ("lab courses") as well, where you do | actually go through the steps of making the theory work in real | life. I had such courses where we played with microcontrollers | and FPGAs to understand CPU instructions, assembly and low-level | C concepts (but even there the goal wasn't to learn exactly the | thing that you will use on the job. Most CS graduates will never | need to program FPGAs in their day job.). | | But sure, there is a place for even more data engineering | training, but I don't think it's computer science university | programs. Where do people like network engineers learn how to | configure Cisco routers and use whatever config software they | use? Where do sysadmins learn Bash, Unix, backup management etc? | Not at university courses. Wherever they learn those skills, | that's where data cleaning, parallelization engineering etc | aspects of machine learning should be taught as well. | andreyk wrote: | To be fair, this piece is mostly arguing against just teaching | Tensorflow or Pytorch and is in favor of more general skills | (with data engineering being fairly general, though as you | point out also something that can be taught via concrete | assignments). And as to your last point, that's pretty much the | conclusions of the piece itself: | | "Based on the current state of machine learning courses it is | clear that AI courses will get you through the door in your | effort to perform cutting edge research or landing a machine | learning job, but they won't teach you everything you need to | know. To fill in the knowledge gaps that remain you will have | to put in outside effort on your own. " | | I guess the question is whether the outside effort needs to be | addressed by universities, or by other resources. | tracer4201 wrote: | I'm an engineer who helps quite a bit with hiring interviews. | In my humble opinion, there's a surprising number of fresh grad | candidates who are not very skilled with theory nor practice. | | It has been many, many years since I was in school. I think | it's fine that your computer science education focuses on | fundamental CS concepts and the mathematics so you can easily | pick up areas that require that math (ML, for example). I do | think universities can do better. At my school, we had | mandatory "block" classes in arts and humanities, which in my | opinion, offered no value. | | To be clear, I'm not saying these subjects aren't valuable. I | am saying, however, that the quality of these courses was very | poor, and they could have been substituted entirely with | classes related more to my discipline. | | I remember sitting in some political science classes that were | part of this required pool. I have no idea what we learned in | there. As far as I can remember, we read political science | papers that were very poorly written/extremely inaccessible. It | was impossible to differentiate the authors personal opinion | from objective truth of any kind. It was more or less a | checkbox - I had to have so many credit hours from this pool in | their curriculum to graduate. Did it force me to think | critically in some manner? Not at all. It actually gave me a | false sense of how "intelligent people" write. | | Yes - all these years later I understand that it wasn't me who | just could understand those papers - it was half rambling, | pretentious rambling nonsense. It was the opposite of effective | communication, and it provided no value. | bonoboTP wrote: | > there's a surprising number of fresh grad candidates who | are not very skilled with theory nor practice. | | Is there any "entrance point" where people don't complain | about this? Companies complain universities don't prepare | students. Universities complain that first-year students come | out of high school without necessary background. High schools | complain that elementary school does not prepare their | entrants etc. Elementary schools complain kindergarten | doesn't prepare kids to be mature enough. Kindergarten | probably complains that parents don't prepare the kids | enough. | | Broadly speaking, I think there is more demand for highly | skilled, highly intelligent people than can be produced by | any given cross-section of the population born in a given | year. Sure, universities could do better, but beyond a | certain point teaching doesn't work. There are people who are | intrinsically motivated and soak up knowledge and seek it out | in books and online (so much high quality content can be | found online, especially for CS!), and there are those who | just coast and do the bare minimum. I don't think you can | radically improve the outcomes by changing the curriculum. | jagger27 wrote: | There's a big gap between the whiteboard in a classroom and a | blinking terminal cursor. As a teaching assistant a few years | ago I spent a good chunk of my time in the lab showing | otherwise brilliant computer science students things like basic | terminal commands, how to read error messages, and just overall | basic problem solving skills. Almost all of the students I | worked with were much stronger than me in terms of theory and | what I call "whiteboard computer science", but many of those | same students who aced written tests really struggled with | basic roadblocks like learning language syntax to do practical | assignments. | | It sounds really silly, but some of the best instruction I gave | was "Tab" to autocomplete a command and "Up Arrow" to re-run | the last command. Whenever I would do a class demo on the | projector someone would always stop me to ask how I was running | commands so quickly and fluidly and how on Earth I could | remember them all. | p1esk wrote: | Someone (Knuth?) said there are only 2% students in any CS | class who are good. I worked as a CS TA as well, and I found | that to be true. The top 2% of students already had practical | skills (shell, vim, git, networking, etc), usually because | they've been coding since they were 12 and learned by doing | cool projects. Sometimes I saw bright students without prior | experience picking up those skills quickly (much quicker than | I did). | dunkelheit wrote: | This complaint has nothing to do with teaching hot new python | libraries. | | The thing is, data cleaning is no less fundamental than | backpropagation. Maybe more so - learning algorithms come and | go, but real-world data is always going to be inherently messy. | The difference is in that we have a beautiful mathematical | theory for backpropagation but not for data cleaning. So the | courses that teach the former but not the latter are akin to | the proverbial drunkard that searches for the lost keys under | the street light - beautiful mathematical theories are easier | to lecture on so they teach them instead of messier (but not | less fundamental or useful) topics such as data cleaning. | mlthoughts2018 wrote: | I think the article is actually saying the exact opposite - | that Tensorflow / PyTorch / sklearn code soup from "trade | school" sources like bootcamps or quick online programs are not | very valuable out in the world. | | You might be misunderstanding the focus on data cleaning and | feature engineering as being less specialized than say PyTorch | coding but it's exactly the opposite. | | The most critical aspects of ML engineering for production are | all about advanced statistics. Understanding multicollinearity, | overfitting, dimensionality reduction, convergence, and time | series issues like assumptions of stationarity or conditional | independence effects. | | Any engineer can crank out neural network software - that has | pretty much zero value. | | Value lies in realizing some stratification error in the data | and following that lead to use a multi-level model to control | for it. Value lies in realizing several key feature inputs are | correlated on a seasonal basis - leading to multicollinearity - | and then setting up some adaptive feature aggregation to | mitigate it and dashboards with things like variance inflation | factor to be able to raise alerts on it across time. | | Value lies in working on small data problems and using | literature review to determine the best prior to use for a | Bayesian model, and doing robust posterior predictive checks to | validate it. | | These things require many years of education and experience | dealing with statistical irregularities, understanding | confounders and causal inference, understanding missing data | treatments, understanding time series forecasting. | | You cannot learn that in 101 courses that overly focus on the | mechanics of how to type Tensorflow or sklearn code - that part | can be picked up by anyone in a month or two. And mere intro to | data cleaning and plotting distributions or proportions of | missing data is not a substitute for actual statistical | knowledge. | p1esk wrote: | Or you can just use 100x more weights in a transformer, and | it will learn how to write human level quality texts without | much data cleaning or fancy statistics. /s | tester756 wrote: | >None of my computer science lectures were about how to apply | that particular part of CS knowledge in some hot new Python | library. | | Mine were | | I had C#, .NET Core, Docker, MongoDb, MSSQL, Postgres, GraphQL, | OData, Neo4j, Redis, WebAssembly (Blazor), React, Vuejs and | stuff like Git. | | that was covered on "Web apps", "Databases", "Non relational | databases" and meanwhile some bigger/smaller programming | projects. | | Public school, studying at weekends. | jasim wrote: | I've oscillated between these two positions for a few years | now, when in truth neither positions are really in conflict. | | When we complain about universities not preparing students | better for jobs, what we really mean is that universities are | not doing the bare minimum that they should be doing - in case | of CS, students should at least know how to program well, and | be well versed in the practicalities of computing. That does | not exclude learning the fundamentals (which is often | denigrated as "theory"). | | It is just that students often have neither the theory nor the | practice, and at a minimum, we're asking, they should know the | practice so they can at least be useful in their jobs. | marketingPro wrote: | The fact that there is somehow a notable size of computer | science grads that don't know how to program is the most | major red flag. | | Are students cheating? Is curriculum group based? Is the | content not hard enough? | | If people need to do coding interviews, I see no reason why | similar can't be done in college at a 200 and 300 level | checkpoint. | | Programming/logic is easy if you understand. It doesn't need | to be directly tested often. | freeone3000 wrote: | A computer science graduate degree is not a programming | course certificate and should not be treated as a | substitute. If you're willing to hire people who studied | four years of theory with no practical applications or | experience, you need to have a plan to onboard them from | theory to software development. | | You wouldn't hire a metallurgist as a welder, so you | shouldn't be hiring a computer scientist as a programmer. | MattGaiser wrote: | I graduated in the class of 2019. Plenty of CS courses had | no programming requirement at all and a few only had on- | paper coding requirements (i.e. nobody every checked if the | code worked). | | I once submitted code that did not compile as I ran out of | time. I got 100% on that assignment. | | Whether you get a good grade on the programming portions is | almost random. | bonoboTP wrote: | There's lots of hard CS content you can learn and take | exams in without writing code. Logic design, complexity | theory and automata, graph theory theorem proofs, linear | algebra, complex analysis, coding theory (compression, | encryption, ...), algorithms, proofs about data structures, | operating system theory (scheduling algorithms, deadlocks, | race conditions, virtual memory), database theory, etc. | zdragnar wrote: | When I was at a major US University in thr early 2000's, | anything higher than C wasn't taught because the field | changes faster than a 4 year degree would make sense- | modern languages were seen as the domain of 2 year tech | schools. | | I believe that has since changed, but I am not sure to what | extent. | mlthoughts2018 wrote: | I think this is more of an expression that coding | interviews are horribly poor for assessing effectiveness at | delivering code to solve business problems. | | Those people you say "can't code" actually can code very | well - it's just that the question "can you pass this timed | hazing trivia test in coderpad or on a whiteboard?" has no | relationship to "can you code?" | bonoboTP wrote: | Which universities don't teach the bare minimum? I assume the | article was about the US. The US has the very best CS | universities in the world, do they not teach these basics? Or | are we talking about smaller lower-tier American | universities? I think there's also a difference in which | programs you look at. There are, for example Computer | Engineering programs and also Computer Science programs, | which are not the same. In Germany, there are universities | (Universitat) and "universities of applied sciences" | (Fachhochschule), which differ in the balance of theory and | practice. | | My big picture point is that the complaint is really general | and _isn 't specific to machine learning_ (ML is more of a | click magnet here). The same could be said about other parts | of CS and about the general computer-handling skills of CS | graduates. | vkou wrote: | > Which universities don't teach the bare minimum? | | My university did not teach source control, or the basics | of good programming practices. | | There were plenty of practical courses, with plenty of | programming assignments among them, but the only thing that | you were evaluated on was whether or not the resulting code | worked. | bonoboTP wrote: | I don't consider teaching source control to be the job of | a university. | MattGaiser wrote: | Why not? Mechanical engineering teaches CAD use. Chemical | engineering teaches you lab skills. | ganafagol wrote: | That's not the job of a university. A reasonably smart | student picks up how to work git in a few evenings with | some online tutorial and some open source project. Or | just when doing their homework. | | In fact, a proper theoretical foundation makes this | really easy. Graph theory and algebra will have taught | them about DAGs and partial order, which is what git | branches are. A crypto class will have taught them about | hashes and signatures. Distributed systems class will | have taught them about issues with synchronisation. With | all that background it doesn't matter whether it's git or | whatever system will be en vogue in 10 years. | | Imagine a student having learnt CVS 20 years ago at | university. Completely useless knowledge today. But the | same student with the above fundamentals will pick up git | in no time. _That 's_ what universities are for. | vvanders wrote: | If you don't actively make the real-world connections to | the theory then most students will just memorize the | coursework and then forget it later. | | The number of times I have to walk through why linear | memory access matters, how caches and branch predictors | work is staggeringly high. In every single case they all | knew the theory but never made the connection to how it | applied to the task at hand. | [deleted] | vivekhaz wrote: | Let's continue the anecdotal train: my Computer science | major requires a class that does teach source control. | Better yet, it's a liberal arts school and I'll graduate | with a B.A., so instead of focusing on anecdotal | evidence, why don't we talk about what ought to be? | codelord wrote: | I got my BSc in computer science and PhD in machine learning, and | ended up working in a top FAANG AI research lab. | | In the hindsight both when doing research for my PhD and also | when working as an engineer I felt the most useful courses from | undergrad were linear algebra, algorithms, calculus, operating | systems, and statistics in that order. I ended up filling the | gaps in my math education later by reading textbooks and taking | online courses. | | IMO an undergrad program should focus on very fundamental theory. | If I was in charge of designing CS programs I would quadruple the | amount of credits required in math and specifically in linear | algebra. You would be surprised how handy and applicable linear | algbera is in ML, CV, robotics, computer graphics, finance, etc. | etc. Calculus is also important but to a lesser degree. | | It's a waste of time to teach TensorFlow or teach the trendiest | neural network architecture at school. The knowledge becomes | irrelevant in a few years, and it's fairly easy to pick it up by | reading docs/papers if you know the fundamentals. | throwawaygh wrote: | _> It 's a waste of time to teach TensorFlow or teach the | trendiest neural network architecture at school. The knowledge | becomes irrelevant in a few years, and it's fairly easy to pick | it up by reading docs/papers if you know the fundamentals._ | | Well, kind of. You teach one or two instances of such things as | a case study in how to learn a framework. Usually Software | Engineering courses are the best place to do this. The point | is, your ML course should probably not be spending any time on | things like pytorch. A sophomore level engineering course | should have already taught students how to go through the | process of learning a new framework. ___________________________________________________________________ (page generated 2020-10-10 23:00 UTC)