[HN Gopher] The Gap: Where Machine Learning Education Falls Short
       ___________________________________________________________________
        
       The Gap: Where Machine Learning Education Falls Short
        
       Author : andreyk
       Score  : 43 points
       Date   : 2020-10-10 17:21 UTC (5 hours ago)
        
 (HTM) web link (thegradient.pub)
 (TXT) w3m dump (thegradient.pub)
        
       | glitchc wrote:
       | "Ah I see, your college degree is in Arts."
       | 
       | "Wait, what do you mean you can't sing, or act, or draw? What do
       | they teach you in school??!?!?"
        
       | _RPL5_ wrote:
       | The author makes a point that I relate to. I've been on the
       | receiving end of a couple 'Statistical Learning 101' courses.
       | These courses go roughly as he describes it in his blog post:
       | they first teach you how to multiply two matrices, then launch
       | into linear and logistic regression, then classification via
       | clustering, then decision trees and SVMs, then CNNs and deep
       | learning. Along the way, they do a lecture or two on
       | reinforcement learning and HMMs.
       | 
       | In the end, I ended up with a thin smear of half-baked knowledge
       | in my head, where I stop understanding the math once we are half-
       | way through the material.
       | 
       | So how to you achieve this level of deep/intuitive understanding:
       | 
       | > Without understanding the mathematical underpinnings of key
       | models and techniques in full detail, students aren't able to
       | quickly choose the right models for certain scenarios.
       | 
       | Does anyone have a good study plan with MOOCs and so on? If you
       | have any practical advice, I would appreciate it!
        
         | orange3xchicken wrote:
         | I mean it really depends how deep you want to go. Like you
         | point out that the classes you took are 101 courses. These are
         | really just "tasting" courses. I'm sure if you decided to take
         | more advanced/grad numbered courses, or unnumbered "topics"
         | courses, you would have a better idea of what's going on.
         | 
         | In general, "having a deep understanding of models & techniques
         | in full detail" is not well-defined. For example, analysis of
         | linear regression is often offered as a full year-long sequence
         | for graduate students in math/stats depts. Is this necessary
         | for doing linear regression in practice? Not really, but who
         | cares - it's interesting stuff in its own right. Most people
         | just need just enough understanding to finish a job.
         | 
         | In general, the precise medium that you use to study something
         | isn't that important as long as it works for you, but there is
         | a good reason that there are longstanding classic textbooks
         | that people swear by in most fields of mathematics. I do
         | strongly feel that in the context of any mathematical subject,
         | that there are few substitutes to the grind - doing proofs and
         | solving problems on your own.
         | 
         | Okay, but just to have at least one link in my post, I want to
         | share this guy MathematicalMonk who used to make really great
         | videos on ml-related stuff:
         | 
         | https://www.youtube.com/channel/UCcAtD_VYwcYwVbTdvArsm7w
        
       | Exuma wrote:
       | So how about actually giving suggestions of courses or resources
       | that fix this, instead of just saying what's broken and then
       | printing your name.
        
         | marketingPro wrote:
         | Keep reading the article
        
         | lukeplato wrote:
         | He proposes changing the syllabus of advanced/graduate-level
         | courses to skip reviewing linear classification and backprop
         | and make that a pre-requisite.
        
       | shekharshan wrote:
       | The author highlights not teaching enough mathematical theory
       | behind the various techniques. I have tried Andrew Ng's course on
       | Coursera. It uses Octave from what I remember. After a point the
       | lack of mathematical background started to show up. I have always
       | wondered where can I find course that teaches both the
       | mathematical background as well as the hands-on programming in a
       | balanced way.
        
       | bonoboTP wrote:
       | This complaint comes up time and time again. "Universities should
       | prepare students better for jobs!" "Teach more real life skills!"
       | "Turn CS uni into trade school!"
       | 
       | But a computer science university program is not about scikit-
       | learn or TensorFlow! It's about long-lasting principles,
       | underlying mathematics, mental models and ways of thinking.
       | 
       | None of my computer science lectures were about how to apply that
       | particular part of CS knowledge in some hot new Python library.
       | It's expected that there will be some amount of time required to
       | adjust to a company's software setup. That's not a big hurdle
       | usually.
       | 
       | I'm not saying it should be only theory, though. University
       | courses often have accompanying assignments or projects.
       | Depending on the country in question, they often offer more
       | hands-on, practical courses ("lab courses") as well, where you do
       | actually go through the steps of making the theory work in real
       | life. I had such courses where we played with microcontrollers
       | and FPGAs to understand CPU instructions, assembly and low-level
       | C concepts (but even there the goal wasn't to learn exactly the
       | thing that you will use on the job. Most CS graduates will never
       | need to program FPGAs in their day job.).
       | 
       | But sure, there is a place for even more data engineering
       | training, but I don't think it's computer science university
       | programs. Where do people like network engineers learn how to
       | configure Cisco routers and use whatever config software they
       | use? Where do sysadmins learn Bash, Unix, backup management etc?
       | Not at university courses. Wherever they learn those skills,
       | that's where data cleaning, parallelization engineering etc
       | aspects of machine learning should be taught as well.
        
         | andreyk wrote:
         | To be fair, this piece is mostly arguing against just teaching
         | Tensorflow or Pytorch and is in favor of more general skills
         | (with data engineering being fairly general, though as you
         | point out also something that can be taught via concrete
         | assignments). And as to your last point, that's pretty much the
         | conclusions of the piece itself:
         | 
         | "Based on the current state of machine learning courses it is
         | clear that AI courses will get you through the door in your
         | effort to perform cutting edge research or landing a machine
         | learning job, but they won't teach you everything you need to
         | know. To fill in the knowledge gaps that remain you will have
         | to put in outside effort on your own. "
         | 
         | I guess the question is whether the outside effort needs to be
         | addressed by universities, or by other resources.
        
         | tracer4201 wrote:
         | I'm an engineer who helps quite a bit with hiring interviews.
         | In my humble opinion, there's a surprising number of fresh grad
         | candidates who are not very skilled with theory nor practice.
         | 
         | It has been many, many years since I was in school. I think
         | it's fine that your computer science education focuses on
         | fundamental CS concepts and the mathematics so you can easily
         | pick up areas that require that math (ML, for example). I do
         | think universities can do better. At my school, we had
         | mandatory "block" classes in arts and humanities, which in my
         | opinion, offered no value.
         | 
         | To be clear, I'm not saying these subjects aren't valuable. I
         | am saying, however, that the quality of these courses was very
         | poor, and they could have been substituted entirely with
         | classes related more to my discipline.
         | 
         | I remember sitting in some political science classes that were
         | part of this required pool. I have no idea what we learned in
         | there. As far as I can remember, we read political science
         | papers that were very poorly written/extremely inaccessible. It
         | was impossible to differentiate the authors personal opinion
         | from objective truth of any kind. It was more or less a
         | checkbox - I had to have so many credit hours from this pool in
         | their curriculum to graduate. Did it force me to think
         | critically in some manner? Not at all. It actually gave me a
         | false sense of how "intelligent people" write.
         | 
         | Yes - all these years later I understand that it wasn't me who
         | just could understand those papers - it was half rambling,
         | pretentious rambling nonsense. It was the opposite of effective
         | communication, and it provided no value.
        
           | bonoboTP wrote:
           | > there's a surprising number of fresh grad candidates who
           | are not very skilled with theory nor practice.
           | 
           | Is there any "entrance point" where people don't complain
           | about this? Companies complain universities don't prepare
           | students. Universities complain that first-year students come
           | out of high school without necessary background. High schools
           | complain that elementary school does not prepare their
           | entrants etc. Elementary schools complain kindergarten
           | doesn't prepare kids to be mature enough. Kindergarten
           | probably complains that parents don't prepare the kids
           | enough.
           | 
           | Broadly speaking, I think there is more demand for highly
           | skilled, highly intelligent people than can be produced by
           | any given cross-section of the population born in a given
           | year. Sure, universities could do better, but beyond a
           | certain point teaching doesn't work. There are people who are
           | intrinsically motivated and soak up knowledge and seek it out
           | in books and online (so much high quality content can be
           | found online, especially for CS!), and there are those who
           | just coast and do the bare minimum. I don't think you can
           | radically improve the outcomes by changing the curriculum.
        
         | jagger27 wrote:
         | There's a big gap between the whiteboard in a classroom and a
         | blinking terminal cursor. As a teaching assistant a few years
         | ago I spent a good chunk of my time in the lab showing
         | otherwise brilliant computer science students things like basic
         | terminal commands, how to read error messages, and just overall
         | basic problem solving skills. Almost all of the students I
         | worked with were much stronger than me in terms of theory and
         | what I call "whiteboard computer science", but many of those
         | same students who aced written tests really struggled with
         | basic roadblocks like learning language syntax to do practical
         | assignments.
         | 
         | It sounds really silly, but some of the best instruction I gave
         | was "Tab" to autocomplete a command and "Up Arrow" to re-run
         | the last command. Whenever I would do a class demo on the
         | projector someone would always stop me to ask how I was running
         | commands so quickly and fluidly and how on Earth I could
         | remember them all.
        
           | p1esk wrote:
           | Someone (Knuth?) said there are only 2% students in any CS
           | class who are good. I worked as a CS TA as well, and I found
           | that to be true. The top 2% of students already had practical
           | skills (shell, vim, git, networking, etc), usually because
           | they've been coding since they were 12 and learned by doing
           | cool projects. Sometimes I saw bright students without prior
           | experience picking up those skills quickly (much quicker than
           | I did).
        
         | dunkelheit wrote:
         | This complaint has nothing to do with teaching hot new python
         | libraries.
         | 
         | The thing is, data cleaning is no less fundamental than
         | backpropagation. Maybe more so - learning algorithms come and
         | go, but real-world data is always going to be inherently messy.
         | The difference is in that we have a beautiful mathematical
         | theory for backpropagation but not for data cleaning. So the
         | courses that teach the former but not the latter are akin to
         | the proverbial drunkard that searches for the lost keys under
         | the street light - beautiful mathematical theories are easier
         | to lecture on so they teach them instead of messier (but not
         | less fundamental or useful) topics such as data cleaning.
        
         | mlthoughts2018 wrote:
         | I think the article is actually saying the exact opposite -
         | that Tensorflow / PyTorch / sklearn code soup from "trade
         | school" sources like bootcamps or quick online programs are not
         | very valuable out in the world.
         | 
         | You might be misunderstanding the focus on data cleaning and
         | feature engineering as being less specialized than say PyTorch
         | coding but it's exactly the opposite.
         | 
         | The most critical aspects of ML engineering for production are
         | all about advanced statistics. Understanding multicollinearity,
         | overfitting, dimensionality reduction, convergence, and time
         | series issues like assumptions of stationarity or conditional
         | independence effects.
         | 
         | Any engineer can crank out neural network software - that has
         | pretty much zero value.
         | 
         | Value lies in realizing some stratification error in the data
         | and following that lead to use a multi-level model to control
         | for it. Value lies in realizing several key feature inputs are
         | correlated on a seasonal basis - leading to multicollinearity -
         | and then setting up some adaptive feature aggregation to
         | mitigate it and dashboards with things like variance inflation
         | factor to be able to raise alerts on it across time.
         | 
         | Value lies in working on small data problems and using
         | literature review to determine the best prior to use for a
         | Bayesian model, and doing robust posterior predictive checks to
         | validate it.
         | 
         | These things require many years of education and experience
         | dealing with statistical irregularities, understanding
         | confounders and causal inference, understanding missing data
         | treatments, understanding time series forecasting.
         | 
         | You cannot learn that in 101 courses that overly focus on the
         | mechanics of how to type Tensorflow or sklearn code - that part
         | can be picked up by anyone in a month or two. And mere intro to
         | data cleaning and plotting distributions or proportions of
         | missing data is not a substitute for actual statistical
         | knowledge.
        
           | p1esk wrote:
           | Or you can just use 100x more weights in a transformer, and
           | it will learn how to write human level quality texts without
           | much data cleaning or fancy statistics. /s
        
         | tester756 wrote:
         | >None of my computer science lectures were about how to apply
         | that particular part of CS knowledge in some hot new Python
         | library.
         | 
         | Mine were
         | 
         | I had C#, .NET Core, Docker, MongoDb, MSSQL, Postgres, GraphQL,
         | OData, Neo4j, Redis, WebAssembly (Blazor), React, Vuejs and
         | stuff like Git.
         | 
         | that was covered on "Web apps", "Databases", "Non relational
         | databases" and meanwhile some bigger/smaller programming
         | projects.
         | 
         | Public school, studying at weekends.
        
         | jasim wrote:
         | I've oscillated between these two positions for a few years
         | now, when in truth neither positions are really in conflict.
         | 
         | When we complain about universities not preparing students
         | better for jobs, what we really mean is that universities are
         | not doing the bare minimum that they should be doing - in case
         | of CS, students should at least know how to program well, and
         | be well versed in the practicalities of computing. That does
         | not exclude learning the fundamentals (which is often
         | denigrated as "theory").
         | 
         | It is just that students often have neither the theory nor the
         | practice, and at a minimum, we're asking, they should know the
         | practice so they can at least be useful in their jobs.
        
           | marketingPro wrote:
           | The fact that there is somehow a notable size of computer
           | science grads that don't know how to program is the most
           | major red flag.
           | 
           | Are students cheating? Is curriculum group based? Is the
           | content not hard enough?
           | 
           | If people need to do coding interviews, I see no reason why
           | similar can't be done in college at a 200 and 300 level
           | checkpoint.
           | 
           | Programming/logic is easy if you understand. It doesn't need
           | to be directly tested often.
        
             | freeone3000 wrote:
             | A computer science graduate degree is not a programming
             | course certificate and should not be treated as a
             | substitute. If you're willing to hire people who studied
             | four years of theory with no practical applications or
             | experience, you need to have a plan to onboard them from
             | theory to software development.
             | 
             | You wouldn't hire a metallurgist as a welder, so you
             | shouldn't be hiring a computer scientist as a programmer.
        
             | MattGaiser wrote:
             | I graduated in the class of 2019. Plenty of CS courses had
             | no programming requirement at all and a few only had on-
             | paper coding requirements (i.e. nobody every checked if the
             | code worked).
             | 
             | I once submitted code that did not compile as I ran out of
             | time. I got 100% on that assignment.
             | 
             | Whether you get a good grade on the programming portions is
             | almost random.
        
             | bonoboTP wrote:
             | There's lots of hard CS content you can learn and take
             | exams in without writing code. Logic design, complexity
             | theory and automata, graph theory theorem proofs, linear
             | algebra, complex analysis, coding theory (compression,
             | encryption, ...), algorithms, proofs about data structures,
             | operating system theory (scheduling algorithms, deadlocks,
             | race conditions, virtual memory), database theory, etc.
        
             | zdragnar wrote:
             | When I was at a major US University in thr early 2000's,
             | anything higher than C wasn't taught because the field
             | changes faster than a 4 year degree would make sense-
             | modern languages were seen as the domain of 2 year tech
             | schools.
             | 
             | I believe that has since changed, but I am not sure to what
             | extent.
        
             | mlthoughts2018 wrote:
             | I think this is more of an expression that coding
             | interviews are horribly poor for assessing effectiveness at
             | delivering code to solve business problems.
             | 
             | Those people you say "can't code" actually can code very
             | well - it's just that the question "can you pass this timed
             | hazing trivia test in coderpad or on a whiteboard?" has no
             | relationship to "can you code?"
        
           | bonoboTP wrote:
           | Which universities don't teach the bare minimum? I assume the
           | article was about the US. The US has the very best CS
           | universities in the world, do they not teach these basics? Or
           | are we talking about smaller lower-tier American
           | universities? I think there's also a difference in which
           | programs you look at. There are, for example Computer
           | Engineering programs and also Computer Science programs,
           | which are not the same. In Germany, there are universities
           | (Universitat) and "universities of applied sciences"
           | (Fachhochschule), which differ in the balance of theory and
           | practice.
           | 
           | My big picture point is that the complaint is really general
           | and _isn 't specific to machine learning_ (ML is more of a
           | click magnet here). The same could be said about other parts
           | of CS and about the general computer-handling skills of CS
           | graduates.
        
             | vkou wrote:
             | > Which universities don't teach the bare minimum?
             | 
             | My university did not teach source control, or the basics
             | of good programming practices.
             | 
             | There were plenty of practical courses, with plenty of
             | programming assignments among them, but the only thing that
             | you were evaluated on was whether or not the resulting code
             | worked.
        
               | bonoboTP wrote:
               | I don't consider teaching source control to be the job of
               | a university.
        
               | MattGaiser wrote:
               | Why not? Mechanical engineering teaches CAD use. Chemical
               | engineering teaches you lab skills.
        
               | ganafagol wrote:
               | That's not the job of a university. A reasonably smart
               | student picks up how to work git in a few evenings with
               | some online tutorial and some open source project. Or
               | just when doing their homework.
               | 
               | In fact, a proper theoretical foundation makes this
               | really easy. Graph theory and algebra will have taught
               | them about DAGs and partial order, which is what git
               | branches are. A crypto class will have taught them about
               | hashes and signatures. Distributed systems class will
               | have taught them about issues with synchronisation. With
               | all that background it doesn't matter whether it's git or
               | whatever system will be en vogue in 10 years.
               | 
               | Imagine a student having learnt CVS 20 years ago at
               | university. Completely useless knowledge today. But the
               | same student with the above fundamentals will pick up git
               | in no time. _That 's_ what universities are for.
        
               | vvanders wrote:
               | If you don't actively make the real-world connections to
               | the theory then most students will just memorize the
               | coursework and then forget it later.
               | 
               | The number of times I have to walk through why linear
               | memory access matters, how caches and branch predictors
               | work is staggeringly high. In every single case they all
               | knew the theory but never made the connection to how it
               | applied to the task at hand.
        
               | [deleted]
        
               | vivekhaz wrote:
               | Let's continue the anecdotal train: my Computer science
               | major requires a class that does teach source control.
               | Better yet, it's a liberal arts school and I'll graduate
               | with a B.A., so instead of focusing on anecdotal
               | evidence, why don't we talk about what ought to be?
        
       | codelord wrote:
       | I got my BSc in computer science and PhD in machine learning, and
       | ended up working in a top FAANG AI research lab.
       | 
       | In the hindsight both when doing research for my PhD and also
       | when working as an engineer I felt the most useful courses from
       | undergrad were linear algebra, algorithms, calculus, operating
       | systems, and statistics in that order. I ended up filling the
       | gaps in my math education later by reading textbooks and taking
       | online courses.
       | 
       | IMO an undergrad program should focus on very fundamental theory.
       | If I was in charge of designing CS programs I would quadruple the
       | amount of credits required in math and specifically in linear
       | algebra. You would be surprised how handy and applicable linear
       | algbera is in ML, CV, robotics, computer graphics, finance, etc.
       | etc. Calculus is also important but to a lesser degree.
       | 
       | It's a waste of time to teach TensorFlow or teach the trendiest
       | neural network architecture at school. The knowledge becomes
       | irrelevant in a few years, and it's fairly easy to pick it up by
       | reading docs/papers if you know the fundamentals.
        
         | throwawaygh wrote:
         | _> It 's a waste of time to teach TensorFlow or teach the
         | trendiest neural network architecture at school. The knowledge
         | becomes irrelevant in a few years, and it's fairly easy to pick
         | it up by reading docs/papers if you know the fundamentals._
         | 
         | Well, kind of. You teach one or two instances of such things as
         | a case study in how to learn a framework. Usually Software
         | Engineering courses are the best place to do this. The point
         | is, your ML course should probably not be spending any time on
         | things like pytorch. A sophomore level engineering course
         | should have already taught students how to go through the
         | process of learning a new framework.
        
       ___________________________________________________________________
       (page generated 2020-10-10 23:00 UTC)