[HN Gopher] My story as a self-taught AI researcher
       ___________________________________________________________________
        
       My story as a self-taught AI researcher
        
       Author : emilwallner
       Score  : 144 points
       Date   : 2020-01-20 18:39 UTC (4 hours ago)
        
 (HTM) web link (blog.floydhub.com)
 (TXT) w3m dump (blog.floydhub.com)
        
       | exdsq wrote:
       | Survivorship bias or reality:
       | 
       | 3 months learning FastAI, 3-12 months personal projects and
       | consulting, 2 months flashcards of ~100 papers, 6 months to
       | publish a paper
       | 
       | What does he mean by 'paper'? A Medium post? NeurIPS?
        
       | qntty wrote:
       | "Many are realizing that education is a zero-sum credential
       | game."
       | 
       | Can this silly meme die already? Maybe it's understandable coming
       | from an economist who values education for no other reason than
       | it's economic effects, but it's strange coming from someone who
       | clearly understands the value of personal development.
        
         | lern_too_spel wrote:
         | It seems correct to me. If I get a degree from MIT, somebody
         | else can't. They have limited spots. He is promoting models of
         | education for signaling employers that are not zero sum.
        
         | gwern wrote:
         | No one doubts the value of personal development, least of all
         | the interviewee.
         | 
         | But I'm not sure what that has to do with buying expensive
         | formal education credentials.
        
         | Nasrudith wrote:
         | It is pretty strange even from an economist really - they of
         | all people should be able to understand and articulate the
         | difference between signaling value and direct utility value of
         | a given good or service.
        
           | koube wrote:
           | Economists have been debating the skills vs signaling value
           | of education, especially since Bryan Caplan released his book
           | The Case Against Education. If you want to get a smattering
           | of opinion on the issue the book's reviews and dicussionsn
           | would be a good starting point.
           | 
           | https://en.wikipedia.org/wiki/The_Case_Against_Education#Rev.
           | ..
           | 
           | Bryan Caplan back and forth with Noah Smith on the book: http
           | s://www.econlib.org/archives/2015/04/educational_sig_1.h...
           | 
           | Bryan Caplan back and forth with Bill Dickens on the book: ht
           | tps://www.econlib.org/archives/2010/08/education_and_s.htm...
        
           | Gimpei wrote:
           | It's not a majority view among economists. Caplan is the only
           | person I can think of who holds this view.
        
           | jessaustin wrote:
           | If anyone could realize the tenuous value of education, it
           | might be someone paying student loans for an economics
           | degree...
        
         | codebolt wrote:
         | My prediction is that whoever comes up with the next forward
         | leap in AI will be someone who at minimum has a firm grasp on
         | the various branches of undergraduate level maths. Naively
         | tinkering with heuristic statistical ML methods like neural
         | nets and hoping that higher level intelligence somehow
         | magically pops out isn't the way forward. We need a more
         | sophisticated approach.
        
           | why-el wrote:
           | This is already being done in places such as the university
           | of Arizona (Chomsky and his former students). The subject is
           | narrower of course (computational linguistics and some
           | neuroscience), but there are taking an approach that is more
           | Galilean in nature, by designing experiments that _reduce_
           | externalities rather that simply looking at massive amounts
           | of data. I think that 's what's going be the most useful, at
           | least in areas that continue to be challenging for the
           | current trends in AI, namely language.
        
       | narenst wrote:
       | This is a really good time to be a Independent Scientist (aka
       | Gentleman scientist) in this field because how nascent deep
       | learning and similar techniques are. It requires a lot of trial
       | and error and time/cost investment to bring the AI techniques to
       | the masses.
       | 
       | The FAANGs are trying to hire all the top talent (including Emil
       | who wrote the post) but I believe these independent researchers
       | will be the one finding new opportunities to make AI useful in
       | the real world (like colorizing b&w photos, create website code
       | from mockups).
       | 
       | The biggest challenge I see for these folks is the access to high
       | quality data. There is a reason Google is releasing so many ML
       | models in production compared to smaller companies. Bridging the
       | data gap requires effort from the community to build high quality
       | open source datasets for common applications.
        
         | andreyk wrote:
         | wrt the data point, to be fair most research is still coming
         | out of universities where students have access to the same data
         | as anyone else. So from a research perspective it's not a huge
         | deal, much as with compute industry can scale up known
         | techniques while individual researchers do more interesting
         | stuff.
        
           | tnecniv wrote:
           | A lot of research data sets are publicly available, but many
           | researchers based at universities have relationships with
           | private companies where they can get access to data or other
           | resources useful for research (e.g. Google has a big room of
           | robotic arms generating data for pick and place tasks).
           | 
           | There is still plenty you can do with a reasonable personal
           | budget, however.
        
           | K0SM0S wrote:
           | So if I understand correctly, to reformulate in my own
           | words/views:
           | 
           | while the "big data" (datasets) formed and thus owned by big-
           | tech, big-ads, big-brother, etc. may be instrumental to build
           | at-scale solutions for real-world usage (for profit,
           | knowledge, control, whatever actionable goal),
           | 
           | fundamental research itself, as done in universities, can
           | move forward without these datasets: using what's publicly
           | available is _enough_.
           | 
           | Did I read this right? It would effectively add much needed
           | nuance to the common perception that big data is necessary to
           | train innovative models, that there might be some sort of
           | monopoly on oil (data, the 'fuel' of ML) by a few champions
           | of data collection.
        
             | andreyk wrote:
             | yep, you read that right. Source: I am a PhD student at
             | Stanford at the Stanford Vision and Learning lab
             | (http://svl.stanford.edu/) and read a ton of AI papers. The
             | vast majority of papers are done with datasets anyone can
             | just download / request, as far as I've seen.
        
             | yorwba wrote:
             | It's not exactly true that research institutions don't have
             | access to the same big datasets as companies. For example,
             | I took a course that involved tracking soccer players using
             | videos provided by a streaming company that specializes in
             | amateur soccer. They promised to give us access to their
             | internal API under an NDA, which they wouldn't have done
             | for just anyone.
             | 
             | On the other hand, they never actually gave our API keys
             | the necessary privileges, so in the end I just reverse-
             | engineered the URL scheme of their streams and scraped
             | them. Many datasets used in academia are just collections
             | of publicly available data (e.g. Wikipedia, images found by
             | googling), optionally annotated for cheap using Amazon
             | Mechanical Turk. Experimenting with that kind of data is
             | also open to independent researchers. You don't need to
             | work at a data-hoarding company if you can get what you
             | need by scraping their website.
        
         | woah wrote:
         | On the other hand, the lack of data for independent researchers
         | may encourage the development of low data techniques which is
         | much more exciting in the long term since humans are able to
         | learn with much less data than required by most machine
         | learning techniques
        
           | TrainedMonkey wrote:
           | Arguably humans have a lifetime of data which was used to
           | develop a model of the world that is amazingly efficient at
           | interpreting new data.
        
             | cygaril wrote:
             | Or our entire evolutionary history of data.
        
               | AlanSE wrote:
               | ...which fits into a size of less than 700Mb compressed.
               | Some of the most exciting stories I've read recently for
               | machine learning are cases where learning is re-used
               | between different problems. Strip off a few layers, do
               | minimal re-training and it learns a new problem, quickly.
               | In the next decade, I can easily see some unanticipated
               | techniques blowing the lid off this field.
        
               | K0SM0S wrote:
               | It indeed strikes me as particularly domain-narrow when I
               | hear neuro or ML scientists claim as self-evident that
               | "humans can learn new stuff with just a few examples!.."
               | when the hardware upon which said learning takes place
               | has been exposed to such 'examples' likely _trillions of
               | times_ over _billions of years_ before -- encoded as DNA
               | and whatever else runs the  'make' command on us.
               | 
               | The usual corollary (that ML should "therefore" be able
               | to learn with a few examples) may only apply, as I see
               | it, if we somehow _encode previous "learning" about the
               | problem in very the structure (architecture, hardware,
               | design) of the model itself_.
               | 
               | It's really intuition based on 'natural' evolution, but I
               | think you don't get to train much "intelligence" in 1
               | generation of being, however complex your being might be
               | (or else humans would be rising exponentially in
               | intelligence every generation by now, and think of what
               | that means to the symmetrical assumption about silicon-
               | based intelligence).
        
               | tprice7 wrote:
               | "The usual corollary (that ML should "therefore" be able
               | to learn with a few examples) may only apply, as I see
               | it, if we somehow encode previous "learning" about the
               | problem in very the structure (architecture, hardware,
               | design) of the model itself."
               | 
               | Yes, and they do. They aren't choosing completely
               | arbitrary algorithms when they attempt to solve a ML
               | problem, they are typically using approaches that have
               | already been proven to work well on related problems, or
               | at least are variants of proven approaches.
        
             | echelon wrote:
             | Humans can transfer learn across domains because we can
             | draw on an incredible wealth of past experience. We can
             | understand and abstractly reason about the architecture of
             | problem landscapes and map our understanding into new
             | spaces.
             | 
             | That isn't even counting our hardwired animal intelligence.
        
             | lallysingh wrote:
             | Is that in a csv.gz I can torrent somewhere?
        
             | rhizome wrote:
             | Are you referring to empiricism?
        
             | eanzenberg wrote:
             | humans dont start with a random scrambled brain
        
             | ummonk wrote:
             | Transfer learning for the win.
        
           | gdubs wrote:
           | This would be a great area, IMHO, for the government to step
           | in and fund an initiative to provide huge, rich datasets for
           | anyone to use for ML research.
        
           | mendeza wrote:
           | I think an exciting area that can innovate the lack of data
           | is domain randomization, and synthetic data generation.
           | 
           | Slides from Josh Tobin is a great introduction: http://josh-
           | tobin.com/assets/pdf/randomization_and_the_reali...
           | 
           | http://josh-
           | tobin.com/assets/pdf/BeyondDomainRandomization_T...
           | 
           | And a really cool project implementing synthetic generation
           | of text in images: https://github.com/ankush-me/SynthText
        
           | SQueeeeeL wrote:
           | Low data techniques are just another name for
           | algorithms/equations. Dijstras algorithm required 0 training
           | graphs to make.
           | 
           | Any other kind of method will get killed by low statistical
           | information in the data (can't get blood from a stone)
        
             | ssivark wrote:
             | Agree with your first statement and disagree with your
             | second; I don't think the former implies the latter.
             | 
             | I think there's a lot of room to be clever with encoding
             | domain-specific inductive biases into models/algorithms,
             | such that they can perform fast+robust inference.
             | Exploiting this trade off as a design parameter to be
             | tuned, rather than sitting at one of the two extremes is
             | potentially going to generate a lot of value. And this is
             | highly under-appreciated currently since most people are
             | obsessed with "data". I'm willing to bet that this will
             | become big in a few years when the current AI hype machine
             | falters, and will serve as a huge competitive advantage.
        
               | btrettel wrote:
               | These types of techniques are already big in certain
               | fields. E.g., in fluid dynamics and heat transfer,
               | "dimensional analysis" is frequently used to simplify and
               | generalize models. Sometimes models can be nearly fully
               | specified up to a constant of proportionality based
               | solely on dimensional considerations. Beyond what is
               | typically seen as "data" the information here is a list
               | of variables involved in the problem and the dimensions
               | of the variables.
               | 
               | As far as I can tell "dimensions" in this sense are a
               | purely human construct. For two variables to have
               | different dimensions, it means that they can not be
               | meaningfully added, e.g., apples and oranges.
        
       | LemonAndroid wrote:
       | I don't see how this is self-taught, as the person got picked up
       | for an internship and could learn from experts first-handly.
       | 
       | FAKE.
        
       | rmah wrote:
       | Is this guy actually a _researcher_ in the way most people would
       | think of it? That is, someone who pushes the boundaries of
       | science; who develops new AI techniques or finds the hard
       | boundaries of existing AI techniques; who finds new ways compose
       | multiple AI techniques cohesively; who explores the theoretical
       | foundations of AI.
       | 
       | Or is he someone who uses AI techniques to solve problems (and
       | then wrote a paper about it)? I can't help but wonder a bit.
        
         | ssivark wrote:
         | For better or worse, the definition of researcher has morphed
         | into a combination of
         | 
         | 1. Solves previously unsolved problems
         | 
         | 2. Publishes papers sharing those solutions
         | 
         | without regard to the kind/spirit/scope of problems solved.
         | 
         | Since conference publications don't have the same number
         | constraints as journal papers, and are accepting of
         | application-specific results, this explosion of what is
         | considered "research" is somewhat inevitable. Also, there are a
         | lot of people chasing this given the prestige associated with
         | the title.
        
         | rlayton2 wrote:
         | Research needs people at the entire spread of the spectrum -
         | from those making fundamental improvements to underlying
         | theory, all the way to people running the thing to see if it
         | works on actual problems people have (obviously in a robust and
         | verifiable way).
        
       | bluetwo wrote:
       | The thing that disappoints me about the aspirations of being a
       | researcher is that the goal is to get paid to study AI, not solve
       | real-world problems.
       | 
       | I would rather build a small company by solving a real problem
       | than work for a big company spinning my wheels.
        
         | currymj wrote:
         | for a lot of people who end up in research-type jobs, a sense
         | of curiosity is one of their strongest motivators, and they
         | want work that will let them pursue their curiosity. it sounds
         | like you're motivated by something else.
        
         | hogFeast wrote:
         | Why I didn't go into academia but GL convincing other people of
         | the value in that. I am sure there are cultural differences
         | here but where I am, the goal of most people who study CS is:
         | leave me alone while I mess about with X (evidence: the local
         | college was doing speech processing/nlp in the 60s, they
         | actively turned down paid work...unsurprisingly, they got left
         | in the dust, professors are now being encouraged into doing
         | commercial work but, of course, most of it is totally nonviable
         | and is just more messing about with complex nonsense that
         | doesn't work).
         | 
         | I think if you look at history this is also evident: the
         | inventions of the late 18th century were a function of
         | necessity, the invention of semis (not just in the US but how
         | Taiwan developed)...this isn't to say academia is pointless but
         | there is just far more going on (I think if you look at some of
         | the East Asian nations that get great academic results, their
         | progress on actual R&D innovation is far less impressive).
        
           | ssivark wrote:
           | (American) Academia is a complicated matter, so I'll elide
           | commenting on that.
           | 
           | For a thoughtful counterpoint to the necessity argument, see:
           | https://jnd.org/technology_first_needs_last/ (previously
           | discussed on HN)
        
       | wigl wrote:
       | This reeks of survivorship bias to me. I much prefer Andreas
       | Madsen's more sober and self-conscious take on independent
       | research [0].
       | 
       | > I'd spend 1-2 months completing Fast.ai course V3, and spend
       | another 4-5 months completing personal projects or participating
       | in machine learning competitions... After six months, I'd
       | recommend doing an internship. Then you'll be ready to take a job
       | in industry or do consulting to self-fund your research.
       | 
       | Where are these internships that will hire you based on your
       | completion of Fast.ai (if done in 1-2 months by a beginner I
       | assume it's only part 1) alone, especially in 2020? How many are
       | going to place in a Kaggle competition with just half a year of
       | experience? More importantly, just how many people are
       | privileged/secure enough to put their all into learning, with no
       | sense of security or peer support?
       | 
       | > I started working with Google because I reproduced an ML paper,
       | wrote a blog post about it, and promoted it. Google's brand
       | department was looking for case studies of their products,
       | TensorFlow in this case. They made a video about my project.
       | Someone at Google saw the video, though my skill set could be
       | useful, and pinged me on Twitter.
       | 
       | So what really mattered was self-promotion, good timing, and
       | luck.
       | 
       | > Tl;dr, I spent a few years planning and embarking on personal
       | development adventures. They were loosely modeled after the
       | Jungian hero's journey with the influences of Buddhism and
       | Stoicism.
       | 
       | Why does the author have to present his life like one would in a
       | fucking college essay?
       | 
       | [0] https://medium.com/@andreas_madsen/becoming-an-
       | independent-r...
        
         | drongoking wrote:
         | > So what really mattered was self-promotion, good timing, and
         | luck.
         | 
         | Yes. He seems like someone who is good at self-promotion and
         | networking. Well, good for him, but I think he underplays the
         | role these have in his success.
         | 
         | > Why does the author have to present his life like one would
         | in a fucking college essay?
         | 
         | I guess that's the self-promotion. And humble-bragging. Like
         | this bit:
         | 
         | "I started working as a teacher in the countryside, but after
         | invoking the spirit of their dead chief, they later annotated
         | me the king of their village."
        
           | wigl wrote:
           | > Well, good for him, but I think he underplays the role
           | these have in his success.
           | 
           | Exactly. Good for Emil, but it's always frustrating to hear
           | survivorship bias preaching. Even the interviewer starts off
           | by saying:
           | 
           | > By the way, I really love your CV - the quirks section was
           | especially fun to read.
           | 
           | It's even more frustrating when I hear non-POC's talk about
           | their journey to some non-western country (and subsequent
           | conquering of fantastical goals like gaining the approval of
           | locals) or pursuit of some sense of foreign culture. It's
           | almost a given that they have internalized and appropriated
           | the ideas (i.e. Buddhism or even worse post-retreat
           | Buddhism). Good for the author to receive such positive
           | feedback for such signaling, but it makes me sad to know that
           | I might not receive the same.
        
       | K0SM0S wrote:
       | This was a great read (and great nuggets, like that paper on
       | Intelligence by Chollet).
       | 
       | I wonder:
       | 
       | -- Is math a problem for non-academic researchers?
       | 
       | Most papers strike me as requiring a non-trivial knowledge of
       | linear algebra, for instance; and topology sits right behind; the
       | bold seem to take it one up on category theory as we speak, and
       | geometric algebra is quickly gaining traction too. Lots of math,
       | cool math but math nonetheless.
       | 
       | Not that you can't learn these on your own, but how big is the
       | gap _in practice_ , on the job, compared with actual PhDs in
       | ML/math? (how much of a hinderance, a problem it is for the self-
       | taught researcher)
       | 
       | -- "Contracting" in the field of AI sounds great but, how
       | exactly? Especially solo: what type of clients and how/where to
       | find them, what type of 'business proposition' as a freelancer do
       | you offer, what's the pricing structure of such gigs?
       | 
       | I mean, I can sell you websites and visuals and stuff, but AI? I
       | know first-hand most SMBs (IME the only real customers for
       | freelancers) are a tough sell: their datasets are tiny and demand
       | scripting skills to sort out (extract business value), not AI, so
       | the value proposition is low for both parties; it's still early
       | adoption so 90% don't even consider spending 1 cent on "AI"
       | unless as a SaaS (they actually don't need to know if it's AI or
       | programming).
       | 
       | I can imagine tons of fantastic research to do with SMBs, as
       | partners or 'interested sponsors' (should they reap benefits on a
       | low investment), but really not much yet in the way of
       | "freelancer products" to market and sell for a living. I'm
       | eagerly anticipating those days, but it's more like 2025-2030 as
       | I see it.
       | 
       | I would love to hear first hand takes on this.
        
         | deepnotderp wrote:
         | To be honest, linear algebra is not that difficult to learn on
         | your own, and plenty of people do. Gilbert Strang's course on
         | OCW has made introductory linear algebra quite accessible.
         | 
         | Things like topology (e.g. TDA, persistent homology, etc.)
         | aren't really mainstream yet, but even then most of it isn't
         | really "hardcore" math in the sense that you can get away with
         | a basic understanding, e.g. what a Vietoris-Rips complex is and
         | why we use it instead of a Cech complex in TDA. Plus most DL
         | research nowadays is pretty (advanced) math-light. That being
         | said, taking the time to understand the math is absolutely
         | worthwhile in my experience.
         | 
         | It should also be noted that a lot of real world ML/AI projects
         | in industry aren't really about brand new algorithms using
         | advanced math, but rather more about applying mostly existing
         | techniques to messy, noisy real world data and taking the time
         | to understand the domain you are applying it to.
        
         | jph00 wrote:
         | > Is math a problem for non-academic researchers?
         | 
         | It takes a while to figure out how to read academic papers, but
         | it's largely about learning the notation. In the end, it maps
         | back to the code you write anyway in most cases, so it's just
         | another way of writing stuff you already know.
         | 
         | It's not so much linear algebra you need, since much of that is
         | not relevant to AI. It's really matrix calculus. Which is
         | largely about multiplying things together and adding them up.
         | Terence Parr and I tried to create a "all you need to know"
         | tutorial here: https://explained.ai/matrix-calculus/ .
         | 
         | You certainly don't need topology (unless you happen to be
         | interested in that particular sub-field).
        
           | anjc wrote:
           | Your tutorial is very good, but to able read even a few
           | paragraphs you need to be proficient with linear algebra and
           | calculus already.
           | 
           | > Most papers strike me as requiring a non-trivial knowledge
           | of linear algebra
           | 
           | I think this is correct, if you consider college level linear
           | algebra and an intuition for applying it to novel problems to
           | be non-trivial knowledge
        
         | JamesBarney wrote:
         | It's my understanding that dirty datasets that "demand
         | scripting skills to sort out" is pretty common and most data
         | scientists spend 80% of their time "sorting this out".
        
       | ineedasername wrote:
       | I think in these sorts of discussions two concepts with the same
       | name tend to get conflated, so I think it's important to make a
       | distinction between:
       | 
       | 1) _AI Research_ as applying /tweaking known ML/DL methods to a
       | novel problem. I would term these something like "AI Engineering
       | Research"
       | 
       | 2) _AI Research_ as examining the theoretical frameworks  &
       | approaches to ML/DL in a way that may itself lead to shifts in
       | the understanding of ML/DL as a whole and/or develop
       | fundamentally new tools for the purpose of #1. What might be
       | termed "basic" or "pure" research.
       | 
       | I'm not placing one of these above the other in terms of
       | importance. They are both necessary, and they form a virtuous
       | feedback loop between the two that, one without the other, would
       | see the other wither on the vine.
       | 
       | In the example of this particular person, Emil Wallner, he
       | appears to be doing #1, and perhaps doing so in a way that might
       | help inform more of #2.
        
       ___________________________________________________________________
       (page generated 2020-01-20 23:00 UTC)