hngopher.com

       [HN Gopher] Deep Learning Interviews book: Hundreds of fully sol...
       ___________________________________________________________________
        
       Deep Learning Interviews book: Hundreds of fully solved job
       interview questions
        
       Author : piccogabriele
       Score  : 376 points
       Date   : 2022-01-10 15:57 UTC (7 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | angarg12 wrote:
       | I have been working as an ML Engineer for a few years now and I
       | am baffled by the bar to entry for these positions in the
       | industry.
       | 
       | Not only I need to perform at the Software Engineer level
       | expected for the position (with your standard leetcode style
       | interviews), but I need to pass extra ML specific (theory and
       | practice) rounds. Meanwhile the vast majority of my work consist
       | of getting systems production ready and hunting bugs.
       | 
       | If I have to jump through so many hoops when changing jobs I'll
       | seriously consider a regular non-ML position.
        
       | 1970-01-01 wrote:
       | This book has fun problems! Example:
       | 
       | During the cold war, the U.S.A developed a speech to text (STT)
       | algorithm that could theoretically detect the hidden dialects of
       | Russian sleeper agents. These agents (Fig. 3.7), were trained to
       | speak English in Russia and subsequently sent to the US to gather
       | intelligence. The FBI was able to apprehend ten such hidden
       | Russian spies and accused them of being "sleeper" agents.
       | 
       | The Algorithm relied on the acoustic properties of Russian
       | pronunciation of the word (v-o-k-s-a-l) which was borrowed from
       | English V-a-u-x-h-a-l-l. It was alleged that it is impossible for
       | Russians to completely hide their accent and hence when a Russian
       | would say V-a-u-x-h-a-l-l, the algorithm would yield the text
       | "v-o-k-s-a-l". To test the algorithm at a diplomatic gathering
       | where 20% of participants are Sleeper agents and the rest
       | Americans, a data scientist randomly chooses a person and asks
       | him to say V-a-u-x-h-a-l-l. A single letter is then chosen
       | randomly from the word that was generated by the algorithm, which
       | is observed to be an "l". What is the probability that the person
       | is indeed a Russian sleeper agent?
        
         | 8note wrote:
         | Really small?
         | 
         | How many russians in america are actually sleeper agents?
        
         | [deleted]
        
         | whatshisface wrote:
         | A single letter is chosen randomly? Huh? Why would you do that?
        
           | renewiltord wrote:
           | Seems a bit pointless to ask. You want them to make up a
           | story? "The data scientist's radio link degrades to static
           | while he waits for the answer and all he hears is the letter
           | 'l'". There.
        
             | whatshisface wrote:
             | It's just a bit funny to come up with a clever
             | justification for 50% of the problem only to quit at the
             | last moment with tacked-on math problem stuff.
        
               | renewiltord wrote:
               | Haha fair enough.
        
         | TrackerFF wrote:
         | Likewise, in the military, the use of countersigns have been
         | designed to make non-native speakers stand out - should the
         | countersign be compromised. For example, in WW2, Americans
         | would use "Lollapalooza", as Japanese really struggled with
         | that word.
        
           | PeterisP wrote:
           | That's more of a shibboleth than a secret, which is literally
           | a practice as old as the Bible - "And the Gileadites took the
           | passages of Jordan before the Ephraimites: and it was so,
           | that when those Ephraimites which were escaped said, Let me
           | go over; that the men of Gilead said unto him, Art thou an
           | Ephraimite? If he said, Nay; Then said they unto him, Say now
           | Shibboleth: and he said Sibboleth: for he could not frame to
           | pronounce it right. Then they took him, and slew him at the
           | passages of Jordan: and there fell at that time of the
           | Ephraimites forty and two thousand."
        
           | kragen wrote:
           | Hmm, I'd think that in a rhotic accent a word like
           | "furlstrengths" or "fatherlands" would work better? In
           | Japanese they sound like [FWrWrWsWtWriisW] or [harWsWtWriisW]
           | and [hazarWrandozW] respectively, rather than the native
           | [f@rlstriNGths] or [f@rlstriNGkths] and [fad@laendz].
           | Adjacent /rl/ pairs are a special challenge, there are
           | multiple unvoiced fricatives that don't exist at all in
           | Japanese, and consonant clusters totally violate Japanese
           | phonotactics to the point where it's hard for Japanese people
           | to even detect the presence of some of the consonants. By
           | contrast Japanese [raraparWza] is only slightly wrong,
           | requiring a little bit more bilateral bypass on the voiced
           | taps and a slight rounding of the W sound.
           | 
           | Some Japanese-American soldiers would be SOL tho.
        
         | kilotaras wrote:
         | Bayes rule with odd ratios makes it pretty easy.
         | base odds: 20:80 = 1:4           relative odds = (1 letter/6
         | letters) : (2 letters / 8 letters) = 2/3           posterior
         | odds = 1:4*2:3 = 1:6           Final probability = 1/(6+1) =
         | 1/7 or roughly 14.2%
         | 
         | Bayes rule with raw probabilities is a lot more involved.
        
           | thaumasiotes wrote:
           | Odds are usually represented with a colon -- the base odds
           | are 1:4 (20%), not 1/4 (25%).
        
             | [deleted]
        
           | Aethylia wrote:
           | Assuming that the algorithm is 100% accurate!
        
             | mattkrause wrote:
             | I was also distracted by the fact that you can't (usually)
             | hear the difference between English words _written_ with
             | one  'l' and those with two consecutive 'l's.
             | 
             | "Voksal" and "Vauxhall" seem like they should each have six
             | phonemes.
        
           | hervature wrote:
           | I don't know about "a lot more". It is essentially the same
           | calculation without having to know 3 new terms. Let:
           | 
           | A = the event they are a spy B = the event that an l appears
           | 
           | And ^c denote the complement of these events. Then,
           | 
           | P(A) = 1/5
           | 
           | P(A^c) = 4/5
           | 
           | P(B|A) = 1/6
           | 
           | P(B|A^c) = 1/4
           | 
           | P(A|B) = P(B|A)P(A)/P(B)
           | 
           | By law of total probability,
           | 
           | P(B) = P(B|A)P(A) + P(B|A^c)P(A^c)
           | 
           | Which is very standard formulation and really just your
           | equation as you can rewrite everything I have done as:
           | 
           | P(A|B) = 1/(1 + P(B|A^c)P(A^c)/P(B|A)P(A))
           | 
           | Which is the base odds, posterior odds, and odds to
           | probability conversion all in one. The reason why this method
           | is strictly better in my opinion is because the odds breaks
           | down simply if we introduce a third type of person which
           | doesn't pronounce l's. Also, after doing one homework's worth
           | of these problems, you just skip to the final equation in
           | which case my post is just as short as yours.
        
           | FabHK wrote:
           | Hmm, a bit more involved maybe, but not that much. But your
           | calculation sure seems short.
           | 
           | With S = sleeper, and L = letter L, and remembering "total
           | probability":                  P(L) = P(L|S)P(S) +
           | P(L|-S)P(-S),
           | 
           | (where -S is not S), we have by Bayes                  P(S|L)
           | = P(L|S) P(S) / P(L)      = P(L|S) P(S) / (P(L|S)P(S) +
           | P(L|-S)P(-S))      = 1/6 * 1/5 / (1/6*1/5 + 1/4*4/5)       =
           | 1/30 / (1/30 + 6/30)       = 1/7
        
       | la_fayette wrote:
       | Question aside: using arXiv for distributing such interview
       | questions, seems to me inappropriate. Is there any SEO trick
       | behind it?
        
       | time_to_smile wrote:
       | Fisher Information is under the "Kindergarten" section?
       | 
       | Maybe I've just been interviewing at the wrong places, I'd be
       | very curious if anyone here has been asked to even explain Fisher
       | information in any DS interview?
       | 
       | It's not that Fisher information is a particularly tricky topic,
       | but I certainly wouldn't put it as a "must know" for even the
       | most junior of data scientists. Not that I wouldn't mind living
       | in a world where this was the case... just not sure I live in the
       | same world as the authors.
        
         | sdenton4 wrote:
         | When I was a mathematician it was pretty common to make jokes
         | whenever we actually had to evaluate an integral, along the
         | lines of 'think back to your elementary-school calculus...'
        
           | lp251 wrote:
           | "integrate by parts, like you learned in middle school"
           | 
           | tf middle school did you go to?!
        
             | light_hue_1 wrote:
             | It's a joke. Like, we joke that the more math you learn the
             | less arithmetic you can do (ok, maybe that one isn't a
             | joke).
        
               | rindalir wrote:
               | In my undergrad abstract algebra class our professor
               | asked us a question about finding the order of a group
               | that involved dividing 32/8 and we all just sat there for
               | ten seconds before someone bravely ventured "...four?"
        
               | jpindar wrote:
               | I've experienced that many times among groups of
               | electrical engineers - we're all fine discussing
               | equations but once its time to plug in the numbers no one
               | wants to volunteer an answer.
        
       | master_yoda_1 wrote:
       | My problem with these line of numerous shallow books and courses
       | are                 1) Written by people who has no experience in
       | industry or they are not working on "real" machine learning jobs
       | 2) They think the standard in industry is pretty low and any BS
       | works. For example the concept of "lagrange multiplier" is
       | missing from the book. One need this concept to understand
       | training convergence guarantee.
        
       | Raphaellll wrote:
       | I actually bought this as a physical book on Amazon. Naturally it
       | came as a print-on-demand book. Unfortunately it has many
       | problems in this format. E.g. the lack of margins makes it hard
       | to read the end of sentences towards the gutter. Also some text
       | is pushed into each other. Not sure what source file format you
       | have to provide to Amazon, but it's certainly not the pdf
       | provided in the repo.
       | 
       | Edit:
       | 
       | It seems the overlapping text also occurs on some pdf readers:
       | https://github.com/BoltzmannEntropy/interviews.ai/issues/2
        
         | ruph123 wrote:
         | The last 5 textbooks I bought new on amazon had similar
         | problems. Totally unacceptable. I started returning them and
         | (because most were exclusive to amazon) started buying them new
         | on ebay with great results.
        
       | spekcular wrote:
       | This is amazing. I am ecstatic.
       | 
       | I've been looking for something exactly like this - and it's
       | executed better than I could have imagined.
       | 
       | (Needs a good proofreader still, though! Also, whatever custom
       | LaTeX template the authors are using is misbehaving a bit in
       | various places. Still great content.)
        
       | abul123 wrote:
        
       | mcemilg wrote:
       | The ML/DS positions highly competitive these days. I don't get
       | why ML positions requires hard preparations for the interviews
       | more than other CS positions while you do similar things. People
       | expect you to know a lot of theory from statistics, probability,
       | algorithms to linear algebra. I am ok with knowing basic of these
       | topics which are the foundations of ML and DL. But I don't get to
       | ask eigenvectors and challenging algorithm problems in an ML
       | Engineering position at the same while you already proof yourself
       | with a Masters Degree and enough professional experience. I am
       | not defending my PhD there. We will just build some DL models,
       | maybe we will read some DL papers and maybe try to implement some
       | of those. The theory is the only 10% of the job, rest is
       | engineering, data cleaning etc. Honestly I am looking for the
       | soft way to get back to Software Engineering.
        
         | uoaei wrote:
         | In part because ML fails silently by design. Even if the code
         | runs flawlessly with no errors, the outputs could be completely
         | bunk, useless, or even harmful, and you won't have any idea if
         | that is true just from watching The Number go down during
         | training. It's not enough to know how to build it but also _how
         | it works_. It 's the difference between designing the JWST and
         | assembling it.
        
           | mattkrause wrote:
           | I'm sure this happens, but do you think the problem is
           | actually one of mathematical savvy?
           | 
           | My guess would be that more machine learning projects go off
           | the rails for want of understanding the data or the
           | {business, research} problem.
        
           | borroka wrote:
           | But the OP was asking something different, that is why
           | someone should excessively focus on theory, when, by the way,
           | DL theory is very far from being solid and trial and error in
           | ML and AI is the common way of operating.
           | 
           | The "model is in place, but I have no clue what's doing and
           | so it can fail without me understanding when and how is
           | straw-man". Especially for supervised learning, that is, we
           | have a label for data, it is immediately clear whether the
           | output of the model is "bunk, useless, or even harmful".
           | There is no "fail silently by design".
           | 
           | I have been working in the field for almost 20 years in
           | academia and in industry and it is not that I starting every
           | PCA thinking about eigenvectors and eigenvalues and if you
           | ask me now without preparing what are those, I would be
           | between approximately right and wrong. But I fit many, many
           | very accurate models.
        
           | minimaxir wrote:
           | > In part because ML fails silently by design.
           | 
           | That's why there's so much iteration and feedback gathering
           | (e.g. A/B tests) as a part of DS/ML, which incidentally is
           | rarely a part of the interview loop.
           | 
           | Anyone who claims they can get a good model the first time
           | they train it is dangerously optimistic. Even the "how it
           | works" aspect has become more and more marginal due to black
           | boxing.
        
         | hintymad wrote:
         | A reason for such requirements is similar to that that software
         | engineers need to leetcode hard: supply and demand. Prestigious
         | companies get hundreds, if not thousands, of applications every
         | day. The companies can afford looking for candidates who have
         | raw talent, such as the capability of mastering many concepts
         | and being able solve hard mathematical problems in a short
         | time. Case in point, you may not need to use eigenvectors
         | directly in the job, but the concept is so essential in linear
         | algebra and I as a hiring manager would expect a candidate to
         | explain and apply it in their sleep. That is, knowing
         | eigenvector is an indirect filter to get people who are deeply
         | geeky. Is it the best strategy for a company? That's up to
         | discussion. I'm just explaining the motives behind such
         | requirements.
        
           | lumost wrote:
           | My guess is that this type of interview is partly why the ML
           | Space is full of loud explained who can't execute.
           | 
           | When I approach a science problem at work with other folks
           | who have scientist in their title, I assume that some portion
           | between 30 and 60% will have no meaningful contribution to
           | the project other than discussion and disagreement with the
           | direction the effort is taking. Most of the time, these
           | individuals will not dirty themselves with the details
           | sufficient to know how the algorithm/data/training process
           | works.
        
           | vsareto wrote:
           | I can't help but think there's been a ton of filters used in
           | the past to figure out if someone is deeply geeky, and we'll
           | continue to invent more in the future.
           | 
           | It's really looking like another rat race. Especially since
           | there's no central authority, every hiring manager has the
           | potential to invent their own filter, and make it arbitrarily
           | harder or easier based on supply and demand (and then the
           | filter drifts away from the intended purposes).
        
             | jollybean wrote:
             | But if there is an abundance of supply, the company has to
             | use some kind of filter.
             | 
             | Testing for geekyness and ability to solve tricky coding
             | math problems, seems like a rational way to do that.
             | 
             | If companies were starving for talent because 'nobody could
             | pass the test' - it would be another thing.
             | 
             | But they have to set the bar on something, somewhere.
             | 
             | I can't speak to AI/ML but I would imagine it might be hard
             | to hire there, given the very deep and broad concepts,
             | alongside grungy engineering.
             | 
             | I've rarely had such fascination and interest in a field
             | that I would _never_ actually want to work in.
        
               | crate_barre wrote:
               | There's an abundance of supply of people with masters
               | degrees in machine learning? How's that possible? I
               | thought this shit was supposed to be hard.
               | 
               | Has humanity just scaled way too hard or something,
               | because if we're having an abundance of supply in
               | difficult cutting edge fields to the point where they
               | also have their own version of Leetcode, then what hope
               | do average people have of getting _any_ job in this
               | world?
               | 
               | Or, is it at all possible that companies are
               | disrespecting the candidate pool by being stingy and
               | picky?
               | 
               | Maybe the truth is gray.
        
               | Mehdi2277 wrote:
               | I currently work as an ML engineer and have interviewed
               | on both sides for some well known companies.
               | 
               | The absolute demand in number of people is small compared
               | to popularity. It would not surprise me at all if many
               | computer science master's programs had a majority of the
               | students studying machine learning. I remember in
               | undergrad we had to ration computer science classes due
               | to too much demand from students. I think school had 3x
               | majors over a couple year time period in CS.
               | 
               | The number of needed ML engineers is much smaller than
               | total software engineers. When a lot of students decide
               | ML is coolest we have imbalanced CS pool with too many
               | wanting to do ML. Especially when for ML to work you
               | normally need good data engineering, backend engineer,
               | infra, and the actual ML is only a small subset of the
               | service using ML.
               | 
               | At the same time supply of experienced ml engineers is
               | still low due to recent growth of the field. Hiring 5+
               | years of professional experience ML engineers is more
               | challenging. The main place were supply is excessive is
               | for new graduates.
        
             | hintymad wrote:
             | It will be rat race when there are so many interview books
             | and courses and websites. It was a not rat race before
             | 2005, when there were only two reasons that one can solve
             | problems like Pirate Coins or Queen Killing Infidel
             | Husbands: the person is so mathematically mature that such
             | problems are easy for them; the person is so geeky that
             | they read Scientific American or Gardner's columns and
             | remembered everything they read.
        
               | littlestymaar wrote:
               | You're missing the third category: people like myself who
               | absolutely love this kind of riddles and destroy them in
               | a few minutes, without any significance on their actual
               | work abilities.
               | 
               | I don't think I'm a bad engineer, but I'm certainly not
               | the rock star you absolutely need for your team, but when
               | it comes to this kind of "cleverness" tests, I'm really
               | really good.
               | 
               | I've had the "Queen Killing Infidel Husbands" (with
               | another name) in an interview last year and I aced it in
               | a few minutes, and I didn't knew about "Pirate Coins",
               | but when I read your comment HN said your comment was "35
               | minutes ago" and now it says "40 minutes" which means I
               | googled the problem, figured out the solution and then
               | found the correction online to see if I was right in less
               | than 6 minutes, and so while I'm putting my son to bed!
               | 
               | It's really sad because there are many engineers much
               | better at there job than me who will get rejected because
               | of pointless tests like this...
        
           | Der_Einzige wrote:
           | Given that PCA is heavily antiquated these days, I'd say that
           | asking your candidates to know algebraic topology (the basis
           | behind many much more effective non linear DR algorithms like
           | UMAP) is far better... But in spite of the field having long
           | ago advanced beyond PCA, you're still using it to gatekeep.
        
             | selimthegrim wrote:
             | The initialization strategy for UMAP is important enough
             | that asking about that in practice is probably more
             | important than anything out of Ghrist's book as an
             | interview question
             | 
             | cf.
             | https://twitter.com/hippopedoid/status/1356906342439669761
        
           | MontyCarloHall wrote:
           | > Case in point, you may not need to use eigenvectors
           | directly in the job, but the concept is so essential in
           | linear algebra and I as a hiring manager would expect a
           | candidate to explain and apply it in their sleep.
           | 
           | Exactly. Whenever eigenvectors come up during interviews,
           | it's usually in the context of asking a candidate to explain
           | how something elementary like principal components analysis
           | works. If they claim on their CV to understand PCA, then
           | they'd better understand what eigenvectors are. If not, it
           | means they don't actually know how PCA works, and the
           | knowledge they profess on their CV is superficial at best.
           | 
           | That said, if they don't claim to know PCA or SVD or other
           | analysis techniques requiring some (generalized) form of
           | eigendecomposition, then I won't ask them about eigenvectors.
           | But given how fundamental these techniques are, this is rare.
        
         | nerdponx wrote:
         | Maybe "eigenvectors" is a bad example, because it's a pretty
         | foundational linear algebra concept.
         | 
         | But there is a threshold where it stops being a test of
         | foundational knowledge and starts being a test of arbitrary
         | trivia, and favors who has the most free time to study and
         | memorize said trivia.
        
           | whimsicalism wrote:
           | Having recently completed an MLE interview loop successfully
           | at a top company, I'm wondering where you are getting asked
           | complicated linear algebra questions in interview?
        
             | fault1 wrote:
             | Hopefully you aren't equating "eigenvectors" to
             | "complicated linear algebra question".
             | 
             | But I agree, a lot of MLE roles don't get asked such
             | things.
             | 
             | I think the OP's guide is closer to interviews I've seen
             | for phd programs.
        
           | uoaei wrote:
           | The difference between trivia and meaty knowledge is somewhat
           | contextually dependent, but an understanding of how core
           | probability and statistics concepts are integrated into the
           | framework of machine learning by means of linear algebra and
           | the other analytical tools is pretty damn useful to have
           | substantive conversations about ML design decisions. Helps
           | when everyone in the team speaks that language to keep up the
           | momentum.
        
         | devoutsalsa wrote:
         | I figure the best way to prepare for an ML job is to pull out
         | the nastiest working rat's nest of if statements you've ever
         | written & claim it was autogenerated by an adversarial network
         | (which was you fighting with your coworkers over your spaghetti
         | code).
        
         | barry-cotter wrote:
         | > But I don't get to ask eigenvectors and challenging algorithm
         | problems in an ML Engineering position at the same while you
         | already proof yourself with a Masters Degree and enough
         | professional experience.
         | 
         | People know pity passes exist for Master's degrees. You can't
         | trust that someone actually knows what they should know just
         | because they have a degree. Ditto professional experience. The
         | entire reason FizzBuzz exists is because people with years of
         | profesional experience can't program.
        
           | vanusa wrote:
           | We aren't talking about FizzBuzz here; but rather the
           | fashionable practice of subject people to 4-6 hours of
           | grilling on "medium-to-hard" problems that you absolutely
           | cannot fail, or even be slightly halting in your delivery on.
           | And which can only be effectively prepared for by investing
           | substantial amounts of time on by-the-book cramming.
           | 
           | On top of the fact that these problems are often poorly
           | selected, poorly communicated, conducted under completely
           | unrealistic time pressure, often as pile-ons (with 3-4
           | strangers as if just to add pressure and distraction), and
           | (these days) over video conferencing (so you have to stare in
           | the camera and pretend to make eye contact with people while
           | supposedly thinking about your problem, on top of shitty
           | acoustics), etc, etc.
           | 
           | It's just fucking ridiculous.
        
             | vidarh wrote:
             | I'm quite happy these places makes it so clear they're not
             | places I would be happy to work. I always ask about the
             | interview process and tell the recruiters I'm not
             | interested if they expect really lengthy processes. I'm
             | fine with things dragging out of they have additional
             | questions after initial interviews, but not if their
             | default starting position is that they need that.
        
       | jstx1 wrote:
       | Data science and ML interviews can be tough because it's very
       | difficult to prepare for everything and cover all the theory. A
       | lot of the value you add comes from knowing the theory so it's
       | understandable to test it but it's still hard to prepare well.
       | And you have a take-home and/or LC style problem(s) in addition
       | to the theory interview.
        
         | minimaxir wrote:
         | The hard questions in DS/ML interviews I've received over the
         | years aren't the theory questions (which I rarely get asked),
         | but the trick SQL questions that often depend on obscure syntax
         | and/or dialect-specific features, or "implement binary search"
         | when I'm not in the mindset for that as that isn't what DS/ML
         | is in the real world.
        
           | jstx1 wrote:
           | I think they're fine as long as you know the format and have
           | an opportunity to prepare or just get in the right mindset
           | for it. And some things (like binary search) should be easy
           | to write anyway.
           | 
           | The SQL questions can also be a symptom of the type of job -
           | Facebook's first data science round focuses a lot on SQL but
           | that's because it's a very product/analytics/decision-making
           | focused role without that much coding or ML. With data
           | science you have to be more careful about these things when
           | searching for a job; you can't just use the job title as a
           | descriptor.
        
             | minimaxir wrote:
             | > And some things (like binary search) should be easy to
             | write anyway.
             | 
             | It's a different story when a) your mind is set on
             | statistics/linear algebra b) you've never had to actually
             | implement binary search by hand since college and c) even
             | if you do implement the algorithm and demonstrate that you
             | have a general understanding, it must work perfectly and
             | pass test cases otherwise it doesn't count.
             | 
             | FWIW I was rarely asked about algorithmic complexity which
             | is more relevant in DS/ML, albeit it's usually in the
             | context of whiteboarding another algorithm and the
             | interviewer mocking me for doing it in O(n) instead of
             | O(logn).
        
               | kragen wrote:
               | Binary search in particular is surprisingly tricky, which
               | is precisely what makes it useful for telling if someone
               | knows how to program. To a significant extent, though,
               | you can cheat by studying binary search itself, which is
               | a surprisingly beautiful thing.
               | 
               | I like this formulation for finding the first index in a
               | half-open range where p is true, assuming p stays true
               | thereafter:                   bsearch p i j :=          i
               | if i == j else          bsearch p i       m if p m
               | else          bsearch p (m + 1) j          where m := i +
               | (j - i)//2
               | 
               | Or in Python:                   def bsearch(p, i, j):
               | m = i + (j - i) // 2             return (i if i == j
               | else bsearch(p, i, m) if p(m)                     else
               | bsearch(p, m+1, j))
               | 
               | The only tricky thing about this formulation is that m <
               | j if i < j, thus the asymmetric +1 in only one case to
               | ensure progress. If invoked with a p such as a[m] >= k it
               | gives the usual binary search on an array without early
               | termination. The i + (j - i) // 2 formulation is not
               | needed in modern Python, but historically an overflowing
               | (i + j) // 2 was a bug in lots of binary search library
               | functions, notably in Java and C.
               | 
               | (Correction: I said a[m] <= k. This formulation is less
               | tricky than the usual ones, but it's still tricky!)
        
               | minimaxir wrote:
               | > Binary search in particular is surprisingly tricky,
               | which is precisely what makes it useful for telling _if
               | someone knows how to program_.
               | 
               | That's the problem. There are many other ways to do that
               | without risking false negatives and annoying potential
               | candidates (e.g. I would not reapply to places that have
               | rejected me due to skepticism about my programming
               | abilities and using tests blatantly irrelevant to day-to-
               | day work because it's a bad indication of the engineering
               | culture).
               | 
               | Even FizzBuzz is better at accomplishing that task.
        
               | kragen wrote:
               | There are levels of not knowing how to program that go
               | beyond FizzBuzz. But sure, many programming jobs don't
               | require them.
        
               | minimaxir wrote:
               | If that's the case for the DS/ML domain, then a short
               | take-home exam should provide a better example of
               | practical coding ability (the common counterargument that
               | "take-home exams can be gamed" is a strawman that would
               | be more on the interviewer's fault for creating a flawed
               | exam).
               | 
               | In my case, I typically got the "implement binary search"
               | questions in a technical interview _after_ I passed a
               | take-home exam, which just makes me extra annoyed.
        
               | kragen wrote:
               | Agreed.
               | 
               | If you're gaming the take-home exam by looking up the
               | answer on Stack Overflow, you could game the same exam in
               | person by reading books of interview questions ahead of
               | time, and the interviewer can avoid that by making up new
               | questions. (OTOH if you're gaming the take-home exam by
               | paying someone else to solve the problem for you, that
               | might be harder to tell.)
        
               | nerdponx wrote:
               | FizzBuzz (or equivalent) is actually great IMO. It weeds
               | out the people who lied on their resume, without
               | punishing the people who never learned CS because they
               | were too busy learning things that were actually useful
               | to DS, like statistics or data visualization.
        
               | jstx1 wrote:
               | I've actually been given fizzbuzz in a DS interview! Up
               | to that point I thought that fizzbuzz was just a meme
               | because it's obviously too easy.
        
               | rightbyte wrote:
               | I tried to make Fizzbuzz on a paper when I first heard of
               | it, and it had a bug printing fizzbuzzfizzbuzz on 15.
               | 
               | If you want a correct program without a compiler/computer
               | I don't think anything is too easy. Maybe like, "make a
               | function returning the sum of two float parameters".
        
               | kragen wrote:
               | That would just test syntax, though. Fizzbuzz tests
               | logic. Your bug was a logic bug.
               | 
               | To a certain extent you can dispense with mental logic by
               | using a compiler. But the feedback loop is much slower.
               | Thinking your logic through before feeding it to a
               | compiler is like looking at a map when you're driving a
               | car; you can cut off whole branches of exploration.
               | 
               | Binary search is a particularly tricky logic problem in
               | part because it's so deceptively simple. In a continuous
               | domain it's easy to get right, but the discrete domain
               | introduces three or four boundary cases you can easily
               | get wrong.
               | 
               | But the great-grandparent is surely correct that many
               | programming jobs don't require that level of thinking
               | about program logic. Many that do, it's because the
               | codebase is shitty, not because they're working in an
               | inherently mentally challenging domain.
        
               | rightbyte wrote:
               | Ye I meant running it and then correcting the error.
               | 
               | Concerning binary search I acctually implemented that in
               | an ECU for message sorting. It took like a whole day,
               | including getting the outer boundries one off too big in
               | the first test run. Funnely enough the vehicle ran fine
               | anyway.
               | 
               | I would never pull that algorithm off correctly in an
               | interview without like training to I think.
        
             | disgruntledphd2 wrote:
             | Facebook Product Data Science has always been a Product
             | Analyst role more than anything else. I did the interviews
             | a while back, and it was a pretty fun experience, but it's
             | not what a lot of people call data science.
        
               | jstx1 wrote:
               | > but it's not what a lot of people call data science
               | 
               | I think that's changed a bit over time and the term has
               | expanded to mean more things. In addition to Facebook,
               | another great example is this article from Lyft in 2018
               | where they say that they're renaming all their data
               | analysts to data scientists and all their data scientists
               | to research scientists -
               | https://medium.com/@chamandy/whats-in-a-name-ce42f419d16c
        
               | kevinventullo wrote:
               | In my experience, it varied greatly from team to team.
        
           | nerdponx wrote:
           | I had an "implement binary search" interview once. I came
           | away feeling like I was being interviewed for the wrong role.
           | I don't understand how anyone could think that's an
           | appropriate interview task for a DS position.
        
             | whimsicalism wrote:
             | I'm an MLE and I get asked much harder questions than that.
             | Implement a binary search seems ... fine?
        
               | nerdponx wrote:
               | But it makes sense for MLE! IMO you should ask a stats or
               | probability question in a DS interview.
        
               | jstx1 wrote:
               | The distinction between the two roles isn't that clear.
               | Some data science jobs are very focused on engineering.
        
               | fault1 wrote:
               | Agreed. MLE in very ML-heavy companies tends to mean SWE
               | who work on ML systems, and sometimes, that can mean as
               | much working on stuff like infrastructure as modeling.
        
       | pietromenna wrote:
       | Wow! Great resource! Thank you!
        
       | kragen wrote:
       | Why are all the em dashes missing from the PDF?
        
         | aesthesia wrote:
         | This may be a rendering issue. Some interaction of the Computer
         | Modern font, the TeX layout algorithm, and Chrome's rendering
         | engine sometimes ends up making em-dashes and minus signs
         | invisible.
        
       | pradn wrote:
       | I think I know the answer to this, but how bad should I feel for
       | being a software engineer with little-to-no knowledge of deep
       | learning. I suspect it's not bad at all since the software
       | engineering field has split into a few camps, and mine - backend
       | systems work - isn't in the same universe as the machine learning
       | one, for the most part.
        
         | jstx1 wrote:
         | Not bad at all. I'm a data scientist and my not knowing React
         | doesn't affect me one bit.
        
       | abul123 wrote:
        
       | pugio wrote:
       | I'm really enjoying the discussion here, as I've been thinking a
       | lot about what a full modern ML/DS curriculum would look like.
       | 
       | I currently work for a non-profit investigating making a free
       | high quality set of courses in this space, and would love to talk
       | to as many people either working in ML/DS or looking to get into
       | the field. (I have ideas but would prefer to ground them in as
       | many real-world experiences as I can collect.)
       | 
       | If anyone here wouldn't mind chatting about this, or even just
       | sharing an experience or opinion, please drop me an email (in my
       | profile).
       | 
       | EDIT: We already have Into to DS, and a Deep RL sequence far
       | along in our pipeline, but are looking to see where we can help
       | the most with available resources.
       | 
       | I really appreciate this Interviews book as an example of what
       | topics might be necessary (and at what level), taking into
       | account the qualifying discussion here, of course.
        
       | lvl100 wrote:
       | In my 20s, I was doing data science at a very high level spanning
       | multiple disciplines. Truly state of the art. I would like to
       | think I was quite good at my job.
       | 
       | I am 99% certain I would not have passed the interview bars set
       | today. More specifically, the breadth they expect you to master
       | is very puzzling (and seemingly unrealistic).
        
       | light_hue_1 wrote:
       | I've interviewed well over 100 people for DL/ML positions. This
       | may be a good roadmap to what some people ask, but it's a
       | terrible guide to what you should ask. It's like a collection of
       | class exam questions.
       | 
       | Just as in programming, the world is full of people who can
       | recite facts but don't understand them. There is no point in
       | asking what an L1 norm is and asking for its equation. Or say,
       | giving someone the C++ code that corresponds to computing the
       | norm of a vector and asking them "what does this do". Or even
       | worse, showing them some picture of some cross-validation scheme
       | and asking them to name it. Yes, your candidates should be able
       | to do this, but positive answers to these kinds of questions are
       | nearly useless. These are the kinds of questions you get answers
       | to by Googling.
       | 
       | It's far more critical to know what your candidate can do,
       | practically. Create a hypothetical dataset from your domain where
       | the answer is that they need to use an L1 norm. Do they realize
       | this? Do they even realize that the distance metric matters? Are
       | they proposing reasonable distance metrics? Do they understand
       | what goes wrong with different distance metrics? etc. Or problems
       | where they need to use a network but say, padding matters a lot.
       | Or where the particulars of cross validation matter a lot.
       | 
       | This also gives you depth. "name this cross validation scheme"
       | gives you a binary answer "yes, they can do it, or no they can't"
       | And you're done. If you have a hypothetical dataset, you can keep
       | prodding. "Ok, but how about if I unbalance the data" or "what if
       | we now need to fine tune" or "what if the payoffs for precision
       | and recall change in our domain", "what if my budget is limited",
       | etc. It also lets you transition smoothly to other kinds of
       | questions. And to discover areas of deeper expertise than you
       | expected. For example, even for the cross validation questions,
       | if you ask that binary question, you might never discover that a
       | candidate knows about how to use generalized cross validation,
       | which might actually be very useful for your problem.
       | 
       | The uninformative tedious mess that we see in programming
       | interviews? This is the equivalent for ML/DL interviews!
        
         | flubflub wrote:
         | A problem with these questions is that a lot of them people can
         | answer without knowing ML/DL, admittedly cherry picked but
         | still.
         | 
         | For example what is the definition of two events being
         | independent in probability?
         | 
         | Or the L1 norm example: 'Which norm does the following equation
         | represent? |x1 - x2| + |y1 - y2|'
         | 
         | Find the taylor series expansion for e^x (this is highschool
         | maths).
         | 
         | Find the partial derivatives of f (x, y) = 3 sin2(x - y)
         | 
         | Limits etc...
         | 
         | These aren't specific to deep learning or machine learning, not
         | that I claim to be a practitioner.
        
         | MichaelRazum wrote:
         | Exactly I though the same. Not sure what a really good
         | alternative is. BUT you may be in risk to get bad candidates,
         | since they might be the ones with the best intrview practice.
         | 
         | Maybe that kind of questions are ok for people without
         | expirience but not for seniors.
        
         | coliveira wrote:
         | Do you have any books/material that can help the learner
         | acquiring this deeper understanding?
        
           | master_yoda_1 wrote:
           | I know one good reference.
           | 
           | https://www.deeplearningbook.org/
           | 
           | Also there are various courses and lectures but that needs
           | time and effort. There is no short cuts like the book posted
           | by OP.
        
             | light_hue_1 wrote:
             | Yeah. You just have to build models, experiment,
             | intentionally make bad decisions, and get a feel for how
             | things work. There's no clear shortcut.
             | 
             | But, this is also what you will practically be doing.
        
       | erwincoumans wrote:
       | Wow, nice resource! Wish it had some sections about (deep)
       | reinforcement learning and its algorithms. Looks like it is in
       | the plan though.
        
         | jstx1 wrote:
         | RL is still kind of niche - the number of companies that ship
         | anything using RL and the number of jobs that require it are
         | both quite low.
        
           | master_yoda_1 wrote:
           | just a clarification I think you are confused between RL and
           | robotics. RL algorithm could be used anywhere either in ads,
           | nlp, computer vision etc.
        
       | mrfusion wrote:
       | Are there deep learning roles that focus more on software
       | engineering and using the tools rather than having a deep
       | understanding of statistics?
        
         | fault1 wrote:
         | I would say on average MLE roles tend to be more SWEng heavy.
         | But some roles are as much creating infrastructure as running
         | the tools.
        
         | jstx1 wrote:
         | There are. But
         | 
         | 1) the titles will vary a lot (software engineer, ML engineer,
         | research engineer, data scientist etc.) which makes it hard to
         | locate those jobs and to move in the job market in general
         | 
         | 2) you still need a reasonable amount of theory (not
         | necessarily too much statistics) to use the tools well. And in
         | all likelihood you will be tested on it in some way during the
         | interviews.
         | 
         | 3) the interviews/job descriptions that don't emphasise the
         | theory often will be for jobs where you get a title like
         | Machine Learning Engineer but you focus more on the
         | infrastructure rather than on the ML code
        
         | throwaway6734 wrote:
         | I think they're called research engineering roles or ML
         | engineering
        
         | time_to_smile wrote:
         | > having a deep understanding of statistics?
         | 
         | As someone with a strong background in statistics, please tell
         | me where I can find DS jobs that require this.
         | 
         | For me and all my statistics friends in DS we find much more
         | frustration in how hard it is to pass DS interviews when you
         | understand problems deeper than "use XGBoost". I have found
         | that very few data scientists really even understand basic
         | statistics, I failed an interview once because an interviewer
         | did not believe that logistic regression could be used to solve
         | statistical inference questions (when it and more generally the
         | GLM is the workhorse of statistical work).
         | 
         | And to answer your question, whenever I'm in a hiring manager
         | position I very strongly value strong software engineering
         | skills. DS teams made up of people that are closer to curious
         | engineers tend to greatly outperform teams made up of
         | researchers that don't know you can write code outside of a
         | notebook.
        
           | disgruntledphd2 wrote:
           | A good conceptual understanding of statistics is always
           | helpful.
           | 
           | It's not really tested for in most places though, where they
           | regard a DS as a service that produces models.
        
       ___________________________________________________________________
       (page generated 2022-01-10 23:00 UTC)