[HN Gopher] Deep Learning Interviews book: Hundreds of fully sol... ___________________________________________________________________ Deep Learning Interviews book: Hundreds of fully solved job interview questions Author : piccogabriele Score : 376 points Date : 2022-01-10 15:57 UTC (7 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | angarg12 wrote: | I have been working as an ML Engineer for a few years now and I | am baffled by the bar to entry for these positions in the | industry. | | Not only I need to perform at the Software Engineer level | expected for the position (with your standard leetcode style | interviews), but I need to pass extra ML specific (theory and | practice) rounds. Meanwhile the vast majority of my work consist | of getting systems production ready and hunting bugs. | | If I have to jump through so many hoops when changing jobs I'll | seriously consider a regular non-ML position. | 1970-01-01 wrote: | This book has fun problems! Example: | | During the cold war, the U.S.A developed a speech to text (STT) | algorithm that could theoretically detect the hidden dialects of | Russian sleeper agents. These agents (Fig. 3.7), were trained to | speak English in Russia and subsequently sent to the US to gather | intelligence. The FBI was able to apprehend ten such hidden | Russian spies and accused them of being "sleeper" agents. | | The Algorithm relied on the acoustic properties of Russian | pronunciation of the word (v-o-k-s-a-l) which was borrowed from | English V-a-u-x-h-a-l-l. It was alleged that it is impossible for | Russians to completely hide their accent and hence when a Russian | would say V-a-u-x-h-a-l-l, the algorithm would yield the text | "v-o-k-s-a-l". To test the algorithm at a diplomatic gathering | where 20% of participants are Sleeper agents and the rest | Americans, a data scientist randomly chooses a person and asks | him to say V-a-u-x-h-a-l-l. A single letter is then chosen | randomly from the word that was generated by the algorithm, which | is observed to be an "l". What is the probability that the person | is indeed a Russian sleeper agent? | 8note wrote: | Really small? | | How many russians in america are actually sleeper agents? | [deleted] | whatshisface wrote: | A single letter is chosen randomly? Huh? Why would you do that? | renewiltord wrote: | Seems a bit pointless to ask. You want them to make up a | story? "The data scientist's radio link degrades to static | while he waits for the answer and all he hears is the letter | 'l'". There. | whatshisface wrote: | It's just a bit funny to come up with a clever | justification for 50% of the problem only to quit at the | last moment with tacked-on math problem stuff. | renewiltord wrote: | Haha fair enough. | TrackerFF wrote: | Likewise, in the military, the use of countersigns have been | designed to make non-native speakers stand out - should the | countersign be compromised. For example, in WW2, Americans | would use "Lollapalooza", as Japanese really struggled with | that word. | PeterisP wrote: | That's more of a shibboleth than a secret, which is literally | a practice as old as the Bible - "And the Gileadites took the | passages of Jordan before the Ephraimites: and it was so, | that when those Ephraimites which were escaped said, Let me | go over; that the men of Gilead said unto him, Art thou an | Ephraimite? If he said, Nay; Then said they unto him, Say now | Shibboleth: and he said Sibboleth: for he could not frame to | pronounce it right. Then they took him, and slew him at the | passages of Jordan: and there fell at that time of the | Ephraimites forty and two thousand." | kragen wrote: | Hmm, I'd think that in a rhotic accent a word like | "furlstrengths" or "fatherlands" would work better? In | Japanese they sound like [FWrWrWsWtWriisW] or [harWsWtWriisW] | and [hazarWrandozW] respectively, rather than the native | [f@rlstriNGths] or [f@rlstriNGkths] and [fad@laendz]. | Adjacent /rl/ pairs are a special challenge, there are | multiple unvoiced fricatives that don't exist at all in | Japanese, and consonant clusters totally violate Japanese | phonotactics to the point where it's hard for Japanese people | to even detect the presence of some of the consonants. By | contrast Japanese [raraparWza] is only slightly wrong, | requiring a little bit more bilateral bypass on the voiced | taps and a slight rounding of the W sound. | | Some Japanese-American soldiers would be SOL tho. | kilotaras wrote: | Bayes rule with odd ratios makes it pretty easy. | base odds: 20:80 = 1:4 relative odds = (1 letter/6 | letters) : (2 letters / 8 letters) = 2/3 posterior | odds = 1:4*2:3 = 1:6 Final probability = 1/(6+1) = | 1/7 or roughly 14.2% | | Bayes rule with raw probabilities is a lot more involved. | thaumasiotes wrote: | Odds are usually represented with a colon -- the base odds | are 1:4 (20%), not 1/4 (25%). | [deleted] | Aethylia wrote: | Assuming that the algorithm is 100% accurate! | mattkrause wrote: | I was also distracted by the fact that you can't (usually) | hear the difference between English words _written_ with | one 'l' and those with two consecutive 'l's. | | "Voksal" and "Vauxhall" seem like they should each have six | phonemes. | hervature wrote: | I don't know about "a lot more". It is essentially the same | calculation without having to know 3 new terms. Let: | | A = the event they are a spy B = the event that an l appears | | And ^c denote the complement of these events. Then, | | P(A) = 1/5 | | P(A^c) = 4/5 | | P(B|A) = 1/6 | | P(B|A^c) = 1/4 | | P(A|B) = P(B|A)P(A)/P(B) | | By law of total probability, | | P(B) = P(B|A)P(A) + P(B|A^c)P(A^c) | | Which is very standard formulation and really just your | equation as you can rewrite everything I have done as: | | P(A|B) = 1/(1 + P(B|A^c)P(A^c)/P(B|A)P(A)) | | Which is the base odds, posterior odds, and odds to | probability conversion all in one. The reason why this method | is strictly better in my opinion is because the odds breaks | down simply if we introduce a third type of person which | doesn't pronounce l's. Also, after doing one homework's worth | of these problems, you just skip to the final equation in | which case my post is just as short as yours. | FabHK wrote: | Hmm, a bit more involved maybe, but not that much. But your | calculation sure seems short. | | With S = sleeper, and L = letter L, and remembering "total | probability": P(L) = P(L|S)P(S) + | P(L|-S)P(-S), | | (where -S is not S), we have by Bayes P(S|L) | = P(L|S) P(S) / P(L) = P(L|S) P(S) / (P(L|S)P(S) + | P(L|-S)P(-S)) = 1/6 * 1/5 / (1/6*1/5 + 1/4*4/5) = | 1/30 / (1/30 + 6/30) = 1/7 | la_fayette wrote: | Question aside: using arXiv for distributing such interview | questions, seems to me inappropriate. Is there any SEO trick | behind it? | time_to_smile wrote: | Fisher Information is under the "Kindergarten" section? | | Maybe I've just been interviewing at the wrong places, I'd be | very curious if anyone here has been asked to even explain Fisher | information in any DS interview? | | It's not that Fisher information is a particularly tricky topic, | but I certainly wouldn't put it as a "must know" for even the | most junior of data scientists. Not that I wouldn't mind living | in a world where this was the case... just not sure I live in the | same world as the authors. | sdenton4 wrote: | When I was a mathematician it was pretty common to make jokes | whenever we actually had to evaluate an integral, along the | lines of 'think back to your elementary-school calculus...' | lp251 wrote: | "integrate by parts, like you learned in middle school" | | tf middle school did you go to?! | light_hue_1 wrote: | It's a joke. Like, we joke that the more math you learn the | less arithmetic you can do (ok, maybe that one isn't a | joke). | rindalir wrote: | In my undergrad abstract algebra class our professor | asked us a question about finding the order of a group | that involved dividing 32/8 and we all just sat there for | ten seconds before someone bravely ventured "...four?" | jpindar wrote: | I've experienced that many times among groups of | electrical engineers - we're all fine discussing | equations but once its time to plug in the numbers no one | wants to volunteer an answer. | master_yoda_1 wrote: | My problem with these line of numerous shallow books and courses | are 1) Written by people who has no experience in | industry or they are not working on "real" machine learning jobs | 2) They think the standard in industry is pretty low and any BS | works. For example the concept of "lagrange multiplier" is | missing from the book. One need this concept to understand | training convergence guarantee. | Raphaellll wrote: | I actually bought this as a physical book on Amazon. Naturally it | came as a print-on-demand book. Unfortunately it has many | problems in this format. E.g. the lack of margins makes it hard | to read the end of sentences towards the gutter. Also some text | is pushed into each other. Not sure what source file format you | have to provide to Amazon, but it's certainly not the pdf | provided in the repo. | | Edit: | | It seems the overlapping text also occurs on some pdf readers: | https://github.com/BoltzmannEntropy/interviews.ai/issues/2 | ruph123 wrote: | The last 5 textbooks I bought new on amazon had similar | problems. Totally unacceptable. I started returning them and | (because most were exclusive to amazon) started buying them new | on ebay with great results. | spekcular wrote: | This is amazing. I am ecstatic. | | I've been looking for something exactly like this - and it's | executed better than I could have imagined. | | (Needs a good proofreader still, though! Also, whatever custom | LaTeX template the authors are using is misbehaving a bit in | various places. Still great content.) | abul123 wrote: | mcemilg wrote: | The ML/DS positions highly competitive these days. I don't get | why ML positions requires hard preparations for the interviews | more than other CS positions while you do similar things. People | expect you to know a lot of theory from statistics, probability, | algorithms to linear algebra. I am ok with knowing basic of these | topics which are the foundations of ML and DL. But I don't get to | ask eigenvectors and challenging algorithm problems in an ML | Engineering position at the same while you already proof yourself | with a Masters Degree and enough professional experience. I am | not defending my PhD there. We will just build some DL models, | maybe we will read some DL papers and maybe try to implement some | of those. The theory is the only 10% of the job, rest is | engineering, data cleaning etc. Honestly I am looking for the | soft way to get back to Software Engineering. | uoaei wrote: | In part because ML fails silently by design. Even if the code | runs flawlessly with no errors, the outputs could be completely | bunk, useless, or even harmful, and you won't have any idea if | that is true just from watching The Number go down during | training. It's not enough to know how to build it but also _how | it works_. It 's the difference between designing the JWST and | assembling it. | mattkrause wrote: | I'm sure this happens, but do you think the problem is | actually one of mathematical savvy? | | My guess would be that more machine learning projects go off | the rails for want of understanding the data or the | {business, research} problem. | borroka wrote: | But the OP was asking something different, that is why | someone should excessively focus on theory, when, by the way, | DL theory is very far from being solid and trial and error in | ML and AI is the common way of operating. | | The "model is in place, but I have no clue what's doing and | so it can fail without me understanding when and how is | straw-man". Especially for supervised learning, that is, we | have a label for data, it is immediately clear whether the | output of the model is "bunk, useless, or even harmful". | There is no "fail silently by design". | | I have been working in the field for almost 20 years in | academia and in industry and it is not that I starting every | PCA thinking about eigenvectors and eigenvalues and if you | ask me now without preparing what are those, I would be | between approximately right and wrong. But I fit many, many | very accurate models. | minimaxir wrote: | > In part because ML fails silently by design. | | That's why there's so much iteration and feedback gathering | (e.g. A/B tests) as a part of DS/ML, which incidentally is | rarely a part of the interview loop. | | Anyone who claims they can get a good model the first time | they train it is dangerously optimistic. Even the "how it | works" aspect has become more and more marginal due to black | boxing. | hintymad wrote: | A reason for such requirements is similar to that that software | engineers need to leetcode hard: supply and demand. Prestigious | companies get hundreds, if not thousands, of applications every | day. The companies can afford looking for candidates who have | raw talent, such as the capability of mastering many concepts | and being able solve hard mathematical problems in a short | time. Case in point, you may not need to use eigenvectors | directly in the job, but the concept is so essential in linear | algebra and I as a hiring manager would expect a candidate to | explain and apply it in their sleep. That is, knowing | eigenvector is an indirect filter to get people who are deeply | geeky. Is it the best strategy for a company? That's up to | discussion. I'm just explaining the motives behind such | requirements. | lumost wrote: | My guess is that this type of interview is partly why the ML | Space is full of loud explained who can't execute. | | When I approach a science problem at work with other folks | who have scientist in their title, I assume that some portion | between 30 and 60% will have no meaningful contribution to | the project other than discussion and disagreement with the | direction the effort is taking. Most of the time, these | individuals will not dirty themselves with the details | sufficient to know how the algorithm/data/training process | works. | vsareto wrote: | I can't help but think there's been a ton of filters used in | the past to figure out if someone is deeply geeky, and we'll | continue to invent more in the future. | | It's really looking like another rat race. Especially since | there's no central authority, every hiring manager has the | potential to invent their own filter, and make it arbitrarily | harder or easier based on supply and demand (and then the | filter drifts away from the intended purposes). | jollybean wrote: | But if there is an abundance of supply, the company has to | use some kind of filter. | | Testing for geekyness and ability to solve tricky coding | math problems, seems like a rational way to do that. | | If companies were starving for talent because 'nobody could | pass the test' - it would be another thing. | | But they have to set the bar on something, somewhere. | | I can't speak to AI/ML but I would imagine it might be hard | to hire there, given the very deep and broad concepts, | alongside grungy engineering. | | I've rarely had such fascination and interest in a field | that I would _never_ actually want to work in. | crate_barre wrote: | There's an abundance of supply of people with masters | degrees in machine learning? How's that possible? I | thought this shit was supposed to be hard. | | Has humanity just scaled way too hard or something, | because if we're having an abundance of supply in | difficult cutting edge fields to the point where they | also have their own version of Leetcode, then what hope | do average people have of getting _any_ job in this | world? | | Or, is it at all possible that companies are | disrespecting the candidate pool by being stingy and | picky? | | Maybe the truth is gray. | Mehdi2277 wrote: | I currently work as an ML engineer and have interviewed | on both sides for some well known companies. | | The absolute demand in number of people is small compared | to popularity. It would not surprise me at all if many | computer science master's programs had a majority of the | students studying machine learning. I remember in | undergrad we had to ration computer science classes due | to too much demand from students. I think school had 3x | majors over a couple year time period in CS. | | The number of needed ML engineers is much smaller than | total software engineers. When a lot of students decide | ML is coolest we have imbalanced CS pool with too many | wanting to do ML. Especially when for ML to work you | normally need good data engineering, backend engineer, | infra, and the actual ML is only a small subset of the | service using ML. | | At the same time supply of experienced ml engineers is | still low due to recent growth of the field. Hiring 5+ | years of professional experience ML engineers is more | challenging. The main place were supply is excessive is | for new graduates. | hintymad wrote: | It will be rat race when there are so many interview books | and courses and websites. It was a not rat race before | 2005, when there were only two reasons that one can solve | problems like Pirate Coins or Queen Killing Infidel | Husbands: the person is so mathematically mature that such | problems are easy for them; the person is so geeky that | they read Scientific American or Gardner's columns and | remembered everything they read. | littlestymaar wrote: | You're missing the third category: people like myself who | absolutely love this kind of riddles and destroy them in | a few minutes, without any significance on their actual | work abilities. | | I don't think I'm a bad engineer, but I'm certainly not | the rock star you absolutely need for your team, but when | it comes to this kind of "cleverness" tests, I'm really | really good. | | I've had the "Queen Killing Infidel Husbands" (with | another name) in an interview last year and I aced it in | a few minutes, and I didn't knew about "Pirate Coins", | but when I read your comment HN said your comment was "35 | minutes ago" and now it says "40 minutes" which means I | googled the problem, figured out the solution and then | found the correction online to see if I was right in less | than 6 minutes, and so while I'm putting my son to bed! | | It's really sad because there are many engineers much | better at there job than me who will get rejected because | of pointless tests like this... | Der_Einzige wrote: | Given that PCA is heavily antiquated these days, I'd say that | asking your candidates to know algebraic topology (the basis | behind many much more effective non linear DR algorithms like | UMAP) is far better... But in spite of the field having long | ago advanced beyond PCA, you're still using it to gatekeep. | selimthegrim wrote: | The initialization strategy for UMAP is important enough | that asking about that in practice is probably more | important than anything out of Ghrist's book as an | interview question | | cf. | https://twitter.com/hippopedoid/status/1356906342439669761 | MontyCarloHall wrote: | > Case in point, you may not need to use eigenvectors | directly in the job, but the concept is so essential in | linear algebra and I as a hiring manager would expect a | candidate to explain and apply it in their sleep. | | Exactly. Whenever eigenvectors come up during interviews, | it's usually in the context of asking a candidate to explain | how something elementary like principal components analysis | works. If they claim on their CV to understand PCA, then | they'd better understand what eigenvectors are. If not, it | means they don't actually know how PCA works, and the | knowledge they profess on their CV is superficial at best. | | That said, if they don't claim to know PCA or SVD or other | analysis techniques requiring some (generalized) form of | eigendecomposition, then I won't ask them about eigenvectors. | But given how fundamental these techniques are, this is rare. | nerdponx wrote: | Maybe "eigenvectors" is a bad example, because it's a pretty | foundational linear algebra concept. | | But there is a threshold where it stops being a test of | foundational knowledge and starts being a test of arbitrary | trivia, and favors who has the most free time to study and | memorize said trivia. | whimsicalism wrote: | Having recently completed an MLE interview loop successfully | at a top company, I'm wondering where you are getting asked | complicated linear algebra questions in interview? | fault1 wrote: | Hopefully you aren't equating "eigenvectors" to | "complicated linear algebra question". | | But I agree, a lot of MLE roles don't get asked such | things. | | I think the OP's guide is closer to interviews I've seen | for phd programs. | uoaei wrote: | The difference between trivia and meaty knowledge is somewhat | contextually dependent, but an understanding of how core | probability and statistics concepts are integrated into the | framework of machine learning by means of linear algebra and | the other analytical tools is pretty damn useful to have | substantive conversations about ML design decisions. Helps | when everyone in the team speaks that language to keep up the | momentum. | devoutsalsa wrote: | I figure the best way to prepare for an ML job is to pull out | the nastiest working rat's nest of if statements you've ever | written & claim it was autogenerated by an adversarial network | (which was you fighting with your coworkers over your spaghetti | code). | barry-cotter wrote: | > But I don't get to ask eigenvectors and challenging algorithm | problems in an ML Engineering position at the same while you | already proof yourself with a Masters Degree and enough | professional experience. | | People know pity passes exist for Master's degrees. You can't | trust that someone actually knows what they should know just | because they have a degree. Ditto professional experience. The | entire reason FizzBuzz exists is because people with years of | profesional experience can't program. | vanusa wrote: | We aren't talking about FizzBuzz here; but rather the | fashionable practice of subject people to 4-6 hours of | grilling on "medium-to-hard" problems that you absolutely | cannot fail, or even be slightly halting in your delivery on. | And which can only be effectively prepared for by investing | substantial amounts of time on by-the-book cramming. | | On top of the fact that these problems are often poorly | selected, poorly communicated, conducted under completely | unrealistic time pressure, often as pile-ons (with 3-4 | strangers as if just to add pressure and distraction), and | (these days) over video conferencing (so you have to stare in | the camera and pretend to make eye contact with people while | supposedly thinking about your problem, on top of shitty | acoustics), etc, etc. | | It's just fucking ridiculous. | vidarh wrote: | I'm quite happy these places makes it so clear they're not | places I would be happy to work. I always ask about the | interview process and tell the recruiters I'm not | interested if they expect really lengthy processes. I'm | fine with things dragging out of they have additional | questions after initial interviews, but not if their | default starting position is that they need that. | jstx1 wrote: | Data science and ML interviews can be tough because it's very | difficult to prepare for everything and cover all the theory. A | lot of the value you add comes from knowing the theory so it's | understandable to test it but it's still hard to prepare well. | And you have a take-home and/or LC style problem(s) in addition | to the theory interview. | minimaxir wrote: | The hard questions in DS/ML interviews I've received over the | years aren't the theory questions (which I rarely get asked), | but the trick SQL questions that often depend on obscure syntax | and/or dialect-specific features, or "implement binary search" | when I'm not in the mindset for that as that isn't what DS/ML | is in the real world. | jstx1 wrote: | I think they're fine as long as you know the format and have | an opportunity to prepare or just get in the right mindset | for it. And some things (like binary search) should be easy | to write anyway. | | The SQL questions can also be a symptom of the type of job - | Facebook's first data science round focuses a lot on SQL but | that's because it's a very product/analytics/decision-making | focused role without that much coding or ML. With data | science you have to be more careful about these things when | searching for a job; you can't just use the job title as a | descriptor. | minimaxir wrote: | > And some things (like binary search) should be easy to | write anyway. | | It's a different story when a) your mind is set on | statistics/linear algebra b) you've never had to actually | implement binary search by hand since college and c) even | if you do implement the algorithm and demonstrate that you | have a general understanding, it must work perfectly and | pass test cases otherwise it doesn't count. | | FWIW I was rarely asked about algorithmic complexity which | is more relevant in DS/ML, albeit it's usually in the | context of whiteboarding another algorithm and the | interviewer mocking me for doing it in O(n) instead of | O(logn). | kragen wrote: | Binary search in particular is surprisingly tricky, which | is precisely what makes it useful for telling if someone | knows how to program. To a significant extent, though, | you can cheat by studying binary search itself, which is | a surprisingly beautiful thing. | | I like this formulation for finding the first index in a | half-open range where p is true, assuming p stays true | thereafter: bsearch p i j := i | if i == j else bsearch p i m if p m | else bsearch p (m + 1) j where m := i + | (j - i)//2 | | Or in Python: def bsearch(p, i, j): | m = i + (j - i) // 2 return (i if i == j | else bsearch(p, i, m) if p(m) else | bsearch(p, m+1, j)) | | The only tricky thing about this formulation is that m < | j if i < j, thus the asymmetric +1 in only one case to | ensure progress. If invoked with a p such as a[m] >= k it | gives the usual binary search on an array without early | termination. The i + (j - i) // 2 formulation is not | needed in modern Python, but historically an overflowing | (i + j) // 2 was a bug in lots of binary search library | functions, notably in Java and C. | | (Correction: I said a[m] <= k. This formulation is less | tricky than the usual ones, but it's still tricky!) | minimaxir wrote: | > Binary search in particular is surprisingly tricky, | which is precisely what makes it useful for telling _if | someone knows how to program_. | | That's the problem. There are many other ways to do that | without risking false negatives and annoying potential | candidates (e.g. I would not reapply to places that have | rejected me due to skepticism about my programming | abilities and using tests blatantly irrelevant to day-to- | day work because it's a bad indication of the engineering | culture). | | Even FizzBuzz is better at accomplishing that task. | kragen wrote: | There are levels of not knowing how to program that go | beyond FizzBuzz. But sure, many programming jobs don't | require them. | minimaxir wrote: | If that's the case for the DS/ML domain, then a short | take-home exam should provide a better example of | practical coding ability (the common counterargument that | "take-home exams can be gamed" is a strawman that would | be more on the interviewer's fault for creating a flawed | exam). | | In my case, I typically got the "implement binary search" | questions in a technical interview _after_ I passed a | take-home exam, which just makes me extra annoyed. | kragen wrote: | Agreed. | | If you're gaming the take-home exam by looking up the | answer on Stack Overflow, you could game the same exam in | person by reading books of interview questions ahead of | time, and the interviewer can avoid that by making up new | questions. (OTOH if you're gaming the take-home exam by | paying someone else to solve the problem for you, that | might be harder to tell.) | nerdponx wrote: | FizzBuzz (or equivalent) is actually great IMO. It weeds | out the people who lied on their resume, without | punishing the people who never learned CS because they | were too busy learning things that were actually useful | to DS, like statistics or data visualization. | jstx1 wrote: | I've actually been given fizzbuzz in a DS interview! Up | to that point I thought that fizzbuzz was just a meme | because it's obviously too easy. | rightbyte wrote: | I tried to make Fizzbuzz on a paper when I first heard of | it, and it had a bug printing fizzbuzzfizzbuzz on 15. | | If you want a correct program without a compiler/computer | I don't think anything is too easy. Maybe like, "make a | function returning the sum of two float parameters". | kragen wrote: | That would just test syntax, though. Fizzbuzz tests | logic. Your bug was a logic bug. | | To a certain extent you can dispense with mental logic by | using a compiler. But the feedback loop is much slower. | Thinking your logic through before feeding it to a | compiler is like looking at a map when you're driving a | car; you can cut off whole branches of exploration. | | Binary search is a particularly tricky logic problem in | part because it's so deceptively simple. In a continuous | domain it's easy to get right, but the discrete domain | introduces three or four boundary cases you can easily | get wrong. | | But the great-grandparent is surely correct that many | programming jobs don't require that level of thinking | about program logic. Many that do, it's because the | codebase is shitty, not because they're working in an | inherently mentally challenging domain. | rightbyte wrote: | Ye I meant running it and then correcting the error. | | Concerning binary search I acctually implemented that in | an ECU for message sorting. It took like a whole day, | including getting the outer boundries one off too big in | the first test run. Funnely enough the vehicle ran fine | anyway. | | I would never pull that algorithm off correctly in an | interview without like training to I think. | disgruntledphd2 wrote: | Facebook Product Data Science has always been a Product | Analyst role more than anything else. I did the interviews | a while back, and it was a pretty fun experience, but it's | not what a lot of people call data science. | jstx1 wrote: | > but it's not what a lot of people call data science | | I think that's changed a bit over time and the term has | expanded to mean more things. In addition to Facebook, | another great example is this article from Lyft in 2018 | where they say that they're renaming all their data | analysts to data scientists and all their data scientists | to research scientists - | https://medium.com/@chamandy/whats-in-a-name-ce42f419d16c | kevinventullo wrote: | In my experience, it varied greatly from team to team. | nerdponx wrote: | I had an "implement binary search" interview once. I came | away feeling like I was being interviewed for the wrong role. | I don't understand how anyone could think that's an | appropriate interview task for a DS position. | whimsicalism wrote: | I'm an MLE and I get asked much harder questions than that. | Implement a binary search seems ... fine? | nerdponx wrote: | But it makes sense for MLE! IMO you should ask a stats or | probability question in a DS interview. | jstx1 wrote: | The distinction between the two roles isn't that clear. | Some data science jobs are very focused on engineering. | fault1 wrote: | Agreed. MLE in very ML-heavy companies tends to mean SWE | who work on ML systems, and sometimes, that can mean as | much working on stuff like infrastructure as modeling. | pietromenna wrote: | Wow! Great resource! Thank you! | kragen wrote: | Why are all the em dashes missing from the PDF? | aesthesia wrote: | This may be a rendering issue. Some interaction of the Computer | Modern font, the TeX layout algorithm, and Chrome's rendering | engine sometimes ends up making em-dashes and minus signs | invisible. | pradn wrote: | I think I know the answer to this, but how bad should I feel for | being a software engineer with little-to-no knowledge of deep | learning. I suspect it's not bad at all since the software | engineering field has split into a few camps, and mine - backend | systems work - isn't in the same universe as the machine learning | one, for the most part. | jstx1 wrote: | Not bad at all. I'm a data scientist and my not knowing React | doesn't affect me one bit. | abul123 wrote: | pugio wrote: | I'm really enjoying the discussion here, as I've been thinking a | lot about what a full modern ML/DS curriculum would look like. | | I currently work for a non-profit investigating making a free | high quality set of courses in this space, and would love to talk | to as many people either working in ML/DS or looking to get into | the field. (I have ideas but would prefer to ground them in as | many real-world experiences as I can collect.) | | If anyone here wouldn't mind chatting about this, or even just | sharing an experience or opinion, please drop me an email (in my | profile). | | EDIT: We already have Into to DS, and a Deep RL sequence far | along in our pipeline, but are looking to see where we can help | the most with available resources. | | I really appreciate this Interviews book as an example of what | topics might be necessary (and at what level), taking into | account the qualifying discussion here, of course. | lvl100 wrote: | In my 20s, I was doing data science at a very high level spanning | multiple disciplines. Truly state of the art. I would like to | think I was quite good at my job. | | I am 99% certain I would not have passed the interview bars set | today. More specifically, the breadth they expect you to master | is very puzzling (and seemingly unrealistic). | light_hue_1 wrote: | I've interviewed well over 100 people for DL/ML positions. This | may be a good roadmap to what some people ask, but it's a | terrible guide to what you should ask. It's like a collection of | class exam questions. | | Just as in programming, the world is full of people who can | recite facts but don't understand them. There is no point in | asking what an L1 norm is and asking for its equation. Or say, | giving someone the C++ code that corresponds to computing the | norm of a vector and asking them "what does this do". Or even | worse, showing them some picture of some cross-validation scheme | and asking them to name it. Yes, your candidates should be able | to do this, but positive answers to these kinds of questions are | nearly useless. These are the kinds of questions you get answers | to by Googling. | | It's far more critical to know what your candidate can do, | practically. Create a hypothetical dataset from your domain where | the answer is that they need to use an L1 norm. Do they realize | this? Do they even realize that the distance metric matters? Are | they proposing reasonable distance metrics? Do they understand | what goes wrong with different distance metrics? etc. Or problems | where they need to use a network but say, padding matters a lot. | Or where the particulars of cross validation matter a lot. | | This also gives you depth. "name this cross validation scheme" | gives you a binary answer "yes, they can do it, or no they can't" | And you're done. If you have a hypothetical dataset, you can keep | prodding. "Ok, but how about if I unbalance the data" or "what if | we now need to fine tune" or "what if the payoffs for precision | and recall change in our domain", "what if my budget is limited", | etc. It also lets you transition smoothly to other kinds of | questions. And to discover areas of deeper expertise than you | expected. For example, even for the cross validation questions, | if you ask that binary question, you might never discover that a | candidate knows about how to use generalized cross validation, | which might actually be very useful for your problem. | | The uninformative tedious mess that we see in programming | interviews? This is the equivalent for ML/DL interviews! | flubflub wrote: | A problem with these questions is that a lot of them people can | answer without knowing ML/DL, admittedly cherry picked but | still. | | For example what is the definition of two events being | independent in probability? | | Or the L1 norm example: 'Which norm does the following equation | represent? |x1 - x2| + |y1 - y2|' | | Find the taylor series expansion for e^x (this is highschool | maths). | | Find the partial derivatives of f (x, y) = 3 sin2(x - y) | | Limits etc... | | These aren't specific to deep learning or machine learning, not | that I claim to be a practitioner. | MichaelRazum wrote: | Exactly I though the same. Not sure what a really good | alternative is. BUT you may be in risk to get bad candidates, | since they might be the ones with the best intrview practice. | | Maybe that kind of questions are ok for people without | expirience but not for seniors. | coliveira wrote: | Do you have any books/material that can help the learner | acquiring this deeper understanding? | master_yoda_1 wrote: | I know one good reference. | | https://www.deeplearningbook.org/ | | Also there are various courses and lectures but that needs | time and effort. There is no short cuts like the book posted | by OP. | light_hue_1 wrote: | Yeah. You just have to build models, experiment, | intentionally make bad decisions, and get a feel for how | things work. There's no clear shortcut. | | But, this is also what you will practically be doing. | erwincoumans wrote: | Wow, nice resource! Wish it had some sections about (deep) | reinforcement learning and its algorithms. Looks like it is in | the plan though. | jstx1 wrote: | RL is still kind of niche - the number of companies that ship | anything using RL and the number of jobs that require it are | both quite low. | master_yoda_1 wrote: | just a clarification I think you are confused between RL and | robotics. RL algorithm could be used anywhere either in ads, | nlp, computer vision etc. | mrfusion wrote: | Are there deep learning roles that focus more on software | engineering and using the tools rather than having a deep | understanding of statistics? | fault1 wrote: | I would say on average MLE roles tend to be more SWEng heavy. | But some roles are as much creating infrastructure as running | the tools. | jstx1 wrote: | There are. But | | 1) the titles will vary a lot (software engineer, ML engineer, | research engineer, data scientist etc.) which makes it hard to | locate those jobs and to move in the job market in general | | 2) you still need a reasonable amount of theory (not | necessarily too much statistics) to use the tools well. And in | all likelihood you will be tested on it in some way during the | interviews. | | 3) the interviews/job descriptions that don't emphasise the | theory often will be for jobs where you get a title like | Machine Learning Engineer but you focus more on the | infrastructure rather than on the ML code | throwaway6734 wrote: | I think they're called research engineering roles or ML | engineering | time_to_smile wrote: | > having a deep understanding of statistics? | | As someone with a strong background in statistics, please tell | me where I can find DS jobs that require this. | | For me and all my statistics friends in DS we find much more | frustration in how hard it is to pass DS interviews when you | understand problems deeper than "use XGBoost". I have found | that very few data scientists really even understand basic | statistics, I failed an interview once because an interviewer | did not believe that logistic regression could be used to solve | statistical inference questions (when it and more generally the | GLM is the workhorse of statistical work). | | And to answer your question, whenever I'm in a hiring manager | position I very strongly value strong software engineering | skills. DS teams made up of people that are closer to curious | engineers tend to greatly outperform teams made up of | researchers that don't know you can write code outside of a | notebook. | disgruntledphd2 wrote: | A good conceptual understanding of statistics is always | helpful. | | It's not really tested for in most places though, where they | regard a DS as a service that produces models. ___________________________________________________________________ (page generated 2022-01-10 23:00 UTC)