[HN Gopher] Andreessen-Horowitz craps on "AI" startups from a gr... ___________________________________________________________________ Andreessen-Horowitz craps on "AI" startups from a great height Author : dostoevsky Score : 253 points Date : 2020-02-24 20:31 UTC (2 hours ago) (HTM) web link (scottlocklin.wordpress.com) (TXT) w3m dump (scottlocklin.wordpress.com) | bryanrasmussen wrote: | Generally the use of the phrase from a great height implies the | height is one of morality, intellect, or valor (each of these | decreasing in usage), I'm not exactly sure what the great height | Andreessen-Horowitz craps from is composed of - maybe money? | | I think they may just be crapping on them from a reasonable | vantage point. | KaiserPro wrote: | The height is not really about morals. Its more about the blast | radius of the shit. | darwingr wrote: | Or like "nuked from orbit" | raiyu wrote: | The number of places where machine learning can be used | effectively from both a cost perspective and a return perspective | are small. They are usually tremendously large datasets at | gigantic companies, and they probably have to build in house | expertise because it's hard to package this up into a product and | resell it for various industries, datasets, etc. | | Certainly something like autonomous driving needs machine | learning to function, but again, these are going to be owned by | large corporations, and even when a startup is successful, it's | really about the layered technology on-top of machine learning | that makes it interesting. | | It's kind of like what Kelsey Hightower said about Kubernetes. | It's interesting and great, but what will really matter is what | service you put on top of it, so much so that whether you use | Kubernetes becomes irrelevant. | | So I think companies that are focusing on a specific problem, | providing that value added service, building it through machine | learning, can be successful. While just broadly deploying machine | learning as a platform in and of itself can be very challenging. | | And I think the autonomous driving space is a great example of | that. They are building a value added service in a particular | vertical, with tremendous investment, progress, and potentially | life changing tech down the road. But as a consumer it's really | the autonomous driving that is interesting, not whether they are | using AI/machine learning to get there. | andreilys wrote: | _"The number of places where machine learning can be used | effectively from both a cost perspective and a return | perspective are small."_ | | Thankfully transfer learning and super convergence invalidates | this claim. | | Using pre-trained models + specific training techniques | significantly reduces the amount of data you need, your | training time and the cost to create near state of the art | models. | | Both Kaggle and google colab offer free GPU. | ska wrote: | >Thankfully transfer learning and super convergence | invalidates this claim. | | IME it is nowhere near as universally successful as this | suggests. | Q6T46nT668w6i3m wrote: | How would you explain the rise (and success) of machine | learning in science? A lab that uses some learning-based method | will likely be limited to just one or two people (responsible | for data acquisition, feature engineering, evaluation, etc.) | and extremely finite data. | PeterisP wrote: | Can you elaborate on what you mean by "A lab that uses some | learning-based method will likely be limited to just one or | two people (responsible for data acquisition, feature | engineering, evaluation, etc.)" ? I know a bunch of labs that | apply machine learning to specific tasks, and the parts you | list _each_ can easily take up multiple people for years for | a single task - not counting data acquisition, because data | is definitely not "extremely finite", you need lots of | quality data, and improving data is something that always | gets improvements and can easily eat up more manpower than | you can have budget, no matter what that budget is. | semi-extrinsic wrote: | How do you define success? Adoption? Because right now, | writing "we will use machine learning to solve X" in a grant | proposal is an easy way to increase chances of getting | funding. | Barrin92 wrote: | I'm not sure there is a rise. 'Science' is a huge domain. | Machine learning if I had to guess maybe plays a role in < 1% | of them, and that may be overstating it. | | Also it's doubtful to even categorize machine-learning as | science. The goal of science is to generate insight and | knowledge, ML solves particular engineering problems or | searches problem spaces, it doesn't build fundamental | scientific models. | ska wrote: | It's not clear there has been any deep impact actually, but | there has been a lot of discussion (and grant proposals) | | I've seen a lot of cross pollination of ML and AI techniques | into various disciplines. A large percentage just didn't work | at all, most of the rest were more "kind of interesting, | but". Nothing earthshaking happened although pop sci press | likes to talk about it a lot. | | If you have more digital data than you used to, using modern | free frameworks and toolkits to do basic (i.e. older, boring, | but understood) ML stuff to understand it seems to have a | reasonable return. Mostly I think this is because it becomes | accessible to someone without much background in the area, | and you can do reasonable things without having to put 6 | months of reading and implementing together before starting. | joshuaellinger wrote: | I just spent $50K on coloc hardware. I'm taking a $10K/mo Azure | spend down to a $1K/mo hosting cost. | | But the real kicker is that I get x5 the cores, x20 RAM, x10 | storage, and a couple of GPUs. I'm running last-generation | Infiniband (56gb/sec) and modern U.2 SSDs (say 500MB/sec per | device). | | I figure it is going to take me about $10K in labor to move and | then $1K/mo to maintain and pay for services that are bundled in | the cloud. And because I have all this dedicated hardware, I | don't have to mess around with docker/k8s/etc. | | It's not really a big data problem but it shows the ROI on owning | your own hardware. If you need 100 servers for one day per month, | the cloud is amazing. But I do a bunch of resampling, simple | models, and interactive BI type stuff, so co-loc wins easily. | seibelj wrote: | I wrote an article I published a week ago about how AI is the | biggest misnomer in tech history https://medium.com/@seibelj/the- | artificial-intelligence-scam... | | I wrote it to be tongue-in-cheek in a ranting style, but | essentially "AI" businesses and the technology underpinning it | are not the silver bullet the media and marketing hype has made | it out to be. The linked article about a16z shows how AI is the | same story everywhere - enormous capital to get the data and | engineers to automate, but even the "good" AI still gets it wrong | much of the time, necessitating endless edge-cases, human | intervention, and eventually it's a giant ball of poorly- | understand and impossible to maintain pipelines that don't even | provide a better result than a few humans with a spreadsheet. | ativzzz wrote: | I agree with the author's opinion about | | > I'll go out on a limb and assert that most of the up front data | pipelining and organizational changes which allow for it are | probably more valuable than the actual machine learning piece. | | Especially at non-tech companies with outdated internal | technology. I've consulted at one of these and the biggest wins | from the project (I left before the whole thing finished | unfortunately) were overall improvements to the internal data | pipeline, such as standardization and consolidation of similar or | identical data from different business units. | noelsusman wrote: | I do data science at a non-tech company with outdated internal | technology and I've seen this over and over again. Honestly | though, it's worth every penny because often the only way to | get the resources to truly solve data pipeline issues is to get | an executive to buy some crap from a vendor and force everyone | to make it work. | fxtentacle wrote: | I predict a great future for startups that sell pickaxes, err, | tools for AI. | | AI is like the new gold rush. And just like back then, it's not | the gold diggers that will get rich. | | "Most people in AI forget that the hardest part of building a new | AI solution or product is not the AI or algorithms -- it's the | data collection and labeling." | | https://medium.com/startup-grind/fueling-the-ai-gold-rush-7a... | | (from 2017) | moksly wrote: | Is it the new gold rush though. I work in a large organisation | that has a lot of data and inefficient processes, and we | haven't bought anything. | | It hasn't been for a lack of trying. We've had everyone from | IBM and Microsoft to small local AI startup try to sell us | their magic, but no one has come up with anything meaningful to | do with our data that our analysis department isn't already | doing without ML/AI. I guess we could replace some of our | analysis department with ML/AI, but working with data is only | part of what they do, explaining the data and helping our | leadership make sound decisions is their primary function, and | it's kind of hard for ML/AI to do that (trust me). | | What we have learned though, is that even though we have a | truck load of data, we can't actually use it unless we have | someone on deck who actually understands it. IBM had a run at | it, and they couldn't get their algorithms to understand | anything, not even when we tried to help them. I mean, they did | come up with some basic models that their machine | spotted/learned by itself by trawling through our data, but | nothing we didn't already have. Because even though we have a | lot of data, the quality of it is absolute shite. Which is | anecdotal, but it's terrible because it was generated by | thousand of human employees over 40 years, and even though I'm | guessing, I doubt we're unique in that aspect. | | We'll continue to do various proof of concepts and listen to | what suppliers have to say, but I fully expect most of it to go | the way Blockchain did which is where we never actually find a | use for it. | | With a gold rush, you kind of need the nuggets of gold to sell, | and I'm just not seeing that with ML/AI. At least no yet. | m0zg wrote: | "Huge compute bills" usually come from training, or to be more | precise, hyperparameter search that's required before you find a | model that works well. You could also fail to find such a model, | but that's another discussion. | | So yeah, you could spend one or two FTE salaries' (or one deep | learning PhD's) worth of cash on finding such models for your | startup if you insist on helping Jeff Bezos to wipe his tears | with crisp hundred dollar bills. That's if you know what you're | doing of course. Literally unlimited amounts could be spent if | you don't. Or you could do the same for a fraction of the cost by | stuffing a rack in your office with consumer grade 2080ti's. Just | don't call it a "datacenter" or NVIDIA will have a stroke. Is | that too much money? Not in most typical cases, I'd think. If the | competitive advantage of what you're doing with DL does not | offset the cost of 2 meatspace FTEs, you're doing it wrong. | | That, once again, assumes that you know what you're doing, and | aren't doing deep learning for the sake of deep learning. | | Also, if your startup is venture funded, AWS will give you $100K | in credit, hoping that you waste it by misconfiguring your | instances and not paying attention to their extremely opaque | billing (which is what most of their startup customers proceed to | doing pretty much straight away). If you do not make these | mistakes, that $100K will last for some time, after which you | could build out the aforementioned rack full of 2080ti's on prem. | zitterbewegung wrote: | I was training ML models on AWS / Google Colab. After racking | up a few hundred dollars on AWS I bought a Titan RTX (I also | play video games so it does that very well also. | artsyca wrote: | Slow clap | bob1029 wrote: | I find it fun how the cost of the cloud is forcing people to | consider what absolutely must run in the cloud (presumably for | stability and compliance reasons) and what can be brought back | on-prem. | | We don't train ML models, but we are in a similar boat | regarding cloud compute costs. Building our solutions for our | clients is a compute-heavy task which is getting expensive in | the cloud. We are considering options such as building | commodity threadripper rigs, throwing them in various | developers' (home) offices, installing a VPN client on each and | then attaching as build agents to our AWS-hosted jenkins | instance. In this configuration we could drop down to a | t3a.micro for Jenkins and still see much faster builds. The | reduction in iteration time over a month would easily pay for | the new hardware. An obvious next step up from this is to do | proper colocation, but I am of a mindset that if I have to | start racking servers I am bringing 100% of our infrastructure | out of the cloud. | buckminster wrote: | I once had a borrowed Sun blade server in my home office. The | fan in it sounded like an industrial vacuum cleaner. It got | moved to a different room and was powered on as little as | possible. | | Your plan makes sense but be mindful of the acoustics or your | devs may grow to hate you. | m0zg wrote: | BTW, the only reason why consumer vacuum cleaners are so | loud is because consumers associate loudness with suction | power. | | "Backpack" style commercial vacuum cleaners have more | suction, and are barely audible in comparison. | buckminster wrote: | I stand corrected. It was much louder than a consumer | vacuum but my analogy skills are weak. | ggm wrote: | Your analogy skills were strong, because analogy is | rooted in myth, not fact. Achilles did not actually | _have_ an Achilles tendon because Achilles _did not | exist_ | | There does not have to be an increadibly loud functional | industrial vacuum cleaner, for figuratively _everyone_ to | get your analogy, because the Herculean reality of vacuum | cleaners is that you cannot clean an augean stable of | lego on the floor, without a lot of noise. If you get my | analogy. | bob1029 wrote: | Excellent point. If we are building these rigs by hand | (which is a likely option considering the initial usage | context), the cooling solution would probably be a Noctua | NH-U14S or similar. I already have one of these in my | office attached to a 2950X and it is dead silent. You can | definitely hear it when every core is pegged, but it's | hardly noticeable over any other arbitrary workstation. The | sound is nowhere near as intrusive as something like a | blower on a GPU (or god forbid a sun microsystems blade). | m0zg wrote: | This is not a new phenomenon. As early as in 2009 I worked | for a company (ads, but not Google) which outgrew the typical | "cloud" cost structure at the time, and moved everything to a | more traditional datacenter, and saved substantial money even | considering 3 more SREs they had to hire to absorb the | increased support needs. AWS charges what the market will | bear, and as such it was never designed to make sense for | everyone. One needs to re-evaluate on the back of the napkin | from time to time. | blt wrote: | If I worked from home and my employer asked me to install a | server in my home, I would tell them to go fuck themselves. | | It's noisy, it takes up space, and presumably I'm on call to | fix it if it breaks. | | You should pay them an extra 24x(PSU wattage)x(peak $/Wh in | area) per day for the electricity too. | | I'm alarmed that someone in your company felt this idea was | appropriate enough to propose. | pridkett wrote: | There's also the issue that data scientists often want to go | running to hyperparameter optimization and neural architecture | search. In most cases improving your data pipelines and | ensuring the data are clean and efficient will pay off much | more quickly. | fxtentacle wrote: | But manually improving the data pipeline requires an | understanding of the problem, whereas doing a hyperparameter | optimized architecture search just needs $$$ hardware and no | clue on the side of the operator. | derefr wrote: | Or, to put that another way: if you knew what algorithm the | AI would be using to discriminate the signal from the noise | in your data, why would you need the AI? Just write that | algorithm. | fxtentacle wrote: | Exactly :) | | In most cases, unsupervised learning is nothing more than | having the AI try to approximate the solution of your | highly non-linear loss function. So if there's any way of | solving that loss function directly, it will perform like | a well-trained AI. | fxtentacle wrote: | No, also inference is quite expensive. You'll have 100% usage | on a $10,000 GPU for 3s per customer image for a decently sized | optical flow network. That's 3 hours of compute time for 1 | minute of 60fps video. | | Now let's say your customer wants to analyze 2 hours = 120 | minutes of video and doesn't want to wait more than those 3 | hours, then suddenly you need 120 servers with one $10k GPU | each to service this one customer within 3 hours of waiting. | | Good luck reaching that $1,200,000 customer lifetime value to | get a positive ROI on your hardware investment. | | When I talk about AI, I usually call it "beating the problem to | death with cheap computing power". And looking at the average | cleverness of AI algorithm training formulas, that seems to be | exactly what everyone else is doing, too. | | And since I'm being snarky anyway, there's two subdivisions to | AI: | | supervised learning => remember this | | unsupervised learning => approximate this | | Both approaches don't put much emphasis on intelligence ;) And | both approaches can usually be implemented more efficiently | without AI, if you know what you are doing. | m0zg wrote: | Some kinds of inference are expensive, yes, not going to | dispute that. But 99.95% of it is actually surprisingly | inexpensive. Hell, a lot of useful workloads can be deployed | on a cell phone nowadays, and that fraction will increase | over time, further reducing inference costs or eliminating | them outright (or rather moving them to the consumer). | | For the vast majority of people the main expense is creating | the combination of a dataset and model that works for their | practical problem, with the dataset being the harder (and | sometimes more expensive) problem of the two. | | The dataset is also their "moat", even though most of them | don't realize it, and don't put enough care into that part of | the pipeline. | fxtentacle wrote: | The algorithms that run on cell phones tend to be specially | optimized and quality-reduced neural networks. For example, | https://arxiv.org/abs/1704.04861 | ignoramous wrote: | As someone hoping to build a world-wide footprint, say 25 to 50 | DCs, of servers to deploy to with unmetered bandwidth, what are | some alternatives to the usual suspects? | | I have come across fly.io, Vultr, Scaleway, Stackpath, Hetzner, | and OVH but either they are expensive (in that they charge for | bandwidth and uptime) or do not have a wide enough foot-print. | | I guess colos are the way to go, but how does one work with | colos, allocate servers, deploy to them, ensure security and | uptime and so on from a single place, 'cause dealing with them | individually might slow down the process? Is there a tooling | that deals with multi-colos like the ones for multi-cloud like | min.io, k8s, Triton etc; | KaiserPro wrote: | > As someone hoping to build a world-wide footprint | | Does adding an extra 100ms to the response time cost you that | much business wise? | | As for colos, it depends on scale. If you have 30k servers | world wide, it pays to have someone manage the contracts for | you. If not it pays to go for the painful arseholes like | vodaphone, or whoever bought Cable & wireless's stuff. | | as for security, it gets very difficult. You need to make | sure that each machine is actually running _what_ you told | it, and know if someone has inserted a hypervisor shim | between you and your bare metal. | | none of that is off the shelf. | | Which is why people pay the big boys, so that they can prove | chain of custody and have very big locks on the cages. | | K8s gives you scheduling and a datastore. For a large | globally distributed system its going to scale like treacle. | mrkurt wrote: | (Hi, I'm from fly.io) | | It depends what you need in your datacenters! If you just | want servers, and don't care about doing something like | anycast, you can find a bunch of local dedicated server | providers in a bunch of cities and go to town. But you can't | get them all from one provider, really, not with any kind of | reasonable budget. | | You _could_ buy colo from a place like Equinix in a bunch of | cities, and then either use their transit or buy from other | transit providers. | | But also, unmetered bandwidth isn't a very sustainable | service, so I'm curious what you're after? You're usually | either going to have to pay for usage, or pay large monthly | fixed prices to get reasonable transit connections in each | datacenter. | | In our case, we're constrained by Anycast. To expand past the | 17 usual cities you end up needing to do your own network | engineering which we'd rather not do yet. | ignoramous wrote: | (thanks mrkurt) | | It is anycast that I'm going after. Requirement for | unmetered bandwidth (or cheaper than AWS et al) is because | of the kind of workloads (TURN relays, proxy, tunnels etc) | we'd deal with gets expensive, otherwise. For another | related workload, per-request pricing gets expensive, | again, due to the nature of the workload (to the tune 100k | requests per user per month). | | So far, for the former (TURN relays etc), I've found using | AWS Global Accelerator and/or GCP's GLB to be the _easiest_ | way to do anycast but the bandwidth is slightly on the | expensive side. Fly.io matches the pricing in terms of | network bandwidth (as promised on the website), so that 's | a positive but GCP/AWS have a wider footprint. Cloudflare's | Magic Transit is another potential solution, but requires | an enterprise plan and one needs to bring-your-own-anycast- | IP and origin-servers. | | For the latter (latency-sensitive workload with ~100k+ reqs | / month), Cloudflare Workers (200+ locations minus China) | are a great fit though would get expensive once we hit a | certain scale. Plus, they're limited to L7 HTTP reqs, only. | Whilst, I believe, fly.io can do L4. | avip wrote: | For balance, all big cloud providers - aws, gcp, azure, oracle | [0] have pretty similar startup plans. Y$$MV | | (I'm in full agreement with everything you've written + it's | well-phrased and funny. gj!) | | [0] that's not a typo - there is such thing as "Oracle cloud" | alephnan wrote: | > Just don't call it a "datacenter" or NVIDIA will have a | stroke. | | Context please :) ? | mereel wrote: | Just a guess but maybe it's some licensing issue? | https://www.nvidia.com/en-us/drivers/geforce-license/ | | _No Datacenter Deployment. The SOFTWARE is not licensed for | datacenter deployment, except that blockchain processing in a | datacenter is permitted._ | derefr wrote: | > except that blockchain processing in a datacenter is | permitted | | Well, Nvidia, y'see, my new blockchain does AI training as | its Proof-of-Work step... | mam2 wrote: | well they are the one writing the rules, so i'd side with | OP | ThePadawan wrote: | Datacenter GPUs are mostly identical to the much cheaper | consumer versions. The only thing preventing you from running | a datacenter with consumer hardware is the licensing | agreement you accept. | KaiserPro wrote: | And the cooling, amount of ram and the doubles performance. | | the chip might be the same, but the rest of it isn't | | Granted, its not worth the $3k price bump, but thats a | different issue. | zwaps wrote: | Nah that's not really it. The reason NVIDIA doesn't allow | this is precisely because the additional RAM - | functionally the only difference - is not cost efficient. | People would like (and did) use a bunch of consumer | 1080s, which is why NVIDIA disallowed precisely that. You | had to buy the equivalent pro grade card, which costs | easily two or three times that and offers a couple more | GB of RAM. | endorphone wrote: | "The only thing preventing you from running a datacenter | with consumer hardware is the licensing agreement you | accept." | | The consumer cards don't use ECC and memory errors are a | common issue (GDDR6 running at the edge of its | capabilities). In a gaming situation that means a polygon | might be wrong, a visual glitch occurs, a texture isn't | rendered right -- things that just don't matter. For | scientific purposes that same glitch could be catastrophic. | | The "datacenter" cards offer significantly higher | performance for some case (tensor cores, double precision), | are designed for data center use, are much more scalable, | etc. They also come with over double the memory (which is | one of the primary limitations forcing scale outs). | | Going with the consumer cards is one of those things that | might be Pyrrhic. If funds are super low and you want to | just get started, sure, but any implication that the only | difference is a license is simply incorrect. | luisfmh wrote: | Learned a new word today. Pyrrhic. | [deleted] | OkGoDoIt wrote: | NVIDIA forces you to buy significantly more expensive cards | that perform marginally better if you are using them for | datacenter use. They try to enforce not letting businesses | use consumer grade gaming cards. I assume this is so cloud | providers don't buy up all the supply of graphics cards and | make it hard for gamers to get decent cards, like what | happened during the bitcoin craze. | nkassis wrote: | No it's just pure price discrimination. They don't care | about gamers they just know businesses will pay more if | forced to while gamers can't. | aj7 wrote: | Exactly. | [deleted] | [deleted] | fizixer wrote: | - Or AMD could change their policy of 'never miss an | opportunity to miss an opportunity' and offer high-performance | OpenCL GPGPU offerings. Then nVidia could have all the stroke | they wanted. | | - Or Tensorflow/Pytorch could've crapped on OpenCL a little | less by releasing a fully functional OpenCL version everytime | they released a fully functional Cuda version, instead of | worshipping Cuda year in and year out. | | - Or Google could start selling their TPUv2, if not TPUv3, | while they're on the verge of releasing TPUv4. | | - Or one of the other big-tech's Facebook/Microsoft/Intel could | make and start selling a TPU-equivalent device. | | - Or I could finish school and get funded to do all/most of the | above ;) | | edit: On a more serious note, a cloud/on-prem hybrid is | absolutely the right way to go. You should have a 4x 2080 ti | rig available 24x7 for every ML engineer. It costs about $6k-8k | a piece [0]. Prototype the hell out of your models on on-prem | hardware. Then when your setup is in working condition and | starts producing good results on small problems, you're ready | to do a big computation for final model training. Then you send | it to the cloud, for final production run. (Guess what, on a | majority of your projects, you might realize, the final | production run could be carried out on on-prem itself; you just | have to keep it running 24 hours-a-day for a few days or up to | a couple weeks.) | | [0]: https://l7.curtisnorthcutt.com/the-best-4-gpu-deep- | learning-... | m0zg wrote: | As someone who has actually worked on this stuff soup to | nuts, it's not as easy as people imagine, because you can't | just support some subset of available ops and call it a day. | If you want to make OpenCL pie from scratch, you must first | make the universe, and support every single stupid thing | (among thousands) and even mimic some of the bugs so that | models work "the same". | | This is hard and time consuming, and this field is hard | enough as it is. What makes it even harder is that only | NVIDIA has decent, mature tooling. There is some work on ROCM | though, so AMD is not _totally_ dead in the water. I'd say | they're about 90% dead in the water. | derefr wrote: | > support every single stupid thing (among thousands) and | even mimic some of the bugs so that models work "the same". | | Do you need to do the stupid things performantly, though? | Because that sounds like a case for skipping microcode | shims, and going straight to instructions that trap into a | software implementation. Or just running the whole compute- | kernel in a driver-side software emulator that then | compiles real sub-kernels for each stretch of non-stupid | GPGPU instructions, uploads those, and then calls into them | from the software emulation at the right moments. Like a | profile-guided JIT, but one that can't actually JIT | everything, only some things. | fizixer wrote: | I know Tensorflow decided to be cuda-exclusive for the | silly reason that the matrix library they were using | (eigen) only supported cuda. | | I have never recovered from that. | simonebrunozzi wrote: | Are you in the Bay area? Would love to chat. Thinking of an | idea where your expertise could be very handy. $my-hn- | username at gmail. | marmaduke wrote: | I sometimes contribute to methodology projects in neuroscience | ("AI" for scientists). The most tiring part of it is explaining | essentially these things over and over. Very interesting to see | the sentiment vindicated in Startupistan. | atulkum wrote: | On the other hand some of the startup is doing absolutely fraud | on the name of AI.I went to a self checkout store (AIFI.io). I | did not touch anything but they charge me $35.10. According to | the receipt I took 17 packs of snacks :) These guys are doing | fraud on the name of AI. They have no technology no software just | put up some camera and open a store so that they can defraud the | investor. Anyone can try if intersted https://www.aifi.io/loop- | case-study | mtkd wrote: | AI on the algo side is only half the story -- it has to sit in a | domain specific framework to be most effective | | I see a lot of 'bolt-on' tech emerging -- it looks mostly snake | oil -- there is no obvious way to be competitive against teams | that baked it in to the bare metal design | | Also most commercial use-cases I've seen need effective ML more | than anything else | dang wrote: | A thread about the original article, from a few days ago: | https://news.ycombinator.com/item?id=22352750 | rotrux wrote: | This is a terrific article. Two thumbs up. | moab wrote: | I found it fun to read this after reading this other post that | made the rounds today about AI automating most programming work | and making program optimization irrelevant: | https://bartoszmilewski.com/2020/02/24/math-is-your-insuranc... | blueyes wrote: | The A16Z piece makes all these points quite clearly. This | editorial is trying to put a finer point on a sharp knife. | correlator wrote: | No need to look at AZ for this. If you're building "AI" I wish | you a speedy road to being acquired by a company that can put it | to use. You've become a high priced recruiting firm. | | If you're solving a real problem and use ML in service of solving | that problem, then you've got a great moat....happy trusting | customers. | | It's not complicated | motohagiography wrote: | Sssh! Valuations are a function of projected market size and | opacity of the problem. Clarity like this collapses the | uncertainty and destroys value. If you pour enough capital into | rooms full of PhD's something's gotta hit. | | My way of saying, you're very, very right. | lazzlazzlazz wrote: | Is the misspelling of "Andreessen-Horowitz" and use of "A19H" | instead of "a16z" intentional? | scottlocklin wrote: | I suck at spelling. If I was one of the cool kids I'd claim to | be dyslexic. | yubozhao wrote: | hi OP. We built an open-source library called, | BentoML(https://github.com/bentoml/bentoml) to make model | inferencing/serving a lot easier for Data scientists in | various serving scenarios. | | Love to hear your thoughts on our library | khazhoux wrote: | You mean the fact that they left out an "s" in Andreessen? | dang wrote: | We've squeezed another s above. | leetrout wrote: | That is a great write up and very accurate description of both | the costs and human intervention based on my experience with "AI" | tools. | allovernow wrote: | All of this might be true currently, but that's because this | current first generation "AI" (technically should just be called | ML) is mostly bullshit. To clarify, I don't mean anyone is lying | or selling snake oil - what I mean by bullshit is that the vast | majority of these services are cooked up by software developers | without any background in mathematics, selling adtechy services | in domains like product recommendation and sentiment analysis. | They are single discipline applications accessable to devs | without science backgrounds and do not rely on substantial | expertise from other fields. That makes them narrow in technical | scope and easy to rip off (hence no moat, lots of competition, | and human reliance and lack of actual software). | | The next generation of Machine Learning is just emerging, and | looks nothing like this. Funds are being raised, patents are | being filed, and everything is in early stage development, so you | probably haven't heard much yet - but these ML startups are going | after real problems in industry: cross disciplinary applications | leveraging the power of heuristic learning to make cross | disciplinary designs and decisions currently still limited to the | human domain. | | I'm talking about the kind of heuristics which currently exist | only as human intuition expressed most compactly as concept | graphs and, especially, mathematical relationships - e.g. | component design with stress and materials constraints, geologic | model building, treatment recommendation from a corpus of patient | data, etc. ML solutions for problems like these cannot be | developed without an intimate understanding of the problem | domain. This is a generalist's game. I predict that the most | successful ML engineers of the next decade will be those with | hard STEM backgrounds, MS and PhD level, who have transitioned to | ML. [Un]Fortunately for us, the current buzzwordy types of ML | services give the rest of us a bad name, but looking at _these_ | upcoming applications the answers to the article tl;dr look | different: | | >Deep learning costs a lot in compute, for marginal payoffs | | The payoffs here are far greater. Designs are in the pipeline | which augment industry roles - accelerate design by replacing | finite methods with vastly quicker ML for unprecedented | iteration. Produce meaningful suggestions during the development | of 3D designs. Fetch related technical documents in real time by | scanning the progressive design as the engineer works, parsing | and probabilistically suggesting alternative paths to research | progression. Think Bonzi Buddy on steroids...this is a place for | recurring software licenses, not SaaS. | | >Machine learning startups generally have no moat or meaningful | special sauce | | For solving specific, technical problems, neural network design | requires a certain degree of intuition with respect to the flow | of information through the network, which both optimizes and | limits the kind of patterns that a given net can learn. Thus | designing NN for hard-industry applications is predicated upon an | intimate understanding of domain knowledge, and these highly | specialized neural nets become patentable secret sauces. That's | half of the most - the other comes from competition for the | software developers with first-hand experience in these fields, | or a general enough math heavy background to capture the | relationships that are being distilled into nets. | | >Machine learning startups are mostly services businesses, not | software businesses | | Again only true because most current applications are NLP adtechy | bullshit. Imagine coding in an IDE powered by an AI (multiple | interacting neural nets) which guides the structure of your code | at a high level and flags bugs as you write. This, at a more | practical level, is the type of software that will eventually | change every technical discipline, and you can sell licenses! | | >Machine learning will be most productive inside large | organizations that have data and process inefficiencies | | This next generation goes far past simply optimizing production | lines or counting missed pennies or extracting a couple extra | percent of value from analytics data. This style of applied ML | operates at a deeper level of design which will change | everything. | scottlocklin wrote: | >The next generation of Machine Learning is just emerging, and | looks nothing like this. Funds are being raised, patents are | being filed, and everything is in early stage development, so | you probably haven't heard much yet ... | | Citations needed. Large claims: presumably you can name one | example of this, and hopefully it's not a company you work at. | | I've seen projects on literally all the things you mention: | materials science, medical stuff, geology/prospecting -none of | them worked well enough to build a stand alone business around | them. I do know the oil companies are using DL ideas with some | small successes, but this only makes sense for them, as they've | been working on inverse problems for decades. None of them buy | canned software/services: it's all done in house. Probably | always will be, same as their other imaging efforts. | allovernow wrote: | >Citations needed. Large claims: presumably you can name one | example of this, and hopefully it's not a company you work | at. | | Unfortunately this is all emerging just now and yes, I do | work at such a company, but I'm old enough to not be naively | excited by some hot fad. There's something profound just | starting to happen but everyone is keeping the tech rather | secret because it isn't developed/differentiated enough yet | to keep a competitor from running off with an idea, yet. | Disclosure is probably 1-3 years out of estimate. | | >I do know the oil companies are using DL...as their other | imaging efforts. | | You're correct, and I happen to have experience in this | domain - except there are a handful of up and commers | courting funds from global majors like Shell and BP, and | seismic inversion is near the end of the list of novel | applications. Peteoleum is ground zero for a potential | revolution right now, if we can come up with something before | the U.S. administration clamps down on fossil fuels. | | But we're talking complex algorithms which consist of | multiple interacting neural networks. We are rapidly moving | toward rudimentary reasoning systems which represent | conceptual information encoded in vectors. I'm jaded enough | that I wouldn't say we're developing AGI, but if the | progressing ideas I'm familiar with and Workin on personally | pan out, they will be massive baby steps towards something | like AGI. | | The space is evolving at least as rapidly as the academic | side, which I think is an unprecedented pace of development | for a novel field of study. I can't help but feel like these | are the first steps towards some kind of singularity. There's | no question that we are on to something civilization changing | with neural networks, what remains to be seen is whether | compute scaling will keep up with the needs of this next | generation ML. Even if research stopped today, the modern ML | zoo has exploded with architectures with fruitful | applications across domains. The future is here! | rossdavidh wrote: | So, way back in the last millenium, I did my Master's thesis (way | smaller deal than a Ph.D. thesis) on neural networks. Since then, | I have looked in on it every few years. I think they're cool, I | like using them, and writing multi-level backpropagation neural | networks used to be one of the first things I'd do in a new | language, just to get a feel for how it worked (until pytorch | came along and I decided for the first time that using their | library was easier than writing my own). | | So, it's not like I dislike ML. But, saying an investment is an | "AI" startup, ought to be like saying it's a python startup, or | saying it's a postgres startup. That ought not to be something | you tell people as a defining characteristic of what you do, not | because it's a secret but rather because it's not that important | to your odds of success. If you used a different language and | database, you would probably have about the same odds of success, | because it depends more on how well you understand the problem | space, and how well you architect your software. | | Linear models or other more traditional statistical models can | often perform just as well as DL or any other neural network, for | the same reason that when you look at a kaggle leaderboard, the | difference between the leaders is usually not that big after a | while. The limiting factor is in the data, and how well you have | transformed/categorized that data, and all the different methods | of ML that get thrown at it all end up with similar looking | levels of accuracy. | | There used to be a saying: "If you don't know how to do it, you | don't know how to do it with a computer." AI boosters sometimes | sound as if they are suggesting that this is no longer true. | They're incorrect. ML is, absolutely, a technique that a good | programmer should know about, and may sometimes wish to use, kind | of like knowing how a state machine works. It makes no great deal | of difference to how likely a business is to succeed. | 7532yahoogmail wrote: | Thank you for the perspective. Now when we talk machine | learning are we talking: | | L. Pachter and B. Sturmfels. Algebraic Statistics for | Computational Biology. Cambridge University Press 2005. | | G. Pistone, E. Riccomango, H. P. Wynn. Algebraic Statistics. | CRC Press, 2001. Drton, Mathias, Sturmfels, Bernd, Sullivant, | Seth. Lectures on Algebraic Statistics, Springer 2009. | | Or more like: | | Watanabe, Sumio. Algebraic Geometry and Statistical Learning | Theory, Cambridge University Press 2009. | | My understanding (I do not do AI or machine learning) that AI | is distinct from these more mathematical analytic perspectives. | | Finally, might we argue that generally AI/ML is more easily | suited to data that's already high quality eg. CERN data, trade | data, drug trial data as opposed to unconstrained data eg. Find | the buses in these 1MM jpegs? | aj7 wrote: | " Embrace services. There are huge opportunities to meet the | market where it stands. That may mean offering a full-stack | translation service rather than translation software or running a | taxi service rather than selling self-driving cars. Building | hybrid businesses is harder than pure software, but this approach | can provide deep insight into customer needs and yield fast- | growing, market-defining companies. Services can also be a great | tool to kickstart a company's go-to-market engine - see this post | for more on this - especially when selling complex and/or brand | new technology. The key is pursue one strategy in a committed | way, rather than supporting both software and services | customers." | | Exactly wrong and contradicts most of the thesis of the article - | that AI often fails to achieve acceptable models because of the | individuality, finickiness, edge cases, and human involvement | needed to process customer data sets. | | The key to profitability is for AI to be a component in a | proprietary software package, where the VENDOR studies, | determines, and limits the data sets and PRESCRIBES this to the | customer, choosing applications many customers agree upon. Edge | cases and cat-guacamole situations are detected and ejected, and | the AI forms a smaller, but critical efficiency enhancing | component of a larger system. | whoisjuan wrote: | An many times all these AI computations go into solving mundane | problems like "What's the likelihood of this Ad to perform well". | | AI is so shiny that makes people want to jump as fast as they can | into that boat but a reasonable objective analysis shows that a | huge and not insignificant amount of software problems can still | be solved without relying on the "AI black box". | harias wrote: | >That's right; that's why a lone wolf like me, or a small team | can do as good or better a job than some firm with 100x the head | count and 100m in VC backing. | | goes on to say | | >I agree, but the hockey stick required for VC backing, and _the | army of Ph.D.s required to make it work_ doesn't really mix well | with those limited domains, which have a limited market. | | Choose one? | | Also assumes running your own data center to be easy. Some people | don't want to be up 24x7 monitoring their data center or to buy | hardware to accommodate the rare 10 minute peaks in usage. | detaro wrote: | > _Some people don 't want to be up 24x7 monitoring their data | center or to buy hardware to accommodate the rare 10 minute | peaks in usage._ | | Do you need that for training workloads, and what percentage of | a startups workload is training? | jjeaff wrote: | >rare 10 minute peaks | | But is that really the use case here? I haven't worked in ML. | But I'm not seeing where you are going to need to handle a 10 | minute spike that requires a whole datacenter. | | A month's worth of a quad gpu instance on AWS could pay for a | server with similar capacity in a few months of usage. | | And hardware is pretty resilient these days. Especially if you | co-locate it in a datacenter that handles all the internet and | power up time for you. And when something does go wrong, they | offer "magic hands" service to go swap out hardware for you. | Colocation is surprisingly cheap. As is leasing 'managed' | equipment. | icheishvili wrote: | I don't think these are necessarily contradictory. With | pytorch-transformers, you can use a full-blown BERT model like | the best in the world. And yet, to make this novel and | defensible, you would need to build on top of it and innovate | significantly, which would require significant capital to | achieve. | brundolf wrote: | > Training a single AI model can cost hundreds of thousands of | dollars (or more) in compute resources | | Why don't they buy their own hardware for this part? The training | process doesn't need to be auto-scalable or failure-resistant or | distributed across the world. The value proposition of cloud | hosting doesn't seem to make sense here. Surely at this price the | answer isn't just "it's more convenient"? | KaiserPro wrote: | because you are trading speed for cash. | | Say you have $8M in funding, and you need to train a model to | do x | | You can either: | | a) gain access to a system that scale ondemand and allows | instant, actionable results. | | b) hire a infrastructure person, someone to write a K8s | deployment system. Another person to come in a throw that all | away. Another person to negotiate and buy the hardware, and | another to install it. | | Option b is can be the cheapest in the long term, but it | carries the most risk of failing before you've even trained a | single model. It also costs time, and if speed to market is | your thing, then you're shit out of luck. | brundolf wrote: | Why in the world do you need a Kubernetes deployment system | to run a single, manual, one-time (or a handful of times), | high-compute job? | PeterisP wrote: | Because that high-compute job needs to be distributed on | many, many machines, and if you're using cheap preemptible | instances you have to handle machines dropping off and | joining in while you're running that single job. | | It's definitely not something that you can launch manually | - perhaps Kubernetes is not the best solution, but you | definitely need some automation. | dsl wrote: | Because when all you have is a hammer, everything looks | like a nail. | | We have become so DevOps and cloud dependent that everyone | has forgotten how to just run big systems cheaply and | efficiently. | GaryNumanVevo wrote: | If you're in a position where you need to train a large | network: first, I feel bad for you. second, you'll need | additional machines to train in a reasonable amount of time. | | ML distributed training is all about increasing training | velocity and searching for good hyperparameters ___________________________________________________________________ (page generated 2020-02-24 23:00 UTC)