[HN Gopher] Launch HN: Slai (YC W22) - Build ML models quickly a...
       ___________________________________________________________________
        
       Launch HN: Slai (YC W22) - Build ML models quickly and deploy them
       as apps
        
       Hi HN, we're Eli and Luke from Slai
       (https://www.slai.io/hn/62203ae9ee716300083c879b). Slai is a fast
       ML prototyping platform designed for software engineers. We make it
       easy to develop and train ML models, then deploy them as
       production-ready applications with a single link.  ML applications
       are increasingly built by software engineers rather than data
       scientists, but getting ML into a product is still a pain. You have
       to set up local environments, manage servers, build CI/CD
       pipelines, self-host open-source tools. Many engineers just want to
       leverage ML for their products without doing any of that. Slai
       takes care of all of it, so you can focus on your own work.  Slai
       is opinionated: we are specifically for software developers who
       want to build models into products. We cover the entire ML
       lifecycle, all the way from initial exploration and prototyping to
       deploying your model as a REST API. Our sandboxes contain all the
       code, dataset, dependencies, and application logic needed for your
       model to run.  We needed this product ourselves. A year ago, Luke
       was working as a robotics engineer, working on a computationally
       intensive problem on a robot arm (force vector estimation). He
       started writing an algorithm, but realized a neural network could
       solve the problem faster and more accurately. Many people had
       solved this before, so it wasn't difficult to find an example
       neural net and get the model trained. You'd think that would be the
       hard part--but actually the hard part was getting the model
       available via a REST API. It didn't seem sensible to write a Flask
       app and spin up an EC2 instance just to serve up this little ML
       microservice. The whole thing was unnecessarily cumbersome.  After
       researching various MLOps tools, we started to notice a pattern--
       most are designed for data scientists doing experimentation, rather
       than software engineers who want to solve a specific problem using
       ML. We set out to build an ML tool that is designed for developers
       and organized around SWE best practices. That means leaving
       notebooks entirely behind, even though they're still the preferred
       form factor for data exploration and analysis. We've made the bet
       that a normal IDE with some "Jupyter-lite" functionality (e.g.
       splitting code into cells that can be run independently) is a fair
       trade-off for software engineers who want easy and fast product
       development.  Our browser-based IDE uses a project structure with
       five components: (1) a training section, for model training
       scripts, (2) a handler, for pre- and post-processing logic for the
       model and API schema, (3) a test file, for writing unit tests, (4)
       dependencies, which are interactively installed Python libraries,
       and (5) datasets used for model training. By modularizing the
       project in this way, we ensure that ML apps are functional end-to-
       end (if we didn't do this, you can imagine a scenario where a data
       scientist hands off a model to a software engineer for deployment,
       who's then forced to understand how to create an API around the
       model, and how to parse a funky ML tensor output into a JSON
       field). Models can be trained on CPUs or GPUs, and deployed to our
       fully-managed backend for invoking via a REST API.  Each browser-
       based IDE instance ("sandbox") contains all the source code,
       libraries, and data needed for an ML application. When a user lands
       on a sandbox, we remotely spin up a Docker container and execute
       all runtime actions in the remote environment. When a model is
       deployed, we ship that container onto our inference cluster, where
       it's available to call via a REST API.  Customers have so far used
       Slai to categorize bills and invoices for a fintech app; recognize
       gestures from MYO armband movement data; detect anomalies in
       electrocardiograms; and recommend content in a news feed based on
       previous content a user has liked/saved.  If you'd like to try it,
       here are three projects you can play with:   _Convert any image
       into stylized art_ -
       https://www.slai.io/hn/62203ae9ee716300083c879b   _Predict Peyton
       Manning's Wikipedia page views_ -
       https://www.slai.io/hn/6215708345d19a0008be3f25   _Predict how
       happy people are likely to be in a given country_ -
       https://www.slai.io/hn/621e9bb3eda93f00081875fc  We don't have
       great documentation yet, but here's what to do: (1) Click "train"
       to train the model; (2) Click the test tube icon to try out the
       model - this is where you enter sentences for GPT-2 to complete, or
       images to transform, etc; (3) Click "test model" to run unit tests;
       (4) Click "package" to, er, package the model; (5) Deploy, by
       clicking the rocket ship icon and selecting your packaged model.
       "Deploy" means everything in the sandbox gets turned into a REST
       endpoint, for users to consume in their own apps. You can do the
       first 3 steps without signup and then there's a signup dialog
       before step 4.  We make money by charging subscriptions to our
       tool. We also charge per compute hour for model training and
       inference, but (currently) that's just the wholesale cloud cost--we
       don't make any margin there.  Our intention with Slai is to allow
       people to build small, useful applications with ML. Do you have any
       ideas for an ML-powered microservice? We'd love to hear about apps
       you'd like to create. You can create models from scratch, or use
       pretrained models, so you can be really creative. Thoughts,
       comments, feedback welcome!
        
       Author : Mernit
       Score  : 81 points
       Date   : 2022-03-03 16:39 UTC (6 hours ago)
        
       | lysecret wrote:
       | Congrats on the launch. I'm quite impressed.
       | 
       | Here are my unordered thoughts.
       | 
       | So it seems a lot like an improved colab with a deployment stage.
       | Which sounds good to me, it will be much more expensive than
       | colab though.
       | 
       | I like the pitch of SWEs doing ML instead of pitching towards
       | Data Scientists. As Data Scientist turned SWE I still miss the
       | Jupyter like cell based execution. (you said it exists but I
       | couldn't find it.)
       | 
       | In general I'm quite sceptical when it comes to online IDEs.
       | However, for Text and Image based models it might be enough
       | (since you don't need tooo much code).
       | 
       | There might be a valid niche between colab on the one side and
       | building it yourself with AWS cli on the other.
       | 
       | I wonder though, in your target market it really isn't such a big
       | deal to spin up a rest api, there are no lambdas with GPU though
       | (but this should be a matter of time). Or use something like AWS
       | batch for remote training. It will come down to: Is it more
       | convenient to code in your IDE and you handle Lambda, Batch
       | Docker and CD. Or do you code in your own IDE and you have to
       | handle this stuff yourself.
       | 
       | Wish you all the best!
        
       | crsn wrote:
       | Your website pitches this product SO well. Kudos.
        
         | Mernit wrote:
         | Thanks, that means a lot! We've spent a lot of time thinking
         | about the pitch, given how many other ML tools are out there.
         | We're hoping to strike a chord with simplicity and developer
         | experience.
        
       | omarhaneef wrote:
       | Firstly, I can't believe you have enough instances to resist the
       | HN hug of death, with so many people presumably running tests. So
       | that is impressive.
       | 
       | Secondly, I ran the train -> test cycle and I didn't see any
       | error metrics. Is the idea that if we were spinning up our own we
       | would be outputting these ourselves? Or would we have trained up
       | the model somewhere else and we would transfer it to SLAI to do a
       | final test and then package it?
        
         | llom2600 wrote:
         | We tried to keep our testing workflow as flexible as possible.
         | There's a couple of use-cases that we wanted to allow:
         | 
         | - User is working with a pre-trained model that already went
         | through extensive testing during training. In this case our
         | test utilities are useful as e2e tests. Once you integrate the
         | model into your handler, you can specify a bunch of test cases
         | to be sure your API is going to behave as expected (like a unit
         | test).
         | 
         | - User wants to train the model on our platform - they can add
         | error metrics directly in their training script and prevent the
         | model from being saved if any error metric exceeds a certain
         | threshold. They can then additionally use the test.py script to
         | run tests against the model + handler.
        
       | timmit wrote:
       | I had the similar idea in 2018, to transfer AI model to API
       | endpoints. but I did not do anything. :cry
        
       | thegginthesky wrote:
       | Congrats on the launch!
       | 
       | Overall I like the idea and I agree with you, either the tools
       | are too focused on Data Scientists or there are a lot of DevOps
       | involved to get things started
       | 
       | I work on the field so I have some questions:
       | 
       | - Are there any plans to connect the project into a git repo?
       | 
       | - Is there any option for me to pass trained binaries to your
       | product? For example I have a beast of a machine and I can easily
       | train things locally, but I'd like to host the inference with you
       | guys
       | 
       | - Do you intend to allow automated testing and linting?
        
         | Mernit wrote:
         | Hi, thanks for the questions! We're planning on adding a way to
         | synchronize a git repo to a sandbox, should be out by April.
         | 
         | Right now, you can upload a trained binary to a sandbox and
         | return it from the train function, and then use it for
         | inference. So it's a bit manual at this point, but we're
         | planning on improving that workflow shortly.
         | 
         | We built linting and testing into the sandbox, but testing is
         | currently triggered manually - we're planning on building both
         | into our CI/CD system (scheduled training)
        
       | rish1_2 wrote:
       | This is essentially HuggingFace models + aws cdk deployed over
       | lambda. They are your biggest competition but likely there is
       | room for more. I think the key difference here is the training
       | part, which can be done by Sagemaker. If aws makes it user
       | friendly, they will be a serious threat. Good luck!
        
         | Mernit wrote:
         | We think developer experience is the factor that has been
         | sorely lacking from the ML tooling space. I'm curious how your
         | experience using Sagemaker has been?
        
           | gleenn wrote:
           | I heard that SageMaker wasn't great and was not prioritized
           | by Amazon anymore. Definitely could be wrong.
        
       | sandGorgon wrote:
       | this is pretty cool! especially the opinionated structuring part.
       | 
       | now Sagemaker allows u to download ur running code and docker
       | (https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangle...)
       | . Also allows u to simulate local running -
       | https://github.com/aws/sagemaker-tensorflow-training-toolkit
       | 
       | rather than anything else, this is basically just a way to calm
       | worries about lock-in. Google ML resisted this for a long time,
       | but even they had to finally do it -
       | https://cloud.google.com/automl-tables/docs/model-export
       | 
       | are you planning something similar ?
        
         | llom2600 wrote:
         | We have been planning an "eject" feature that would let you
         | develop in our app, but then export your artifact as a docker
         | image you can spin up in your own cluster (or whatever). This
         | is a necessity for customers who require on-premise
         | deployments.
         | 
         | However, this is probably a few months out since we're
         | currently focused on startups/developers that don't have that
         | requirement.
        
       | 5cotts wrote:
       | This seems pretty cool! I deployed a model to a REST endpoint and
       | am trying to test it out now using a Jupyter notebook running
       | Python.
       | 
       | Two things that happened to me:
       | 
       | 1) I wasn't able to install `slai` using Pip and PyPi. I ended up
       | downloading the source tarball from
       | https://pypi.org/project/slai/#files and installing locally.
       | 
       | 2) I am following the example for how to "Integrate" my model
       | using Python under the "Metrics" tab. However, the call to `model
       | = slai.model("foobarbaz")` is failing. It looks like the regex
       | check for `MODEL_ROUTE_URI` from line 21 in `model.py` doesn't
       | like my custom email address :(. For example, the following model
       | endpoint isn't valid according to the regex: "s@slai.io/foo-bar-
       | baz/initial" (My custom email is very similar to `s@slai.io`).
       | I'll post the regex below.
       | 
       | `MODEL_ROUTE_URI = r"([A-Za-z0-9]+[\\._]?[A-Za-z0-9]+[@]\w+[.]\w{
       | 2,3})/([a-zA-Z0-9\\-\\_]+)/?([a-zA-Z0-9\\-\\_]*)"`
       | 
       | Just wanted to let you know! Looking forward to experimenting
       | with this more.
        
         | llom2600 wrote:
         | The second SDK issue w/ the regex is resolved. Just bumped
         | latest version to 0.1.70. If you upgrade you should be good to
         | go. Still haven't been able to reproduce your issue with pip.
        
         | llom2600 wrote:
         | Thanks for the heads up - yep, looks like a bug in our SDK.
         | Should have a new version out shortly that handles it.
         | 
         | Weird that you weren't able to install the slai sdk via pip, we
         | just released a new version of the SDK this morning, unsure if
         | that's related. I'll take a look into that this afternoon.
         | 
         | Thanks for trying it out!
        
       | kamikazeturtles wrote:
       | Very interesting!
       | 
       | So when a user trains a model you guys startup a docker container
       | with everything in it. You guys bind the container's ports to the
       | host and add it to some key value store that a reverse proxy
       | references. Is that correct?
       | 
       | Sorry, I'm just really curious. It's a really interesting
       | project. Do you guys have anything open source?
        
         | llom2600 wrote:
         | Much of the complexity in the sandbox is in ensuring that the
         | development environment behaves as it it would in production -
         | but also that it loads as fast as possible. So we have to do a
         | bit of stuff behind the scenes involving dynamically linking
         | libraries, provisioning Kubernetes resources, etc. But
         | generally that's about right.
         | 
         | We've haven't open-sourced any of it yet, but there are
         | definitely a few components of our system that we'll open
         | source once we feel they're stable enough.
        
           | kamikazeturtles wrote:
           | I'm sorry I don't have any experience with Kubernetes
           | 
           | What benefit would Kubernetes bring to this architecture? You
           | can create and destroy docker container using the api.
           | 
           | What do you guys use Kubernetes for?
        
             | llom2600 wrote:
             | Kubernetes gives you a ton of extra tools that allow us to
             | manage the lifecycle of our sandboxes, deployed models,
             | asynchronous training jobs, etc.
             | 
             | Internally, each pod is just running a docker image. You
             | could probably throw something together with docker/the
             | docker API - but in our case we needed a bit more control.
        
               | kamikazeturtles wrote:
               | Hey Eli!
               | 
               | Thanks for the helpful responses!
               | 
               | I sent you an email regarding a possible internship
               | opportunity. Are you guys open to interns?
        
       | luke-stanley wrote:
       | This is cool. It took me a while to figure out that you want
       | people to click the test button on the sidebar to try it out, not
       | the "Test model" buttons unit tests in the bottom right side.
       | Unit tests might benefit from a different kind of icon. I tried
       | the "Interactive Mode" toggle button too, and that didn't do
       | anything obvious.
        
         | Mernit wrote:
         | Thanks for checking us out! We sort of anticipated that -- we
         | just launched the unit test feature this week, and it's not
         | easily distinguished from the test panel. We'll probably rename
         | the test panel to something more descriptive, like API runner.
         | Interactive mode is a feature to set interactive breakpoints in
         | your code, it's really useful for iterating on individual
         | blocks of code without having to run the entire app E2E.
        
       | chrisweekly wrote:
       | Awesome! This or something like it is going to bring ML to the
       | (SWE) masses. Congrats and hoid luck and thanks!
        
       | [deleted]
        
       | Oras wrote:
       | Congratulations on launch. How's slai different from huggingface?
        
         | Mernit wrote:
         | Hi, thanks! The main difference is that HuggingFace contains a
         | huge repository of pretrained models, whereas we're providing
         | the scaffolding to build your own end to end applications. For
         | example, in Slai you could actually embed a HuggingFace model
         | (or maybe two models), combine them into one application, along
         | with API serialization/deserialization, application logic,
         | CI/CD, versioning, etc.
         | 
         | You can think of us as being a store of useful ML based
         | microservices, and not just a library of pre-trained models.
        
           | Oras wrote:
           | Sounds good, but you can actually train and deploy with
           | HuggingFace.
           | 
           | https://huggingface.co/autonlp
           | 
           | Looking forward to seeing your success, good luck.
        
       | ayanb wrote:
       | Cool product! are you guys using wasm under the hood?
        
         | llom2600 wrote:
         | Hi - Luke (CTO) here. Thanks! Yeah, we're using WASM for things
         | some things in the editor like syntax highlighting. Planning on
         | moving most of the network logic into WASM shortly as well.
        
       | icyfox wrote:
       | Congratulations on the launch guys. Product need seems clear to
       | me & is a painpoint that I've felt most acutely in side projects
       | that I've worked on outside of our company's devoted CI
       | infrastructure.
       | 
       | Are you planning any git or IDE integration? Most of the magic
       | here seems to happens in the backend with easier training,
       | scheduling, and inference. Could this be enabled locally so devs
       | iterate in an environment that's more comfortable to them?
        
         | llom2600 wrote:
         | Hey - thanks!
         | 
         | Yeah we've been thinking about this quite a bit. We've explored
         | a couple of options here - I think our first pass is going to
         | be a way to synchronize an external git repository with a
         | sandbox. Would love to hear your thoughts here on what kind of
         | workflow might make the most sense.
         | 
         | I think long term we'll also add VSCode integration through an
         | extension, but that might be a few months out.
        
       ___________________________________________________________________
       (page generated 2022-03-03 23:00 UTC)