[HN Gopher] Launch HN: Nyckel (YC W22) - Train and deploy ML cla...
       ___________________________________________________________________
        
       Launch HN: Nyckel (YC W22) - Train and deploy ML classifiers in
       minutes
        
       Hi HN, we're George, Oscar, and Dan, the founders of Nyckel
       (https://nyckel.com). Nyckel allows developers with no ML
       experience to train and deploy ML functions to classify images and
       text with very little training data. We let you go from a few
       labeled data points to a serverless machine-learned classifier in
       minutes.  The ML-as-a-Service space is dense, including some recent
       YC companies; so why did we create Nyckel? Our goal is to create a
       tool that is light, fast, fun, and accessible. Training only takes
       seconds, you only need 10s of annotations, we avoid ML-lingo and
       abstract away concepts that make developers feel like outsiders.
       Our pricing is transparent, signup is instant, and the platform is
       100% self-serve.  Dan, an experienced engineer without any ML
       background, was building a social website that required manual
       curation of user-contributed content. He looked into automating
       this curation with ML and found offerings that required complicated
       setup and knowledge of ML concepts. He talked to his AI-researcher
       friend Oscar, and together they realized that the current solutions
       were unnecessarily complex and didn't expose the right developer-
       friendly abstractions. We think there are many engineers like Dan
       who leave similar problems unsolved because of the effort required.
       Using Nyckel, you upload a small (or large) amount of data,
       annotate a minimum of 2 examples per class, and have a trained
       model deployed in the cloud and callable via a REST API. All of
       this happens in seconds. As you use the model, you can continue to
       improve it by providing more data points as you encounter them. You
       can also explore and annotate your data in the UI.  The Nyckel
       AutoML engine is based on meta transfer learning. It's "transfer"
       because it leverages a large set of pre-trained neural networks to
       represent your data, and it's "meta" because we make informed
       decisions on which of the networks to try based on your particular
       problem. The design allows for a highly parallel execution where
       features are extracted and models trained by 100s of compute nodes
       in parallel requiring only 10s of seconds to train even with 1,000s
       of samples. We keep abreast of the latest deep-learning networks
       and add new networks to the system to improve existing and new
       models. Your trained model is immediately deployed on an elastic
       inference infrastructure.  Our customers are using us to do things
       like: tag and organize photos in a used-car marketplace; triage
       customer responses and support tickets for CRM (in multiple
       languages); determine fake vs real profile pictures to help with
       user verification; analyze blood sugar charts to suggest corrective
       actions; and build a barcode-less scanner for bulk foods.  Oscar
       has over 4k scientific citations of his AI research, as well as
       several industry applications behind him. Dan has designed multiple
       developer APIs throughout his career, most recently Square's
       developer APIs. I led the Functions-as-a-Service team at Oracle
       Cloud and have extensive experience building large cloud systems.
       We think that ML, cloud, and API expertise is the right combination
       for this problem!  We have elastic pricing with an always-free
       tier. Beyond the free tier, we make money when you invoke your
       function to make a prediction.  We're really happy we get to show
       this to you all. Thank you for reading it all! We'd love for you to
       check us out, and share your thoughts on Nyckel and your
       experiences in the ML tooling space in general.
        
       Author : saintarian
       Score  : 75 points
       Date   : 2022-01-10 14:18 UTC (8 hours ago)
        
       | gamerDude wrote:
       | This is a very interesting product, and since I have some data
       | that could really benefitted from this, I tried it out.
       | 
       | I went through the upload process. But then I don't really know
       | what to do from there. I tried some filters. I went to the invoke
       | page, but I had no idea what invoke does or what the example
       | output is. (Eventually figured out that I can just put text in
       | the invoke and run it). All in all, there are a bunch of things
       | that I don't really know what they are. I was a statistician
       | before ml became popular, so I understand the underlying
       | premises, but none of the modern language.
       | 
       | I would also really have liked to been able to filter by say, if
       | the confidence level is over 80%, how accurate is the model.
       | Because then I can say, well, if we use this, I can knock out
       | tons of work at the 80% confidence rate and then just manually
       | work with the rest.
       | 
       | I'm also not sure if you are seperating training/test data. All
       | in all, looks nice, it was very easy to get started, but I'm a
       | bit lost on what to do next and I'm having trouble judging how
       | useful this will be to me and if I should invest more time.
        
         | saintarian wrote:
         | Thanks for trying us out and for the feedback! I agree that our
         | filters are a little confusing right now and we're working on
         | fixing it. In the meantime, here are a couple of filters you
         | could try:
         | 
         | - To see all cases where the model disagrees with your
         | annotation: Function Output = Disagrees, Desired Output = Any.
         | 
         | - To see the least confident predictions from the model:
         | Function Output = Any, Desired Output = Any, Sort By = Least
         | Confident Prediction.
         | 
         | Your idea us helping you pick a confidence threshold is a good
         | one. We'll get that into our near-term roadmap.
         | 
         | We use a technique called cross-validation to seperate training
         | and test data. We have that documented here:
         | https://www.nyckel.com/docs#cross-validation
        
           | gamerDude wrote:
           | So, yeah. I could actually use some help on language here. Is
           | Desired Output what I tagged it as?
           | 
           | I think output is confusing me a bit. Output being predicted
           | value? And then desired output is user tagged value?
        
             | saintarian wrote:
             | Desired Output is what you tagged it as. Function Output is
             | what the model predicted.
             | 
             | We tried to make the lingo developer-friendly. We think of
             | models as functions that transform inputs to outputs.
             | Instead of writing code to do so, as developers usually do,
             | you train the function by providing desired outputs to
             | sample inputs.
        
         | danott wrote:
         | Thanks for the feedback! We actually had a feature for "what is
         | the accuracy if I only consider >80% confident samples" but we
         | iterated away from it because people found it complicated.
         | We'll definitely bring it back when we can make it simple
         | enough.
         | 
         | We've also found that people can get lost in the filters; in
         | particular the "Not assigned" annotation filter we probably
         | need to remove for people who have annotated all of their data.
         | 
         | In terms of separating training / test data: we use cross-
         | validation so that we can abstract away the concept of train
         | vs. test vs. validate sets.
        
       | blululu wrote:
       | This is very cool. A few quick questions:
       | 
       | 1.) Would it be possible to buy the model and integrate/host it
       | on my own machines?
       | 
       | 2.) Would you consider making solutions for embedded ML in the
       | future?
        
         | danott wrote:
         | We do think "model export" is important, but we're still
         | getting our heads around how to do it in the most non-ML-expert
         | friendly way. We don't think the persona we're building for
         | wants a weights file dropped in their lap. What output / format
         | would be ideal from your perspective?
        
           | blululu wrote:
           | I was thinking of something like an ONNX file or something
           | that can easily slot into different runtimes.
           | 
           | Makes sense that this would be less beginner friendly so
           | maybe you're correct that this is a P2 feature.
           | 
           | I guess I was thinking more in terms of pricing models and
           | scaling up a service which is obviously a complicated
           | decision for a startup so I'm not really sure what makes
           | sense here. My rationale for wanting to buy/rent the model is
           | that as a service scales it becomes increasingly important to
           | own the model and the hosting. One of my concerns with
           | building on top of a service like this is that it will
           | potentially reach a chokepoint in the future. In general
           | training a model is expensive and unique but hosting it is a
           | commodity service. This will incentivize customers to use the
           | service when they are small and then drop it when they grow
           | to a certain size which is not necessarily ideal for either
           | party.
        
             | danott wrote:
             | Yep. The pricing model does basically break down for model
             | export, but I think there's a solution there. Or, said
             | another way, if we could make it really easy to do then
             | there's an adjacent business we could move into.
             | 
             | In terms of keeping customers as they grow, our view
             | (hope?) is that these models will be continually updated
             | because of new annotations on their end, and from new
             | training techniques on ours. And that concept of continuous
             | improvement will push people toward a SaaS model.
             | 
             | When you say chokepoint, are you referring to cost, or
             | latency, or something else?
        
           | isaacimagine wrote:
           | Maybe a cross-language library that takes a binary weights
           | file (with embedded model information) and exposes an
           | interface similar to that of the web API? Or a local
           | lightweight version of Nyckel that one can run on their own
           | infrastructure (that exposes the same REST API)?
           | 
           | Just spitballing here; these two would be the most convinent
           | for the use-cases I have in mind.
        
             | saintarian wrote:
             | Agreed that they would be convenient. We are looking at
             | both those options. There are devils in the details like
             | seamlessly taking advantage of available hardware
             | acceleration.
             | 
             | Would love to talk more about your use case so we
             | prioritize the right things for model export. Drop me a
             | line (george at nyckel dot com).
        
       | pyrrhotech wrote:
       | This looks cool, I signed up. I may have to employ this in my
       | algotrading pipeline. I've collected historical data from 100s of
       | different indicators and I'm currently using a largely heuristic
       | approach with some light ML on subsets, but so far I haven't had
       | the ML breakthrough I've been looking for,
       | https://grizzlybulls.com/models/vix-ta-macro-advanced
        
         | saintarian wrote:
         | Thanks for trying us out. We just added a beta for classifying
         | tabular inputs (a mixture of text and numbers) - this may be of
         | interest to you. We have seen some people use our platform to
         | detect stock market trends. Let us know how it goes and reach
         | out if we can help (george at nyckel dot com).
        
       | montenegrohugo wrote:
       | Heyo, cool product. I know setting up and deploying ML on your
       | own is a big PITA from own experience, so ML as a service is a no
       | brainer.
       | 
       | Questions: What do you guys see as the long term vision here?
       | Where would you like to go, if you could? Obviously the value
       | prop is simple and clear right now, and you have tons of
       | competition. But assuming you're profitable, get traction, what
       | would you LIKE to be in 5-10 years? What's your dream as
       | entrepreneurs?
       | 
       | Best of luck. Also, for some small feedback, the UI currently
       | looks a bit generic, bootstrap-y. I'd recommend giving it a small
       | facelift if you can.
        
         | danott wrote:
         | Haha yea I'm to blame for the crappy UI... I believe in
         | principle with shipping things you're uncomfortable with, but
         | it also hurts my soul.
        
         | saintarian wrote:
         | We think the number of use-cases for ML is going to grow
         | drastically as 1) ML state-of-the-art continues to get better;
         | and 2) Developers realize how accessible it can be. More
         | problems will be solved by a "machine-learned function" rather
         | than imperatively coded functions. Taking it one step at a
         | time, in 5-10 years we'd love to be the default place the
         | developers go to solve their ML problems. This would mean that
         | we go beyond image/text classification to a broader set of
         | input and output types.
         | 
         | Thanks for the feedback on the UI! None of us are UI experts
         | and we agree that it isn't great. We're working to make it
         | better.
        
       | isusmelj wrote:
       | Hi George, Oscar and Dan. Congrats on the launch! Nyckel sounds
       | super interesting. It looks like you mostly focus on
       | classification tasks. Any plans to also support other tasks such
       | as object detection or image segmentation?
        
         | danott wrote:
         | Thanks. They're both on the roadmap; in fact we've got a
         | handful of users in a private beta for object detection; let me
         | know if you're interested in getting in on that!
        
           | Melting_Harps wrote:
           | I'd like to hear more how I could contribute to the beta, I'm
           | studying AI and ML after a career in fintech and would like
           | to see how things are done to train these models first hand.
           | 
           | I'm playing with Nykel right now and it looks pretty
           | straightforward with a clear and simple UI.
        
             | beijbom wrote:
             | Thanks! I'd love to chat more about your use-case and how
             | we can help you. Drop me a line at oscar at nyckel dot com
             | to set something up.
        
       | petters wrote:
       | Great to see alumni from Lund here! Oscar Beijbom created the ML
       | tech for the airbag bicycle helmet Hovding right after college.
        
         | beijbom wrote:
         | Haha, that's right. Thanks for the shout-out @petters!
        
       | [deleted]
        
       | Cyril_HN wrote:
       | Would this let a complete novice tag and classify words, say, to
       | only highlight verbs? Would that task be easier than typical
       | methods?
        
         | stuartaxelowen wrote:
         | Check out Spacy, it provides PoS tagging (among other things):
         | https://spacy.io
        
         | beijbom wrote:
         | Hi Cyril_HN! Thanks for your question. What you are asking for
         | is sometimes called "part of speech" tagging. We currently
         | don't support that but will add it down the road along with
         | more advanced image outputs like detection.
        
           | beijbom wrote:
           | I'm Oscar, btw. :)
        
       | colincooke wrote:
       | Interesting product strategy, but I can't help laugh at some of
       | the product examples like "barcodeless scanner" or "quality
       | inspection" (I can only comment on imaging as that's my
       | background). First of all the idea of replacing a perfectly
       | functional barcode scanner with an ML model is not a great sell
       | (having how much I pay at the register dependent on the lighting
       | at the grocery store is not going to be a fun time). Second, ML
       | models are great when failure is OK at a specific rate, none of
       | the image classification examples shown have that characteristic.
       | 
       | While I'm sure the founders are competent and understand these
       | limitations, it's unfortunate that they've chosen such flawed
       | examples to show off on their home page.
        
         | danott wrote:
         | Actually, all these examples are straight from our userbase. I
         | can't speak to how successful they're being in their own
         | businesses, but they seem happy with the ML. The 'barcodeless
         | scanner' in particular is about scanning bulk foods that don't
         | have barcodes, and it seems to work for them.
        
       | dangledangle wrote:
       | I'm curious to know how this is different from Microsoft's
       | CustomVision and other such tools. In fact, they've been around
       | for much longer with the exact same product.
       | 
       | Thanks.
        
         | danott wrote:
         | Well there's a continuum of offerings, some having lots of
         | custom control of the training pipeline and on the other side
         | things like CustomVision that try to make it easy / hide the
         | complexity. We consider ourselves even further to the "hide
         | complexity" side, since we try every domain automatically vs.
         | making you choose, re-train automatically, etc. Initial
         | feedback from our users is that even recall vs. precision is
         | something they don't want to have to think about. In addition,
         | we don't limit ourselves to only vision - we'd like to be the
         | one stop shop for ML as a service.
        
       | zschuessler wrote:
       | I couldn't stop chuckling over the example use cases' screenshot
       | for Sentiment Analysis. Please don't remove it, it made my day
       | and it's only 10am here.
        
         | danott wrote:
         | Oh man, we literally just got new images from our new designer
         | and I think they went a different direction with that one :)
        
         | saintarian wrote:
         | Ha - that one makes us chuckle too! But we can't promise to not
         | remove it.
        
       | psimm wrote:
       | Very cool! I signed up and uploaded data for a text classifier.
       | 3000 examples of social media posts on a binary annotation task.
       | Got 91% initially, then looked through the annotations and
       | corrected a few errors that had snuck in. The UI for that is
       | great. That got it to 92%.
       | 
       | Easy to use UI, easy data upload and the training was quick. A
       | great tool for testing new ideas for classifiers. For bigger
       | projects I'd be concerned about long term cost with pay per
       | invocation.
       | 
       | Is weak labeling via labeling functions (snorkel, skweak)
       | something that's on the roadmap for Nyckel? Also, do you plan to
       | add named entity recognition?
        
         | saintarian wrote:
         | Thanks you for the kind words and feedback! You basically went
         | through most of the UI flow that we designed for. You're spot-
         | on about testing new classifiers - answering the question "Can
         | ML even help with my problem?" is much easier with Nyckel and
         | prototyping and rapid iteration starts with that.
         | 
         | Our goal is to be cost-competitive, even for bigger projects.
         | Given how early we are, our pricing structure is still being
         | worked on, especially for high-volume.
         | 
         | Integrating with labeling solutions is in our roadmap. In the
         | meantime, our API should enable any data/labeling integrations.
         | 
         | Named entity recognition is also in the roadmap. Would love to
         | hear more about your use-case and we can give you access to the
         | beta when ready.
        
           | beijbom wrote:
           | Chiming in on the weak labeling question: As of right now,
           | you can use outside libraries like skweak to create weak
           | labels offline and then PUT those using our API
           | (https://www.nyckel.com/docs#update-annotation). This
           | wouldn't cost anything since we only charge for invokes, but
           | it requires some work.
           | 
           | We may look at adding weak labeling as a first class feature
           | of our site down the road, but we are not yet sure we need
           | to. With the powerful semantic representations offered by the
           | latest deep nets, we find that smaller number of hand-
           | annotated samples often suffice for the desired accuracy
           | which makes the whole annotation process simpler and faster.
           | Of course, if you have data & evidence to the contrary, we'd
           | love to take a look.
        
       ___________________________________________________________________
       (page generated 2022-01-10 23:00 UTC)