[HN Gopher] Launch HN: Humanloop (YC S20) - A platform to annota...
       ___________________________________________________________________
        
       Launch HN: Humanloop (YC S20) - A platform to annotate, train and
       deploy NLP
        
       Hey HN.  We're Peter, Raza and Jordan of Humanloop
       (https://humanloop.com) and we're building a low code platform to
       annotate data, rapidly train and then deploy Natural Language
       Processing (NLP) models. We use active learning research to make
       this possible with 5-10x less labelled data.  We've worked on large
       machine learning products in industry (Alexa, text-to-speech
       systems at Google and in insurance modelling) and seen first-hand
       the huge efforts required to get these systems trained, deployed
       and working well in production. Despite huge progress in pretrained
       models (BERT, GPT-3), one of the biggest bottlenecks remains
       getting enough _good quality_ labelled data.  Unlike annotations
       for driverless cars, the data that's being annotated for NLP often
       requires domain expertise that's hard to outsource. We've spoken to
       teams using NLP for medical chat bots, legal contract analysis,
       cyber security monitoring and customer service, and it's not
       uncommon to find teams of lawyers or doctors doing text labelling
       tasks. This is an expensive barrier to building and deploying NLP.
       We aim to solve this problem by providing a text annotation
       platform that trains a model as your team annotates. Coupling data
       annotation and model training has a number of benefits:  1) we can
       use the model to select the most valuable data to annotate next -
       this "active learning" loop can often reduce data requirements by
       10x  2) a tight iteration cycle between annotation and training
       lets you pick up on errors much sooner and correct annotation
       guidelines  3) as soon as you've finished the annotation cycle you
       have a trained model ready to be deployed.  Active learning is far
       from a new idea, but getting it to work well in practice is
       surprisingly challenging, especially for deep learning. Simple
       approaches use the ML models' predictive uncertainty (the entropy
       of the softmax) to select what data to label... but in practice
       this often selects genuinely ambiguous or "noisy" data that both
       annotators and models have a hard time handling. From a usability
       perspective, the process needs to be cognizant of the annotation
       effort, and the models need to quickly update with new labelled
       data, otherwise it's too frustrating to have a human-in-the-loop
       training session.  Our approach uses Bayesian deep learning to
       tackle these issues. Raza and Peter have worked on this in their
       PhDs at University College London alongside fellow cofounders David
       and Emine [1, 2]. With Bayesian deep learning, we're incorporating
       uncertainty in the parameters of the models themselves, rather than
       just finding the best model. This can be used to find the data
       where the model is uncertain, not just where the data is noisy. And
       we use a rapid approximate Bayesian update to give quick feedback
       from small amounts of data [3]. An upside of this is that the
       models have well-calibrated uncertainty estimates -- to know when
       they don't know -- and we're exploring how this could be used in
       production settings for a human-in-the-loop fallback.  Since
       starting we've been working with data science teams at two large
       law firms to help build out an internal platform for cyber threat
       monitoring and data extraction. We're now opening up the platform
       to train text classifiers and span-tagging models quickly and
       deploy them to the cloud. A common use case is for classifying
       support tickets or chatbot intents.  We came together to work on
       this because we kept seeing data as the bottleneck for the
       deployment of ML and were inspired by ideas like Andrej Karpathy's
       software 2.0 [4]. We anticipate a future in which the barriers to
       ML deployment become sufficiently lowered that domain experts are
       able to automate tasks for themselves through machine teaching and
       we view data annotation tools as a first step along this path.
       Thanks for reading. We love HN and we're looking forward to any
       feedback, ideas or questions you may have.  [1]
       https://openreview.net/forum?id=Skdvd2xAZ - a scalable approach to
       estimates uncertainty in deep learning models  [2]
       https://dl.acm.org/doi/10.1145/2766462.2767753 work to combine
       uncertainty together with representativeness when selecting
       examples for active learning.  [3] https://arxiv.org/abs/1707.05562
       - a simple Bayesian approach to learn from few data  [4]
       https://medium.com/@karpathy/software-2-0-a64152b37c35
        
       Author : jordn
       Score  : 89 points
       Date   : 2020-07-29 14:57 UTC (8 hours ago)
        
       | ZeroCool2u wrote:
       | This looks pretty great, though the SaaS model is an absolute
       | non-starter for my own usage unfortunately. We've been pretty
       | prolific users of Explosion AI's (Makers of SpaCy) Prodigy [1]
       | and actually the interfaces look very similar. What would you say
       | the core differences are between Humanloop and Prodigy?
       | 
       | 1: https://prodi.gy/
        
         | razcle wrote:
         | Thanks! Prodigy is a good tool and we definitely were inspired
         | by some of their UX decisions. Reducing each decision to a
         | small atomic unit and avoiding context switching makes a lot of
         | sense.
         | 
         | Our starting place is similar to Prodigy in that we also see
         | active learning as a key piece of the puzzle but we think to
         | make active learning work reliably really does need taking into
         | account parameter uncertainty. As far as I know Prodigy doesn't
         | do this. We are also working to make our active learning work
         | at the level of batches and be cost-aware. Often the most
         | valuable examples to label for the model are the most time
         | consuming for humans and we work to trade this off.
         | 
         | A few other differences are that we do offer a cloud hosted
         | solution so getting set up is much faster and it's more natural
         | for us to be able to accomodate team annotation and quality
         | assurance. By providing a hosted model we also give you the
         | option of deploying features very quickly and continuing to
         | improve them post deployment.
         | 
         | I'd be curious to know the barriers that Saas introduces for
         | you?
        
       | an_ml_engineer wrote:
       | Cool! I'm curious, how do you compare your service to Scale
       | (scale.com)?
        
         | razcle wrote:
         | Hi,
         | 
         | Raza here (one of the other co-founders). Good question! I
         | think our visions are quite different even if our starting
         | points look similar.
         | 
         | Scale has always positioned themselves as an API to human
         | labour and their goal is to abstract the labelling task away
         | from the end user as much as possible. So scale works really
         | well when you can easily outsource your annotation task.
         | 
         | Our ultimate goal is to try and give domain experts the ability
         | to teach ML models themselves. We're much more focussed on NLP
         | and on tasks that require domain expertise and are hard to
         | outsource. For people where deep domain expertise matters or
         | their are privacy concerns, Scale isn't really an option and
         | we're building tools for them.
         | 
         | On another point, Scale makes its money by charging per
         | annotation so we think they aren't as incentivised to reduce
         | how much you need to label.
         | 
         | thanks!
        
       | jeffbarg wrote:
       | Humanloop is such a great name for an AI platform :) Congrats on
       | the launch!
        
         | jordn wrote:
         | haha so great to hear! For a while google search kept trying to
         | auto correct it to 'human poop'
        
       | ml_basics wrote:
       | Great stuff!
        
       | [deleted]
        
       | gauravsc wrote:
       | https://jacobbuckman.com/2020-01-17-a-sober-look-at-bayesian...
       | 
       | "But in practice, BNNs do generalize to test points, and do seem
       | to output reasonable uncertainty estimates. (Although it's worth
       | noting that simpler approaches, like ensembles, consistently
       | outperform BNNs.)"
        
       | Grimm1 wrote:
       | Neat how do you compare yourself on the annotation capabilities
       | with Datasaur.ai which launched in the last YC batch?
       | 
       | In terms of training the models for deployment -- do we own the
       | artifact? Can I move that into my own model repository?
       | 
       | Also how do you feel this compares to using fine tuning on a
       | publicly available BERT family model which is already fairly fast
       | and easy not requiring a huge corpus, speaking from experience of
       | recently having done so?
       | 
       | Are the benefits more from the tight feedback loop and already
       | standing infrastructure?
        
         | jordn wrote:
         | All great questions!
         | 
         | Datasaur are great. I hope Ivan would think it's fair that I'd
         | describe their current product as as a modern, cloud-hosted
         | Brat (https://brat.nlplab.org/ - this remains very popular!)
         | with the features to make that work with teams. As you point
         | out we're focusing on the tight integration of annotation and
         | training enabling you to move faster and iterate on NLP
         | ideas... essentially trying for move a waterfall ML lifecycle
         | to a an agile one.
         | 
         | Fine tuning on BERT is the way to go. It's what we do, and that
         | already reduces the data annotation requirements by an order of
         | magnitude. Doing that offline in a notebook is still wanted by
         | some (you can use our tool just as the annotation platform, and
         | download the data and you'll still get the efficiency benefit
         | through active learning) but integrating or deploying that
         | model is still a time-suck. Having the model deployed in the
         | cloud immediately has a load of supplementary benefits (easy to
         | update, can always use the latest models etc) too, we hope.
         | 
         | (edit: typos)
        
           | Grimm1 wrote:
           | Great answers, thank you very much!
        
           | julvo wrote:
           | Firstly, congrats on the launch! Active learning is a super
           | interesting space.
           | 
           | You say it's possible to download the data and use Humanloop
           | for annotation only while still benefitting from active
           | learning. I'm curious about your experience with how much
           | active learning depends on the model. Are the examples that
           | the online model selects for labelling generally also the
           | most useful ones for a different model trained offline?
        
             | jordn wrote:
             | Cheers. It's a good thing to be wary of. Poor use of active
             | learning will end up biasing the data according to the
             | model it's trained on - so that data won't be the best X
             | samples to train on a different model. Most of this issue
             | comes from bad active learning selection methods. If you
             | have well calibrated uncertainty estimates and sample for
             | diversity and representiveness too, it's far less of a
             | concern.
        
       | alihabib123 wrote:
       | This is really cool! Wish you all the best of luck!!
        
       | foobaw wrote:
       | #1 and #2, if they work as advertised, are great features but a
       | lot of other companies claim to do this but have failed.
       | 
       | One of the biggest problems I have is image annotation using CVAT
       | - the tool works when the task is simple annotation but
       | outputting the annotation data and integrating it has been a
       | pain-point. Also CVAT has a tool is great but has a lot of
       | missing features :/
        
       | caiobegotti wrote:
       | Is it English-only or true NLP that would work with multiple
       | languages? Congrats for the launch!
        
         | razcle wrote:
         | We wrap a lot of popular frameworks and have implementations of
         | most SOTA models. By default we use a multilingual BERT model
         | so it should work out of the box on different languages.
        
       | haffi112 wrote:
       | What type of annotations do you offer?
        
         | jordn wrote:
         | Right now document level classification and span tagging within
         | text documents. These can also be combined (as in the landing
         | page screenshot) so that for a given input, you're learning
         | multiple tasks at once as you annotate.
         | 
         | The core of this platform should generally be independent of
         | the data input type and the output labels, so we're building
         | out other annotation options for our business customers. If
         | there's a use case you would like it to support, it would be
         | great to chat jordan[at]humanloop.com :)
        
           | hbcondo714 wrote:
           | >> text documents
           | 
           | Congrats on the launch! Would Humanloop be able to support
           | HTML files or URLs? A client of ours has a need to annotate
           | verbose web pages.
        
             | razcle wrote:
             | At the moment we dont support the ability to render the
             | HTML but it is something that has come up before. One of
             | the teams we're speaking to wants to classify blog posts
             | and would like to be able to preserve their formatting. If
             | this is something that's important to you we would consider
             | adding it so maybe drop me an email at
             | raza[at]humanloop.com and we can discuss?
        
               | hbcondo714 wrote:
               | Thank you for your reply. Yes, preserving the formatting
               | is important for us too.
        
       | Rickasaurus wrote:
       | Is this something we will be able to buy and run on our servers?
       | I don't think we're the only ones wary of working hard to develop
       | IP for a different company.
       | 
       | Also predictions/month pricing is just really challenging and
       | incompatible with many downstream business models. The value has
       | to be really huge to justify that.
        
         | razcle wrote:
         | The model you train and data you upload are yours to own,
         | unique to you and don't get shared across users or reused for
         | any other tasks so hopefully you shouln't feel too much like
         | your building our IP ;)
         | 
         | In terms of deployment options, we're trying to lead with cloud
         | hosting by default but know that for a lot of people the whole
         | reason they're annotating in house is privacy so we've been
         | exploring deploying in your VPC and for larger enterprises on-
         | prem.
         | 
         | Interested to hear more of your thoughts on the pricing model,
         | this is something we're still iterating on so I'd be interested
         | what you think would be most compatible with your use cases?
        
       | stuartaxelowen wrote:
       | Do you allow for on-premise inference?
        
         | peadarohaodha wrote:
         | Our default deployment option is cloud first for both training
         | and inference at the moment, but we have thought about the
         | ability for users to export a trained model. Either exporting
         | the model parameters in some standardised format, or a compiled
         | predict function, or a docker image that encapsulates a full
         | inference service, etc. So if you could use this kind of export
         | within your application, this would allow on-premise inference.
         | This is something we could probably make available pretty
         | quickly if necessary for your use case.
        
       ___________________________________________________________________
       (page generated 2020-07-29 23:00 UTC)