[HN Gopher] Launch HN: Deepnote (YC S19) - A better data science...
       ___________________________________________________________________
        
       Launch HN: Deepnote (YC S19) - A better data science notebook
        
       Hello HN,  I'm Jakub and I'm the founder of Deepnote
       (https://deepnote.com/). We're building a better data science
       notebook.  As an engineer, I spent most of my time working on
       developer tools, building IDEs, and studying human-computer
       interaction. I helped build a couple of startups, I built tools for
       JavaScript development, and worked on Firefox DevTools. But once I
       started to work with data scientists, all those code editors and
       IDEs that I knew as a software engineer suddenly stopped being the
       right tool for the job. Notebooks were.  Notebooks as we know them
       today have many pain points (versioning, reproducibility,
       collaboration). They don't work well with other tools. They don't
       exactly encourage best practices. But none of these are fundamental
       flaws of the notebook paradigm. They are signs of a new
       computational medium. Much like spreadsheets in the 1980s.  Two
       years ago, my co-founders and I started to think about a better
       data science notebook. Deepnote is built on top of the Jupyter
       ecosystem. We are using the same format, and we intend to remain
       fully compatible in both directions. But to solve the above
       problems, we've introduced significant changes.  First, we made
       collaboration a first-class citizen. To allow for this, Deepnote
       runs in the cloud by default. Every Deepnote notebook is easily
       shareable (like Google Docs) and easy to understand even by non-
       technical users.  Second, we completely redesigned the interface to
       encourage best practices, write clean code, define dependencies,
       and create reproducible notebooks. We also built a really good
       autocomplete system, and added a variable explorer.  Third, we made
       Deepnote easy to integrate with other services. We didn't want to
       build another data science platform where people work with an
       iframed notebook. We want to build an amazing notebook that plays
       well with other services, databases, ML platforms, and the Jupyter
       ecosystem.   __Check out a 2-min demo here:
       __https://www.loom.com/share/b7e05ecca78047c2a2f687d77be8ecea
       Building a new computational medium is hard. It takes time. Today,
       we're launching a public beta of Deepnote. Not everything works
       yet. Some pieces are missing. But we also have a lot in store,
       including versioning, code reviews, visualizations. We still have a
       lot to learn too, so I'd love to hear your thoughts and feedback.
        
       Author : Equiet
       Score  : 196 points
       Date   : 2020-10-30 14:49 UTC (8 hours ago)
        
       | amirathi wrote:
       | > Notebooks as we know them today have many pain points
       | (versioning, reproducibility, collaboration)
       | 
       | Absolutely. We're solving a small part of this by making
       | notebooks play nicely with GitHub (https://reviewnb.com). Code
       | reviews & collaboration for Jupyter Notebooks, essentially.
       | 
       | Happy to see more products taking a stab at this problem. I'd be
       | curious to know how you implement version control (git or
       | something else) & what kind of experiences does that translate to
       | for the user. Congrats on the launch!
        
         | Equiet wrote:
         | At the moment we have GitHub integration, so you can easily
         | commit changes like you're used to. We also have project
         | history (so you can see all the actions that lead to the
         | current state of the project and review what happened while you
         | were away).
         | 
         | But I'd like to improve on this experience. There are many ways
         | how to do it (great job btw), but we want to explore how a
         | versioning system native to notebooks would look like. We're
         | still iterating on that.
        
       | rsweeney21 wrote:
       | Kind of a random question: why wait so long to do your launch HN?
       | It looks like you've had a working product for quite a while now.
       | 
       | When I see a launch HN from over a YC batch over a year ago I
       | assume there was a pivot or they had trouble getting traction.
       | Doesn't seem like that happened in your cases and it might not
       | happen in most cases, which is why I'm asking.
       | 
       | Either way, looks like an awesome product. I sent it over to the
       | Data Science team at my company and they were pretty impressed.
        
         | Equiet wrote:
         | We're building a pretty hard product. We had a nice working
         | demo a year ago, but there's a lot of work to make a platform
         | like this stable. Real-time collaboration is pretty difficult
         | by itself (especially when you're not syncing just text), but
         | we also had to build a computing platform where users can run
         | arbitrary code. That opens us up to everything from a large
         | attack surface to a huge number of quite inventive crypto
         | miners. So we kept building in a private beta until we were
         | confident enough to launch publicly.
         | 
         | Interestingly, ever since we started almost 2 years ago we've
         | been pretty laser focused and there were minimal changes to the
         | vision overall. But we also knew what we were going into and
         | that it'd take time.
        
       | loehnsberg wrote:
       | This looks very promising! Great work and I'd love to give it a
       | try. While I do use Python, I also use Scala and R kernels. Does
       | or will Deepnote support other kernels?
        
         | epiteton wrote:
         | Hi, Deepnote supports any other Jupyter compatible kernel.
         | Check out docs for details (we have guides for both Scala and
         | R) https://docs.deepnote.com/environment/custom-
         | environments/ru...
        
       | 29athrowaway wrote:
       | Please add corgi mode and kitty mode.
       | 
       | https://mobile.twitter.com/googlecolab/status/11899381904522...
        
       | threatofrain wrote:
       | Deepnote + Racket kernel + docker:
       | 
       | https://twitter.com/dkvasnickajr/status/1321901316411711490?...
        
       | tpetry wrote:
       | A notebook service without any gpu machines is a kind of very
       | strange decision. Whats your target audience with this service?
       | As any machine learning workload would practically take forever.
        
         | Equiet wrote:
         | Ship early, ship often. GPUs are coming.
        
       | jensidean wrote:
       | Good job!
        
       | setgree wrote:
       | This looks great, thanks for sharing! I just signed up.
       | 
       | My main question is how/if DeepNote addresses issues of
       | reproducibility. Is this a priority for your team? You mention it
       | a few times in your post here, but there is not so much in the
       | docs -- I looked it up in and got just to this:
       | 
       | > Even though the Custom environment cache is implemented using
       | Docker images, it doesn't primarily serve the reproducibility
       | problem. The aim of the feature is to significantly speed up the
       | start time of your projects. In other words, you should consider
       | it to be only a cache at this point.
       | 
       | My experience with Notebooks suggests that the main
       | (computational) reproducibility challenges were
       | 
       | A) 'hidden state' information (e.g. cells executed out of order,
       | variables changed and then reverted but not re-run); and
       | 
       | B) no clear infrastructure for documenting/caching dependencies
       | (I see you have a terminal option, and the web-based access
       | should address some of this, but something like 'conda install
       | environment.yml` doesn't seem possible out of the box.)
       | 
       | I would understand if these issues are not priorities for you, I
       | don't think most data science projects _need_ to be run in the
       | far future and most teams can informally sync their dependencies.
       | 
       | If reproducibility is a core priority, do you plan to write
       | something about DN serves that purpose? I'd be glad to take a
       | close look if you do (I have written/worked a fair bit on this in
       | the past).
        
         | Equiet wrote:
         | Re Dockerfiles: Right now, we need to rebuild your docker
         | images from time to time (e.g. when we make some changes to the
         | kernel). That means that if you create a docker image with `RUN
         | pip install numpy` and we need to rebuild it in a year, you
         | might get a different version which might break things. The
         | correct solution here is to encourage users to always use `RUN
         | pip install numpy==1.19.3`. We already do this in Python cells
         | (when you run `!pip install numpy` we query PyPI and suggest
         | the last version to you), but we haven't added it to
         | Dockerfiles yet. So to set the expectations we have this notice
         | in the docs.
         | 
         | Regarding other issues: We currently record every execution in
         | project history. That means even if you run cells out of order,
         | you can still get a list of commands that shows how you got to
         | the current state.
         | 
         | The next step for us is to start subtly notifying users when
         | they are doing something that could be an issue later down the
         | road (for example executing cells out of order). We already
         | built this, but decided not to ship it yet because it needed
         | more love. The second thing we are working on is
         | interactive/reacting execution. This is very very very cool and
         | brings the experience from the notebook to the next level (at
         | least for me), but needs much more testing.
         | 
         | Reproducibility vs flexibility (in the sense of letting the
         | user do whatever they want if they know what they're doing) is
         | a difficult problem. In the end, it's going to be a combination
         | of friendly nudges and much better experience if users are
         | following the "reproducible" path. However, we never want to
         | limit users in what they are able to do.
         | 
         | I spent a lot of time thinking about this and would be happy to
         | chat about what you're thinking. Feel free to email me at
         | jakub@deepnote.com.
        
           | setgree wrote:
           | Thanks so much for the thoughtful reply! I will follow up
           | when I've had a deeper dive into the product.
        
       | dfsegoat wrote:
       | This looks great!
       | 
       | I'd be curious to see a detailed feature comparison between this
       | and Google Colab / Colab pro [1,2]? I think others might find
       | this useful as well.
       | 
       | 1 -
       | https://colab.research.google.com/notebooks/intro.ipynb#rece...
       | 
       | 2 - https://colab.research.google.com/signup
        
         | Equiet wrote:
         | Quick summary: - real time collaboration - integrations
         | (databases, S3 buckets, environment variables) - persistent
         | (and much much faster) filesystem - hardware doesn't shut off -
         | many more features like variable explorer or automatic
         | visualizations - much nicer interface so you can share with
         | non-technical people - paid plan so you can build your data
         | science team around it - no GPU/TPU machines yet, but that's
         | coming
        
       | desmap wrote:
       | No GPUs with tensor cores?
        
         | adlha wrote:
         | Hi from Deepnote! We've pulled GPUs temporarily so that we can
         | focus the roadmap on improving the notebooks experience. Will
         | be back sooner or later for sure.
        
       | [deleted]
        
       | WClayFerguson wrote:
       | Hey Jacob, check out this platform:
       | 
       | https://quanta.wiki
       | 
       | A "collaborative notebook" would be one very good way to describe
       | what Quanta is as well. I'm the developer of it, by the way.
        
       | marapuru wrote:
       | Very nice. As a UX person who recently got into data science this
       | is a breath of fresh air compared to the traditional jupyter
       | notebooks. Website looks really fresh and I love the colors and
       | font usage. Video could be slightly shorter to really show me the
       | highlights and maybe show a bit more of the realtime collab
       | (althought the gif below shows it well).
       | 
       | One thing that annoyed me a bit is that I could only register
       | with github or google. Why can't I just create an account
       | directly with your service?
        
         | Equiet wrote:
         | Thanks! Honest answer: it was faster to implement. Regular sign
         | up via email should be coming soon.
        
           | marapuru wrote:
           | Understood. I was thinking it might have been a GDPR tackle
           | tactic :-)
        
         | marapuru wrote:
         | Additional comments:
         | 
         | The tutorial is nice, I like how it guides me through the tool.
         | But I struggled finding the publish button. As it was under the
         | Share text. It would be quick win to make it more of a CTA
         | (make it blue or something like that). Look at Figma for an
         | example.
        
       | kndjckt wrote:
       | This looks great. We're very much fed up with collaborating on
       | Jupyter notebooks. Sadly my team can't use Deepnote yet because
       | all our datastores are behind VPN. Is there a future where we
       | could run Deepnote on our own AWS instances?
        
       | epiteton wrote:
       | Saw this back when it started vs today, amazing job!
       | https://twitter.com/DeepnoteHQ/status/1315375717526507522/ph...
        
       | abalaji wrote:
       | I'm playing around with it right now and right off the bat, I can
       | say that load times are significantly faster than something like
       | mybinder (maybe due to scale or caching?)
       | 
       | Overall this seems pretty cool! The realtime editing seems to be
       | killer, google collab is close but not as good from my initial
       | testing. Some of the python package integrations may be able to
       | be replicated with open source tools (e.g. table visualization
       | and https://github.com/quantopian/qgrid)
       | 
       | My big question comes down to vendor lock in. What's the vision
       | here for compatibility with the Jupyter eco-system in the long
       | haul? (e.g. do we see Deepnote features contributed back to
       | Jupyter)
        
         | Equiet wrote:
         | Thanks! The difference between Deepnote and MyBinder is that we
         | keep the pool of Docker images as small as possible. That means
         | they are always in cache. You can still write your own
         | Dockerfile, but they are layered on our base image. MyBinder
         | has a lot of work that needs to be done (pulling the image,
         | sometimes building it, etc) which we thankfully mostly avoided.
         | 
         | Regarding the lock-in, it's in our best interest to remain
         | fully compatible. So yes, there'll always be a way how to
         | export your project and run it in plain Jupyter. The hope is
         | the more advanced features (comments, output visualizations,
         | different cell types) will appear in Jupyter over time as well,
         | but it's also up to Jupyter whether they want those features.
        
       | joebo wrote:
       | Looks good. We use CoCalc for similar collaboration benefits.
       | There is a self-hosted option which was important to us. CoCalc
       | has been a game changer as we've all moved remote. Once Deepnote
       | adds the self-hosted / cloud option (I see it coming soon) we'll
       | check it out.
        
       | lowdanie wrote:
       | Looks very useful!
        
       | jsty wrote:
       | Quick FYI - clicking the NavBar's "pricing" link on the "about"
       | page links to "/about#pricing" rather than "/#pricing"
        
         | Equiet wrote:
         | Thanks, fixing
        
         | [deleted]
        
       | zopper wrote:
       | Loving Deepnote so far. Being able to seamlessly collaborate with
       | others during the pandemic is a lifesaver!
       | 
       | It would be great if you had automatic versioning similar to
       | Google Docs, using Git with notebooks is a nightmare.
        
         | Equiet wrote:
         | Thank you! Actually I'm working on that right now.
        
       | the21st wrote:
       | Deepnote team member here. It took a lot of effort to get to
       | where we are right now - Deepnote is one of the most complex
       | products I worked on. If you have any questions, engineering,
       | product or otherwise, ask away!
        
         | ZephyrBlu wrote:
         | What makes it so complex compared to other products you've
         | worked on?
        
         | lhnz wrote:
         | Super cool product.
         | 
         | Do you hire remote engineers? I'm London based.
        
           | anna_sco wrote:
           | Thanks! We are open to that, get in touch at
           | work@deepnote.com and we'll take it from there.
        
       | beisner wrote:
       | Looks interesting, there are definitely a lot of pain points when
       | using the Jupiter notebooks for more complex explanation. One
       | thing I would love to see you in a Jupiter notebook, which some
       | of the various deep learning experimentation start ups (wandb,
       | etc) have is a local visual sink for time series data. For
       | instance it would be great to be able to dynamically plot (maybe
       | in the left or right margin of the window) loss over time for
       | multiple runs, maybe with some dynamic ability to group graph.
       | 
       | I could envision hooking up the outputs of multiple executions of
       | the same (or different) notebooks to these visualizations.
       | 
       | You can kinda get something like this with matplotlib or plotly
       | but it has always felt kinda missing something.
        
         | Equiet wrote:
         | Agreed. The way how I see it, it's not a core competency of a
         | notebook though (it's important to keep the medium itself
         | versatile enough), but of an extension. As you mentioned, there
         | are some startups working exactly on this (thus very likely a
         | much better job than a notebook), so it's probably a problem of
         | a missing link (like a nice API for UI extension).
         | 
         | Speaking on behalf of Deepnote, there's no such an API yet, but
         | it's something that I'd definitely like to see and build.
        
       | prashp wrote:
       | How does debugging work in Deepnote vs Jupyter?
        
         | Equiet wrote:
         | We are using the same kernels as Jupyter, so features like
         | debugging work out of the box. However, we don't have an
         | interface for visual debugging yet.
        
       | Reebz wrote:
       | Congratulations on the launch! Do you see Gradient (paperspace)
       | as a competitor, how do you compare?
        
         | Equiet wrote:
         | Paperspace is doing a great job providing infrastructure for
         | data science workloads and mlops. The target users are data
         | scientists/engineers. The ability to share with non-technical
         | users is quite limited.
         | 
         | We built Deepnote so that the work you do as a data scientist
         | can be shared with both engineers and non-technical folks.
         | We're not really an mlops platform. We make a really good
         | notebook that integrates with other platforms.
        
           | Reebz wrote:
           | Thanks for the reply. To clarify, I was referring to their
           | Gradient Notebook specifically [1], which seem to have
           | feature parity and have the additional benefit of vertical
           | integration. https://gradient.paperspace.com/notebooks
        
             | Equiet wrote:
             | Got it. I can't speak of Gradient's roadmap, but as of
             | right now they are using Jupyter as a notebook and focusing
             | on infrastructure around it. We are innovating on the
             | notebook itself.
             | 
             | Huge part of it is simply the UX. There's a wide range of
             | what kind of work a data scientist does. Some train models
             | that go into production, some analyze the datasets and
             | build reports. Probably best to try both products with your
             | workload and see what works better.
        
       | taigi100 wrote:
       | Been using this for 1-2 weeks now. Compared it with pretty much
       | all the other options I found out there. The experience it's just
       | so much better.
       | 
       | I fully recommend you try it - it's awesome.
       | 
       | All that's left which I want are dark mode mainly and maybe a
       | cheaper alternative to more powerful GPU / something along those
       | line. Tho, with the long-running tasks I don't really mind.
       | 
       | Great job and congrats on launching!
        
       | bravura wrote:
       | Once you add GPU support, how do you differentiate with respect
       | to paperspace and floydhub?
        
       | chrisaycock wrote:
       | The Jupyter hosting space is getting crowed. Even with
       | collaboration and versioning, there's Saturn Cloud and CoCalc.
       | How does Deepnote plan on differentiating?
        
         | Equiet wrote:
         | Well, we are not a Jupyter hosting service. There's definitely
         | a lot of work being put into embedding Jupyter into data
         | science platforms (mostly putting Jupyter into an iframe). But
         | at the end of the day, there are limitations to this approach
         | so some things won't work that well.
        
       ___________________________________________________________________
       (page generated 2020-10-30 23:00 UTC)