hngopher.com

       [HN Gopher] Nbdev: A literate programming environment that democ...
       ___________________________________________________________________
        
       Nbdev: A literate programming environment that democratizes best
       practices
        
       Author : pbowyer
       Score  : 124 points
       Date   : 2020-11-20 17:40 UTC (5 hours ago)
        
 (HTM) web link (github.blog)
 (TXT) w3m dump (github.blog)
        
       | polyrand wrote:
       | I used nbdev when it was first released. Some things must have
       | improved since then, but I was already amazed by the experience.
       | I think having code + docs + tests in the same document makes a
       | huge difference in the effort needed to get those 3 done
       | properly.
        
       | zomglings wrote:
       | nbdev looks promising. I'm wondering if it solves what I see as
       | the biggest pain of using notebooks when I'm doing data science
       | work.
       | 
       | When a notebook gets large, it can be difficult to keep track of
       | dependencies between cells. For workflows in which you have to
       | run cells n_1, n_2, ..., n_k before running cell n.
       | 
       | I try to organize my cells so that if I run them from first to
       | last, all dependencies are covered (e.g. "Restart kernel and run
       | all cells).
       | 
       | Unfortunately, this doesn't help when I discover a bug in cell
       | n_2 and don't want to run ALL cells n_2 + 1, ..., n-1, n because
       | some of them carry out expensive operations.
       | 
       | When working in my editor, the way I resolve this is to make a
       | light CLI wrapper around my program (if __name__ == "__main__":
       | import argparse; ...) and my CLI commands encode all this
       | dependency information.
       | 
       | Is it possible to get this kind of experience in a Jupyter
       | notebook without building a custom plugin (I think a frontend
       | plugin would suffice)?
        
       | tmabraham wrote:
       | I really enjoy using nbdev. I am a fastai user but also a
       | researcher, and I have been using nbdev for my research projects.
       | Because everything is in Jupyter Notebooks, it's much easier to
       | document your work as you are working. Nbdev will build docs and
       | run tests based on the jupyter notebooks and everything is quite
       | flexible!
       | 
       | Here is a small project of mine highlighting some of the
       | capabilities of nbdev: https://github.com/tmabraham/UPIT
        
       | nestorD wrote:
       | I have used Nbdev and I am not a fan. It creates friction when
       | one wants to contribute (in my case, to fast.ai) and forces you
       | to write your code with notebooks which is the point but also not
       | great when you are writing code rather than producing a display
       | mixing text and pictures. Plus, while notebooks should favour
       | documentation in theory, you can also end up with notebooks full
       | of blob of code with transitions text that does not help you
       | undertsanding what's going on.
       | 
       | Case in point, here is a random notebook from the fastai
       | repository, a python file would be simpler to read and shorter:
       | https://github.com/fastai/fastai/blob/master/nbs/09b_vision....
        
       | themusicgod1 wrote:
       | Sorry, using NSA/Microsoft Github is not a "best practice". This
       | project should be dead in the water if that's their starting
       | point.
        
         | hpeinar wrote:
         | Willing to be more specific with a comment like this?
        
           | j0k3r_85 wrote:
           | While the grand parent's tone is not appropriate and has been
           | duely downvoted. They do point out that nbdev is currently
           | closely linked around pushing code and docs to GitHub. This
           | is something which threw me at first but isn't a requirement.
           | You can set it up to work only locally.
           | 
           | The GitHub flavor is likely just because that is what the
           | author was familiar with and what they were using.
           | 
           | If there are enough people interested we could get together
           | to make PRs to add other remote version control systems and
           | other static site hosts. I know an integration into the
           | Atlassian world would really help me at work as that's my
           | employer's chosen code repo and doc manager.
        
       | fulafel wrote:
       | So you can put these in a GH action to ensure they are in working
       | order across deps / dataset updates etc? Seems like exctly the
       | missing piece for using notebooks for serious work.
        
       | trash3 wrote:
       | Has anyone from a SE background used this? I come from data
       | science so I and peers use Jupiter notebooks for everything. I've
       | never used a proper IDE so I wouldn't have anything to compare it
       | to. But I need to start wrangling in people's notebooks and am
       | hoping nbdev does the trick.
        
         | AlexCoventry wrote:
         | I've done both. I haven't used nbdev, but it looks like it
         | addresses many of the drawbacks notebooks have, from a
         | software-development perspective.
        
         | Jugurtha wrote:
         | I will use it soon. We want to support fast.ai[0] courses on
         | our machine learning platform[1], and wanted a way to easily
         | test the notebooks. I asked one of the people behind fast.ai,
         | and they told me they use nbdev to test their notebooks.
         | 
         | - [0]: https://www.fast.ai/
         | 
         | - [1]: https://iko.ai
        
       | Avaclon wrote:
       | I'm wondering how one would incorporate important practices such
       | as TDD into this development methodology.
        
       | rwhaling wrote:
       | This is really exciting! I've been building a data engineering
       | practice around jupyter notebooks, Netflix's papermill, and k8s
       | cronjobs for scheduling, and it's been great...except for code
       | review, weird dependency/virtualenv glitches, tests, and
       | documentation.
       | 
       | At first glance, this seems like it would address all of my pain
       | points? Will be interesting to try it out.
        
       | kowlo wrote:
       | The author writes
       | 
       | > we decided to assist fastai in their development of a new,
       | literate programming environment for Python, called nbdev.
       | 
       | but this is followed by:
       | 
       | > nbdev builds on top of Jupyter notebooks to fill these gaps and
       | provides the following features
       | 
       | is it a new environment, or is it an extended Jupyter Notebook?
       | It looks like Jupyter Notebook to me. Why not Jupyter Lab?
       | 
       | > JupyterLab: Jupyter's Next-Generation Notebook Interface
       | https://jupyter.org
        
         | rwhaling wrote:
         | for background, the original author of nbdev has a good post
         | outlining exactly how it relates to jupyter -
         | 
         | https://www.fast.ai/2019/12/02/nbdev/
        
           | kowlo wrote:
           | Strange... "Nbdev is a system for something that we call
           | exploratory programming." but no citation... the phrasing
           | suggests this is something they believe they've come up with
           | a name for?
        
             | tmabraham wrote:
             | The term has a wikipedia page:
             | https://en.wikipedia.org/wiki/Exploratory_programming
        
               | kowlo wrote:
               | I'm aware, what I'm confused about is why they've phrased
               | it as if it's something _they_ call exploratory
               | programming, as if they 've coined the term.
        
               | jph00 wrote:
               | I coined the term - and it turns out someone else did
               | too, for something else. So be it. If someone else can
               | think of a better term that's never been used before,
               | then I'll happily use that instead.
               | 
               | The earlier usage mentioned in Wikipedia is entirely
               | uncited there however, and seems to have only been used
               | in one academic project AFAICT.
        
               | kowlo wrote:
               | How are you sure you didn't just read it somewhere and
               | then forgot? It has been mentioned many times in related
               | literature (even in publication titles) https://scholar.g
               | oogle.com/scholar?q=%22exploratory+programm...
               | 
               | Here's one from 1988
               | https://dl.acm.org/doi/abs/10.1145/51607.51614
               | 
               | Are these unrelated? Is Nbdev not only a "new programming
               | environment", but also a new concept that needs a new
               | name?
        
               | tlarkworthy wrote:
               | "In some cases the estimates may be obvious. Perhaps the
               | story is similar to others that have already been
               | completed. In other cases the story may be very difficult
               | to estimate and may require exploratory programming."
               | 
               | Kent Beck and Martin Fowler
               | 
               | http://index-
               | of.es/Java/Planning%20Extreme%20Programming.pdf
               | 
               | Its a commonly understood term AFAIK
        
               | kowlo wrote:
               | > Its a commonly understood term AFAIK
               | 
               | I agree - it's what made me raise an eyebrow... However,
               | based on their comment above, the lead author of Nbdev
               | believes to have coined the term.
               | 
               | My opinion is that due diligence and attribution are
               | important. If I believed I'd coined a new term, I'd check
               | first. Mistakes are easy to make, but when highlighted,
               | perhaps corrections are more appropriate than negotiating
               | with the person highlighting them:
               | 
               | From the lead author (jph00): _... If someone else can
               | think of a better term that 's never been used before,
               | then I'll happily use that instead._
        
         | hansvm wrote:
         | Guessing at plausible answers:
         | 
         | - Nbdev started right around the 1.0 release of jupyterlab, and
         | it might not have been on their radar
         | 
         | - Nbdev came out of fast.ai, and I wouldn't be surprised if
         | they were using a ton of jupyter-specific features already
         | which weren't supported by jupyterlab
        
           | jph00 wrote:
           | It works fine in lab too, although I prefer using notebooks
           | on the whole.
           | 
           | (I'm the lead author of nbdev).
        
             | hansvm wrote:
             | Awesome! Thanks for chiming in :)
        
         | madenine wrote:
         | >is it a new environment, or is it an extended Jupyter
         | Notebook? It looks like Jupyter Notebook to me. Why not Jupyter
         | Lab?
         | 
         | Neither? It doesn't change the features of jupyter notebooks,
         | and its not an improved/expanded UI like jupyter lab (you could
         | use nbdev with jupyter lab). Its utilities and automation to
         | make package/library development a better experience if jupyter
         | is where you write your code.
         | 
         | From https://github.com/fastai/nbdev:
         | 
         | "nbdev is a library that allows you to develop a python library
         | in Jupyter Notebooks, putting all your code, tests and
         | documentation in one place."
        
           | kowlo wrote:
           | Sure, but the OP link that I'm commenting on says it's a "new
           | literate programming environment". Based on what you're
           | quoting, the OP article is incorrect and needs correcting?
        
       | pmdulaney wrote:
       | I love democracy, but somehow there is something cloying about
       | the use of "democratize" in this context.
        
         | justnotworthit wrote:
         | It suggests you've stolen fire from the gods (or the arcane
         | halls of wizards/academics) and fed the starving masses.
         | 
         | Pretty good for an ide.
        
         | ipsum2 wrote:
         | For some strange reason, everyone in the ML community refers to
         | 'increasing adoption' as 'democratizing'. It's my pet peeve.
        
           | [deleted]
        
           | nvrspyx wrote:
           | I mean, increased adoption is a result of democratization
           | (e.g. more accessible). I think the usage here is fine
           | because Jupyter notebooks are definitely more accessible to
           | those new to programming than a typical Python environment
           | and this expands Jupyter to be used for more development
           | purposes, like building libraries.
        
           | jph00 wrote:
           | There are two lead definitions for "democratize" in the
           | Oxford English Dictionary. One of them is:
           | 
           | "make (something) accessible to everyone"
           | 
           | So the usage here is entirely consistent with standard
           | English usage. It is also consistent with the French
           | etymology (democratiser), which has as a dictionary
           | definition "Rendre democratique, populaire" (i.e. to make
           | popular).
        
             | justnotworthit wrote:
             | If I let everyone on my street borrow my bike, can I say
             | "I've democratized my bike"?
        
             | kubanczyk wrote:
             | I think that definition is a part of parent's pet peeve.
             | 
             |  _demo-_ (people) _-kratia_ (rule) has only indirect
             | relation to popularity.
        
           | hobofan wrote:
           | Increased adoption in ML comes with more open implementations
           | and more freedom, which in contrast to FAANG being the only
           | ones to employ advanced ML can indeed be seen as a form
           | democratization.
        
           | minimaxir wrote:
           | "Democratizing" makes more sense when the industry evolved
           | from AI/ML frameworks which required a Ph.D to use to
           | allowing anyone to train a model (e.g. Theano -> TensorFlow
           | -> Keras/fast.ai), and allowing people to train models on
           | GPUs without a grant-funded supercomputer cluster, for very
           | little cost (spot/preemptible cloud GPUs, Google Colab)
           | 
           | In this case, I agree it's not equitable.
        
           | Asooka wrote:
           | Yeah, my first guess was that the tool allowed everybody to
           | have an input on what best practices should look like and
           | enable distributing the final consensus to everybody. So if
           | today best practice is 4 space indent, this would be the
           | default, but if enough people changed it to one tab, that
           | would change everybody's default and reformat their code to
           | conform with the new best practice.
        
         | contravariant wrote:
         | I'm still struggling to find the link with democracy. Sure you
         | can't have democracy if the source code is inscrutable but
         | that's a somewhat tenuous link.
         | 
         | The word 'democratize' here doesn't seem to add any meaning
         | that 'literate programming' doesn't already cover.
        
       | davidkell wrote:
       | IMO no one has done more to make deep learning accessible than
       | Jeremy + fast.ai team. Thanks for the amazing work!
       | 
       | My question is about the coding style - @jph00 I've read your
       | fast.ai style guide and worked with APLs like q/KDB (written by
       | Arthur Whitney who you cite).
       | 
       | My experience is that brevity is great, until you need to
       | collaborate or have individuals working on small parts. That was
       | my experience as well trying to write an extension to the fast.ai
       | code (where I had to read large amounts of source to understand
       | how to implement a small change).
       | 
       | Given that a key motivator for literate programming is
       | collaboration/communication, how do you think about this?
        
       | konjin wrote:
       | This is not literate programming.
       | 
       | Literate programming involves having a meta language that is
       | extended into the target source code through nested macros.
       | 
       | The killer feature was the ability to see everywhere a piece of
       | code was used on dead paper by looking at the auto-generated
       | index, with the chunks being _logical_ rather than language
       | driven. In a literate program you wouldn 't care that something
       | was a class or a function, you would just have it be described by
       | what it does, not how it does it.
       | 
       | This is marginally better documentation for python notebooks.
        
         | musingsole wrote:
         | No true scotsman.
         | 
         | It's more literate programming than not and chasing the promise
         | of the ideal literate programming environment is what created
         | the novel notebook environment in the first place.
        
           | konjin wrote:
           | 1). Notebooks were invented by Mathematica in the 80s.
           | 
           | 2). Words have meanings and literate programming is defined
           | extremely well by Knuth in his 1983 paper. This is _not_ what
           | was described there any more than the WWW is Xanadu.
           | 
           | 3). That is not what No True Scotsman means.
        
             | contravariant wrote:
             | > That is not what No True Scotsman means.
             | 
             | Setting aside whether you are or are not right, can we just
             | appreciate the irony in that statement for a moment?
        
       | st1x7 wrote:
       | Sigh. I really think that the only reason notebooks are so
       | popular is that Python never had a popular IDE with a high-
       | quality REPL where people can learn to work interactively while
       | writing code in plain text.
       | 
       | R illustrates this very well - as an R user you can have your
       | pick between RStudio, Jupyter and Rmarkdown and the overwhelming
       | majority of users pick RStudio and notebooks are reserved only
       | for a niche set of use cases. It also speaks volumes that almost
       | no one writes R in Jupyter even though it's supported very well -
       | R users just have better options available to them.
        
         | minimaxir wrote:
         | You bring up RStudio but not the R Notebooks which it supports
         | natively (https://bookdown.org/yihui/rmarkdown/notebook.html),
         | which IMO is a far-superior way of handling notebooks than
         | Jupyter. (namely, the files are plain text so you can actually
         | commit to Git without fuss)
         | 
         | I wrote a detailed blog post about the differences between
         | Jupyter and R Notebooks years ago:
         | https://minimaxir.com/2017/06/r-notebooks/
        
           | st1x7 wrote:
           | Yes, what I meant by Rmarkdown is these notebooks.
        
           | davidkell wrote:
           | You can do the same thing with the jupytext extension. But
           | sometimes it is helpful to have the rendered results in
           | version control, eg internally we use it to discuss data
           | science findings on Gitlab.
        
         | fulafel wrote:
         | This is an interesting view in light of the fact that Jupyter
         | is actually a direct evolution of the most popular bells-and-
         | whistles REPL in Python land (IPython) - .ipynb files were just
         | saved IPython REPL sessions.
         | 
         | (try "sudo apt-get install ipython && ipython" on your
         | Ubuntu/Debian system to try it out)
        
         | slightwinder wrote:
         | > Sigh. I really think that the only reason notebooks are so
         | popular is that Python never had a popular IDE with a high-
         | quality REPL where people can learn to work interactively while
         | writing code in plain text.
         | 
         | I would not call Idle unpopular. But it had a limit for growing
         | for sure.
         | 
         | But Notebooks are mostly coming from the science-corner of
         | python. People there used notebook-like tools and workflows for
         | decades and some brought that over to ipython-project. I
         | remember 15(?) Years ago when the project started focusing more
         | and more on the cluster-aspect of their shell, they brought up
         | many differenct tools for this. One of them was notebook-like
         | and what become later Jupyter. It quickly became popular in
         | certain groups for those reasons.
        
         | jph00 wrote:
         | No that's not the reason - or at least, not for everyone. I
         | created nbdev. I've been coding for over 30 years, and spent
         | over 10 years using R and S-PLUS. I've used Delphi, Visual
         | Studio, Emacs, vim, vscode, and many other editors and IDEs,
         | including many with integrated line-oriented REPLs.
         | 
         | There's a big difference between a line-oriented plain text
         | REPL, and a Mathematica/Jupyter-style notebook REPL, especially
         | when you want to mix and match your charts, image outputs, rich
         | table outputs, interactive JS outputs, and so forth. Also, for
         | experimentation, where you want to go back and change things to
         | see what happens (e.g. very common in data science) I find it
         | much easier and more understandable in a notebook.
         | 
         | I have a video where I show the difference between these styles
         | of working in some detail:
         | https://www.youtube.com/watch?v=9Q6sLbz37gk
        
           | sbelskie wrote:
           | Yea, I don't really get what repls have to do with what
           | notebooks offer. They are similar-ish, obviously, but they
           | accommodate different workflows and use cases.
        
             | fulafel wrote:
             | People mean different things by REPL - the nicer Lisps had
             | richer reader prompts that were not totally text bsaed,
             | could show you graphical stuff and accept commands other
             | than just source code, have interactive features etc - see
             | eg https://upload.wikimedia.org/wikipedia/commons/c/c6/List
             | ener... or some youtube videos of Lisp machines.
        
             | st1x7 wrote:
             | That's getting to my point though - they _should_
             | accommodate different workflows and use cases. But what 's
             | happening instead is that people are overusing notebooks in
             | cases where plain text + REPL is more appropriate.
        
               | sbelskie wrote:
               | That makes sense as a possible situation. I'm just not
               | sure that it's one that's familiar to me. I can't
               | honestly say I have my finger on the pulse of where
               | people are using notebooks vs repls but the former seem
               | really great for 1) step by step examples 2) scripts-in-
               | progress where certain steps are more in flux than
               | others.
               | 
               | Though, I think I get what you mean as I reflect on my
               | own dev experience. As a mainly C# dev which has a very
               | very limited repl experience (I would and should say no
               | repl experience but someone will yell at me about csi.exe
               | or dotnet-script) I have seen people using notebooks for
               | want of a good repl, but I'm curious why anyone writing
               | python would.
        
       | zimpenfish wrote:
       | Have only had a quick look at the examples and docs but there
       | doesn't seem to be any support for reordered chunks? (ie. there's
       | no weaving involved.)
        
       | erikgaas wrote:
       | So I use this in production at my company. It's an awesome tool.
       | Personally when I'm coding in python I like to prototype in
       | jupyter, copy code over, and then reimport anyway. Nbdev
       | streamlines everything so I can write docs, tests, and code all
       | in one place. And since the docs are just a jekyll site I can
       | copy it to our documentation aws bucket in continuous
       | integration. And with one command I can run all the notebook
       | tests in CI as well.
       | 
       | The packaging is also really well thought out. I don't have to
       | stress out about connecting setup.py with whatever publishing
       | system we have. The settings.ini makes things sane and I can bump
       | the version whenever I want.
       | 
       | A get a lot of skeptical looks when I say the source code is in
       | notebooks, but that's just syntactic sugar for the raw source
       | code. You still get to edit the raw code files and with one
       | command sync everything with the notebooks. From my point of you
       | it is close to a pareto improvement over traditional python
       | library development.
        
         | mloncode wrote:
         | Really interesting! Do you mind sharing what your company is?
         | (I am the author of the blog post)
        
       ___________________________________________________________________
       (page generated 2020-11-20 23:00 UTC)