[HN Gopher] Nbdev: A literate programming environment that democ... ___________________________________________________________________ Nbdev: A literate programming environment that democratizes best practices Author : pbowyer Score : 124 points Date : 2020-11-20 17:40 UTC (5 hours ago) (HTM) web link (github.blog) (TXT) w3m dump (github.blog) | polyrand wrote: | I used nbdev when it was first released. Some things must have | improved since then, but I was already amazed by the experience. | I think having code + docs + tests in the same document makes a | huge difference in the effort needed to get those 3 done | properly. | zomglings wrote: | nbdev looks promising. I'm wondering if it solves what I see as | the biggest pain of using notebooks when I'm doing data science | work. | | When a notebook gets large, it can be difficult to keep track of | dependencies between cells. For workflows in which you have to | run cells n_1, n_2, ..., n_k before running cell n. | | I try to organize my cells so that if I run them from first to | last, all dependencies are covered (e.g. "Restart kernel and run | all cells). | | Unfortunately, this doesn't help when I discover a bug in cell | n_2 and don't want to run ALL cells n_2 + 1, ..., n-1, n because | some of them carry out expensive operations. | | When working in my editor, the way I resolve this is to make a | light CLI wrapper around my program (if __name__ == "__main__": | import argparse; ...) and my CLI commands encode all this | dependency information. | | Is it possible to get this kind of experience in a Jupyter | notebook without building a custom plugin (I think a frontend | plugin would suffice)? | tmabraham wrote: | I really enjoy using nbdev. I am a fastai user but also a | researcher, and I have been using nbdev for my research projects. | Because everything is in Jupyter Notebooks, it's much easier to | document your work as you are working. Nbdev will build docs and | run tests based on the jupyter notebooks and everything is quite | flexible! | | Here is a small project of mine highlighting some of the | capabilities of nbdev: https://github.com/tmabraham/UPIT | nestorD wrote: | I have used Nbdev and I am not a fan. It creates friction when | one wants to contribute (in my case, to fast.ai) and forces you | to write your code with notebooks which is the point but also not | great when you are writing code rather than producing a display | mixing text and pictures. Plus, while notebooks should favour | documentation in theory, you can also end up with notebooks full | of blob of code with transitions text that does not help you | undertsanding what's going on. | | Case in point, here is a random notebook from the fastai | repository, a python file would be simpler to read and shorter: | https://github.com/fastai/fastai/blob/master/nbs/09b_vision.... | themusicgod1 wrote: | Sorry, using NSA/Microsoft Github is not a "best practice". This | project should be dead in the water if that's their starting | point. | hpeinar wrote: | Willing to be more specific with a comment like this? | j0k3r_85 wrote: | While the grand parent's tone is not appropriate and has been | duely downvoted. They do point out that nbdev is currently | closely linked around pushing code and docs to GitHub. This | is something which threw me at first but isn't a requirement. | You can set it up to work only locally. | | The GitHub flavor is likely just because that is what the | author was familiar with and what they were using. | | If there are enough people interested we could get together | to make PRs to add other remote version control systems and | other static site hosts. I know an integration into the | Atlassian world would really help me at work as that's my | employer's chosen code repo and doc manager. | fulafel wrote: | So you can put these in a GH action to ensure they are in working | order across deps / dataset updates etc? Seems like exctly the | missing piece for using notebooks for serious work. | trash3 wrote: | Has anyone from a SE background used this? I come from data | science so I and peers use Jupiter notebooks for everything. I've | never used a proper IDE so I wouldn't have anything to compare it | to. But I need to start wrangling in people's notebooks and am | hoping nbdev does the trick. | AlexCoventry wrote: | I've done both. I haven't used nbdev, but it looks like it | addresses many of the drawbacks notebooks have, from a | software-development perspective. | Jugurtha wrote: | I will use it soon. We want to support fast.ai[0] courses on | our machine learning platform[1], and wanted a way to easily | test the notebooks. I asked one of the people behind fast.ai, | and they told me they use nbdev to test their notebooks. | | - [0]: https://www.fast.ai/ | | - [1]: https://iko.ai | Avaclon wrote: | I'm wondering how one would incorporate important practices such | as TDD into this development methodology. | rwhaling wrote: | This is really exciting! I've been building a data engineering | practice around jupyter notebooks, Netflix's papermill, and k8s | cronjobs for scheduling, and it's been great...except for code | review, weird dependency/virtualenv glitches, tests, and | documentation. | | At first glance, this seems like it would address all of my pain | points? Will be interesting to try it out. | kowlo wrote: | The author writes | | > we decided to assist fastai in their development of a new, | literate programming environment for Python, called nbdev. | | but this is followed by: | | > nbdev builds on top of Jupyter notebooks to fill these gaps and | provides the following features | | is it a new environment, or is it an extended Jupyter Notebook? | It looks like Jupyter Notebook to me. Why not Jupyter Lab? | | > JupyterLab: Jupyter's Next-Generation Notebook Interface | https://jupyter.org | rwhaling wrote: | for background, the original author of nbdev has a good post | outlining exactly how it relates to jupyter - | | https://www.fast.ai/2019/12/02/nbdev/ | kowlo wrote: | Strange... "Nbdev is a system for something that we call | exploratory programming." but no citation... the phrasing | suggests this is something they believe they've come up with | a name for? | tmabraham wrote: | The term has a wikipedia page: | https://en.wikipedia.org/wiki/Exploratory_programming | kowlo wrote: | I'm aware, what I'm confused about is why they've phrased | it as if it's something _they_ call exploratory | programming, as if they 've coined the term. | jph00 wrote: | I coined the term - and it turns out someone else did | too, for something else. So be it. If someone else can | think of a better term that's never been used before, | then I'll happily use that instead. | | The earlier usage mentioned in Wikipedia is entirely | uncited there however, and seems to have only been used | in one academic project AFAICT. | kowlo wrote: | How are you sure you didn't just read it somewhere and | then forgot? It has been mentioned many times in related | literature (even in publication titles) https://scholar.g | oogle.com/scholar?q=%22exploratory+programm... | | Here's one from 1988 | https://dl.acm.org/doi/abs/10.1145/51607.51614 | | Are these unrelated? Is Nbdev not only a "new programming | environment", but also a new concept that needs a new | name? | tlarkworthy wrote: | "In some cases the estimates may be obvious. Perhaps the | story is similar to others that have already been | completed. In other cases the story may be very difficult | to estimate and may require exploratory programming." | | Kent Beck and Martin Fowler | | http://index- | of.es/Java/Planning%20Extreme%20Programming.pdf | | Its a commonly understood term AFAIK | kowlo wrote: | > Its a commonly understood term AFAIK | | I agree - it's what made me raise an eyebrow... However, | based on their comment above, the lead author of Nbdev | believes to have coined the term. | | My opinion is that due diligence and attribution are | important. If I believed I'd coined a new term, I'd check | first. Mistakes are easy to make, but when highlighted, | perhaps corrections are more appropriate than negotiating | with the person highlighting them: | | From the lead author (jph00): _... If someone else can | think of a better term that 's never been used before, | then I'll happily use that instead._ | hansvm wrote: | Guessing at plausible answers: | | - Nbdev started right around the 1.0 release of jupyterlab, and | it might not have been on their radar | | - Nbdev came out of fast.ai, and I wouldn't be surprised if | they were using a ton of jupyter-specific features already | which weren't supported by jupyterlab | jph00 wrote: | It works fine in lab too, although I prefer using notebooks | on the whole. | | (I'm the lead author of nbdev). | hansvm wrote: | Awesome! Thanks for chiming in :) | madenine wrote: | >is it a new environment, or is it an extended Jupyter | Notebook? It looks like Jupyter Notebook to me. Why not Jupyter | Lab? | | Neither? It doesn't change the features of jupyter notebooks, | and its not an improved/expanded UI like jupyter lab (you could | use nbdev with jupyter lab). Its utilities and automation to | make package/library development a better experience if jupyter | is where you write your code. | | From https://github.com/fastai/nbdev: | | "nbdev is a library that allows you to develop a python library | in Jupyter Notebooks, putting all your code, tests and | documentation in one place." | kowlo wrote: | Sure, but the OP link that I'm commenting on says it's a "new | literate programming environment". Based on what you're | quoting, the OP article is incorrect and needs correcting? | pmdulaney wrote: | I love democracy, but somehow there is something cloying about | the use of "democratize" in this context. | justnotworthit wrote: | It suggests you've stolen fire from the gods (or the arcane | halls of wizards/academics) and fed the starving masses. | | Pretty good for an ide. | ipsum2 wrote: | For some strange reason, everyone in the ML community refers to | 'increasing adoption' as 'democratizing'. It's my pet peeve. | [deleted] | nvrspyx wrote: | I mean, increased adoption is a result of democratization | (e.g. more accessible). I think the usage here is fine | because Jupyter notebooks are definitely more accessible to | those new to programming than a typical Python environment | and this expands Jupyter to be used for more development | purposes, like building libraries. | jph00 wrote: | There are two lead definitions for "democratize" in the | Oxford English Dictionary. One of them is: | | "make (something) accessible to everyone" | | So the usage here is entirely consistent with standard | English usage. It is also consistent with the French | etymology (democratiser), which has as a dictionary | definition "Rendre democratique, populaire" (i.e. to make | popular). | justnotworthit wrote: | If I let everyone on my street borrow my bike, can I say | "I've democratized my bike"? | kubanczyk wrote: | I think that definition is a part of parent's pet peeve. | | _demo-_ (people) _-kratia_ (rule) has only indirect | relation to popularity. | hobofan wrote: | Increased adoption in ML comes with more open implementations | and more freedom, which in contrast to FAANG being the only | ones to employ advanced ML can indeed be seen as a form | democratization. | minimaxir wrote: | "Democratizing" makes more sense when the industry evolved | from AI/ML frameworks which required a Ph.D to use to | allowing anyone to train a model (e.g. Theano -> TensorFlow | -> Keras/fast.ai), and allowing people to train models on | GPUs without a grant-funded supercomputer cluster, for very | little cost (spot/preemptible cloud GPUs, Google Colab) | | In this case, I agree it's not equitable. | Asooka wrote: | Yeah, my first guess was that the tool allowed everybody to | have an input on what best practices should look like and | enable distributing the final consensus to everybody. So if | today best practice is 4 space indent, this would be the | default, but if enough people changed it to one tab, that | would change everybody's default and reformat their code to | conform with the new best practice. | contravariant wrote: | I'm still struggling to find the link with democracy. Sure you | can't have democracy if the source code is inscrutable but | that's a somewhat tenuous link. | | The word 'democratize' here doesn't seem to add any meaning | that 'literate programming' doesn't already cover. | davidkell wrote: | IMO no one has done more to make deep learning accessible than | Jeremy + fast.ai team. Thanks for the amazing work! | | My question is about the coding style - @jph00 I've read your | fast.ai style guide and worked with APLs like q/KDB (written by | Arthur Whitney who you cite). | | My experience is that brevity is great, until you need to | collaborate or have individuals working on small parts. That was | my experience as well trying to write an extension to the fast.ai | code (where I had to read large amounts of source to understand | how to implement a small change). | | Given that a key motivator for literate programming is | collaboration/communication, how do you think about this? | konjin wrote: | This is not literate programming. | | Literate programming involves having a meta language that is | extended into the target source code through nested macros. | | The killer feature was the ability to see everywhere a piece of | code was used on dead paper by looking at the auto-generated | index, with the chunks being _logical_ rather than language | driven. In a literate program you wouldn 't care that something | was a class or a function, you would just have it be described by | what it does, not how it does it. | | This is marginally better documentation for python notebooks. | musingsole wrote: | No true scotsman. | | It's more literate programming than not and chasing the promise | of the ideal literate programming environment is what created | the novel notebook environment in the first place. | konjin wrote: | 1). Notebooks were invented by Mathematica in the 80s. | | 2). Words have meanings and literate programming is defined | extremely well by Knuth in his 1983 paper. This is _not_ what | was described there any more than the WWW is Xanadu. | | 3). That is not what No True Scotsman means. | contravariant wrote: | > That is not what No True Scotsman means. | | Setting aside whether you are or are not right, can we just | appreciate the irony in that statement for a moment? | st1x7 wrote: | Sigh. I really think that the only reason notebooks are so | popular is that Python never had a popular IDE with a high- | quality REPL where people can learn to work interactively while | writing code in plain text. | | R illustrates this very well - as an R user you can have your | pick between RStudio, Jupyter and Rmarkdown and the overwhelming | majority of users pick RStudio and notebooks are reserved only | for a niche set of use cases. It also speaks volumes that almost | no one writes R in Jupyter even though it's supported very well - | R users just have better options available to them. | minimaxir wrote: | You bring up RStudio but not the R Notebooks which it supports | natively (https://bookdown.org/yihui/rmarkdown/notebook.html), | which IMO is a far-superior way of handling notebooks than | Jupyter. (namely, the files are plain text so you can actually | commit to Git without fuss) | | I wrote a detailed blog post about the differences between | Jupyter and R Notebooks years ago: | https://minimaxir.com/2017/06/r-notebooks/ | st1x7 wrote: | Yes, what I meant by Rmarkdown is these notebooks. | davidkell wrote: | You can do the same thing with the jupytext extension. But | sometimes it is helpful to have the rendered results in | version control, eg internally we use it to discuss data | science findings on Gitlab. | fulafel wrote: | This is an interesting view in light of the fact that Jupyter | is actually a direct evolution of the most popular bells-and- | whistles REPL in Python land (IPython) - .ipynb files were just | saved IPython REPL sessions. | | (try "sudo apt-get install ipython && ipython" on your | Ubuntu/Debian system to try it out) | slightwinder wrote: | > Sigh. I really think that the only reason notebooks are so | popular is that Python never had a popular IDE with a high- | quality REPL where people can learn to work interactively while | writing code in plain text. | | I would not call Idle unpopular. But it had a limit for growing | for sure. | | But Notebooks are mostly coming from the science-corner of | python. People there used notebook-like tools and workflows for | decades and some brought that over to ipython-project. I | remember 15(?) Years ago when the project started focusing more | and more on the cluster-aspect of their shell, they brought up | many differenct tools for this. One of them was notebook-like | and what become later Jupyter. It quickly became popular in | certain groups for those reasons. | jph00 wrote: | No that's not the reason - or at least, not for everyone. I | created nbdev. I've been coding for over 30 years, and spent | over 10 years using R and S-PLUS. I've used Delphi, Visual | Studio, Emacs, vim, vscode, and many other editors and IDEs, | including many with integrated line-oriented REPLs. | | There's a big difference between a line-oriented plain text | REPL, and a Mathematica/Jupyter-style notebook REPL, especially | when you want to mix and match your charts, image outputs, rich | table outputs, interactive JS outputs, and so forth. Also, for | experimentation, where you want to go back and change things to | see what happens (e.g. very common in data science) I find it | much easier and more understandable in a notebook. | | I have a video where I show the difference between these styles | of working in some detail: | https://www.youtube.com/watch?v=9Q6sLbz37gk | sbelskie wrote: | Yea, I don't really get what repls have to do with what | notebooks offer. They are similar-ish, obviously, but they | accommodate different workflows and use cases. | fulafel wrote: | People mean different things by REPL - the nicer Lisps had | richer reader prompts that were not totally text bsaed, | could show you graphical stuff and accept commands other | than just source code, have interactive features etc - see | eg https://upload.wikimedia.org/wikipedia/commons/c/c6/List | ener... or some youtube videos of Lisp machines. | st1x7 wrote: | That's getting to my point though - they _should_ | accommodate different workflows and use cases. But what 's | happening instead is that people are overusing notebooks in | cases where plain text + REPL is more appropriate. | sbelskie wrote: | That makes sense as a possible situation. I'm just not | sure that it's one that's familiar to me. I can't | honestly say I have my finger on the pulse of where | people are using notebooks vs repls but the former seem | really great for 1) step by step examples 2) scripts-in- | progress where certain steps are more in flux than | others. | | Though, I think I get what you mean as I reflect on my | own dev experience. As a mainly C# dev which has a very | very limited repl experience (I would and should say no | repl experience but someone will yell at me about csi.exe | or dotnet-script) I have seen people using notebooks for | want of a good repl, but I'm curious why anyone writing | python would. | zimpenfish wrote: | Have only had a quick look at the examples and docs but there | doesn't seem to be any support for reordered chunks? (ie. there's | no weaving involved.) | erikgaas wrote: | So I use this in production at my company. It's an awesome tool. | Personally when I'm coding in python I like to prototype in | jupyter, copy code over, and then reimport anyway. Nbdev | streamlines everything so I can write docs, tests, and code all | in one place. And since the docs are just a jekyll site I can | copy it to our documentation aws bucket in continuous | integration. And with one command I can run all the notebook | tests in CI as well. | | The packaging is also really well thought out. I don't have to | stress out about connecting setup.py with whatever publishing | system we have. The settings.ini makes things sane and I can bump | the version whenever I want. | | A get a lot of skeptical looks when I say the source code is in | notebooks, but that's just syntactic sugar for the raw source | code. You still get to edit the raw code files and with one | command sync everything with the notebooks. From my point of you | it is close to a pareto improvement over traditional python | library development. | mloncode wrote: | Really interesting! Do you mind sharing what your company is? | (I am the author of the blog post) ___________________________________________________________________ (page generated 2020-11-20 23:00 UTC)