[HN Gopher] Writing a Book with Pandoc, Make, and Vim
       ___________________________________________________________________
        
       Writing a Book with Pandoc, Make, and Vim
        
       Author : halst
       Score  : 349 points
       Date   : 2020-04-13 09:33 UTC (13 hours ago)
        
 (HTM) web link (keleshev.com)
 (TXT) w3m dump (keleshev.com)
        
       | hmcamp wrote:
       | Well done. Thanks for sharing your process! Looking forward to
       | the completed book.
        
       | ggerules wrote:
       | A couple of things that might be of interest:
       | 
       | 1) pandoc is awesome. 2) There are integrated development
       | environments that allow you to write in markdown and output to
       | pdf, html, and word with the flick of a switch. Rstudio with
       | knitr, bookdown, and markdown has some nice functionality. Plus
       | you can do graphs and drawings and embed them in the the rmd (r
       | markdown) text. 3) There is an earlier post in HN from Gilles
       | Castel on how to speedily write text through the ultisnips
       | package. Very much a game changer on how I use vim to work with
       | anything text related.
       | 
       | https://castel.dev/post/lecture-notes-1/
       | 
       | Nice post!
        
         | lozf wrote:
         | > https://castel.dev/post/lecture-notes-1/
         | 
         | Very nice! Thank you.
        
       | MattBlissett wrote:
       | Here's a result [1] from a system I've put together, primarily
       | using AsciiDoc(tor) and PO4A[3], to allow us to write a source
       | document then translate it into multiple languages. It produces
       | HTML and PDF, but ePUB is an option too.
       | 
       | Using AsciiDoc rather than Markdown has several benefits. The
       | language supports many common book features, especially for
       | technical books, like those "! Warning here" callouts, cited
       | quotes, captioned figures/tables/codeblocks, internal links, I
       | think even an index. It's also a lot more stable; I'm not
       | concerned that there will be significant syntax changes in 5
       | years time. The user manual [2] is the quickest way to see what
       | AsciiDoc can do.
       | 
       | PO4A is an adaptation of GNU GetText to use on prose. PO4A's
       | output can input into a typical translation workflow --
       | distributing the files, or using online translation services. It
       | mostly supports AsciiDoc, though there are some bugs, and
       | outputting a PO file directly from AsciiDoctor (with a plugin)
       | might be better -- PO4A parses AsciiDoc itself.
       | 
       | The code is at [4]. It's in slow development when necessary for
       | new documents; I don't particularly intend to polish it for
       | release or wider use.
       | 
       | KiCAD's documentation was the best example of something similar
       | (AsciiDoc + PO4A) to what I've put together.
       | 
       | The missing pieces, which are closely related, are translatable
       | and flexible diagrams. AsciiDoctor supports plenty of diagram
       | tools, but none of them can do this. For example, the diagram at
       | [6] is an SVG, which (since it's XML) can be translated using
       | PO4A. However, in French the longer text spills out of the boxes.
       | The previous diagram is an image, for this reason.
       | 
       |  _Is there an open-format (preferably open source) diagramming
       | tool, which supports wrapping text, and even resizing "too long"
       | text?_ I would be very interested!
       | 
       | [1] https://docs.gbif.org/collections-idea-paper/ or (in
       | progress) https://docs.gbif.org/effective-nodes-guidance/1.0/
       | 
       | [2] https://asciidoctor.org/docs/user-manual/
       | 
       | [3] https://po4a.org/
       | 
       | [4] https://github.com/gbif/gbif-asciidoctor-toolkit/
       | 
       | [5] https://gitlab.com/kicad/services/kicad-doc
       | 
       | [6] https://docs.gbif.org/effective-nodes-
       | guidance/1.0/en/#box-e...
        
       | DyslexicAtheist wrote:
       | pretty cool. I'm using a similar setup that allows a real-time
       | preview of every change by means of the `entr`[1] command and
       | gets triggered by saving the markdown.                 ls
       | ./presentation.md |entr -c bash -c "pandoc --pdf-engine=xelatex
       | --toc -N presentation.md -t beamer -o presentation.pdf; killall
       | -HUP mupdf"
       | 
       | this would reload the pipeline and update the content of the pdf
       | output. easy as 1-2-3 (no Makefile though which would be another
       | step).
       | 
       | [1] https://www.systutorials.com/docs/linux/man/1-entr/
        
       | BooneJS wrote:
       | Can you refer to figures and have the name rendered? For
       | instance, a piece of text referring to a figure and the figure's
       | label will both get rendered to "Figure 2.8" regardless of
       | paragraph edits and figures inserted or delete before it?
        
         | Symbiote wrote:
         | For anything beyond the most basic document, use Acsiidoc
         | rather than Markdown.
         | 
         | I prefer to use the Asciidoctor toolchain, but it's compatible
         | with Acsiidoc.
        
         | halst wrote:
         | So far, I got away with "In the following figure..."
        
       | btreecat wrote:
       | Nice to learn how others approach this task. Curious, with so
       | many extensions to markdown, why not use something like ascii doc
       | or rst instead?
        
       | snide wrote:
       | Somewhat related. I highly suggest the Goyo plugin for Vim if you
       | want distraction free writing.
       | 
       | https://github.com/junegunn/goyo.vim
        
       | marvindanig wrote:
       | Books are not files though!
       | 
       | Pandoc is great. make and vim are great too, but as you can see
       | these tools will produce PDF files, HTML files, text files,
       | markdown files and a lot jargon that the readers simply aren't
       | interested in. I mean normal readers here and not tech folks
       | holed up inside a terminal with a homebrew theme.
        
       | JoshMcguigan wrote:
       | Thanks for sharing. Your book sounds very interesting to me. I
       | like that it targets generating real ARM assembly.
       | 
       | I've signed up for updates, looking forward to the release!
        
       | roland35 wrote:
       | Thanks for sharing this work flow and also for writing this book!
       | I sometimes think I would like to write a book about
       | microcontroller basics (which I've collected knowledge from
       | countless blogs and white papers) but I know it's a huge project!
       | 
       | Also it is cool that you can run draw.io yourself! I've used yed
       | for documentation but this looks nice and is capable.
        
       | sevensor wrote:
       | I did this too, although the makefile presented in the article is
       | _much_ cleaner than mine was. Definitely recommend XeLaTeX. You
       | 're not going to get very far without unicode. I had to drop down
       | to LaTeX often to control formatting, but Pandoc helpfully lets
       | you do that.
        
       | leephillips wrote:
       | I'm always interested in reading articles like this, as I like to
       | see the setups that people come up with to produce books and
       | documents. I didn't know about set virtualedit=all in vim!
       | 
       | If you learn how to extend Pandoc with your own filters, which
       | you can write in several languages, there is no limit to what you
       | can do. Here's the description, published in the sadly defunct
       | _Linux Journal_ , of the system I created to help me write a book
       | about gnuplot:
       | 
       | https://lee-phillips.org/panflute-gnuplot/
        
         | ddrt wrote:
         | Everything in that link can be done with inDesign and having
         | data. Finding a way to complete using console or alternative
         | applications would take hacking the inDes app or finding some
         | sort of IFTTT sort of automation when needed, then saving as a
         | high res image, and referencing as a link in your console
         | layout doc. At the end it would have to compile as an image
         | into something (might as well be inDesign) and at that point
         | why not just layout the book with inDesign from the start?
         | Writing the book in a text doc with some tagged markdown for
         | rules, linking text to connected and flowed into styled text
         | boxes that have rules assigned to them, and generating all the
         | charts and sheets necessary to complete. Visual communication
         | isn't a strongpoint in code interfaces.
        
           | leephillips wrote:
           | I'm not sure I understand your comment, but I believe
           | inDesign is a proprietary, closed-source product, probably
           | driven mainly through a GUI. My goal was to write my book in
           | vim. All I need to do is type, and the book comes out,
           | including a visual index of all the plots in the book. Every
           | link in the chain, and every tool I used, is open source (and
           | free). The result is _exactly_ what I want. To each his own,
           | but the project, described in my article, is to create an
           | interface for me as an author. That interface is typing in
           | vim, using a set of tags I created for the purpose.
        
       | Finnucane wrote:
       | "SVG is well supported with EPUB"
       | 
       | SVG is part of the standard, but not well supported by all epub
       | reading systems. Some displays will fail, some will display as
       | small non-scalable images. Apple's iBooks reader is one of the
       | better ones in that regard.
        
       | frozenlettuce wrote:
       | After reading a couple posts here on HN about building a "second
       | brain", I found a surprisingly effective setup to do that:
       | 
       | - Vim with vimwiki (https://github.com/vimwiki/vimwiki)
       | 
       | - A private Gitlab repo
       | 
       | - A simple cron job to commit all changes in `~/.vimwiki` to my
       | private repo
       | 
       | And this is it! It would be possible to publish the wiki on the
       | web using Gitlab pages, but so far it is working nice to me.
        
       | andrepd wrote:
       | >It allows to move the cursor past the last character. If you
       | insert a new character there, it is automatically padded with
       | spaces. It is easier to see it than to explain it:
       | 
       | >My first programming environment was Turbo Pascal, and this is
       | exactly how the cursor works there, which I grew accustomed to.
       | 
       | Holy shit! What a rush of memories reading that unlocked :)
        
       | pianomanfrazier wrote:
       | For any interested, here is my Pandoc book writing setup.
       | 
       | I have a couple bash scripts that I use to call pandoc to
       | generate PDFs, HTML, or ePub.
       | 
       | Here is the repo https://gitlab.com/pianomanfrazier/pandoc-
       | markdown-book and here is my blog post
       | https://pianomanfrazier.com/post/write-a-book-with-markdown/
        
       | LeonM wrote:
       | I was building a new API recently, and was looking for a good
       | documentation solution.
       | 
       | The commercial cloud based solutions (Gitlab, Confluence, et al)
       | are pretty good, but you have to keep paying or your
       | documentation disappears. Self hosted Wiki or documentation
       | solutions were also out, due to the pain of migrating content in
       | and out.
       | 
       | We ended up with a very simple solution of Markdown + CSS +
       | Pandoc + make. Pandoc takes the CSS and MD files as input, and
       | outputs HTML. The MD files are in the API repository, deployment
       | has been setup so that the latest documentation is deployed
       | automatically with each API update.
        
         | bryan2 wrote:
         | Excuse me if this is a dumb question but did you consider
         | swagger?
        
           | LeonM wrote:
           | There are no dumb questions.
           | 
           | I did have a look at swagger, but it felt way to bloated and
           | complex for what we wanted. With Markdown we know that even
           | in 10 years time when services like swagger are long gone,
           | it'll be possible to view markdown files. Also, there is
           | barely any learning curve with Markdown.
        
       | napsy wrote:
       | This reminds me of my own project to use pandoc for generating
       | blog posts https://outfloor.org/
        
       | airstrike wrote:
       | I know this is tangential, but I would love for someone to talk
       | about writing a more visual type of book, full of images, tables
       | and charts for the business world.
       | 
       | A table like the one in the first screenshot of this post works
       | well because the author is not repeatedly iterating on it,
       | there's very little text and information flows top-to-bottom very
       | neatly. That's great, but it's also extremely basic.
       | 
       | Take a look at something like
       | https://www.jpmorgan.com/jpmpdf/1320605428574.pdf and imagine
       | writing _that_. How do you lay things out on a page? How do you
       | make content fit a layout? There 's no grid.
       | 
       | The reality is people use PowerPoint to do that, but PowerPoint
       | is a slide authoring tool that assumes you have a few bullets,
       | maybe one or two images per slide.
       | 
       | Dense presentations make its shortcomings obvious and quite
       | painful.
       | 
       | It boggles the mind that with all of the resources dumped into
       | CSS/JS and web development in general, nobody has leveraged that
       | experience to build an authoring tool that's 21st-century ready,
       | with version control, with a clear separation but nonetheless
       | linked relationship of raw data, actual content output and
       | formatting and final publishing into PDF.
       | 
       | What am I missing?
       | 
       | EDIT: one more example for good measure
       | https://www.jefferies.com/CMSFiles/Jefferies.com/files/W%201...
        
         | wvh wrote:
         | I assume organisations that need such complex layout also have
         | a budget to pull together immersive HTML pages, infographic
         | design or comprehensive reports. For instance the WHO has some
         | pretty complex and visually pleasing reports; they seem to be
         | using InDesign and I'm sure they use actual designers and
         | researchers to produce them.
         | 
         | I like the idea of using open-source tools to create books and
         | documentation because you could incorporate the process into a
         | workflow that pulls in actual code or on-the-fly generated
         | graphics. I don't own any commercial software, so I don't know
         | how feasable that process would be with something like
         | InDesign.
        
         | marvindanig wrote:
         | You mean https://bubblin.io/book/bookiza-documentation-by-
         | marvin-dani... ?
        
           | ggerules wrote:
           | Learning LaTex and tiKz help out with this. It looks like a
           | presentation. So latex beamer package with some custom
           | templates. The downside is that latex and tikz has a little
           | bit of a learning curve. But it is worth it in the long run.
        
             | marvindanig wrote:
             | Have you used Katex in place of Latex? The latter is much
             | lighter, especially if you're trying to publish a tome for
             | the web.
        
           | airstrike wrote:
           | That's part of it, but the content still flows top-to-bottom
           | for the most part and is not quite enterprise-ready, packaged
           | into a standalone app.
        
         | Symbiote wrote:
         | Those aren't books, they are presentation slides.
         | 
         | Using Powerpoint, for every slide the author chose
         | (potentially) a different Powerpoint template (2x1 columns, 2x2
         | etc). They have complete freedom to "break" the structure, such
         | as with callouts pointing to the "other" column, images going
         | beyond the margins.
         | 
         | A automatic template removes this flexibility, but allows
         | scripting or rebuilding the document with different text/data.
         | That's the compromize.
         | 
         | Remark.js achieves some of the most basic parts of this, but
         | would need some fiddling to add some CSS grid support and/or
         | default templates: https://remarkjs.com/ (Except for being
         | ugly, http://mobmad.github.io/js-tdd-erfaringer/ shows some
         | possible structure with Remark.js).
        
           | airstrike wrote:
           | I'm not so sure... I make them pretty much daily, and we
           | print them and call them "books".
           | 
           | I'm not saying you shouldn't be able to tweak them manually,
           | but there's got to be a more ergonomic language for drafting
           | pages than literally dragging objects pixel by pixel,
           | especially when most of the content comes in four forms:
           | tables pasted in from Excel, charts pasted in from Excel,
           | bullet lists and simple graphics around text like circles and
           | squares
        
           | marvindanig wrote:
           | Going by the strict definition of a book [1] a file, a
           | webpage or a website isn't a book either.
           | 
           | [1] https://en.m.wikipedia.org/wiki/Book
        
       | RMPR wrote:
       | To emulate the live preview, there is a neat piece of software
       | called entr[1], from their main page, you can do something like:
       | ls | entr make
       | 
       | And whenever you save a change, the build is triggered and the
       | preview is updated.
       | 
       | [1]: http://eradman.com/entrproject/
        
         | JoshMcguigan wrote:
         | The pipeline approach here is interesting, but it seems you'd
         | be on your own for filtering out changes in the build
         | directory, etc. I typically use watchexec [1] for this.
         | watchexec make
         | 
         | By default, watchexec will filter out changes in files based on
         | `.gitignore`.
         | 
         | [1]: https://github.com/watchexec/watchexec
        
           | RMPR wrote:
           | This is pretty good, didn't know about watchexec, but you can
           | achieve the same by choosing carefully the command you pipe
           | from, for example:                   ls *.md | make
           | 
           | Will only trigger the build if a md file is modified, which
           | is what I think the author is interested in.
        
         | 0az wrote:
         | I personally use fswatch for this. The invocation is probably
         | something like this:                   fswatch -0 ***.md |
         | xargs -0 make
         | 
         | I invoke a variant of this from my Makefile with a phony watch
         | target. I think the main change is that I also echo a bell
         | character, to provide some feedback.
         | 
         | Fun tip: Preview will automatically reload PDFs on disk, though
         | with some limitations that I workaround by waiting for the
         | bell.
        
       | Klasiaster wrote:
       | It's even possible to replace (Xe)LaTeX with weasy1, a Python
       | HTML-to-PDF converter. It supports two-colums via CSS, automatic
       | CSS hypens, CSS page counters and embedding SVGs. I just needed
       | an HTML header with CSS in the markdown file.                   $
       | pandoc --filter pandoc-citeproc --csl ieee.csl
       | --bibliography=paper.bib --smart --normalize -f
       | markdown+multiline_tables+inline_notes -t html5 -V margin-
       | top:0.5in -V margin-bottom:0.5in -V margin-left:0.5in -V margin-
       | right:0.5in -o output.html input.md         $ python3 -c "from
       | weasyprint import HTML;
       | HTML('output.html').write_pdf('output.pdf',
       | presentational_hints=True)"
       | 
       | For LaTeX-style math equations I added mathjax-pandoc-filter2 as
       | filter to the pandoc args:                   --filter
       | ~/node_modules/.bin/mathjax-pandoc-filter
       | -Mmathjax.centerDisplayMath -Mmathjax.noInlineSVG
       | 
       | 1 https://weasyprint.org/ 2 https://github.com/lierdakil/mathjax-
       | pandoc-filter
        
         | leephillips wrote:
         | This is a very interesting (open source) project that I didn't
         | know about; thank you for mentioning it.
         | 
         | But it doesn't replace LaTeX, as it doesn't produce the same
         | results. A glance at the sample documents reveals the ugly
         | typography resulting from the word-processing layout strategy
         | employed in web browsers. This is confirmed in the
         | documentation. So this could be useful if you have an existing
         | set of HTML pages that you need to convert to PDFs, but, if
         | you're starting a project where you want to produce both HTML
         | and PDF, this should not be part of the solution.
        
           | snazz wrote:
           | It looks nice for graphics-heavy documents, but the quality
           | of the typographical output doesn't come close to LaTeX with
           | microtype. I do wish that LaTeX had something similar to CSS,
           | however. The separation of markup and styling makes the web
           | easier to use for complex layouts, which are not generally
           | TeX's strong suit.
        
           | j88439h84 wrote:
           | I cant tell the difference between this layout quality and
           | latex. What are you noticing?
        
             | leephillips wrote:
             | The first things that jump out are the large and uneven
             | gaps between words and the "color" variations among
             | paragraphs. What I mean by the "word-processing layout
             | strategy" is the algorithm where, when you run out of space
             | on a line, you simply break the line at the end of the
             | previous word, fill up the space (for justified text) by
             | expanding the spaces between words, and begin the next
             | line. When you get to the end of the paragraph you go on to
             | the next one. The TeX layout engine, in contrast, makes
             | several passes over each paragraph, adjusting the line
             | breaking (including hyphenation) in order to optimize its
             | appearance (which includes such things as trying to avoid
             | successive hyphenated lines); then, when the page is set,
             | it goes over the entire page to try to equalize the
             | density, or color, among paragraphs.
        
               | yiyus wrote:
               | Maybe you already knew about it, but the microtype
               | package improves the aspect of your documents even more:
               | https://ctan.org/pkg/microtype
        
         | tarleb wrote:
         | Pandoc can even free you of the second step by using WeasyPrint
         | as PDF engine:                   pandoc --pdf-engine=weasyprint
         | -t html ...
        
         | andrepd wrote:
         | Skimming the examples the typographical quality is that of a
         | webpage (which is to be expected), miles below TeX-quality
         | typesetting.
        
           | Klasiaster wrote:
           | I think it really depends on the font you use and the CSS
           | rules you apply. LaTeX needs tweaking, too, even with a good
           | template.
           | 
           | E.g., if your font supports it, you can enable ligatures:
           | text-rendering: optimizeLegibility;
        
       | jojo14 wrote:
       | Nice article. I'm all for it ! I use Pandoc and Makefile as well.
       | Except I use Emacs and Inkscape for SVG graphics. IMHO this is
       | the way to produce documents in the 21st century.
        
       | ggambetta wrote:
       | Similar story here. I wrote and self-published a novel, both for
       | e-readers and paperback, using only open-source tools, mainly
       | around Pandoc. I wrote some more details here:
       | https://gabrielgambetta.com/tgl_open_source.html
        
       | 0az wrote:
       | I use something very similar for mathematical homework and notes:
       | MacVim, with a Makefile that runs Pandoc with the Eisvogel
       | template.
       | 
       | I also have a script that runs fswatch to run make on save.
       | 
       | Didn't know about virtualedit, though: tables are going to be so
       | much easier now.
        
       | maxmunzel wrote:
       | Vim + Pandoc + Beamer + pdfpc is also the best way to write
       | Presentations I have found so far:
       | 
       | https://github.com/maxmunzel/talk-algorithms-for-np-hard-pro...
        
       | asicsp wrote:
       | Nice and thanks for sharing your setup. The footer is very
       | informative, but I use GitHub style markdown, need to check if
       | there's some workaround. For epub customization, this article [0]
       | might help. Good luck for your book.
       | 
       | Here's how I generate PDF with pandoc+xelatex [1] I use gvim as
       | my editor and have mapped a key (which then executes a shell
       | script) to generate the book.
       | 
       | [0] https://cmichel.io/how-to-create-beautiful-epub-
       | programming-...
       | 
       | [1] https://learnbyexample.github.io/tutorial/ebook-
       | generation/c...
        
       ___________________________________________________________________
       (page generated 2020-04-13 23:00 UTC)