[HN Gopher] Writing a Book with Pandoc, Make, and Vim ___________________________________________________________________ Writing a Book with Pandoc, Make, and Vim Author : halst Score : 349 points Date : 2020-04-13 09:33 UTC (13 hours ago) (HTM) web link (keleshev.com) (TXT) w3m dump (keleshev.com) | hmcamp wrote: | Well done. Thanks for sharing your process! Looking forward to | the completed book. | ggerules wrote: | A couple of things that might be of interest: | | 1) pandoc is awesome. 2) There are integrated development | environments that allow you to write in markdown and output to | pdf, html, and word with the flick of a switch. Rstudio with | knitr, bookdown, and markdown has some nice functionality. Plus | you can do graphs and drawings and embed them in the the rmd (r | markdown) text. 3) There is an earlier post in HN from Gilles | Castel on how to speedily write text through the ultisnips | package. Very much a game changer on how I use vim to work with | anything text related. | | https://castel.dev/post/lecture-notes-1/ | | Nice post! | lozf wrote: | > https://castel.dev/post/lecture-notes-1/ | | Very nice! Thank you. | MattBlissett wrote: | Here's a result [1] from a system I've put together, primarily | using AsciiDoc(tor) and PO4A[3], to allow us to write a source | document then translate it into multiple languages. It produces | HTML and PDF, but ePUB is an option too. | | Using AsciiDoc rather than Markdown has several benefits. The | language supports many common book features, especially for | technical books, like those "! Warning here" callouts, cited | quotes, captioned figures/tables/codeblocks, internal links, I | think even an index. It's also a lot more stable; I'm not | concerned that there will be significant syntax changes in 5 | years time. The user manual [2] is the quickest way to see what | AsciiDoc can do. | | PO4A is an adaptation of GNU GetText to use on prose. PO4A's | output can input into a typical translation workflow -- | distributing the files, or using online translation services. It | mostly supports AsciiDoc, though there are some bugs, and | outputting a PO file directly from AsciiDoctor (with a plugin) | might be better -- PO4A parses AsciiDoc itself. | | The code is at [4]. It's in slow development when necessary for | new documents; I don't particularly intend to polish it for | release or wider use. | | KiCAD's documentation was the best example of something similar | (AsciiDoc + PO4A) to what I've put together. | | The missing pieces, which are closely related, are translatable | and flexible diagrams. AsciiDoctor supports plenty of diagram | tools, but none of them can do this. For example, the diagram at | [6] is an SVG, which (since it's XML) can be translated using | PO4A. However, in French the longer text spills out of the boxes. | The previous diagram is an image, for this reason. | | _Is there an open-format (preferably open source) diagramming | tool, which supports wrapping text, and even resizing "too long" | text?_ I would be very interested! | | [1] https://docs.gbif.org/collections-idea-paper/ or (in | progress) https://docs.gbif.org/effective-nodes-guidance/1.0/ | | [2] https://asciidoctor.org/docs/user-manual/ | | [3] https://po4a.org/ | | [4] https://github.com/gbif/gbif-asciidoctor-toolkit/ | | [5] https://gitlab.com/kicad/services/kicad-doc | | [6] https://docs.gbif.org/effective-nodes- | guidance/1.0/en/#box-e... | DyslexicAtheist wrote: | pretty cool. I'm using a similar setup that allows a real-time | preview of every change by means of the `entr`[1] command and | gets triggered by saving the markdown. ls | ./presentation.md |entr -c bash -c "pandoc --pdf-engine=xelatex | --toc -N presentation.md -t beamer -o presentation.pdf; killall | -HUP mupdf" | | this would reload the pipeline and update the content of the pdf | output. easy as 1-2-3 (no Makefile though which would be another | step). | | [1] https://www.systutorials.com/docs/linux/man/1-entr/ | BooneJS wrote: | Can you refer to figures and have the name rendered? For | instance, a piece of text referring to a figure and the figure's | label will both get rendered to "Figure 2.8" regardless of | paragraph edits and figures inserted or delete before it? | Symbiote wrote: | For anything beyond the most basic document, use Acsiidoc | rather than Markdown. | | I prefer to use the Asciidoctor toolchain, but it's compatible | with Acsiidoc. | halst wrote: | So far, I got away with "In the following figure..." | btreecat wrote: | Nice to learn how others approach this task. Curious, with so | many extensions to markdown, why not use something like ascii doc | or rst instead? | snide wrote: | Somewhat related. I highly suggest the Goyo plugin for Vim if you | want distraction free writing. | | https://github.com/junegunn/goyo.vim | marvindanig wrote: | Books are not files though! | | Pandoc is great. make and vim are great too, but as you can see | these tools will produce PDF files, HTML files, text files, | markdown files and a lot jargon that the readers simply aren't | interested in. I mean normal readers here and not tech folks | holed up inside a terminal with a homebrew theme. | JoshMcguigan wrote: | Thanks for sharing. Your book sounds very interesting to me. I | like that it targets generating real ARM assembly. | | I've signed up for updates, looking forward to the release! | roland35 wrote: | Thanks for sharing this work flow and also for writing this book! | I sometimes think I would like to write a book about | microcontroller basics (which I've collected knowledge from | countless blogs and white papers) but I know it's a huge project! | | Also it is cool that you can run draw.io yourself! I've used yed | for documentation but this looks nice and is capable. | sevensor wrote: | I did this too, although the makefile presented in the article is | _much_ cleaner than mine was. Definitely recommend XeLaTeX. You | 're not going to get very far without unicode. I had to drop down | to LaTeX often to control formatting, but Pandoc helpfully lets | you do that. | leephillips wrote: | I'm always interested in reading articles like this, as I like to | see the setups that people come up with to produce books and | documents. I didn't know about set virtualedit=all in vim! | | If you learn how to extend Pandoc with your own filters, which | you can write in several languages, there is no limit to what you | can do. Here's the description, published in the sadly defunct | _Linux Journal_ , of the system I created to help me write a book | about gnuplot: | | https://lee-phillips.org/panflute-gnuplot/ | ddrt wrote: | Everything in that link can be done with inDesign and having | data. Finding a way to complete using console or alternative | applications would take hacking the inDes app or finding some | sort of IFTTT sort of automation when needed, then saving as a | high res image, and referencing as a link in your console | layout doc. At the end it would have to compile as an image | into something (might as well be inDesign) and at that point | why not just layout the book with inDesign from the start? | Writing the book in a text doc with some tagged markdown for | rules, linking text to connected and flowed into styled text | boxes that have rules assigned to them, and generating all the | charts and sheets necessary to complete. Visual communication | isn't a strongpoint in code interfaces. | leephillips wrote: | I'm not sure I understand your comment, but I believe | inDesign is a proprietary, closed-source product, probably | driven mainly through a GUI. My goal was to write my book in | vim. All I need to do is type, and the book comes out, | including a visual index of all the plots in the book. Every | link in the chain, and every tool I used, is open source (and | free). The result is _exactly_ what I want. To each his own, | but the project, described in my article, is to create an | interface for me as an author. That interface is typing in | vim, using a set of tags I created for the purpose. | Finnucane wrote: | "SVG is well supported with EPUB" | | SVG is part of the standard, but not well supported by all epub | reading systems. Some displays will fail, some will display as | small non-scalable images. Apple's iBooks reader is one of the | better ones in that regard. | frozenlettuce wrote: | After reading a couple posts here on HN about building a "second | brain", I found a surprisingly effective setup to do that: | | - Vim with vimwiki (https://github.com/vimwiki/vimwiki) | | - A private Gitlab repo | | - A simple cron job to commit all changes in `~/.vimwiki` to my | private repo | | And this is it! It would be possible to publish the wiki on the | web using Gitlab pages, but so far it is working nice to me. | andrepd wrote: | >It allows to move the cursor past the last character. If you | insert a new character there, it is automatically padded with | spaces. It is easier to see it than to explain it: | | >My first programming environment was Turbo Pascal, and this is | exactly how the cursor works there, which I grew accustomed to. | | Holy shit! What a rush of memories reading that unlocked :) | pianomanfrazier wrote: | For any interested, here is my Pandoc book writing setup. | | I have a couple bash scripts that I use to call pandoc to | generate PDFs, HTML, or ePub. | | Here is the repo https://gitlab.com/pianomanfrazier/pandoc- | markdown-book and here is my blog post | https://pianomanfrazier.com/post/write-a-book-with-markdown/ | LeonM wrote: | I was building a new API recently, and was looking for a good | documentation solution. | | The commercial cloud based solutions (Gitlab, Confluence, et al) | are pretty good, but you have to keep paying or your | documentation disappears. Self hosted Wiki or documentation | solutions were also out, due to the pain of migrating content in | and out. | | We ended up with a very simple solution of Markdown + CSS + | Pandoc + make. Pandoc takes the CSS and MD files as input, and | outputs HTML. The MD files are in the API repository, deployment | has been setup so that the latest documentation is deployed | automatically with each API update. | bryan2 wrote: | Excuse me if this is a dumb question but did you consider | swagger? | LeonM wrote: | There are no dumb questions. | | I did have a look at swagger, but it felt way to bloated and | complex for what we wanted. With Markdown we know that even | in 10 years time when services like swagger are long gone, | it'll be possible to view markdown files. Also, there is | barely any learning curve with Markdown. | napsy wrote: | This reminds me of my own project to use pandoc for generating | blog posts https://outfloor.org/ | airstrike wrote: | I know this is tangential, but I would love for someone to talk | about writing a more visual type of book, full of images, tables | and charts for the business world. | | A table like the one in the first screenshot of this post works | well because the author is not repeatedly iterating on it, | there's very little text and information flows top-to-bottom very | neatly. That's great, but it's also extremely basic. | | Take a look at something like | https://www.jpmorgan.com/jpmpdf/1320605428574.pdf and imagine | writing _that_. How do you lay things out on a page? How do you | make content fit a layout? There 's no grid. | | The reality is people use PowerPoint to do that, but PowerPoint | is a slide authoring tool that assumes you have a few bullets, | maybe one or two images per slide. | | Dense presentations make its shortcomings obvious and quite | painful. | | It boggles the mind that with all of the resources dumped into | CSS/JS and web development in general, nobody has leveraged that | experience to build an authoring tool that's 21st-century ready, | with version control, with a clear separation but nonetheless | linked relationship of raw data, actual content output and | formatting and final publishing into PDF. | | What am I missing? | | EDIT: one more example for good measure | https://www.jefferies.com/CMSFiles/Jefferies.com/files/W%201... | wvh wrote: | I assume organisations that need such complex layout also have | a budget to pull together immersive HTML pages, infographic | design or comprehensive reports. For instance the WHO has some | pretty complex and visually pleasing reports; they seem to be | using InDesign and I'm sure they use actual designers and | researchers to produce them. | | I like the idea of using open-source tools to create books and | documentation because you could incorporate the process into a | workflow that pulls in actual code or on-the-fly generated | graphics. I don't own any commercial software, so I don't know | how feasable that process would be with something like | InDesign. | marvindanig wrote: | You mean https://bubblin.io/book/bookiza-documentation-by- | marvin-dani... ? | ggerules wrote: | Learning LaTex and tiKz help out with this. It looks like a | presentation. So latex beamer package with some custom | templates. The downside is that latex and tikz has a little | bit of a learning curve. But it is worth it in the long run. | marvindanig wrote: | Have you used Katex in place of Latex? The latter is much | lighter, especially if you're trying to publish a tome for | the web. | airstrike wrote: | That's part of it, but the content still flows top-to-bottom | for the most part and is not quite enterprise-ready, packaged | into a standalone app. | Symbiote wrote: | Those aren't books, they are presentation slides. | | Using Powerpoint, for every slide the author chose | (potentially) a different Powerpoint template (2x1 columns, 2x2 | etc). They have complete freedom to "break" the structure, such | as with callouts pointing to the "other" column, images going | beyond the margins. | | A automatic template removes this flexibility, but allows | scripting or rebuilding the document with different text/data. | That's the compromize. | | Remark.js achieves some of the most basic parts of this, but | would need some fiddling to add some CSS grid support and/or | default templates: https://remarkjs.com/ (Except for being | ugly, http://mobmad.github.io/js-tdd-erfaringer/ shows some | possible structure with Remark.js). | airstrike wrote: | I'm not so sure... I make them pretty much daily, and we | print them and call them "books". | | I'm not saying you shouldn't be able to tweak them manually, | but there's got to be a more ergonomic language for drafting | pages than literally dragging objects pixel by pixel, | especially when most of the content comes in four forms: | tables pasted in from Excel, charts pasted in from Excel, | bullet lists and simple graphics around text like circles and | squares | marvindanig wrote: | Going by the strict definition of a book [1] a file, a | webpage or a website isn't a book either. | | [1] https://en.m.wikipedia.org/wiki/Book | RMPR wrote: | To emulate the live preview, there is a neat piece of software | called entr[1], from their main page, you can do something like: | ls | entr make | | And whenever you save a change, the build is triggered and the | preview is updated. | | [1]: http://eradman.com/entrproject/ | JoshMcguigan wrote: | The pipeline approach here is interesting, but it seems you'd | be on your own for filtering out changes in the build | directory, etc. I typically use watchexec [1] for this. | watchexec make | | By default, watchexec will filter out changes in files based on | `.gitignore`. | | [1]: https://github.com/watchexec/watchexec | RMPR wrote: | This is pretty good, didn't know about watchexec, but you can | achieve the same by choosing carefully the command you pipe | from, for example: ls *.md | make | | Will only trigger the build if a md file is modified, which | is what I think the author is interested in. | 0az wrote: | I personally use fswatch for this. The invocation is probably | something like this: fswatch -0 ***.md | | xargs -0 make | | I invoke a variant of this from my Makefile with a phony watch | target. I think the main change is that I also echo a bell | character, to provide some feedback. | | Fun tip: Preview will automatically reload PDFs on disk, though | with some limitations that I workaround by waiting for the | bell. | Klasiaster wrote: | It's even possible to replace (Xe)LaTeX with weasy1, a Python | HTML-to-PDF converter. It supports two-colums via CSS, automatic | CSS hypens, CSS page counters and embedding SVGs. I just needed | an HTML header with CSS in the markdown file. $ | pandoc --filter pandoc-citeproc --csl ieee.csl | --bibliography=paper.bib --smart --normalize -f | markdown+multiline_tables+inline_notes -t html5 -V margin- | top:0.5in -V margin-bottom:0.5in -V margin-left:0.5in -V margin- | right:0.5in -o output.html input.md $ python3 -c "from | weasyprint import HTML; | HTML('output.html').write_pdf('output.pdf', | presentational_hints=True)" | | For LaTeX-style math equations I added mathjax-pandoc-filter2 as | filter to the pandoc args: --filter | ~/node_modules/.bin/mathjax-pandoc-filter | -Mmathjax.centerDisplayMath -Mmathjax.noInlineSVG | | 1 https://weasyprint.org/ 2 https://github.com/lierdakil/mathjax- | pandoc-filter | leephillips wrote: | This is a very interesting (open source) project that I didn't | know about; thank you for mentioning it. | | But it doesn't replace LaTeX, as it doesn't produce the same | results. A glance at the sample documents reveals the ugly | typography resulting from the word-processing layout strategy | employed in web browsers. This is confirmed in the | documentation. So this could be useful if you have an existing | set of HTML pages that you need to convert to PDFs, but, if | you're starting a project where you want to produce both HTML | and PDF, this should not be part of the solution. | snazz wrote: | It looks nice for graphics-heavy documents, but the quality | of the typographical output doesn't come close to LaTeX with | microtype. I do wish that LaTeX had something similar to CSS, | however. The separation of markup and styling makes the web | easier to use for complex layouts, which are not generally | TeX's strong suit. | j88439h84 wrote: | I cant tell the difference between this layout quality and | latex. What are you noticing? | leephillips wrote: | The first things that jump out are the large and uneven | gaps between words and the "color" variations among | paragraphs. What I mean by the "word-processing layout | strategy" is the algorithm where, when you run out of space | on a line, you simply break the line at the end of the | previous word, fill up the space (for justified text) by | expanding the spaces between words, and begin the next | line. When you get to the end of the paragraph you go on to | the next one. The TeX layout engine, in contrast, makes | several passes over each paragraph, adjusting the line | breaking (including hyphenation) in order to optimize its | appearance (which includes such things as trying to avoid | successive hyphenated lines); then, when the page is set, | it goes over the entire page to try to equalize the | density, or color, among paragraphs. | yiyus wrote: | Maybe you already knew about it, but the microtype | package improves the aspect of your documents even more: | https://ctan.org/pkg/microtype | tarleb wrote: | Pandoc can even free you of the second step by using WeasyPrint | as PDF engine: pandoc --pdf-engine=weasyprint | -t html ... | andrepd wrote: | Skimming the examples the typographical quality is that of a | webpage (which is to be expected), miles below TeX-quality | typesetting. | Klasiaster wrote: | I think it really depends on the font you use and the CSS | rules you apply. LaTeX needs tweaking, too, even with a good | template. | | E.g., if your font supports it, you can enable ligatures: | text-rendering: optimizeLegibility; | jojo14 wrote: | Nice article. I'm all for it ! I use Pandoc and Makefile as well. | Except I use Emacs and Inkscape for SVG graphics. IMHO this is | the way to produce documents in the 21st century. | ggambetta wrote: | Similar story here. I wrote and self-published a novel, both for | e-readers and paperback, using only open-source tools, mainly | around Pandoc. I wrote some more details here: | https://gabrielgambetta.com/tgl_open_source.html | 0az wrote: | I use something very similar for mathematical homework and notes: | MacVim, with a Makefile that runs Pandoc with the Eisvogel | template. | | I also have a script that runs fswatch to run make on save. | | Didn't know about virtualedit, though: tables are going to be so | much easier now. | maxmunzel wrote: | Vim + Pandoc + Beamer + pdfpc is also the best way to write | Presentations I have found so far: | | https://github.com/maxmunzel/talk-algorithms-for-np-hard-pro... | asicsp wrote: | Nice and thanks for sharing your setup. The footer is very | informative, but I use GitHub style markdown, need to check if | there's some workaround. For epub customization, this article [0] | might help. Good luck for your book. | | Here's how I generate PDF with pandoc+xelatex [1] I use gvim as | my editor and have mapped a key (which then executes a shell | script) to generate the book. | | [0] https://cmichel.io/how-to-create-beautiful-epub- | programming-... | | [1] https://learnbyexample.github.io/tutorial/ebook- | generation/c... ___________________________________________________________________ (page generated 2020-04-13 23:00 UTC)