[HN Gopher] Writing my PhD using groff ___________________________________________________________________ Writing my PhD using groff Author : yockyrr Score : 199 points Date : 2022-07-23 11:01 UTC (11 hours ago) (HTM) web link (jstutter.netlify.app) (TXT) w3m dump (jstutter.netlify.app) | zvr wrote: | Having used many *roff variants (e.g., troff, nroff, ditroff, | groff) over decades and also having rather extensive experience | with LaTeX, I'd now definitely choose the latter for any serious | typesetting task. | | Pain points include many customization points: re-creating exact | document specifications provided externally, using specific | typefaces, creating your own macros... Oh, and leaving ASCII (or | ISO-8859-1) for multi-script characters. | | Today's groff is a very fine software, if you are satisfied with | its default settings and your task is in the domain it handles. | baruchel wrote: | Fond of Heirloom Troff (rather than GNU Troff). See https://n-t- | roff.github.io/heirloom/doctools.html Native support for TTF and | OTF fonts. Knuth's algorithm for formatting paragraphs. | ant6n wrote: | The article is making me want to try it, but it's a bit light on | technical details and I'm concerned of having to go down a rabbit | hole of learning a bunch of new tech. | | Perhaps posting a git repository of a sample phd thesis (with a | couple of empty chapters, sample figures/images, tables) could be | something that others would really benefit from. | amtamt wrote: | try this https://raw.githubusercontent.com/amit- | tewari/random/master/... | anthk wrote: | groff -step -k file.groff > file.pdf | | For mom, there's pdfmom -step -k. No need to waste time and ssd | space with pandoc. | [deleted] | [deleted] | kbr- wrote: | Ahh the procrastination when writing a PhD. | | Instead of actually writing it, you research a million different | ways how to render it, and then you write a blog post on it :) | oefrha wrote: | > then you write a blog post on it | | Not so fast. You also need to try a million different SSGs for | your blog, then eventually write your own. | vinay_ys wrote: | I have been guilty of doing this in so many contexts. Obviously | with Latex while in college, later with logging libraries, with | config libraries, with server frameworks etc. instead of doing | the actual functional task at hand, the compulsion to play with | tools, libraries and frameworks was strong. Lately, I'm | convinced all these tools with their own DSL syntax, complex | mental models and configuration options are just unnecessary | cognitive overhead especially for a one-off task you do once in | a while. We should make it easy to learn concepts with rich | visual representation of the graphical interface and apply them | on the fly - this saves cognitive load for the initial few | iterations of the task. If the user finds themselves doing the | task many times then they can advance to learn a DSL specific | to that tool. Even here, I wish there was a widely accepted | universal declarative syntax/grammar for all kinds of DSLs - | configurations, policies, typesetting etc. | analog31 wrote: | I finished my PhD almost 30 years ago, and finding the | motivation to actually finish was as much of a problem as it is | today. And tinkering with typesetting was a familiar | procrastination magnet. My friends who used LaTeX easily tacked | multiple months onto their effort, and my hunch is that CPU | time was not the limiting factor. | | My advice is: Remember, nobody's going to read it. | | My parents' theses were about 50 pages, typewritten, equations | and chemical diagrams entered by hand. The got the same grade | as I did. ;-) | tannhaeuser wrote: | It's a miracle that the developer of GNU roff actually managed | to complete the suite before procrastinating big-time into | SGML. AFAICT, he resisted even the urge to implement the roff | dot command syntax on top of SGML SHORTREF. | [deleted] | passerine wrote: | I can relate all too well the procrastination involved in | fiddling with the typesetting engine! When I was writing my | bachelor's thesis (in Philosophy, nonetheless!), I spent an | inordinate amount of time on tweaking my LaTeX template and | workflow. | | I ended up falling way too deep into the Rabbit hole, and | started using NixOS just to write the thesis itself. It did | eventually result in a fun blog post, though! | | https://shen.hong.io/nixos-for-philosophy-installing-firefox... | jonnydubowsky wrote: | Great blog post! "Plato didn't write his dialogues on | Microsoft Word, and neither should you". | FRidh wrote: | Ha, familiar. I decided I wanted the pdf of my dissertation | to be reproducible and hence packaged it with Nix. | unixhero wrote: | I did exactly the same for my master thesis. This is why so am | so not interested in doing a PhD. | messe wrote: | Same for my undergrad thesis. I learned so much more about | the art of typesetting that year than I did about | Hypergeometric Integrals or String Amplitudes: the topics of | the thesis. That's why I left before doing a masters. | unixhero wrote: | I even took the Artificial Intelligence with Andrew Ng | course at Stanford through Coursera. That's how much I was | procrastinating. :) | | My conclusion is that writing a thesis fucking sucks and | damaged me. | ramraj07 wrote: | It's a rite of passage with latex though. Each one of us has to | spend the time and set up our own latex setup the first time we | write using it. Hopefully never again though. | zomglings wrote: | Spent my teens and twenties in mathematics departments. This | is a very polarized behavioral pattern: | | 1. The majority of people just don't care very much. Just get | LaTeX working locally, get it to run the packages you need - | amsmath, etc. - and start working on your mathematics. | | 2. There is a large minority who dive deep into the | rabbithole on their editing environment, typography, | diagramming, etc. | | Amusingly, you can tell how likely someone is to fall into | either camp based on the amount of care they take with and | over their notes (mostly hand-written in my day). | cinntaile wrote: | Ever since Overleaf (and sharelatex) I would say this is no | longer the case. It's not optimal but it works well enough. | amelius wrote: | In the same way, Knuth wrote a set of books on typesetting. | Procrastination at its finest. | | https://en.wikipedia.org/wiki/Computers_and_Typesetting | db48x wrote: | He didn't just write books on typesetting, he took a year off | of work on Volume 2 of The Art of Computer Programming to | _write a new typesetting system_. Naturally, he had to | invent/implement vector fonts along the way. But even that | wasn't quite enough, so he also took the time to invent | literate programming, a style of programming where the source | code is also a book with narrative structure to guide the | user to a complete understanding of every nuance of the | source code. If you compile the TeX source code one way, you | get a program for typesetting documents. Compile it another | way and you get a TeX document containing TeX: The Book. Same | with the Metafont source code as well. All together I think | it delayed Volume 2 by a decade. | | And then he continued publishing books about typography, in | his spare time, for the next few decades. | copperx wrote: | Tex itself took Knuth 10 years to complete, if I recall | correctly. | the_duck wrote: | I wrote mine in Word and it's SO UGLY. I _wish_ I 'd | procrastinated more by fiddling with different rendering | methods! | bmacho wrote: | But starting with latex is almost always ~0 minute. | | You can just write the text, and write the equations between | dollar signs, and it renders you book quality output by | default. | | (You also can tweak it as much as you want it, and spend as | much time as you want it on it.) | frozenport wrote: | Blog post will likely have a wider audience than hundreds of | pages droning about "The Clausula as Fundamental Unit" | Iwan-Zotow wrote: | Well, following great ones - Knuth started doing TeX before | writing TAOCP | ModernMech wrote: | I wrote my dissertation in Word, and I found it more than | sufficient. WYSIWYG is still the best way to edit documents, but | it's not great for version control. Word's equation editor is | great though, and I enjoyed the ability to precisely place | figures. Although resolving references can take a while with | hundreds, I think they could serve to improve that. | passerine wrote: | I can relate quite well to the author's pursuit in tinkering with | their typesetting workflow. When I wrote my bachelor's thesis, I | also spent a great deal of time coming up with a custom LaTeX | template and workflow. Like the author, one of the pain-points | was the relatively slow edit-compile-review cycle of modern LaTeX | engines like LuaLaTex. | | In my case, I was mainly concerned with making the resulting | thesis.pdf PDF/A compliant. PDF/A is a archival compliance | standard that's dedicated to the long term digital preservation | of PDF files. | | Predictably, I got way too carried away as well, and ended up | trying to create fully-reproducible LaTeX PDFs as well. It was | probably overkill for my use-case, but it did result in a fun | blog post where I documented the process [1] | | [1] https://shen.hong.io/reproducible-pdfa-compliant-latex/ | yockyrr wrote: | Surely PDF/A can be created just by passing any PDF through | Ghostscript with the right flags such as -dPDFA and | -sPDFACompatibilityPolicy=1 ? | passerine wrote: | I don't have any experience with using Ghostscript as a post- | processing step, and I am curious to know if it works well | for complex documents. | | LaTeX does have a native way of generating PDF-A compliant | documents, using the pdf-x package. It's still in beta, but | it is quite stable and works very well. The advantage of | enforcing PDF-A compliance using native LaTeX is that it | allows you to take the further step of implementing | reproducible builds. Once that is done, you can be certain | that given a LaTeX source file, you will be able to generate | a bit-for-bit identical document. | | Additional post-processing steps will have to be at least | documented, and will probably tie the output on the specific | version of your post-processor. | nicce wrote: | > Like the author, one of the pain-points was the relatively | slow edit-compile-review cycle of modern LaTeX engines like | LuaLaTex. | | This depends a lot. In most of the cases delay is only about 1 | second on modern PCs. A bit more when you cite and build the | document twice. | | You can use LaTeX in many different ways. There are built-in | editors and web services such as Overleaf. In the end, they all | use the same workflow or dependencies for building the | document, but might add an additonal delay. | | I too have ended up tweaking my environment a lot. I ended up | testing almost every LaTeX workflow. | | I finally ended up for just using vim and zathura. Optimised | docker image with LuaLatex builds the document. Second favorite | would be LaTeX plugin for Jetbrains products. Overleaf is only | good for collaborating. | | On my desktop pc which has 16 CPU cores, there is only very | little latency when compiling. But for text editing, it is a | bit rare that you need such PC... | passerine wrote: | > This depends a lot. In most of the cases delay is only | about 1 second on modern PCs. A bit more when you cite and | build the document twice. | | I agree that for many (or even most) documents, LaTeX's | compilation delay is generally manageable. However, when it | comes to documents with bibliography management, footnotes, | margin-notes, and multiple figures, the compilation delay can | get quite high. | | In my own experience, I had a document of notes containing | over a hundred citations managed by biblatex and bibmla [1]. | It also had footnotes and margin-notes, requiring an | additional repaint. The compilation time on that document was | well over several seconds on my laptop, up to dozens of | seconds when on battery-power. | | > I finally ended up for just using vim and zathura. | Optimised docker image with LuaLatex builds the document. | Second favorite would be LaTeX plugin for Jetbrains products. | Overleaf is only good for collaborating. | | I'm very curious to hear about the docker image that you are | using. What purpose does the docker image serve in the build | pipeline? I know that for compiled software, sometimes having | a build environment allows you to better define the | environment variables, but to my understanding this is not a | worry for LaTeX. | | [1] https://github.com/ShenZhouHong/sartre-notes | nicce wrote: | > I'm very curious to hear about the docker image that you | are using. What purpose does the docker image serve in the | build pipeline? I know that for compiled software, | sometimes having a build environment allows you to better | define the environment variables, but to my understanding | this is not a worry for LaTeX. | | Using Docker brings several benefits. I allows me to share | the same build environment for multiple different machines. | I can even use my desktop remotely for building the | documents if I want, just by sharing Docker Daemon. | | Sometimes some package breaks after an update, and Docker | allows me to roll back to working environment. I also can | declare additional packages and fonts deterministically if | I need them. Overall, LaTeX is quite complicated and huge | system, and I rather keep it away from my host machine. | Maybe Docker is a bit overkill, but I have never wasted | time on fighting with package conflicts or installing Latex | once again with extra packages for different machine. | nonrandomstring wrote: | Nice tour of student typesetting today. Not surprising to find | roff still in service too. My thesis in the late 80s was set | using nroff, fig and eqn, all of which I've fond memories. | | Surely WYSIWYG and "office" suites were a disaster for writing. | Students seem to spend lost weeks and months fiddling with MS- | Word only to create mediocre looking output. | | Personally I's say it's hard to beat Org-mode, separate plain | text files, then adding the desired exporter and style files at | the last minute. | GiovanniP wrote: | > Students seem to spend lost weeks and months fiddling with | MS-Word only to create mediocre looking output. | | I am suprised, and keep being surprised, that people haven't | yet figured out that there is an excellent tool, that is | TeXmacs, that manages to make WYSIWYG the _best_ way to write | structured documents while having complete control on the | output and never having to fiddle with details. | copperx wrote: | It's always been a mystery to me why TeXmacs is so good yet | obscure. I've stayed away from it because I feared there was | a caveat that I didn't know about. | GiovanniP wrote: | I am using it since several years and I have yet to find | the caveat :-) | | It has low discoverability, so you have to go through the | manuals, the mailing list and the forum (and maybe the blog | too, at https://texmacs.gitee.io/notes/docs/main.html) to | figure out all that it can do, and to have complete | control. On the other hand, it is quite usable with default | settings. | LEARAX wrote: | This was not my experience with TeXmacs. I tried it recently | after using LaTeX for all of my papers, and while it is nice, | it is not replacing LaTeX for me. | | Table output in particular was much lower quality than LaTeX | with booktabs. I think I had to manually resize columns, | which was tedious, and I never had to do that with LaTeX. | There were a lot of similar situations I found myself in, | where I wound up needing to fight TeXmacs quite a bit to get | it to output what I wanted. | | I prefer my LaTeX workflow where I can edit markup in Emacs, | and have a preview almost instantly generated next to my | editor by a filesystem watcher & makefile. TeXmacs | necessitates using its own interface (which lacks my vim | keybindings and Emacs customizations) and I could not find | many resources on editing TeXmacs documents in external | programs. | | I did appreciate that the general typesetting in TeXmacs was | high quality, and the ability to type TeX macros and get e.g. | enumerated lists quickly was very nice. But overall, I prefer | LaTeX. | GiovanniP wrote: | > TeXmacs necessitates using its own interface (which lacks | my vim keybindings and Emacs customizations) | | TeXmacs's own interface is deeply customizable by the user | via Scheme. | | I think you can set it up to have vim keybindings---see | experimental code at https://github.com/chxiaoxn/texmacs- | vi-experiment and comments at | http://forum.texmacs.cn/t/a-very-tiny-vim-in-texmacs/176 (I | know that the lack of a block cursor has put off someone, | but I did not find that comment in the brief search I did | now). | dmd wrote: | I wrote my Masters thesis in LaTeX, which is why I wrote my PhD | thesis in Word. | otherme123 wrote: | I've seen people crying over Word for not being able to work | with proper styles or dealing correctly with cross references, | bibliography included, all of which is relatively easy for | Latex. Bibliographies in Word is almost impossible without a | third party plugin like Zotero, and less able people doesn't | even know they exist. | | There's a well working line of business in my Uni that consist | on properly final-formatting thesis with Word. | | Luckily for Microsoft "easy" products, there are a legion of | people that work for free as technical support. | verytrivial wrote: | Nitpick here, but this should really be titled "Using groff to | render markdown to PDF faster". | uneekname wrote: | I may have misunderstood the article, but by the end I think | they were writing directly in groff. | vlovich123 wrote: | I found that those who can't get consistent styling and have | laggy behavior on large documents don't know how to configure it. | I regularly wrote hundred page reports with embedded excel and | images all embedded in Word with Math and got pretty proficient. | There's basically several things you need to do: | | * Actually set up a named style for every type of content you | have. Creating shortcuts for the common ones doesn't hurt * use | whatever the paid version that powers the free equation editor. | It was miles better about 10 years ago * use a master document | sub document approach for categorizing things. You wouldn't have | a single text file that's 100 pages long. Split up Word that way | too | | I'm pretty sure I got to a state where I was using the tooling as | intended because I wasn't actually fighting the wysiwyg. Now I | did switch to LateX at the end because I was tired of not having | easy version control. Word has it if you enable change tracking | but it can't beat normal tooling. Also I wanted to learn latek | because it felt like a worthwhile investment (it was - writing | formulas in latek is wayyy faster to write and easier to | maintain). | | So I liked LateX just fine. Prefer Markdown / wiki these days | because I don't work with math formulas. | | Disclaimer: I have zero experience with the web version and have | no idea how it scales. I imagine it still does quite well on | large documents but maybe browser rendering is not so good. | hyperdimension wrote: | > writing formulas in latek is wayyy faster to write and easier | to maintain | | I was told by a friend that the Equation Editor in Word would | silently accept LaTeX math-mode equation syntax and convert it | automatically. Besides trying it out briefly, i never used it | extensively, so I'm not sure how complete it is. Still, it's | there. | ygra wrote: | I think the LaTeX compatibility came in a fairly recent | version, but the usual Unicode-based syntax is already a lot | easier to type (especially on non-US keyboard layouts) and | read, since it's a lot more compact and uses whitespace to | separate things by default instead of requiring large amounts | of curly braces. | ahmadmijot wrote: | Can confirm. I use the LaTeX mode in equation editor in Word | quite religiously for both equations and symbols. | amelius wrote: | I think the point with LaTeX is that you can automate the | document generation process to a great extent. For example, if | you have some data, some python scripts that process the data, | and some other scripts that generate figures, you can put all | of that in a pipeline and build a new version of your document | automatically after the data changes. | bo1024 wrote: | Oh man, I completely forgot that wouldn't even be possible | (presumably) in Word. What a pain! | tengwar2 wrote: | The equivalent would be OLE, I think. Yes, Word can't | trivially use the text output of a Python program, but then | LaTeX can't trivially embed a chunk of an Excel document | and have it updated automatically. Both systems make | assumptions about what they will integrate with. | bo1024 wrote: | Thanks for the info. I still consider LaTeX more modular | since the only assumption it makes is the name of file | you want to include. | 0101010110 wrote: | This is possible in Word (and Excel). | | You can link word tables to excel and, provided your | analysis updates your spreadsheet, can refresh all data | instantly. | | You can also refer to values in tables in the body of the | text. | | I know this because I worked in an environment where Word | was the only option! | vlovich123 wrote: | I've also programmatically programmed generation of word | documents using CPAN modules in a past life. I'm sure | there's better packages these days. | ramraj07 wrote: | I wrote multiple papers during my PhD. The theoretical one with | lots of equations I wrote with latex. It'd be stupid not to. | Overleaf helped though I wrote the paper over 6 years so it | only helped in the end. | | Then I wrote two bio heavy papers. Using word. My thesis was in | word too. If you have a ton of figures and not a ton of | equations it's not the best choice to use latex. | tkuraku wrote: | I find that lots of figures are good reason to use latex. You | can programatically scale and crop, captions never get | separated from the figures, and figure placement is handled | in a sane predictable way. | copperx wrote: | > figure placement is handled in a sane predictable way. | | That is not the LaTeX I know. | gnatolf wrote: | It does take some getting used to, I'll give you that. | But it's very predictable. | | Annoyingly, 'new' users, especially led by KOMAScript | hints are drawn to use the 'total' positioning that is | promised by using the H B P and other options. However, | since these aren't holding their promises - at least not | the misunderstood promised 'absolute' positioning, | frustration is very often creeping in. | | I've had too many co authors and friends ask me how to | push figures to certain positions where all I saw was | premature optimization in terms of positions and tons of | wasted cycles (CPU and user) to get to intermediate | solutions that are completely unnecessary if only they | were saved until the last layouting runs. The change in | document creation paradigm (to not care about the layout | until the far end) is what 'manages' expectations and | where the perceived errors mostly come from. | gattilorenz wrote: | > If you have a ton of figures and not a ton of equations | it's not the best choice to use latex. | | Last time I used Word for anything significant (a thesis) it | was either word 2003 or 2007, and adding a table or inserting | a new paragraph somewhere before a figure could mess up all | the figure placement (sometimes the thing would literally | disappear). | | After that I switched to LaTeX and never looked back. Has | this recently become better, or was I just | unlucky/unexperienced? | el_oni wrote: | I wrote my thesis in Word (graduated last friday, vivad | last march). The key to stop things moving around in word | is to use the "in line with text" option for images. And | treat images like text. If you want it centred then justify | it to the centre, dont try to place things manually. | | Previous documents ive written with word i would do things | like tight layouts on images, maybe with anchors, but | that's a recipe for things moving around. | | When i came to compile my chapters to a final document i | used master document-subdocument to pull everything | together. I only had a few issues with blank pages being | added when exporting to pdf and that was due to my use of | page breaks and section breaks. | GiovanniP wrote: | > Has this recently become better? | | You should try TeXmacs; it is not recent, but it has become | smooth and it is superior to both LaTeX and Word under | every point of view. | teakettle42 wrote: | > If you have a ton of figures and not a ton of equations | it's not the best choice to use latex. | | Strong disagree; wysiwyg editing in Word is an exercise in | endless frustration, and Word's typesetting and fonts are so | ugly that it's painfully obvious when a paper has been | written using Word. | | Just write LaTeX and let it do the typesetting, figure | layout, citation formatting, etc. It's less painful than Word | WYSIWYG editing, and the result is far more polished. | chipotle_coyote wrote: | While I have a lot of complaints with Word, I have to very | tepidly take issue with the accusation of ugly fonts. You | may like TeX's default typefaces more than Word's, but | those are just the defaults. You can set a LaTeX document | in Calibri and (presuming you have an OTF version of the | fonts) a Word document in Computer Modern. | | Word doesn't really do _typesetting,_ though. You can make | credible camera-ready output with it if you 're rigorous | with styles and learn how to anchor figures and images | correctly, but the line and page breaks will still say "hi, | I'm a word processor." | lizknope wrote: | W. Richard Stevens wrote some excellent Unix and TCP/IP books in | the 1990s. He used troff / groff | | RIP Richard, your books were amazing and his son thought he was | cool because his book was in Wayne's World 2 | | http://www.kohala.com/start/ | | https://www.salon.com/2000/09/01/rich_stevens/ | kepler1 wrote: | Yeah there was a time when I thought my speed of typing was the | thing slowing down my thesis writing. So I spent a week training | Dragon Naturally Speaking to be able to transcribe my voice. | | Turns out that really wasn't the bottleneck, and I had just spent | another week distracting myself with technology to avoid writing. | fegu wrote: | Will definitely try this. I sometimes used latex at work for | things like contracts and other documents that should look | formal. But occasionally you need to share it with someone to get | their input before it is final. Lots of people are unfamiliar | with latex. So I switched to markdown. Markdown does not get in | your way, so even those unfamiliar with it get the hang of it. | balddenimhero wrote: | Similar to the experiences of other commenters, I find the LaTeX | edit-compile-review cycle to only grow unreasonably slow when | none of the incremental compilation features are used. For larger | documents I recommend (i) splitting the document to leverage the | \include and \includeonly commands, and (ii) using the Tikz | library "external" to avoid the unnecessary recompilation of | unchanged graphics. PGF/TikZ is often a bottleneck. | | I agree though that it would be nice if the compilation (esp. | from scratch) were generally faster. | gnatolf wrote: | Still even using all of that, my thesis with heavy inline tikz | took about 5 minutes per run (about 120 pages). And a full | rerun with all tikz graphs redone (about 20), it took just shy | of 20 minutes if the indexes existed already. That was all on a | surface 4 pro from ~2015. | yakubin wrote: | Wow. That's competing with some C++ projects. | 41b696ef1113 wrote: | The number one reason to lean heavily on sectioning via | \include is for debugging. Debugging Latex is a disaster, and | it is only by compartmentalizing code into smaller sections do | you have a hope of isolating the problem. | ncphil wrote: | In the mid-70s, I typed my senior thesis on a reconditioned | manual (Underwood) and a borrowed electric typewriter. By the | time I did my masters in the late 80s, all my papers were | composed in vde on CP/M and formatted with TeX. | seanhunter wrote: | Groff is great. I went through a phase of doing reports for work | using groff and one of the cool things is that on W Richard | Stevens' website there were all his groff macros he used to | produce all the beautiful diagrams in _TCP /IP Illustrated_ etc. | So I used to have lovely diagrams with spline curves etc thanks | to W Richard Stevens. | | The great thing about groff (compared in my experience with | latex) is that you spend basically zero time on | formatting/messing about once you have a set of macros you like, | and the document production cycle is really fast so you edit with | zero distractions using basically plain text (a lot like | markdown) and then any time you want to see the finished product | it's very quick to see it. | mgaunard wrote: | Writing a PhD is much easier than writing a book. | | To begin with, it has a very simple and constrained structure: | abstract, problem description, state of the art, interesting | subtopic 1, interesting subtopic 2, interesting subtopic 3, | results and perspectives. | | Interesting subtopics are also just previous articles that you | can just recycle. | jszymborski wrote: | > Writing a PhD is much easier than writing a book. | | IMHO, I think that's a pretty broad statement that's wrong as | often as it's true. Surely it depends on the book, the thesis, | and the discipline/genre. | | Firstly, not all disciplines have theses broken down in the way | you've outline. Theses in certain humanities often more | resemble non-fiction than those of other disciplines. | | Of course, some disciplines or departments or schools or | supervisors will have you write a "thesis by manuscript" in | which you present manuscripts you've written as chapter and | write little interludes connecting them as well as a unifying | intro and conclusion. | | This on the face of it might seem like "just recycling" | previous articles, but I think it overlooks the fact that those | manuscripts must be written, at least in majority, by you. Even | when you aren't writing a "thesis by manuscript", most people I | know write chapters as they go along their PhD. | | And finally, a bit of digression, but I don't think it's | reasonable to exclude the amount of research it takes to write | "books" or theses from the estimation of effort it takes to | write them. It's an integral part of the process. | tpoacher wrote: | > I wrote my PhD thesis using groff | | I was hoping it would be via gnu ed too, but they used vim. | Shame. | tpoacher wrote: | @OP: incidentally, i'd love to see examples (thesis or | otherwise) of your groff documents and related makefiles if you | have any publically available | wazoox wrote: | Wow, that looks nice :) | PopAlongKid wrote: | I wrote my masters thesis using _troff_ in the early 1980s. Later | that decade, I used a version of _nroff_ on PC-DOS for my job. It | seems, viewed from a sufficient distance, that this wheel has | been re-invented a number of times since then. | jszymborski wrote: | I first tried writing my MSc theses as a set of AsciiDoc(tor) | files. I really enjoyed how much more flexibility AsciiDoc gave | me over MD so I was pretty set on it. I _really_ hated the | equations it generated and AsciiDoc isn't a Pandoc source, sadly. | Even worse, the tooling was monstrous. I had entire build scripts | that were getting more and more convoluted. | | I relented and went to LaTeX, and while the limitations mentioned | here resonate with me, I've found it totally doable. | noisy_boy wrote: | After years of using hacks in MS Word trying to make my CV look | the way I wanted, one day I bit the bullet and wrote it in LaTeX. | The amount of 3+ hours spent to learn LaTeX basics and doing the | re-write were disproportionately low compared to the huge jump in | the quality of the output. Having used troff for writing man | pages eons ago, this blog makes me interested in learning groff | to re-write my CV in it and compare the experience with that of | LaTeX. | scrlk wrote: | As a counterpoint, I had to ditch my LaTeX CV when I realised | that applicant tracking systems were struggling to properly | parse the PDF. | | Switching back to a simple Word template (no use of tables; | just heading styles and bullet points) and submitting the .docx | resolved these issues. | tpoacher wrote: | yikes! you must have felt so dirty having to do that! | reknih wrote: | The LaTeX criticisms of the article really resonated with me. | Long compile times and a narrow "happy path" are the things where | I feel LaTeX makes me less productive. | | This is a pity because, otherwise, it is a great tool with its | focus on document structure and output quality. I'm currently | working on a LaTeX successor which seeks to address these issues, | but it is really hard to make the right design compromises here | -- what can be programmed? What is accessible through dedicated | syntax? How does the structure drive presentation? | | Computer typesetting is a rabbit hole, but a fascinating one. And | I'm sure the last word on it has not been spoken yet :) | andrepd wrote: | Splitting a long document in chunks and using the `draft` | option while writing speed up compilation times considerably. | Otherwise you're producing a finalised typeset document of 100+ | pages every time you hit F5, no wonder it takes ~10s to finish | ;) | reknih wrote: | IMO draft seems like a crutch: Because TeX has to reprocess | the whole document and the sty-files of each imported package | every time, you do not have a huge budget for your document | content. Instead, you are given the option to sacrifice image | decoding, plot drawing, and output fidelity to keep TeX | humming along. | | Sure, that's better than nothing but I cannot help but wonder | whether there could be an architecture where you cut down on | repeating work and get faster recompiles that way! | jonathanstrange wrote: | I've become productive in LaTeX once I stopped doing any | typesetting in it until there was a real need for it due to | publisher requirements. LaTeX looks great out of the box, I | just finished a book that I had to deliver camera-ready and the | publisher (not a LaTeX shop) was very impressed with the | quality. It was the standard Memoir book template with almost | no changes. Ironically, many documentations for special | typesetting packages in LaTeX look very bad. Generally, the | less you change, the better. | | LaTeX really fails at "register-true" typesetting, though. You | have to allow it to extend pages here or there by a line or be | willing to fix many orphans and widows by hand. AFAIK, this has | to do with the text flow algorithms which are paragraph-based | and cannot do some global optimizations. (Correct me if I'm | wrong, I'm not an expert.) | | Btw, I cannot confirm the compile-time criticisms. A whole book | takes just a few seconds on my machine for one run. I wonder | what people are doing who get slow compile times. | bee_rider wrote: | There are some packages that slow LaTeX down, I think... Tikz | is one I think. | | My masters thesis was written on an old netbook with an Atom | processor, plenty of graphics, the compile times got pretty | ugly. But I did different files for each section, and set it | up so the latex process would automatically kick off and run | in the background after writing to the file in vim. Working | within constraints like that is sort of fun, it forces you to | get the slow operations off the critical path. | | Currently I use a script like: inotifywait -e | close_write,moved_to,create -m . | while read -r | directory events filename; do if [ "$filename" = "$1" | ]; then latexmk -interaction=nonstopmode -f -pdf $1 | 2> $1.errlog fi done | | to just re-compile the .tex whenever it changes. I'm not | really a bash programmer though so I guess this will probably | be ripped apart by somebody here, haha (the top couple lines | were probably taken from some post on the internet | somewhere). | farhaven wrote: | FWIW, `latexmk` has a "Watch the sources for this document | and rebuild if they change" mode builtin. It gets activated | if you pass it the command line flag `-pvc`. | bee_rider wrote: | Suggesting things on this site is such an easy way to get | better solutions. Thanks! | andrepd wrote: | Memoir has options like \sloppy bottom. But honestly, the | reason it doesn't is that it's virtually impossible to have | an algorithm that gives you the best layout 100% of the time. | Sometimes it's physically impossible not to have orphans or | awkward spacing with the text you've given. You can never | remove the human from the equation. | jonathanstrange wrote: | Yes, that's my point. _\sloppybottom_ can look fine but it | violates the requirement of real register-true typesetting | where each typeblock has the same size on every page and | the lines on a double page match exactly. Some publishers | have this requirement and it 's hard to work around it. | This could in principle be improved by some subtle | tradeoffs between line breaks in previous paragraphs, | paragraph breaks, and page breaks. It's a kind of global | optimization that is not possible in TeX due to principal | limitations of the engine. See Section 4 of [1]. | | [1] https://tug.org/TUGboat/tb11-3/tb29mitt.pdf | pja wrote: | Simon Cozens spent some time writing a new typesetter | called SILE: https://sile-typesetter.org/ | | On of the design goals was to be able to achieve exactly | that kind of line matching. IIRC it can ensure that lines | on the front & back of a page line up exactly too; | apparently this is important for bibles? | | Worth taking a look at. It recently acquired TeX style | mathematics typesetting ability & has a small but active | developer group. | svat wrote: | * As a rough rule of thumb, TeX can do about 1000-3000 pages a | second on today's computers.[1] This is for a (presumably | typical) book that was written in plain TeX. | | * So if your LaTeX document is taking orders of magnitude more | than about a millisecond a page, then clearly the slowdown must | be from additional (macro) code you've actually inserted into | your document. | | * TeX is already heavily optimized, so the best way to make the | compilation faster is to not run code you don't need. | | * Helping users do that would be best served IMO not by writing | a new typesetting engine, but by improving the debugging and | profiling so that users understand what is actually going on: | what's making it slow, and what they actually need to happen on | every compile. | | To put it another way: users include macros and packages | because they really want the corresponding functionality (and | everyone wants a different 10% of what's available in the | (La)TeX ecosystem). It's easy to make a system that runs fast | by not doing most of the things that users actually want[2], | but if you want a system that gives users what they'd get from | their humongous LaTeX macro packages and yet is fast, it would | be most useful to help _them_ cut down the fluff from their | document-compilation IMO. | | --- | | [1] Details: Try it out yourself: Take the file gentle.tex, the | source code to the book "A Gentle Introduction to TeX" | (https://ctan.org/pkg/gentle), and time how long it takes to | typeset 8 copies of the file (with the `\bye` only at the end): | on my laptop, the resulting 776 pages are typeset in: 0.3s by | `tex`, 0.6s by `pdftex` and `xetex`, and 0.8s by `luatex`. | | [2] For that matter, plain TeX is already such a system; Knuth | knew a thing or two about programming and optimization! | nerdponx wrote: | How does LaTeX compare to XeTeX, ConTeXt, and LuaTeX in | compilation speed? | reknih wrote: | This is not really a meaningful comparison: You have your | TeX compiler (pdfTeX, XeTeX, LuaTeX, and the older e-TeX) | that takes a document in some format and produces a PDF or | DVI. In my tests (that did not include e-TeX) pdfTeX tends | to be the fastest here, but sometimes you need modern | fonts, so you have no choice but to use the other two. | | The TeX compiler then loads a format like plain TeX (which | the above commenter uses), LaTeX, or ConTeXt. The format | defines what macros are available. LaTeX adds a package | system, as does ConTeXt (modules) so you can import even | more macros on-demand. These TeX formats differ in scope | and thus speed, LaTeX tends to be a bit heavier but what | really weighs it down are the myriad of packages it is | usually used with. | | Many TeX distributions will define aliases like pdflatex in | your path such that you can preload pdfTeX with the LaTeX | format, but they are not really separate compilers. | pdw wrote: | I never got deep into TeX, but I browsed the code at one time | and some of what I found seemed utterly insane to me. For | example, it includes an IEEE floating point implementation, | based entirely on TeX string expansion [1]. I don't know if | it is widely used, but I'm not surprised by slow LaTeX | compiles anymore. | | You say "TeX is already heavily optimized", but that's only | true for the layout engine. The input language is entirely | based on macros and string expansion. That's fine if you're | only going to use it for a bit of text substitution. But as a | programming language it's inherently slow. (To be fair, I | believe Knuth expected that large extensions, such as LaTeX, | would be implemented in WEB.) | | [1] https://github.com/latex3/latex3/blob/main/l3kernel/l3fp- | bas... | svat wrote: | That's precisely my point: the slowness comes not from TeX | but from the LaTeX macro packages that the user has | included, on top of TeX. And if you want to make things | faster for the user, you don't have to replace TeX (which | is already plenty fast); you have to replace the macro | packages or provide faster alternatives. | | (And even there, LaTeX--including the l3fp package you | mentioned--is likely not even the worst offender. To be | sure, better profiling would help.) | ta988 wrote: | For a dissertation you often have diagrams and bibliography | this is taking a while. Even my resume takes more than 1s a | build because of that. | nextos wrote: | This is a topic I have been interested in for a while. Is it | viable to compose fancy large documents in plain TeX without | a lot of effort replicating functionality provided by LaTeX | (if your requirements stay constant)? | | I am a heavy user of the memoir class, and I have always | suspected moving to plain TeX would not be that hard. | However, the fraction of users doing this seems pretty slim | so modern TeX workflows do not seem really well documented. | grayclhn wrote: | I've tried. Coauthors and journal requirements were my | limiting factors, not anything inherent to the typesetting | engines... | | These days (in industry) I manage to use pandoc markdown to | word for everything (for similar reasons), which is even | more limiting than plain TeX. You learn to write around the | limitations pretty quickly. :) | AB1908 wrote: | Does pandoc help at all? | reknih wrote: | In use cases like the Markdown to PDF pipeline described in | the article, sure! Documents there are also simple enough so | that compile times aren't too much of a problem. | | However, many of the documents we like to set in TeX are more | complex than that: bibliographies, figure placement, special | typographical flourishes.... And here is where the complexity | of LaTeX and its macros adds to the inherent complexity of | what we are trying to accomplish (and compile times quickly | ballon again). | | So, sometimes it helps...? | reknih wrote: | I figure I should also mention that my LaTeX alternative is | called Typst. We do not have much public detail yet but there | is a landing page [1] to sign up for more info and beta access | as soon as it becomes available. | | [1]: https://typst.app/ | teakettle42 wrote: | My main concern with rented cloud software for this space is | that it seems like a great way to lose access to editing your | own past a academic/technical work. | reknih wrote: | Very valid concern. We aim for an open core model so you | can always take your projects and compile them locally | chaoxu wrote: | It looks really nice! I've sent this to a few friends to | check it out. | | To give some context, I'm a professor in theoretical computer | science, so I write a lot of LaTeX documents and notes. | | Some observations of my work flow. - Writing: | I'm writing the source, and occasionally look at the output. | So as long as the output time is reasonable, then it is | sufficient. - Editing: I'm reading the output, and | then edit the source. So going from the output to the part of | the source I have to edit should be as smooth as possible. | - Typesetting is the least of my concern. I only check if | there are any glaring typesetting problems right before we | publish. This takes at most 1% of the total time in preparing | a document. - Live editing almost never happen. But I | see why it might be useful to incorporate it into the work | flow. (A cursor on the rending of the live editing would be | very nice) | | There are some choices on how to present source and the | rendering. | | Typst went with the 2 panel design, with one side source, one | side rendering. So I found something close to WYSIWYG is | better for editing. However, full WYSIWYG is hard to get | right and comes with its own problem. Currently I found there | are a few common things people do with respect to | source/rendering. - WYSIWYG editors, which | renders everything (word, TeXmacs, Lyx). Editing is done in | the rendering. It is smooth, but takes a long time to get | used to. - The app Typora that renders everything | except the part where you are editing (which shows as the | source). This can be generalized to render all except the | current line, or something similar. Editing is done in the | source, but feels like I'm editing in the rendering. This is | extremely smooth for my editing work, and is my preferred | way. - The app like Compositor | https://compositorapp.com/ that renders everything, but can | call out the selected part of the source. - The source | and render are in two different panels. Editing is done in | the source. So usually one can click part of the rendering, | and cursor jumps to the corresponding part of the source. | This introduce some friction, as the eyes have to do a jump, | and also a quick context switch. | svat wrote: | The screencast looks good! For parallel/prior work in this | sort of "live update" of the typeset document (and to learn | from their experiences), you may also want to look at: | | * SwiftLaTeX (https://github.com/SwiftLaTeX/SwiftLaTeX / | https://www.swiftlatex.com / | https://doi.org/10.1145/3209280.3209522 -- the cool demo that | used to be on their site seems to be gone, but see HN | discussion: https://news.ycombinator.com/item?id=21710105) | | * Texpad https://www.texpad.com/ | | * BaKoMa TeX (http://www.bakoma-tex.com/) -- its eponymous | author Basil K. Malyshev passed away recently, but the | product and page still exists for now | | * VorTeX (see Pehong Chen's PhD thesis from 1988 https://www2 | .eecs.berkeley.edu/Pubs/TechRpts/1988/CSD-88-436... -- it | actually discusses the issues of quiescence, etc). | GiovanniP wrote: | If you are working on a LaTeX successor you could be interested | in TeXmacs, which is a LaTeX successor which works very nicely | in many ways, except apparently selling itself well :-) You | could see there how it was designed and how the author answered | the questions you are asking. | pingiun wrote: | If you want something LaTeX like, but with a wider happy path | you should try SILE | toastal wrote: | Speaking of typesetting... | | This article is incorrectly scaled for mobile. There's no padding | around the text so it butts up against the edge of my display. | The line widths are way too long for comfortable reading. The | blog entry also starts off with an unsemantic blockquote element | that quotes nothing from a source. | | But yes, Pandoc is a cool piece of software. | zichy wrote: | This together with some padding could help: | <meta name="viewport" content="width=device-width, initial- | scale=1"> | yockyrr wrote: | OP here: thanks for the feedback, added padding and correct | scale. Should look better now! | toastal wrote: | +1 for using `ch` as your max-width unit and `rem` for your | font size. Pixels is definitely the wrong value. I have my | user agent set up with slightly larger default font size to | be easier for me to read. I appreciate that you respect my | settings using these relative values. | mrweasel wrote: | Also weirdly enough, browsers also can seem to go into reader- | mode, to compensate. I've seen this before, but in this case it | seems a little weird that reader mode wouldn't work. ___________________________________________________________________ (page generated 2022-07-23 23:00 UTC)