------------------------------------------------- Title: Saturday Hacking Date: 2022-02-19 Device: Laptop Mood: Relaxed ------------------------------------------------- I don't know where the day has gone. I had a little lie-in today, I woke up at about 0800 which was nice. The weather has been wild here today; when I woke up it was incredibly windy, then we cycles through hail, about two hours of snow, and now it's sunny and back to 7 celcius. I've spent most of the day working on some BBC News scraper code. I've previously tackled this with various libraries to do article identification (similar to readability.js), but this time I decided to take a straight scrape-and-parse-the-HTML approach, and it's been a lot more successful. Previously I would have problems with things like subheadline identification, or identifying when some text is an image caption or a pull-quote, rather than body text. Using CSS selectors seems to give the control which I wanted, and actually the code is vanishingly small; only about 12 lines of BeautifulSoup code among the rest of the boilerplate. My original intention was to generate some bare HTML, so I can read the next distraction-free on my laptop, but then I realised it would be easier to just publish some plain-text and read it using w3m, with a Gophermap as the index: gopher://sdf.org/1/users/cside/bbc/ I'm still a complete Gopher noob, so I'm sure there are improvements to be made here; I'd love suggestions if anyone finds this useful. Email me cside@sdf.org if you have some. Question: How do I use hyphens as section breaks or heading underlines in a gophermap? Like this: Heading ------- I've read the spec and I can't figure out what I'm doing wrong. --C