Title: Port of the week: pup
       Author: Solène
       Date: 22 April 2021
       Tags: internet
       Description: 
       
       # Introduction
       
       Today I will introduce you to the utility "pup" providing CSS selectors
       filtering for HTML documents.  It is a perfect companion to curl to
       properly fetch only a specific data from an HTML page.
       
       On OpenBSD you can install it with `pkg_add pup` and check its
       documentation at /usr/local/share/doc/pup/README.md
       
 (HTM) pup official project
       
       # Examples
       
       pup is quite easy to use once you understand the filters.  Let's see a
       few examples to illustrate practical uses.
       
       ## Fetch my blog titles list to a JSON format
       
       The following command will returns a JSON structure with an array of
       data from the tags matching "a" tags with in "h4" tags.
       
       ```command line example
       curl https://dataswamp.org/~solene/index.html | pup "h4 a json{}"
       ```
       
       The output (only an extract here) looks like this:
       
       ```output truncated
       [
        {
         "href": "2021-04-18-ipfs-bandwidth-mgmt.html",
         "tag": "a",
         "text": "Bandwidth management in go-IPFS"
        },
        {
         "href": "2021-04-17-ipfs-openbsd.html",
         "tag": "a",
         "text": "Introduction to IPFS"
        },
        [truncated]
        {
         "href": "2016-05-02-3.html",
         "tag": "a",
         "text": "How to add a route through a specific interface on FreeBSD 10"
        }
       ]
       ```
       
       ## Fetch OpenBSD -current specific changes
       
       The page https://www.openbsd.org/faq/current.html contains specific
       instructions that are required for people using OpenBSD -current and
       you may want to be notified for changes.  Using pup it's easy to make a
       script to compare your last data to see what has been appended.
       
       ```command line
       curl https://www.openbsd.org/faq/current.html | pup "h3 json{}"
       ```
       
       Output sample as JSON, perfect for further processing with a scripting
       language.
       
       ```JSON output sample
       [
        {
         "id": "r20201107",
         "tag": "h3",
         "text": "2020/11/07 - iked.conf \u0026#34;to dynamic\u0026#34;"
        },
        {
         "id": "r20210312",
         "tag": "h3",
         "text": "2021/03/12 - IPv6 privacy addresses renamed to temporary addresses"
        },
        {
         "id": "r20210329",
         "tag": "h3",
         "text": "2021/03/29 - [packages] yubiserve replaced with yubikeyedup"
        }
       ]
       ```
       
 (HTM) I provide a RSS feed for that
       
       # Conclusion
       
       There are many possibilities with pup and I won't list them all.  I
       highly recommend reading the README.md file from the project because
       it's its documentation and explains the syntax for filtering.