dataswamp.org

       Title: Toward an automated tracking of OpenBSD ports contributions
       Author: Solène
       Date: 15 November 2020
       Tags: openbsd automation
       Description: 
       
       Since my previous article about a continous integration service to
       track OpenBSD ports contribution I made a simple proof of concept that
       allowed me to track what works and what doesn't work.
       
       ## The continuous integration goal
       
       A first step for the CI service would be to create a database of diffs
       sent to ports. This would allow people to track what has been sent and
       not yet committed and what the state of the contribution is
       (build/don't built, apply/don't apply). I would proceed following this
       logic:
       
       * a mail arrive and is sent to the pipeline
       * it's possible to find a pkgpath out of the file
       * the diff applies
       * distfiles can be fetched
       * portcheck is happy
       
       Step 1 is easy, it could be mail dumped into a directory that get
       scanned every X minutes.
       
       Step 2 is already done in my POC using a shell script. It's quite hard
       and required tuning. Submitted diffs are done with diff(1), cvs diff or
       git diff. The important part is to retrieve the pkgpath like
       "lang/php/7.4". This allow testing the port exists.
       
       Step 3 is important, I found three cases so far when applying a diff:
       
       * it works, we can then register in the database it can be used to
       build
       * it doesn't work, human investigation required
       * the diff is already applied and patch think you want to reverse it.
       It's already committed!
       
       Being able to check if a diff is applied is really useful. When
       building the contributions database, a daily check of patches that are
       known to apply can be done. If a reverse patch is detected, this mean
       it's committed and the entry could be delete from the database. This
       would be rather useful to keep the database clean automatically over
       time.
       
       Step 4 is an inexpensive extra check to be sure the distfiles can be
       downloaded over the internet.
       
       Step 5 is also an inexpensive check, running portinfo can reports easy
       to fix mistakes.
       
       All the steps only require a ports tree. Only the step 4 could be
       tricked by someone malicious, using a patch to make the system download
       very huge files or files with some legal concerns, but that message
       would also appear on the mailing list so the risk is quite limited.
       
       To go further in the automation, building the port is required but it
       must be done in a clean virtual machine. We could then report into the
       database if the diff has been producing a package correctly, if not,
       provide the compilation log.
       
       ## Automatic VM creation
       
       Automatically creating an OpenBSD-current virtual machine was tricky
       but I've been able to sort this out using vmm, rsync and upobsd.
       
       The script download the last sets using rsync, that directory is served
       from a mail server. I use upobsd to create an automatic installation
       with bsd.rd including my autoinstall file. Then it gets tricky :)
       
       vmm must be started with its storage disk AND the bsd.rd, as it's an
       auto install, it will reboot after the install finishes and then will
       install again and again.
       
       I found that using the parameters "-B disk" would make the vm to
       shutdown after installation for some reasons. I can then wait for the
       vm to stop and then start it without bsd.rd.
       
       My vmm VM creation sequence:
       
       ```shell commands to generate an OpenBSD virtual machine
       upobsd -i autoinstall-vmm-openbsd -m http://localhost:8080/pub/OpenBSD/
       vmctl stop -f -w integration
       vmctl start -B disk -m 1G -L -i 1 -d main.qcow2 -b autobuild_vm/bsd.rd integration
       vmctl wait integration
       vmctl start -m 1G -L -i 1 -d main.qcow2 integration
       ```
       
       The whole process is long though. A derivated qcow image could be used
       after creation to try each port faster until we want to update the VM
       again.
       Multplies vm could be used at once to make parallel testing and make
       good use of host ressources.
       
       
       ## What's done so far
       
       I'm currently able to deposite email as files in a directory and run a
       script that will extract the pkgpath, try to apply the patch, download
       distfiles, run portcheck and run the build on the host using
       PORTS_PRIVSEP. If the ports compiled fine, the email file is deleted
       and a proper diff is made from the port and moved into a staging
       directory where I'll review the diffs known to work.
       This script would stop on blocking error and write a short text report
       for each port. I intended to sent this as a reply to the mailing at
       first, but maintaining a parallel website for people working on ports
       seems a better idea.