Title: Vger security analysis
       Author: Solène
       Date: 14 January 2021
       Tags: vger gemini security
       Description: 
       
       I would like to share about Vger internals in regards to how the
       security was thought to protect vger users and host systems.
       
 (HTM) Vger code repository
       
       # Thinking about security first
       
       I claim about security in Vger as its main feature, I even wrote Vger
       to have a secure gemini server  that I can trust. Why so? It's written
       in C and I'm a beginner developer in this language, this looks like a
       scam.
       I chose to follow the best practice I'm aware of from the very first
       line. My goal is to be sure Vger can't be used to exfiltrate data from
       the host on which it runs or to allow it to run arbirary command. While
       I may have missed corner case in which it could crash, I think a crash
       is the worse that can happen with Vger.
       
       ## Smallest code possible
       
       Vger doesn't have to manage connections or TLS, this was a lot of code
       already removed by this design choice. There are better tools which are
       exactly made for this purpose, so it's time to reuse other people good
       work.
       
       ## Inetd and user
       
       Vger is run by inetd daemon, allowing to choose the user running vger.
       Using a dedicated user is always a good idea to prevent any harm in
       case of issue, but it's really not sufficient to protect vger to behave
       badly.
       Another kind of security benefit is that vger runtime isn't looping
       like a daemon awaiting new connections. Vger accept a request, read a
       file if exist and gives its result and terminates. This is less error
       prone because no variable can be reused or tricked after a loop that
       could leave the code in an inconsistent or vulnerable state.
       
       ## Chroot
       
       A critical vger feature is the ability to chroot into a directory,
       meaning the directory is now seen as the root of the file system
       (/var/gemini would be seen as /) and prevent vger to escape it. In
       addition to the chroot feature, the feature allow vger to drop to an
       unprivileged user.
       
       ```C code showing the chroot feature
            /* 
             * use chroot() if a user is specified requires root user to be 
             * running the program to run chroot() and then drop privileges 
             */
            if (strlen(user) > 0) {
       
                    /* is root? */
                    if (getuid() != 0) {
                            syslog(LOG_DAEMON, "chroot requires program to be run as root");
                            errx(1, "chroot requires root user");
                    }
                    /* search user uid from name */
                    if ((pw = getpwnam(user)) == NULL) {
                            syslog(LOG_DAEMON, "the user %s can't be found on the system", user);
                            err(1, "finding user");
                    }
                    /* chroot worked? */
                    if (chroot(path) != 0) {
                            syslog(LOG_DAEMON, "the chroot_dir %s can't be used for chroot", path);
                            err(1, "chroot");
                    }
                    chrooted = 1;
                    if (chdir("/") == -1) {
                            syslog(LOG_DAEMON, "failed to chdir(\"/\")");
                            err(1, "chdir");
                    }
                    /* drop privileges */
                    if (setgroups(1, &pw->pw_gid) ||
                        setresgid(pw->pw_gid, pw->pw_gid, pw->pw_gid) ||
                        setresuid(pw->pw_uid, pw->pw_uid, pw->pw_uid)) {
                            syslog(LOG_DAEMON, "dropping privileges to user %s (uid=%i) failed",
                                   user, pw->pw_uid);
                            err(1, "Can't drop privileges");
                    }
            }
       ```
       
       ## No use of third party libs
       
       Vger only requires standard C includes, this avoid leaving trust to
       dozens of developers using fragile or barely tested code.
       
       ## OpenBSD specific code
       
       In addition to all the previous security practices, OpenBSD is offering
       a few functions to help restricting a lot what Vger can do.
       
       The first function is pledge, allowing to restrict the system calls
       that can happen within the code itself. The current syscalls allowed in
       vger are related to the categories "rpath" and "stdio", basically
       standard input/output and reading files/directories only. This mean
       after pledge() is called, if any syscall not in those two categories is
       used, vger will be killed and a pledge error will be reported in the
       logs.
       
       The second function is unveil, which will basically restrict access to
       the filesystem to anything but what you list, with the permission.
       Currently, vger only allows file access in read-only mode in the base
       directory used to serve files.
       
       Here is an extract of the code relative to the OpenBSD specific code.
       With unveil available everywhere chroot wouldn't be required.
       
       ```C code with OpenBSD specific code
        #ifdef __OpenBSD__
                /* 
                 * prevent access to files other than the one in path 
                 */
                if (chrooted) {
                        eunveil("/", "r");
                } else {
                        eunveil(path, "r");
                }
                /* 
                 * prevent system calls other parsing queryfor fread file and 
                 * write to stdio 
                 */
                if (pledge("stdio rpath", NULL) == -1) {
                        syslog(LOG_DAEMON, "pledge call failed");
                        err(1, "pledge");
                }
        #endif
       ```
       
       # The least code before dropping privileges
       
       I made my best to use the least code possible before reducing Vger
       capabilities. Only the code managing the parameters is done before
       activating chroot and/or unveil/pledge.
       
       ```C code showing the parameters parsing
       int
       main(int argc, char **argv)
       {
            char            request  [GEMINI_REQUEST_MAX] = {'\0'};
            char            hostname [GEMINI_REQUEST_MAX] = {'\0'};
            char            uri      [PATH_MAX]           = {'\0'};
            char            user     [_SC_LOGIN_NAME_MAX] = "";
            int             virtualhost = 0;
            int             option = 0;
            char           *pos = NULL;
       
            while ((option = getopt(argc, argv, ":d:l:m:u:vi")) != -1) {
                    switch (option) {
                    case 'd':
                            estrlcpy(chroot_dir, optarg, sizeof(chroot_dir));
                            break;
                    case 'l':
                            estrlcpy(lang, "lang=", sizeof(lang));
                            estrlcat(lang, optarg, sizeof(lang));
                            break;
                    case 'm':
                            estrlcpy(default_mime, optarg, sizeof(default_mime));
                            break;
                    case 'u':
                            estrlcpy(user, optarg, sizeof(user));
                            break;
                    case 'v':
                            virtualhost = 1;
                            break;
                    case 'i':
                            doautoidx = 1;
                            break;
                    }
            }
       
            /* 
             * do chroot if a user is supplied run pledge/unveil if OpenBSD 
             */
            drop_privileges(user, chroot_dir); 
       ```
       
       # The Unix way
       
       Unix is made of small component that can work together as small bricks
       to build something more complex. Vger is based on this idea by
       delegating the listening daemon handling incoming requests to another
       software (let's say relayd or haproxy). And then, what's left from the
       gemini specs once you delegate TLS is to take account of a request and
       return some content, which is well suited for a program accepting a
       request on its standard input and giving the result on standard ouput.
       Inetd is a key here to make such a program compatible with a daemon
       like relayd or haproxy. When a connection is made into the TLS
       listening daemon, a local port will trigger inetd that will run the
       command, passing the network content to the binary into its stdin.
       
       # Fine grained CGI
       
       CGI support was added in order to allow Vger to make dynamic content
       instead of serving only static files. It has a fine grained control,
       you can allow only one file to be executable as a CGI or a whole
       directory of files.  When serving a CGI, vger forks, a pipe is opened
       between the two processes and a process is using execlp to run the cgi
       and transmit its output to vger.
       
       # Using tests
       
       From the beginning, I wrote a set of tests to be sure that once a kind
       of request or a use case work I can easily check I won't break it. This
       isn't about security but about reliability. When I push a new version
       on the git repository, I am absolutely confident it will work for the
       users. It was also an invaluable help for writing Vger.
       As vger is a simple binary that accept data in stdin and output data on
       stdout, it is simple to write tests like this. The following example
       will run vger with a request, as the content is local and within the
       git repository, the output is predictable and known.
       
       ```Shell command to run vger for testing purpose using a pipe
       printf "gemini://host.name/autoidx/\r\n" | vger -d var/gemini/
       ```
       
       From here, it's possible to build an automatic test by checking the
       checksum of the output to the checksum of the known correct output. Of
       course, when you make a new use case, this requires manually generating
       the checksum to use it as a comparison later.
       
       ```Shell command comparing vger output to a checksum
       OUT=$(printf "gemini://host.name/autoidx/\r\n" | ../vger -d var/gemini/ -i | md5)
       if ! [ $OUT = "770a987b8f5cf7169e6bc3c6563e1570" ]
       then
               echo "error"
               exit 1
       fi
       ```
       
       At this time, vger as 19 use case in its test suite.
       
       By using the program `entr` and a Makefile to manage the build process,
       it was very easy to trigger the testing process while working on the
       source code, allowing me to check the test suite only by saving my
       current changes. Anytime a .c file is modified, entr will trigger a
       make test command that will be displayed in a dedicated terminal.
       
       ```shell command using the command "entr" to auto rebuild the project
       ls *.c | entr make test
       ```
       
       Realtime integration tests? :)
       
       # Conclusion
       
       By using best practices, reducing the amount of code and using only
       system libraries, I am quite confident about Vger good security. The
       only real issue could be to have too many connections leading to a
       quite high load due to inetd spawning new processes and doing a denial
       of services. This could be avoided by throttling simultaneous
       connection in the TLS daemon.
       
       If you want to contribute, please do, and if you find a security issue
       please contact me, I'll be glad to examine the issue.