Title: Host your own wikipedia backup
       Author: Solène
       Date: 13 November 2019
       Tags: openbsd wikipedia life
       Description: 
       
       ## Wikipedia and openzim
       
       If you ever wanted to host your own wikipedia replica, here is the
       simplest
       way.
       
       As wikipedia is REALLY huge, you don't really want to host a php
       wikimedia
       software and load the huge database, instead, the project made the
       *openzim*
       format to compress the huge database that wikipedia became while
       allowing using
       it for fast searches.
       
       Sadly, on OpenBSD, we have no software reading zim files and most
       software
       requires the library openzim to work which requires extra work to get
       it as a
       package on OpenBSD.
       
       Hopefully, there is a python package implementing all you need as pure
       python
       to serve zim files over http and it's easy to install.
       
       This tutorial should work on all others unix like systems but packages
       or
       binary names may change.
       
       
       ## Downloading wikipedia
       
       The project Kiwix is responsible for wikipedia files, they create
       regularly
       files from various projects (including stackexchange, gutenberg,
       wikibooks
       etc...) but for this tutorial we want wikipedia:
       [https://wiki.kiwix.org/wiki/Content_in_all_languages](https://wiki.kiw
       ix.org/wiki/Content_in_all_languages)
       
       You will find a lot of files, the language is contained into the
       filename. Some
       filenames will also self explain if they contain everything or
       categories, and
       if they have pictures or not.
       
       The full French file is 31.4 GB worth.
       
       
       ## Running the server
       
       For the next steps, I recommend setting up a new user dedicated to
       this.
       
       On OpenBSD, we will require python3 and pip:
       
           $ doas pkg_add py3-pip--
       
       Then we can use pip to fetch and install dependencies for the zimply
       software,
       the flag `--user` is rather important as it allows any user to download
       and
       install python libraries in its home folder instead of polluting the
       whole
       system as root.
       
           $ pip3.7 install --user --upgrade zimply 
       
       I wrote a small script to start the server using the zim file as a
       parameter, I
       rarely write python so the script may not be high standard.
       
       File **server.py**:
       
           from zimply import ZIMServer
           import sys
           import os.path
       
               print("usage: " + sys.argv[0] + " file")
               exit(1)
       
               ZIMServer(sys.argv[1])
           else:
               print("Can't find file " + sys.argv[1])
       
       And then you can start the server using the command:
       
           $ python3.7 server.py /path/to/wikipedia_fr_all_maxi_2019-08.zim
       
       You will be able to access wikipedia on the url http://localhost:9454/
       
       Note that this is not a "wiki" as you can't see history and edit/create
       pages.
       
       This kind of backup is used in place like Cuba or Africa areas where
       people
       don't have unlimited internet access, the project lead by Kiwix allow
       more
       people to access knowledge.