dataswamp.org

       Title: Introduction to IPFS
       Author: Solène
       Date: 17 April 2021
       Tags: openbsd ipfs
       Description: 
       
       # introduction to IPFS
       
       IPFS is a distributed storage network protocol that comes with a public
       network.  Anyone can run a peer and access content from IPFS and then
       relay the content while it's in your cache.
       
       Gateways are websites used to allow accessing content of IPFS through
       http, there are several public gateways allowing to get data from IPFS
       without being a peer.
       
       Every publish content has an unique CID to identify it, we usually add
       a /ipfs/ to it like in
       /ipfs/QmRVD1V8eYQuNQdfRzmMVMA6cy1WqJfzHu3uM7CZasD7j1. The CID is unique
       and if someone add the same file from another peer, they will get the
       same hash as you.
       
       If you add a whole directory in IPFS, the top directory hash will
       depend on the hash of its content, this mean if you want to share a
       directory like a blog, you will need to publish the CID every time you
       change the content, as it's not practical at all, there is an
       alternative for making the process more dynamic.
       
       A peer can publish data in a long name called an IPNS.  The IPNS string
       will never change (it's tied to a private key) but you can associate a
       CID to it and update the value when you want and then tell other peers
       the value changed (it's called publishing).  The IPNS notation used is
       looking like
       /ipns/k51qzi5uqu5dmebzq75vx3z23lsixir3cxi26ckl409ylblbjigjb1oluj3f2z.ip
       ns, you can access an IPNS content with public gateways with a
       different notation.
       
       - IPNS gateway use example:
       https://k51qzi5uqu5dmebzq75vx3z23lsixir3cxi26ckl409ylblbjigjb1oluj3f2z.
       ipns.dweb.link/
       - IPFS gateway use example:
       https://ipfs.io/ipfs/QmRVD1V8eYQuNQdfRzmMVMA6cy1WqJfzHu3uM7CZasD7j1/
       
       The IPFS link will ALWAYS return the same content because it's a
       defined hash to a specific resource. The IPNS link can be updated to
       have a newer CID over time, allowing people to bookmark the location
       and browse it for updates later.
       
       # Using a public gateway
       
       There are many public gateways you can use to fetch content.
       
 (HTM) Health check of public gateways, useful to pick one
       
       You will find two kind of gateways url, one like "https://$domain/" and
       other like "https://$something_very_long.ipfs.$domain/", for the first
       one you need to append your /ipfs/something or /ipns/something requests
       like in the previous examples. For the latter, in web browser it only
       works with ipns because web browsers think the CID is a domain and will
       change the case of the letters and it's not long valid.  When using an
       ipns like this, be careful to change the .ipfs. by .ipns. in the url to
       tell the gateway what kind of request you are doing.
       
       # Using your own node
       
       First, be aware that there is no real bandwidth control mechanism and
       that IPFS is known to create too many connections that small routers
       can't handle. On OpenBSD it's possible to mitigate this behavior using
       queuing.  It's possible to use a "lowpower" profile that will be less
       demanding on network and resources but be aware this will degrade IPFS
       performance.  I found that after a few hours of bootstrapping and
       reaching many peers, the bandwidth usage becomes less significant but
       it's may be an issue for DSL connections like mine.
       
       When you create your own node, you can use its own gateway or the
       command line client.  When you request a data that doesn't belong to
       your node, it will be downloaded from known peers able to distribute
       the blocks and then you will keep it in cache until your cache reach
       the defined limited and the garbage collector comes to make some room. 
       This mean when you get a content, you will start distributing it, but
       nobody will use your node for content you never fetched first.
       
       When you have data, you can "pin" it so it will never be removed from
       cache, and if you pin a directory CID, the content will be downloaded
       so you have a whole mirror of it.  When you add data to your node, it's
       automatically pinned by default.
       
       The default ports are 4001 (the one you need to expose over the
       internet and potentially forwarding if you are behind a NAT), the Web
       GUI is available at http://localhost:5001/ and the gateway is available
       at http://localhost:8080/
       
       ## Installing the node on OpenBSD
       
       To make it brief, there are instructions in the provided pkg-readme but
       I will give a few advice (that I may add to the pkg-readme later).
       
       ```OpenBSD installation instructions
       pkg_add go-ipfs
       su -l -s /bin/sh _go-ipfs -c "IPFS_PATH=/var/go-ipfs /usr/local/bin/ipfs init"
       rcctl enable go_ipfs
       
       # recommended settings
       rcctl set go_ipfs flags --routing=dhtclient --enable-namesys-pubsub
       
       cat <<EOF >> /etc/login.conf
       go_ipfs:\
               :openfiles=2048:\
               :tc=daemon:
       EOF
       rcctl start go_ipfs
       ```
       
       You can change the profile to lowpower with "env
       IPFS_PATH=/var/go-ipfs/ ipfs config profile apply lowpower", you can
       also list profiles with the ipfs command.
       
       I recommend using queues in PF to limit the bandwidth usage, for my DSL
       connection I've set a maximum of 450K and it doesn't disrupt my network
       anymore.  I explained how to proceed with queuing and bandwidth
       limitations in a previous article.
       
       ## Installing the node on NixOS
       
       Installing IPFS is easy on NixOS thanks to its declarative way.  The
       system has a local IPv4 of 192.168.1.150 and a public IP of
       136.214.64.44 (fake IP here).  it is started with a 50GB maximum for
       cache. The gateway will be available on the local network on
       http://192.168.1.150:8080/.
       
       ```configuration code for NixOS to deploy IPFS
       services.ipfs.enable = true;
       services.ipfs.enableGC = true;
       services.ipfs.gatewayAddress = "/ip4/192.168.1.150/tcp/8080";
       services.ipfs.extraFlags = [ "--enable-namesys-pubsub" ];
       services.ipfs.extraConfig = {
           Datastore = { StorageMax = "50GB"; };
           Routing = { Type = "dhtclient"; };
       };
       services.ipfs.swarmAddress = [
               "/ip4/0.0.0.0/tcp/4001"
               "/ip4/136.214.64.44/tcp/4001"
               "/ip4/136.214.64.44/udp/4001/quic"
               "/ip4/0.0.0.0/udp/4001/quic"
       ];
       ```
       
       ## Testing your gateway
       
       Let's say your gateway is http://localhost:8080/ for making simpler
       incoming examples. If you want to request the data
       /ipfs/QmRVD1V8eYQuNQdfRzmMVMA6cy1WqJfzHu3uM7CZasD7j1 , you just have to
       add this to your gateway, like this:
       http://localhost:8080/ipfs/QmRVD1V8eYQuNQdfRzmMVMA6cy1WqJfzHu3uM7CZasD7
       j1 and you will get access to your file.
       
       When using ipns, it's quite the same, for /ipns/blog.perso.pw/ you can
       request http://localhost:8080/ipns/blog.perso.pw/ and then you can
       browse my blog.
       
       # OpenBSD experiment
       
       To make all of this really useful, I started an experiment:
       distributing OpenBSD amd64 -current and 6.9 both with sets and packages
       over IPFS.  Basically, I have a server making a rsync of both sets once
       a day, will add them to the local IPFS node, get the CID of the top
       directory and then publish the CID under an IPNS.  Note that I have to
       create an index.html file in the packages sets because IPFS doesn't
       handle directory listing very well.
       
       The following examples will have to be changed if you don't use a local
       gateway, replace localhost:8080 by your favorite IPFS gateway.
       
       You can upgrade your packages with this command:
       
       ```upgrading OpenBSD packages by fetching data over IPFS
       env PKG_PATH=http://localhost:8080/ipns/k51qzi5uqu5dmebzq75vx3z23lsixir3cxi26ckl409ylblbjigjb1oluj3f2z/pub/OpenBSD/snapshots/packages/amd64/ pkg_add -Dsnap -u
       ```
       
       You can switch to latest snapshot:
       
       ```upgrading to latest snapshot by fetching data over IPFS
       sysupgrade -s http://localhost:8080/ipns/k51qzi5uqu5dmebzq75vx3z23lsixir3cxi26ckl409ylblbjigjb1oluj3f2z/pub/OpenBSD/
       ```
       
       While it may be slow to update at first, if you have many systems,
       running a local gateway used by all your computers will allow to have a
       cache of downloaded packages, making the whole process faster.
       
       I made a "versions.txt" file in the top directory of the repository, it
       contains the date and CID of every publication, this can be used to
       fetch a package from an older set if it's still available on the
       network (because I don't plan to keep all sets, I have a limited disk).
       
       You can simply use the url
       http://localhost:8080/ipns/k51qzi5uqu5dmebzq75vx3z23lsixir3cxi26ckl409y
       lblbjigjb1oluj3f2z/pub/OpenBSD/ in the file /etc/installurl to globally
       use IPFS for pkg_add or sysupgrade without specifying the url every
       time.
       
       # Using DNS
       
       It's possible to use a DNS entry to associate an IPFS resource to a
       domain name by using dnslink. The entry would look like:
       
       ```dns entry for a dnslink TXT record
       _dnslink.blog        IN        TXT        "dnslink=/ipfs/somehashhere"
       ```
       
       Using an /ipfs/ syntax will be faster to resolve for IPFS nodes but you
       will need to update your DNS every time you update your content over
       IPFS.
       
       To avoid manipulating your DNS every so often (you could use an API to
       automate this by the way), you can use an /ipns/ record.
       
       ```dns entry for a dnslink TXT record using IPNS
       _dnslink.blog        IN        TXT        "dnslink=/ipns/something"
       ```
       
       This way, I made my blog available under the hostname blog.perso.pw but
       it has no A or CNAME so it work only in an IPFS context (like a web
       browser with IPFS companion extension).  Using a public gateway, the
       url becomes https://ipfs.io/ipns/blog.perso.pw/ and it will download
       the last CID associated to blog.perso.pw.
       
       # Conclusion
       
       IPFS is a wonderful piece of technology but in practice it's quite slow
       for DSL users and may not work well if you don't need a local cache.  I
       do really love it though so I will continue running the OpenBSD
       experiment.
       
       Please write me if you have any feedback or that you use my OpenBSD
       IPFS repository. I would be interested to know about people's
       experiences.
       
       # Interesting IPFS resources
       
 (HTM) dweb-primer tutorials for IPFS (very well written)
 (HTM) Official IPFS documentation
 (HTM) IPFS companion for Firefox and Chrom·ium·e
 (HTM) Pinata.cloud is offering IPFS hosting (up to 1 GB for free) for pinned content
 (HTM) Wikipedia over IPFS
 (HTM) OpenBSD website/faq over IPFS (maintained by solene@)