Title: Long Wireless links and monitoring.
Author: paco
Date: 2019-07-31
Type: article

_Update 2021-05-29: This setup is quite outdated now.  One of the endpoints
does not exist anymore.  Also, I replaced Zerotier by Wireguard, and the
monitoring part changed quite a bit too.  All the research, materials, and
build sections may be still useful so I left it all here.  I may revisit this
document in the future and update it_

## Intro

Some time ago I built 2 [P-t-P][1] links between some family members' buildings.

Thing is that my brother and my sister live in an area with no coverage from
traditional ISPs, but that is quite close (5.5km on a straight line, with no
obstacles) to my parent's which have good coverage (even FTTH) and plenty of
providers to choose from.

This project has grown _organically_ so to speak, and the requisites kept
changing.

That, and my lack of experience on the subject make all this far from an
optimal solution.

In the end it has been working for almost 3 years now.  This is an attempt to
document all the infrastructure and the bits and pieces used so I do not forget
about them and maybe it can be of use to somebody else.

## First steps and research

As I said, I knew nothing about this before tackling the project.  I have some
solid knowledge about networking, but I knew little about long (for me)
wireless links, antennas, propagation and a bunch of other stuff I never heard
of.  So I had to do some research.

If you want to do something like this, is better to plan ahead.  See what the
requisites are and start digging.

Some things to take into consideration are:

* Budget.  This is an important one in this scenario, as this is for personal
    use only.
* Distance between the endpoints of the link.  Modern hardware (more on my
    choice later), can easily cover 10km or maybe more, but read the
    manufacturer's datasheet and look for output power, antenna gain and
    sensitivity.  And always take their numbers with a grain of salt, as they
    are usually tested on ideal conditions you won't encounter.  You'll find
    later a way to calculate the ideal numbers to have an estimate.
* Obstacles.  There has to be perfect clear vision between endpoints.  Wireless
    communications, especially WiFi either on 2.4GHz or 5GHz, are very
    sensitive to obstacles.  Even partial cover can have a big impact on link
    quality.  And clear vision does not mean _"I can see a single point in the
    distance"_, there's this thing called [Fresnel zone][2], under some
    atmospheric conditions or spectrum saturation it will give you a lot of
    trouble.
* Materials.  Don't be cheap.  This will have to resists the outdoor conditions
    for as long as possible.
* Neighbours and regulations.  There's the legal part (RF regulations in your
    country and things like that) and the _"social"_ part, in this case my
    family does not live in detached houses but on apartments, so that has to
    be taken into consideration if there are any rules about this.
* Infrastructure.  And by that I mean all the necessary to be able to install
    the antennas, route the cables, install connectors, etc.  I'm not only
    talking about tools, but also access to the best spots to put the antennas,
    etc.
* Antenna location.  As a rule of thumb, the higher the better.  But this
    depends a lot on your particular situation.  It deserves some thought.
* Spectrum saturation.  Wifi is ubiquitous now.  That may be a challenge for
    any installation specially on urban areas.  Ideally, you should check how
    _crowded_ the spectrum is, but this is usually pretty difficult for
    amateurs without special equipment.  Some antennas have a built in spectrum
    analyser, but it may perform badly.

## Materials

This is a list of materials I choose and why I choose them.  It is short, as it
is really an easy installation.

### Antennas

I ended up using [Ubiquity PowerBeams][3] to create the 2 links.  Four in
total, 2 for each link.

I was looking for some reputable manufacturer trying to avoid problems in the
future.  Also, I wanted something as simple as possible.  This kind of antennas
have the _"emitter/receiver"_ and the antenna all in the same device.  So no
special connectors to be crimped, virtually no losses on cables, just an easy
[PoE][4] setup from the house to the rooftop.

Also, this antenna has an easy to setup web interface _and_ an SSH server that
leaves you in a busybox with some proprietary commands that are pretty handy
for automation and data collection.

There are newer models now and other manufacturers.  Do your research, read on
forums and all the usual stuff.  I can say those work for this setup with minor
issues.

If you know something about this subject you may be wondering why I did not use
something with a wider angle on the _"access point"_ side and use just 3
antennas instead of 4.  Truth is, I tried, but I had some problems with the 2nd
link giving poor performance.  Not being an expert on this I can only guess
that the partial obstruction on the LOS (line of sight) path for the second
link was the cause of the poor performance, specially on bad weather days (WiFi
is pretty sensitive to heavy rain) and episodes of spectrum saturation.

Creating a separate link with a dedicated pair of antennas improved the
situation a lot.

### Cables

As the antennas only need a network connection, we only need Ethernet cable.
Be sure that is CAT5e or better.

Always use cable rated for outdoor use.  Regular network cable will not last
long exposed to rain and the sun's UV.  I went for [this one][5] because it was
available at the time on Amazon.

### Connectors

Don't go extra cheap on this, but anything with reasonable quality will do
here.  The antennas are built in a way that the connectors are never exposed,
so this part is not that critical.

### Antenna pole and other hardware

I cannot say much about this.  What to buy here depends a lot on your
particular setup.  Remember that the higher the better for the antennas, and
remember wind is a thing ... you do not want it to fly away like a plastic bag.

## Build steps

This is a list of the build steps I took.  I started checking the list
mentioned on the [First steps](#First steps and research) section.
Specifically the location of the antennas and the clear line of sight.

I have to admit that I did a sloppy job on the second link, because I did not
know about the [Fresnel zone][2] back then, but there's some things you can do
to mitigate its effects.

### Calculate signal strength

There's a simple way to calculate the signal strength you should see on the
other side of the link (on ideal conditions).  This can be taken as a reference
to see if the setup is viable and what conditions and speed negotiation you can
expect between the 2 endpoints of the link.

The simplified formula to calculate the signal is:

```
emitterPower + emitterGain - signalLoss + receiverGain
```

I say this is the simplified formula, because it does not take into account
loses on cables and connectors, that's because I choose to use a _"all in one
packet"_ type of antenna, so that makes no sense in this case.  This is a huge
advantage for a beginner.  Also, because I only take into account the free
space loss and not any other kinds of loss, that would be a lot more difficult
to calculate.  That was sufficient for me anyway, as the conditions of line of
sight are pretty good.

To calculate signal loss, this is the formula:

```
loss = 20*log((4*π*d)/λ)
```

Being `d` the distance between the 2 endpoints in meters and `λ` the
wavelength, also in meters.  If you do not remember how to calculate the
wavelength from the frequency is just:

```
λ = C/f
```

Being `C` the speed of light in meters per second and `f` the frequency in
Hertz.

So, as an example, let's say I choose channel `137` which is `5685 MHz`, and
the 2 endpoints are 5.2km apart.  That gives us a signal loss of `121.85 dB`.

According to the antenna datasheet the transmission power is `5 dBm`, the gain
of the antenna is `25 dBi` (that's on average I guess across the whole range of
channels).  So putting all that together I should get on the other end `-66.86
dBm`.  This works both ways in this case, so now we have to check sensitivity.
Again according to the datasheet, there's no problem in any modulation
negotiation with this kind of signal strength (in theory, so to be on the safe
side add at least `-3 dB` to your results).

### Physical setup and alignment

With the theory calculations out of the way, knowing that was possible, the fun
part started, I had to get on the roof and install the antennas.

Of course I won't be saying much about this, as this is different for every
single installation.  Suffice to say, I had a _"pretty fun time"_ up on ladders
and climbing to places not meant to be climbed ...

Before securing the antenna to the pole in its final position it has to be
aligned.  I did this the best I could given the lack of specialised equipment.

On the datasheet there are radiation plots for the chosen model.  The principle
is simple, those are 2D representations of the radiation lobes of the antenna,
and the loss referred to the total gain.  So basically you want to point them
to one another as perfectly as possible, specially for parabolic antennas,
which have a very narrow beam.

Those radiation plots confused me at first as, in case of the PowerBeam there
are 4 of them "Vertical Azimuth", "Vertical Elevation", "Horizontal Azimuth" and
"Horizontal Elevation".  This did not make any sense for me in the beginning,
as the azimuth is an horizontal angle and elevation is a vertical one.  It
drove me nuts.  It turns out it refers to both polarisations of the signal that
those devices create ... Once you understand that is easy, they are just the
same measurement but times 2, one for each polarisation.

Once I knew  how much of an angle I had before starting to loose signal, and
with a bit of the good old trigonometry, I knew my margin of error when
pointing the antennas to each other.

I did this standing behind the antenna and looking as if my line of sight was
the beam.  With some fiddling, that should be enough for the horizontal
alignment.  For the vertical one, it was easier, because the error margin is
pretty big compared to the distance to the ground, even if you're on a tall
building (again, trigonometry, that angle at 5km is some meters ...).  Anyway
with the help of some online tool I could calculate that easily to make it as
precise as possible (search for "antenna downtilt calculator" on your favourite
search engine).

### Network diagram and configuration

With the antennas installed, it was time for some configuration.

This is a basic diagram of the network setup I came up with:

```
                                                               192.168.1.6/24
                                                                +--------+
                                                                | Bro.   |
                            192.168.1.2/24      192.168.1.4/24  | Router |
                            +---------+         +----------+    +--------+
                            | Antenna |         | Antenna  |   / 192.168.10.1/24
                        ----| AP1     |+++++++++| ST1      |---           
     192.168.1.1/24 ---/    +---------+         +----------+              
              +---------+                                                 
+---------+  -|  ISP    |                                                 
|Internet |-/ |  Router |                                                 
+---------+   +---------+                                                 
                 |  --\     +---------+         +-----------+             
                  \    --\  | Antenna |         |  Antenna  |             
                   \      --| AP2     |+++++++++|  ST2      |-\           
                   |        +---------+         +-----------+  -\ 192.168.1.7/24
                    \        192.168.1.3/24      192.168.1.5/24 +---------+
             +------------+                                     | Sis.    |
             | Rpi        |                                     | Router  |
             | Monitoring |                                     +---------+
             +------------+                                    192.168.10.1/24
             192.168.1.10/24
```

All are cable connections but the `++++` ones, which are the 5km links.

On the routers/APs at the end of the chain I used the same network segment for
both, as hey will be isolated and do NAT.  I did this because I have little
control over the ISP router.  It is _"reset to defaults"_ from time to time and
that caused me problems before.  So setting static routes would be a pain to
maintain.  That produces double NAT on my siblings', but that's a small price
to pay for having a stable setup.

Yes, I know that's a shitty thing to do for an ISP (they break your dhcp
reservations and port forwarding too ...), but most of the ISPs where I live
are the biggest idiots and do the dumbest stuff you can imagine, so that's not
even something for them.

The PowerBeams are configurable via a web interface that is pretty intuitive.
They can also be configured via an SSH access and editing a text file + some
commands.

Some things I did:

* Enable WDS (transparent bridge mode), so I could see the MAC addresses of all
  the chain from my monitoring station.  That helps on debugging if something
  network goes wrong.
* I enabled SNMP for monitoring, SSH server for access (with public keys) and
  NTP so the antennas have the right time (good for logs).
* All 4 antennas were set up on bridge mode.
* The ones connected to the ISP router were set up as "Access Point" and the
  other 2 as "Stations"
* The antenna startup wizard asks you for country location.  That's because
  they apply the necessary regulation restrictions automatically.  Do not cheat
  here, you can have problems with your local authorities.  Besides, if you do
  not have good signal within the power output regulations chances are you're
  doing something wrong or the conditions of line of sight, etc. are not really
  good, so it won't matter and you'll be breaking the law for nothing (and
  probably causing problems to other antennas and installations).

If you prefer the command line to configure the antennas, log into them via SSH
and edit the file `/tmp/system.cfg`.  Then save to `NVRAM` with the command
`cfgmtd -w`.  Then reset with `/usr/etc/rc.d/rc.softrestart force`.

I do not recommend that method at the beginning, until you get familiar with
all the options and configurations possible.  You can make a pretty big mess.

As I said earlier, those antennas have a sort of spectrum analyser you can use
to determine which channel is less busy.  It uses some java applet (yes, I know
...) and it has been broken in 2 occasions on some firmware updates.  But it
can be of assistance if your spectrum is really busy.


### Performance tests

There are 2 ways to easily test the throughput of the links.  The web interface
has a "speed test" built in.  You have to put the credentials of the other end
and it can test TX, RX or both.

The other way (that I like the most) is `iperf(1)`.  The antennas have installed
a basic implementation of that tool, so log into the antenna on the other end,
and use `iperf(1)` either as server or client to test both sides of the
communication.

Play a bit with the channel width.  More channel width allows for faster
transfer rates, but a narrow channel increases stability.

I ended up using `20 MHz` for one of the links and `10 MHz` for the other.
That last one is the one with less than ideal LOS situation.  In the end
reducing the channel width and choosing the least busy channel did the trick
and I could get a stable link.

In the end for the first link I get around `32Mbps` symmetrical.  The second
link is a lot more variable depending on the conditions and the interferences
from other stations.  I get up to `17Mbps` symmetrical, and is usually more
than `12Mbps`, but on worst case scenario it can get as low as `6Mbps`.  Which
is still enough to watch online videos at `1080p` with today's compressions and
is more than enough to do any kind of browsing, email and whatever ...  so
I guess is enough.

### Monitoring and management

For various reasons I wanted to monitor the whole thing.  My brother had some
network outages and I did not know why (I'm pretty sure they are related to
some firmware bug introduced on a recent update, but I have no proof).

My idea for this was to put a Raspberry PI on my parent's network that I could
connect to and install all the necessary software for monitoring.

As I said earlier, I have little control over the ISP router.  Also, I did not
want to setup a VPN at my house or something similar on a VPS ...  So I ended
up using [Zerotier][6] to create a _"local network"_ between one of my hosts at
my home office and the PI at my parent's.  This software creates an interface
on the device with a private range, just like a VPN.  The main difference in
this case is that the _server_ part is managed (you can host it yourself too)
and it uses some clever tricks to find the best path between to endpoints so
latency is always the least possible.  It falls back to relay servers if none
of the direct strategies work.  Besides, is quite easy to add or remove devices
to/from a given virtual network.

They have some [documentation][10] to make this process easy.

Having the monitoring PI on a local network segment, I could now use it as
a jump box to ssh into the antennas and routers (using `ProxyJump`), making
management easier.

In the end I decided to have some data collection and graphing and, after some
consideration, I choose [influxdb][7] + [telegraf][8] + [grafana][9].  That gives
me also alerts (more on that later).

InfluxDB for the database backend, telegraf as the _"agent collector"_ and
grafana for graphing tool.

I choose influxdb because is really [easy to setup][11] on the PI.  Check that
the retention is enabled so you do not fill up the little SD card on the PI.
Is also quite easy to [set up telegraf][12] and [grafana][13].

With that running I set up the InfluxDB data source on Grafana.  I used the
database named _"telegraf"_, which was automatically created by the telegraf
process as soon as it started collecting data.

Then I configured telegraf to get snmp data from the "Access point" antennas
and also from the routers at my siblings'.

To do this I had to add a file to the configuration folder
(something `/etc/telegraf/telegraf.d/snmp.conf`) with the snmp config
parameters:

```
[[inputs.snmp]]
  agents = [ "192.168.1.2", "192.168.1.3", "192.168.1.6", "192.168.1.7" ]
  version = 1
  community = "mycommunity"
  interval = "60s"
  timeout = "10s"
  retries = 3

  [[inputs.snmp.field]]
    name = "hostname"
    oid = "RFC1213-MIB::sysName.0"
    is_tag = true

  [[inputs.snmp.field]]
    name = "uptime"
    oid = "DISMAN-EXPRESSION-MIB::sysUpTimeInstance"

  # IF-MIB::ifTable contains counters on input and output traffic as well as errors and discards.
  [[inputs.snmp.table]]
    name = "interface"
    inherit_tags = [ "hostname" ]
    oid = "IF-MIB::ifTable"

    # Interface tag - used to identify interface in metrics database
    [[inputs.snmp.table.field]]
      name = "ifDescr"
      oid = "IF-MIB::ifDescr"
      is_tag = true
```

The info that comes from this is basically network traffic for all interfaces
and uptime.

I also set up telegraf to collect pings to the remote routers.  That gives me
info about the health of the link, and I based some alerts on that.

The needed config was:

```
[[inputs.ping]]
  ## List of urls to ping
  urls = ["192.168.1.6", "192.168.1.7"]

  ## Number of pings to send per collection (ping -c <COUNT>)
  count = 3
  ## Per-ping timeout, in s. 0 == no timeout (ping -W <TIMEOUT>)
  timeout = 1.0
```

And finally, I wanted to have some info the devices provide, but only through
some internal commands.  For instance, the number of connected devices.

There are 2 commands that run on those devices that provide some internal
information (like signal strength, connected devices, and much more).  They are
`mca-status` and `wstalist`.

It turns out telegraf can execute commands and store that as metrics data, no
problem.  The configuration looks like this:

```
[[inputs.exec]]
  ## Commands array
  commands = [ "/usr/local/bin/get_connected_devices.sh router1" ]
  interval = "300s"

  name_override = "conn_devices"
  tag_keys = [ "hostname" ]
  timeout = "5s"
  data_format = "json"
```

The script is this:

```
#!/bin/sh

set -eu

device=${1:-router1}
device_info=$(ssh "ubnt@$device" mca-status | tr -d "\r")
connected_devices=$(echo "$device_info" |grep wlanConnections| cut -d'=' -f 2)

printf '{"hostname": "%s", "devices": %d }' "$device" "$connected_devices"
```

It outputs some JSON that telegraf understands.

After this it was just a matter of setting up some grafana dashboards to see
what I wanted to see.  I think there is enough information on the internet on
how to do that, so I won't be explaining it here.

As I mentioned my brother was having some outages that I still cannot explain.
They are fixed rebooting the "access point" part of the link (I'm pretty sure
they would go away simply kicking out the client, but I could not be bothered
in looking how to do that programatically ...).

So I thought on automating the reboot process as a mitigation for the
inconveniences it produces.  I set up an alert on grafana for the ping metric
that, when it triggers calls a webhook.

I did it that way because I wanted to be notified and also automatically take
action based on those alerts.  The setup I came up with may seem a bit
complicated, but it works with simple tools and it has been on service for some
months now.

For the webhook, I found [this][14], which is meant to be a sort of gateway
from webhook to XMPP.  It only accepts grafana calls but it can be adapted
pretty easily.

I did [some modifications][15] to not only send an xmpp message, but also to write
a flag file on disk on a specified folder if it gets an alert with a specific
string on it.  Then, there's a cron job running that checks for those flags
and, if it finds any, executes the script of the same name and deletes the flag
on success.  All pretty simple to do with shell script.

On the ping alert case, the shell scripts just connect to the "access point"
antenna and perform a `reboot(8)`.

With that done, outages do not last more than 5 minutes, and they are pretty
rare anyway.  So I think is a good solution until the day I take the time to
dig into it (if I ever do it ...).

I also created a custom handler with super simple payload, so I could use it
from other scripts (not necessarily from this project) to just be notified via
xmpp.

## Conclusion

And that's the whole setup.  Without using anything too complicated or
expensive I could connect those isolated flats, have some insight on what
happens on the network, have alerts on the most interesting metrics and even
automate responses if I need to.

I hope this may serve as a source of ideas for similar projects.

[1]: https://en.wikipedia.org/wiki/Point-to-point_(telecommunications)
[2]: https://en.wikipedia.org/wiki/Fresnel_zone
[3]: https://www.ui.com/airmax/powerbeam/
[4]: https://en.wikipedia.org/wiki/Power_over_Ethernet
[5]: https://www.konigelectronic.com/computer/networking/network-cable-reel-cat5e-futp-100-m-black-solid-55896639
[6]: https://www.zerotier.com/
[7]: https://www.influxdata.com/time-series-platform/
[8]: https://www.influxdata.com/time-series-platform/telegraf/
[9]: https://grafana.com/
[10]: https://zerotier.atlassian.net/wiki/spaces/SD/pages/8454145/Getting+Started+with+ZeroTier
[11]: https://docs.influxdata.com/influxdb/v1.7/introduction/installation/
[12]: https://docs.influxdata.com/telegraf/v1.11/introduction/installation/
[13]: https://grafana.com/docs/installation/debian/
[14]: https://github.com/opthomas-prime/xmpp-webhook/
[15]: https://git.e1e0.net/xmpp-webhook/log.html