Rethinking Moderation: Interview with Alexander Cobleigh
=====================

As we all know, moderation is one of the internet's most difficult issues. 
We've all seen the pattern: toxic behaviors manifest on our favourite 
social media platforms & metastasize through them. We complain, moderators 
struggle to keep up, & bad actors evade protective measures. Folks then 
leave the platform or strike an uneasy peace as the toxicity becomes 
endemic.

Alexander Cobleigh thought there had to be a better way. So he created a 
new kind of moderation system: TrustNet 
[https://cblgh.org/articles/trustnet.html]. Rusty chats with him about 
this fascinating development. 

Alex can be found on Mastodon: @cblgh@merveilles.town

---

QUESTION: Thanks for agreeing to do this email interview bit. You're doing 
some fascinating work & I wanted a chance to pick your brain a bit more.

ANSWER: Of course! I'm glad that you think so & honoured that you asked.

Q: What got you interested in developing TrustNet as a project? What 
motivated you to take on the work yourself?

A: It all started with Cabal, an open source peer-to-peer chat platform I 
have developed with some friends, needing moderation capabilities. We 
reached a point where we couldn't in good faith keep developing features 
without the ability to remove potential malicious actors from a chat. At 
the same time, I had also reached the point of my university studies that 
my Master's thesis was coming up. So, really, the way it got started was 
me thinking for a few months on how Cabal could have a moderation system 
better than the individualistic naive solution of "well every single 
person needs to individually block trolls", as well as wanting to work on 
something worthwhile for my Master's thesis.

Q: The system seems like it may require a complex infrastructure. What 
have been some of the challenges in trying to implement such a system?

A: It doesn't really require any complex infrastructure, really. What you 
need is the following: some way for people to assign trust to others (i.e. 
an interface), a way to store those trust statements, & a way to transmit 
stored trust statements between participants

It would, for example, be possible to use TrustNet in a fork of Mastodon, 
where a modified UI could let people of the Mastodon instance assign each 
other as subjective moderators. The server hosting the instance would 
receive and keep track of the trust issued by each person. A given 
participant could then have a moderated experience through the moderators 
they choose and trust, which could be different for different people 
(pending on who they trust).</p>

Of course, the current implementation is built with peer-to-peer systems 
like Cabal or Secure Scuttlebutt in mind, but server-based paradigms can 
just as well also make use of TrustNet.

The difficulties in developing TrustNet were in trying to represent 
behaviour that made sense socially, while also making use of the system's 
transitivity. The trickiest bit was coming up with a good solution on how 
to partition the calculated subjective trust ranks where basically each 
person in your trust graph is ordered according to their calculated trust 
rank. The problem with the rankings is where to make a cut such that 
everybody above the cut are regarded as trusted, and everyone below it as 
not trusted (e.g. their moderation actions won't be automatically applied, 
due to being too far away from you).

Q: In our era of tech dystopia, any kind of algorithmic ranking is 
frightening to a lot of folks. What distinguishes a trust-based moderation 
system from systems that assign a kind of "social credit score"?

A: The way I interpret algorithmic ranking with your mention of a social 
credit score is from the following point of view of the problem: If 
everybody assigns everyone else a trust score, then you have a popularity 
contest where the people that manage to get the most trust, well, win (and 
people without trust fall outside of society).

What this describes, however, is a *reputation* system. Reputation is an 
aggregate, derived from the crowd. It can be used to inform trust, but it 
is not trust. Reputation is "objective"; the reputation score for one 
person looks the same no matter from which person's perspective you are 
taking. Trust, on the other hand, is subjective.  My trusted peers are 
different from your trusted peers, which are different from a third 
person's trusted peers.

Algorithmic ranking typically builds on machine learning, where you 
increasingly dig yourself into you-shaped hole that is impossible to get 
out of from the perspective of the ranking algorithm. The trust-based 
approach I present in TrustNet is kind of a parallel route one can go down 
to tackling similar problems, but where the end user is in control, 
instead.

Q: I think one of the most fascinating aspects of this system is the 
notion of the Trust Area, which as you state in your blog post, "captures 
the context that the trust is extended within; for outside of the realm of 
computers, we trust each other varying amounts depending on a given 
domain." This makes total sense, but it's something I rarely see 
considered in online platforms. What inspired that idea for you?</p>

A: I wanted to avoid conflating trust within different areas, so that 
TrustNet could be used for different purposes within the same chat system. 
You might have one trust area, let's call it 'general purpose', that 
controls whether people can send DMs to you, whether their profile images 
should be visible, and whether the images they post should be 
automatically shown or not. In the same system, you might want another 
trust area to control who can hide or remove users and posts on your 
behalf. If we consider these two trust areas, we can kind of get a feel 
for the 'general purpose' trust area being less restrictive than the 
'moderation' trust area.</p>

After reading the computational trust literature more, my hunch on the 
notion of a trust area was verified by it having been present in various 
papers and research, albeit termed differently.</p>

Q: I was wondering if you could talk a bit more about how you imagine an 
average user interacting with this kind of system? How do you imagine them 
interfacing with this system?</p>

A: If we consider the moderation trust area, I basically think that 
friends in a chat system like Secure Scuttlebutt would assign trust for 
each other to delegate blocking responsibility.

It would look something like, going to one of your friends's profiles, 
clicking a dropdown titled "Moderation Similarity" and picking one of the 
options: None, Some overlap, Similar, Identical. Each option would 
represent an increasing trust weight, which essentially controls the 
impact of a trusted person's recommendations (that is, people whom *they* 
trust for moderation). You don't need to do this for that many people for 
it to start having an effect, maybe like 4-5 friends and that's all you'll 
ever need.

On SSB, I have a feel for whom of my friends have a similar blocking 
policy as my own (some may be too eager to block, for example).

Q: What will you be working on next with TrustNet?

A: Ah, that's a great question. The first thing that comes to mind is to 
integrate it into Secure Scuttlebutt, where there is currently a lack of 
any kind of delegated moderation system. The community has been very 
receptive and encouraging of my thesis in general, and integrating it with 
SSB, in particular.

I would also like to experiment with it further in Cabal, playing around 
with a kind of mechanism for allowing greater privileges to people who are 
trusted by my friends (or their friends). What I mean by that is, for 
example, using TrustNet to limit which peers I will allow image posts or 
avatars from. So, if someone is trusted at all from my perspective, my 
cabal client would download and show their avatars, whereas for untrusted 
peers a placeholder would be shown, instead. This limits the attack 
surface of malicious actors like trolls or brigades.

Finally, it would also be fun to experiment more playfully with TrustNet 
and see what comes out of that :)