[HN Gopher] I Wrote an Activitypub Server in OCaml: Lessons Lear...
       ___________________________________________________________________
        
       I Wrote an Activitypub Server in OCaml: Lessons Learnt, Weekends
       Lost
        
       Author : gopiandcode
       Score  : 92 points
       Date   : 2023-04-23 10:56 UTC (1 days ago)
        
 (HTM) web link (gopiandcode.uk)
 (TXT) w3m dump (gopiandcode.uk)
        
       | throwaway290 wrote:
       | There's also LitePub, though development seems stalled (?)
        
         | yawaramin wrote:
         | Was it developed at all? I'm not seeing any business logic in
         | the repo: https://hacktivis.me/git/litepub.social/files.html
        
         | riffic wrote:
         | link for reference: https://litepub.social/
        
       | SideburnsOfDoom wrote:
       | My question is this: if I was to try to hack up an ActivityPub
       | server in my platform of choice, how would I know how compliant
       | it is? Is there any compliance test suite to verify this?
       | 
       | "Try and load it up in a client app" seems suboptimal.
       | 
       | "load it up and see" attitude is part of what made parsing and
       | renderings HTML so hairy, and compliance test suites helped.
        
         | mariusor wrote:
         | There was a suite of tests, that sadly fell to bitrot. One of
         | the developers in the community created a parallel application
         | that could test implementations, but then this too ended up
         | unmaintained[1].
         | 
         | [1] https://github.com/go-fed/testsuite
        
       | nologic01 wrote:
       | I found the post well written and informative. Though I am
       | clueless about OCaml it feels as this would be useful for anybody
       | working on a new server implementation in any language ecosystem
       | as it highlights what needs to be done and potential bottlenecks.
       | 
       | As for the activitypub spec and the currently popular
       | implementations it doesnt take long exposure to the fediverse to
       | realise there are some rough edges and historical accidents (e.g
       | mastodon being actually the defacto interpretation of the
       | standard). Imho now that there is substantial more mindshare
       | devoted to decentralized social it would be opportune to revisit
       | these things and if needed revise before they get backed in.
        
       | mikece wrote:
       | Im looking forward to a solid ActivityPub server written in Go or
       | Rust that can run on modest hardware/small resource Docker hosts.
        
         | knjllppppp wrote:
         | I've had a go at doing it in Go and the ActivityPub spec is so
         | loosely defined that it's just a real challenge if you intend
         | to actually unmarshal the JSON you receive
         | 
         | It's not completely impossible but you have to be okay with
         | discarding a lot of unknown options or essentially reverse
         | engineering the objects used by the servers you are federated
         | with
         | 
         | That's not to say it's impossible, I was able to crawl the
         | network successfully, but it hints at the reason that Mastodon
         | and Pleroma use dynamic languages
         | 
         | I'd be very interested to see a flexible/complete AP
         | implementation in any statically typed language
         | 
         | Fwiw WriteFreely is implemented in Go with go-fed but --
         | correct me if I'm wrong -- that library seemed more limited to
         | me than what Pleroma and Mastodon support
        
           | zimpenfish wrote:
           | > I'd be very interested to see a flexible/complete AP
           | implementation in any statically typed language
           | 
           | Try Honk[1] or GotoSocial[2]?
           | 
           | [1] https://humungus.tedunangst.com/r/honk [2]
           | https://github.com/superseriousbusiness/gotosocial
        
             | mariusor wrote:
             | Neither is flexible, nor strives for completion. They are
             | both implementations that try to map the ActivityPub
             | vocabulary on an existing web-application domain.
             | 
             | They are not ActivityPub servers, but web-apps that use the
             | ActivityPub vocabulary to federate, which is what I meant
             | in the grandparent post when I mentioned the classic
             | mistake of ActivityPub implementers. :D
        
           | mariusor wrote:
           | I'm surprised you didn't find my library because I managed to
           | create a statically typed vocabulary library for Go that maps
           | the specification verbatim: https://pkg.go.dev/github.com/go-
           | ap/activitypub#Object
           | 
           | It wasn't easy indeed, and it locked me out of some options
           | to support execution time vocabulary extensions, but hey, it
           | works and it's relatively easy to use.
        
             | knjllppppp wrote:
             | I'm surprised I didn't find it, too! My google-fu must be
             | getting rusty. Thanks for the link, I'll have to have a
             | deep dive when I get the time :)
        
               | mariusor wrote:
               | I hope you get to it. The library itself contains more
               | than just the vocabulary part and I would be glad of more
               | eyeballs on the problem. :D
        
           | yawaramin wrote:
           | Here's the implementation described in OP:
           | https://github.com/Gopiandcode/ocamlot
           | 
           | OCaml is a statically-typed language. It falls somewhere
           | between Go and Haskell on the spectrum of type 'strength'.
        
         | SideburnsOfDoom wrote:
         | > Im looking forward to a solid ActivityPub server written in
         | Go or Rust that can run on modest hardware/small resource
         | 
         | The "Lightweight" GoLang ActivityPub server is GoToSocial
         | https://github.com/superseriousbusiness/gotosocial
         | 
         | The better-known lightweight servers are Pleroma and fork
         | Akkoma, written in Elixir https://akkoma.dev/AkkomaGang/akkoma/
         | 
         | Some of this info I got via:
         | https://social.treehouse.systems/@ariadne/110226729543740723
        
           | zimpenfish wrote:
           | There's also Honk[1] which is written in Go but has slightly
           | wacky source and doesn't support the Mastodon API (but does
           | provide an inbuilt web UI.)
           | 
           | [1] https://humungus.tedunangst.com/r/honk
        
         | bgorman wrote:
         | Ocaml code compiles to native binaries, just like Go/Rust.
        
         | yawaramin wrote:
         | Why specifically those languages? Others can also target modest
         | hardware/small resource Docker hosts.
        
         | mariusor wrote:
         | Well, there is one already as the reference implementation for
         | a suite of libraries I wrote. You can find it at
         | https://github.com/go-ap/fedbox. (Contributions welcome)
        
           | zimpenfish wrote:
           | Does it only support C2S as the API? Are there any clients
           | which actually support C2S rather than the Mastodon API?
        
             | mariusor wrote:
             | It does support server to server, but currently it does not
             | play well with Mastodon due to its limited support of HTTP
             | Signatures algorithms. I didn't get bothered enough by this
             | yet to actually fix it on my side.
             | 
             | And there are a number of clients that work with this
             | specific brand of client to server ActivityPub but I wrote
             | all of them. The one that can be seen on the internet is a
             | link aggregator similar to HN and (old) reddit, you can
             | find a demo instance at https://brutalinks.tech.
        
               | zimpenfish wrote:
               | > It does support server to server
               | 
               | Ah, sorry, I should have said "client API" rather than
               | just API there.
        
               | mariusor wrote:
               | ActivityPub has a section which deals with how clients
               | and servers should communicate with each other (called
               | Client to Server - C2S - in the spec). So it's the same
               | vocabulary and operations with slightly different side
               | effects, but most servers don't implement it because it's
               | not "specified enough". That's why developers generally
               | just use the Mastodon API.
        
         | jeroenhd wrote:
         | I think there is (was?) an attempt to rewrite Mastodon into
         | Rust but I haven't heard much about it.
         | 
         | A single user Mastodon instance takes an unreasonable amount of
         | resources. I don't know if it's just because of Ruby (Gitlab
         | has the same problem, so it might just be) or because everyone
         | is wasting money on expensive servers, but an RSS feed on
         | steroid shouldn't take this much RAM.
        
           | [deleted]
        
           | WorldMaker wrote:
           | Mastodon itself is designed for "flagship scale" (given lead
           | developers run mastodon.social and mastodon.online, two of
           | the biggest instances and the most "dogfooding" two
           | instances) so it bundles an entire cluster of services:
           | background processors (sidekiq), caches (redis, I think?),
           | database server (postgres), optional ElastiCache, and more. I
           | don't know how much Ruby itself accounts for expensive
           | overhead, but just running all of those other things on a
           | single server vertically for a single user instance is a
           | massive, expensive overhead. (It's clearly built for
           | horizontal scale where your background services and caches
           | and database servers may all be different clusters of
           | VMs/servers over vertical stack efficiency when "scaled down"
           | from the "natural" "mastodon.social scale" that Mastodon is
           | most optimized for.)
           | 
           | It's an interesting optimization problem reminder that
           | scaling factors are different for different needs and not
           | everything scales cleanly to every use case. A single user
           | instance _should_ be able to use a much smaller vertical
           | stack, but scaling down from a wide horizontal stack is not
           | necessarily the best or cheapest place to start when building
           | something like that.
           | 
           | (There are some interesting projects I've seen to build
           | single user instances with much less overhead, shorter
           | vertical stacks. I'm curious to see where those efforts go.
           | In my own usage of Mastodon my "single user" instance gets
           | the benefits of the horizontal scaling Mastodon was built for
           | because my hosting provider does a bunch of work to make sure
           | that they take advantage of that economy of scale to host
           | many small instances for cheaper than trying to run small
           | instances in one-off VMs.)
        
         | mxuribe wrote:
         | There are several websites out there which hope to list many
         | ActivityPub servers (and clients) in many (programming)
         | languages, and other implemtnation aspects...Like, here's an
         | oldie but goodie website:
         | https://fediverse.party/en/miscellaneous/ ...There are other
         | wbsites of course.
         | 
         | Just select your desired lang. and review! Now, of course, it
         | might be early days for some languages (e.g. for Rust,
         | etc.)...But, one reason why some languages are used over
         | others...is due to ease of deploying on VPCs and VPC-like hosts
         | (...historically the land that php ruled ;-)
         | 
         | Enjoy, and I hope you find what you're looking for!
        
       | erwinh wrote:
       | A bit off-topic but the post title will probably attract relevant
       | people.
       | 
       | What are the thoughts on OCaml on HN?
        
         | WorldMaker wrote:
         | I haven't used OCaml much directly, but F# is a common enough
         | tool in my toolbelt at this point. My experience of F# is that
         | overall it's a good language family. The access to .NET's
         | standard library (the BCL) and easy interop with C# are the
         | biggest reasons F# is the tool I more often reach to as it
         | already fits the ecosystem most of my other development is in,
         | but I'd love to work more directly with OCaml should the need
         | arise.
        
         | cccbbbaaa wrote:
         | It replaced Python for everything longer than a couple hundred
         | of lines long for me. Fast language, fast compile times,
         | clean(-ish) syntax, strong typing system, good ecosystem, and
         | now multicore support? Yes please!
         | 
         | I must be more nuanced, though: existing libraries in opam are
         | generally very, very good (I really like cmdliner), but many
         | things may be missing. There is no alternative to Django, for
         | instance. No serious IDE, except emacs. The standard library
         | was so lacking that there is at least an alternative. The
         | situation improved, but there's still missing stuff compared to
         | Python.
        
           | mattpallissard wrote:
           | > There is no alternative to Django, for instance.
           | 
           | https://aantron.github.io/dream/, which is new and used by
           | ocaml.org as well as OP
           | 
           | > No serious IDE, except emacs
           | 
           | and vim, and visual studio, and whatever else supports the
           | LSP protocol via https://github.com/ocaml/ocaml-lsp
           | 
           | > The standard library was so lacking that there is at least
           | an alternative.
           | 
           | While janestreet does have an publish their own stdlib, I
           | personally try to stick to the stdlib whenever possible. Not
           | to knock janestreet. I'm glad they're around and have
           | contributed a bunch.
           | 
           | But overall I agree with you. It's been my favorite language
           | to write in for years now. You can't just reach for off-the-
           | shelf libraries for every little thing. Although the ones
           | that do exist tend to be written halfway decently.
        
           | amelius wrote:
           | Do you make GUIs in OCaml, and which libraries do or would
           | you use?
           | 
           | And how about scientific computing (SciPy), deep learning
           | (PyTorch etc.), or computational geometry (Shapely etc.)?
        
             | yw3410 wrote:
             | GUIs are a PITA like in most languages.
             | 
             | I think most people use something which binds to gtk (such
             | as lablgtk) or Qt.
             | 
             | For scientific computing there is Owl, but I haven't used
             | it personally.
        
               | amelius wrote:
               | Hmm, I'll stick with Python for now.
               | 
               | The ecosystem of libraries is just too good.
               | 
               | Perhaps if OCaml made it very easy to interoperate with
               | Python, I could give it a chance.
        
               | yw3410 wrote:
               | There is pyml for interopt but I've never used it.
        
           | still_grokking wrote:
           | I've heard good things about OCaml in general.
           | 
           | But "no serious IDE, except emacs" is a non-starter imho, if
           | it's true.
           | 
           | They should really invest in this. Otherwise the language
           | won't attract any professional developers in the large.
        
             | yw3410 wrote:
             | Vscode works since there is an LSP server.
        
         | zem wrote:
         | one of my favourite languages! not so much for its (excellent)
         | technical qualities, but just as a matter of personal taste -
         | it joined ruby and racket in a short list of languages that
         | just feel nice to program in. (i suspect D would join that list
         | too but despite being interested in it for a while i haven't
         | yet had a compelling project to use it for.)
        
       | dahwolf wrote:
       | Saw some comments on the protocol being fluffy and typical
       | implementations resource hungry. This is an interesting guy to
       | follow:
       | 
       | https://universeodon.com/@supernovae
       | 
       | He's the admin of universeodon, a mastodon instance with 13K MAU.
       | He recently shared that in a month's time, 3TB of text was
       | transferred just in ActivityPub events. Images a multiple of it.
       | I don't know what the bill is, but I was pretty shocked by the
       | stats...for "just" 13K users.
       | 
       | And the cruel thing is that it still doesn't work properly.
       | Likes/boosts and replies do not properly synchronize.
        
       | mariusor wrote:
       | The author makes the basic mistake of most of the people
       | implementing ActivityPub services: they want to map the logic of
       | an existing type of web application and contort existing domain
       | objects into encoding/decoding to an "impractically large number"
       | of options. That happens because they want two things in one: a
       | server and a client.
       | 
       | The ActivityPub specification needs to be read with a goal
       | similar to an email server in mind. It should do one thing:
       | receive JSON-LD objects in inbox, process them according to the
       | specification, and(maybe) store them on disk.
       | 
       | The idea of "users", "friends", "posts", "feeds" etc, are
       | concepts that belong to the clients on top of this server, not in
       | the server itself.
       | 
       | This separation between clients and server will also allow better
       | interop/graceful degradation of object types that the
       | client/server don't specifically understand.
        
         | JustSomeNobody wrote:
         | Do you know of a small sample project that does this as an
         | example?
        
           | mariusor wrote:
           | There are no "small sample" projects as far as I know. But if
           | you look in my profile (or other comments in this thread) I
           | did develop a server which only does ActivityPub, client to
           | server and server to server.
        
         | cratermoon wrote:
         | OK, but for someone who wants to build a useful tool that does
         | what the author wants, "interacting with the Fediverse", such
         | as federating with Mastodon, how useful is doing that one
         | thing?
        
           | mariusor wrote:
           | If you want to create one just for yourself, sure. If you
           | want to create something for the rest of the world, probably
           | not very much.
           | 
           | I get the "scratch your own itch" mentality, but not if you
           | kneecap all efforts that try to build on top of it. :D
        
           | jeroenhd wrote:
           | It depends on your goal. If your server is just a tool you
           | use, you can ignore lot of concepts. There is no local
           | timeline, there are no users, all follows belong to a single
           | user, etc.
           | 
           | I can't find the link but a while back there was a post on
           | the front page about how to get a findable, read only
           | ActivityPub profile by just uploading some static JSON files.
           | Not exactly a Twitter competitor, but you don't need much to
           | start exchanging messages.
        
             | mdasen wrote:
             | I believe you're looking for this:
             | https://blog.joinmastodon.org/2018/06/how-to-implement-a-
             | bas...
        
               | cratermoon wrote:
               | I did that myself. It's quite a distance from passively
               | accepting requests to interacting with the Fediverse.
        
         | still_grokking wrote:
         | This comment raised a whole bunch of red flags for me.
         | 
         | Fist and foremost: Saying that something is like an email
         | server translates for me into "this is an under- and over-
         | specified swamp at the same time, full of quirks, and actually
         | not implementable in any reasonable way". Because that's what
         | email is. I almost can't think of a greater horror than writing
         | an email server from scratch...
         | 
         | I don't know enough about ActivityPub to judge whether it's
         | really like email. I would strongly hope it isn't, as otherwise
         | it would be a tech you should probably better never touch as a
         | developer.
         | 
         | The next thing is: If an ActivityPub server only receives and
         | sends some opaque BLOBs what's the whole point of it?
         | 
         | But when it's not about opaque BLOBs you need to map the
         | structures in the spec to proper types in a statically typed
         | languages as you can't manipulate them otherwise in any
         | meaningful way. If it's not possible to do that because the
         | spec is vague and/or there is no coherent data model behind it
         | that would be just another reason to not touch this tech.
         | Nobody needs the next underspecified, stringly-typed "email".
         | 
         | I really hope I'm reading this wrong!
        
           | WorldMaker wrote:
           | > If an ActivityPub server only receives and sends some
           | opaque BLOBs what's the whole point of it?
           | 
           | There's still a difference between "try to black-box the
           | incoming data as much as possible" and "treat the incoming
           | data as opaque BLOBs and assume". The data is mostly JSON-LD
           | which is a far cry from "binary large objects". It is always
           | going to be "semi-transparent" as it will always be JSON.
           | Whether or not you like the "-LD" extensions to JSON (they
           | are heavy, they do have a lot of RDF baggage you may not
           | desire), they give you a bunch of guaranteed "baseline
           | schema" for the JSON objects that you can use for static
           | typing that might be "good enough" for a lot of "meaningful
           | manipulations" (such as following links to pick up related
           | objects; LD => linking data) and that is all easily
           | transparent.
           | 
           | A lot of the schemas beyond "LD" in ActivityPub are
           | client/application-specific beyond most of the JSON-LD basics
           | and should be easy to treat as a black box unless doing
           | client/application-specific tasks. That's not necessarily
           | "stringly typed", it's kind of a classic "serialization
           | onion": The server at best needs to know that it is JSON and
           | it may have JSON-LD metadata for relevant related linked
           | objects (and a few other metadata fields common to
           | "introspection", similar to "headers"). The client can dig
           | deeper and know it is not just "any" JSON object but a more
           | specific schema for a given class of thing the client cares
           | about.
        
             | still_grokking wrote:
             | To be honest, this sounds indeed quite like the mess that
             | email is.
             | 
             | If the server isn't just a "dumb 'BLOB' storage" it will
             | need to handle application logic (sooner or later, as this
             | is actually what servers are for)...
             | 
             | But given that the application logic seems to be mostly
             | unspecified, kind of wild west, where every client
             | application can do whatever it thinks it's users like, this
             | will unavoidably end in all the problems you have with
             | email, where the server needs to know about all the
             | specific details, quirks, and idiosyncrasies of every
             | client ever built.
             | 
             | The whole concept reads like an implementation of
             | "'Postel's Law' fallacy".
        
               | mariusor wrote:
               | It sounds like you made your mind up. I hope that you'll
               | decide to stop wasting your time by contributing to this
               | thread.
        
               | still_grokking wrote:
               | I'm just reflecting on what I've heard here so far.
               | 
               | I didn't made up my mind, as for that I would need to
               | study the _primary sources_ myself. Talk is cheap. Even
               | here on HN.
               | 
               | But I start to get a kind of picture. And it doesn't look
               | pretty to be honest. That's kind of discouraging and sad.
               | 
               | That's not my fault. I'm just trying to understand what
               | people here are saying.
        
             | mariusor wrote:
             | Thank you for articulating this very well, I was getting a
             | bit frustrated at OPs contrarianism. :)
        
           | mariusor wrote:
           | The email comparison helps people to understand the
           | directional way ActivityPub works, I don't know enough about
           | email (whichever of SMTP or IMAP/POP3/samd you consider that
           | to be) to make a comparison at protocol level.
           | 
           | > If [...]receives and sends some opaque BLOBs what's the
           | whole point of it?
           | 
           | There are some rules about how to have side effects for said
           | blobs. Some of the blobs themselves have side effects. That's
           | mostly what ActivityPub is: rules about how to distribute the
           | blobs in the federated context, rules to what to do with the
           | blobs when they reach your servers (when coming from other
           | servers, or directly from clients).
           | 
           | The vocabulary that ActivityPub is based upon, is another
           | whole specification, called ActivityStreams, and which didn't
           | originate in the W3C group. This vocabulary has three (*main)
           | types of objects: Activities - which provide the backbone of
           | ActivityPub (Like, Follow, Create, Update), Actors -
           | basically different types of users (these are the entities
           | that operate the activities) and, Objects - whatever the
           | Activities operate on.
        
         | MuffinFlavored wrote:
         | > JSON-LD
         | 
         | https://json-ld.org/ for anybody else not super familiar
        
         | iudqnolq wrote:
         | (My only knowledge of activitypub comes from reading this
         | article.)
         | 
         | To receive JSON-LD messages don't you need to send follow
         | requests? And to do that don't you need to deal with the fact
         | the spec is too complicated and most servers implement
         | inconsistent parts of it?
        
           | vidarh wrote:
           | To receive JSON-LD messages, someone needs to send them to
           | you. Sending follow requests is perhaps the easiest way to do
           | that, but those follow requests do not need to be initiated
           | by the same code that hosts the inbox.
           | 
           | The point is there are several potentially independent layers
           | and modules there: The message pump itself at least can be
           | implemented separately from the decoding of individual
           | message types, and separate from managing followers and
           | following, the same way e.g. a mail server knows nothing
           | about how to follow mailing lists, or decoding email messages
           | past the header.
        
             | still_grokking wrote:
             | That sounds like a mess.
             | 
             | Reading through the other comments here it seems that the
             | spec is in fact a mess...
        
               | [deleted]
        
       ___________________________________________________________________
       (page generated 2023-04-24 23:00 UTC)