[HN Gopher] I Wrote an Activitypub Server in OCaml: Lessons Lear... ___________________________________________________________________ I Wrote an Activitypub Server in OCaml: Lessons Learnt, Weekends Lost Author : gopiandcode Score : 92 points Date : 2023-04-23 10:56 UTC (1 days ago) (HTM) web link (gopiandcode.uk) (TXT) w3m dump (gopiandcode.uk) | throwaway290 wrote: | There's also LitePub, though development seems stalled (?) | yawaramin wrote: | Was it developed at all? I'm not seeing any business logic in | the repo: https://hacktivis.me/git/litepub.social/files.html | riffic wrote: | link for reference: https://litepub.social/ | SideburnsOfDoom wrote: | My question is this: if I was to try to hack up an ActivityPub | server in my platform of choice, how would I know how compliant | it is? Is there any compliance test suite to verify this? | | "Try and load it up in a client app" seems suboptimal. | | "load it up and see" attitude is part of what made parsing and | renderings HTML so hairy, and compliance test suites helped. | mariusor wrote: | There was a suite of tests, that sadly fell to bitrot. One of | the developers in the community created a parallel application | that could test implementations, but then this too ended up | unmaintained[1]. | | [1] https://github.com/go-fed/testsuite | nologic01 wrote: | I found the post well written and informative. Though I am | clueless about OCaml it feels as this would be useful for anybody | working on a new server implementation in any language ecosystem | as it highlights what needs to be done and potential bottlenecks. | | As for the activitypub spec and the currently popular | implementations it doesnt take long exposure to the fediverse to | realise there are some rough edges and historical accidents (e.g | mastodon being actually the defacto interpretation of the | standard). Imho now that there is substantial more mindshare | devoted to decentralized social it would be opportune to revisit | these things and if needed revise before they get backed in. | mikece wrote: | Im looking forward to a solid ActivityPub server written in Go or | Rust that can run on modest hardware/small resource Docker hosts. | knjllppppp wrote: | I've had a go at doing it in Go and the ActivityPub spec is so | loosely defined that it's just a real challenge if you intend | to actually unmarshal the JSON you receive | | It's not completely impossible but you have to be okay with | discarding a lot of unknown options or essentially reverse | engineering the objects used by the servers you are federated | with | | That's not to say it's impossible, I was able to crawl the | network successfully, but it hints at the reason that Mastodon | and Pleroma use dynamic languages | | I'd be very interested to see a flexible/complete AP | implementation in any statically typed language | | Fwiw WriteFreely is implemented in Go with go-fed but -- | correct me if I'm wrong -- that library seemed more limited to | me than what Pleroma and Mastodon support | zimpenfish wrote: | > I'd be very interested to see a flexible/complete AP | implementation in any statically typed language | | Try Honk[1] or GotoSocial[2]? | | [1] https://humungus.tedunangst.com/r/honk [2] | https://github.com/superseriousbusiness/gotosocial | mariusor wrote: | Neither is flexible, nor strives for completion. They are | both implementations that try to map the ActivityPub | vocabulary on an existing web-application domain. | | They are not ActivityPub servers, but web-apps that use the | ActivityPub vocabulary to federate, which is what I meant | in the grandparent post when I mentioned the classic | mistake of ActivityPub implementers. :D | mariusor wrote: | I'm surprised you didn't find my library because I managed to | create a statically typed vocabulary library for Go that maps | the specification verbatim: https://pkg.go.dev/github.com/go- | ap/activitypub#Object | | It wasn't easy indeed, and it locked me out of some options | to support execution time vocabulary extensions, but hey, it | works and it's relatively easy to use. | knjllppppp wrote: | I'm surprised I didn't find it, too! My google-fu must be | getting rusty. Thanks for the link, I'll have to have a | deep dive when I get the time :) | mariusor wrote: | I hope you get to it. The library itself contains more | than just the vocabulary part and I would be glad of more | eyeballs on the problem. :D | yawaramin wrote: | Here's the implementation described in OP: | https://github.com/Gopiandcode/ocamlot | | OCaml is a statically-typed language. It falls somewhere | between Go and Haskell on the spectrum of type 'strength'. | SideburnsOfDoom wrote: | > Im looking forward to a solid ActivityPub server written in | Go or Rust that can run on modest hardware/small resource | | The "Lightweight" GoLang ActivityPub server is GoToSocial | https://github.com/superseriousbusiness/gotosocial | | The better-known lightweight servers are Pleroma and fork | Akkoma, written in Elixir https://akkoma.dev/AkkomaGang/akkoma/ | | Some of this info I got via: | https://social.treehouse.systems/@ariadne/110226729543740723 | zimpenfish wrote: | There's also Honk[1] which is written in Go but has slightly | wacky source and doesn't support the Mastodon API (but does | provide an inbuilt web UI.) | | [1] https://humungus.tedunangst.com/r/honk | bgorman wrote: | Ocaml code compiles to native binaries, just like Go/Rust. | yawaramin wrote: | Why specifically those languages? Others can also target modest | hardware/small resource Docker hosts. | mariusor wrote: | Well, there is one already as the reference implementation for | a suite of libraries I wrote. You can find it at | https://github.com/go-ap/fedbox. (Contributions welcome) | zimpenfish wrote: | Does it only support C2S as the API? Are there any clients | which actually support C2S rather than the Mastodon API? | mariusor wrote: | It does support server to server, but currently it does not | play well with Mastodon due to its limited support of HTTP | Signatures algorithms. I didn't get bothered enough by this | yet to actually fix it on my side. | | And there are a number of clients that work with this | specific brand of client to server ActivityPub but I wrote | all of them. The one that can be seen on the internet is a | link aggregator similar to HN and (old) reddit, you can | find a demo instance at https://brutalinks.tech. | zimpenfish wrote: | > It does support server to server | | Ah, sorry, I should have said "client API" rather than | just API there. | mariusor wrote: | ActivityPub has a section which deals with how clients | and servers should communicate with each other (called | Client to Server - C2S - in the spec). So it's the same | vocabulary and operations with slightly different side | effects, but most servers don't implement it because it's | not "specified enough". That's why developers generally | just use the Mastodon API. | jeroenhd wrote: | I think there is (was?) an attempt to rewrite Mastodon into | Rust but I haven't heard much about it. | | A single user Mastodon instance takes an unreasonable amount of | resources. I don't know if it's just because of Ruby (Gitlab | has the same problem, so it might just be) or because everyone | is wasting money on expensive servers, but an RSS feed on | steroid shouldn't take this much RAM. | [deleted] | WorldMaker wrote: | Mastodon itself is designed for "flagship scale" (given lead | developers run mastodon.social and mastodon.online, two of | the biggest instances and the most "dogfooding" two | instances) so it bundles an entire cluster of services: | background processors (sidekiq), caches (redis, I think?), | database server (postgres), optional ElastiCache, and more. I | don't know how much Ruby itself accounts for expensive | overhead, but just running all of those other things on a | single server vertically for a single user instance is a | massive, expensive overhead. (It's clearly built for | horizontal scale where your background services and caches | and database servers may all be different clusters of | VMs/servers over vertical stack efficiency when "scaled down" | from the "natural" "mastodon.social scale" that Mastodon is | most optimized for.) | | It's an interesting optimization problem reminder that | scaling factors are different for different needs and not | everything scales cleanly to every use case. A single user | instance _should_ be able to use a much smaller vertical | stack, but scaling down from a wide horizontal stack is not | necessarily the best or cheapest place to start when building | something like that. | | (There are some interesting projects I've seen to build | single user instances with much less overhead, shorter | vertical stacks. I'm curious to see where those efforts go. | In my own usage of Mastodon my "single user" instance gets | the benefits of the horizontal scaling Mastodon was built for | because my hosting provider does a bunch of work to make sure | that they take advantage of that economy of scale to host | many small instances for cheaper than trying to run small | instances in one-off VMs.) | mxuribe wrote: | There are several websites out there which hope to list many | ActivityPub servers (and clients) in many (programming) | languages, and other implemtnation aspects...Like, here's an | oldie but goodie website: | https://fediverse.party/en/miscellaneous/ ...There are other | wbsites of course. | | Just select your desired lang. and review! Now, of course, it | might be early days for some languages (e.g. for Rust, | etc.)...But, one reason why some languages are used over | others...is due to ease of deploying on VPCs and VPC-like hosts | (...historically the land that php ruled ;-) | | Enjoy, and I hope you find what you're looking for! | erwinh wrote: | A bit off-topic but the post title will probably attract relevant | people. | | What are the thoughts on OCaml on HN? | WorldMaker wrote: | I haven't used OCaml much directly, but F# is a common enough | tool in my toolbelt at this point. My experience of F# is that | overall it's a good language family. The access to .NET's | standard library (the BCL) and easy interop with C# are the | biggest reasons F# is the tool I more often reach to as it | already fits the ecosystem most of my other development is in, | but I'd love to work more directly with OCaml should the need | arise. | cccbbbaaa wrote: | It replaced Python for everything longer than a couple hundred | of lines long for me. Fast language, fast compile times, | clean(-ish) syntax, strong typing system, good ecosystem, and | now multicore support? Yes please! | | I must be more nuanced, though: existing libraries in opam are | generally very, very good (I really like cmdliner), but many | things may be missing. There is no alternative to Django, for | instance. No serious IDE, except emacs. The standard library | was so lacking that there is at least an alternative. The | situation improved, but there's still missing stuff compared to | Python. | mattpallissard wrote: | > There is no alternative to Django, for instance. | | https://aantron.github.io/dream/, which is new and used by | ocaml.org as well as OP | | > No serious IDE, except emacs | | and vim, and visual studio, and whatever else supports the | LSP protocol via https://github.com/ocaml/ocaml-lsp | | > The standard library was so lacking that there is at least | an alternative. | | While janestreet does have an publish their own stdlib, I | personally try to stick to the stdlib whenever possible. Not | to knock janestreet. I'm glad they're around and have | contributed a bunch. | | But overall I agree with you. It's been my favorite language | to write in for years now. You can't just reach for off-the- | shelf libraries for every little thing. Although the ones | that do exist tend to be written halfway decently. | amelius wrote: | Do you make GUIs in OCaml, and which libraries do or would | you use? | | And how about scientific computing (SciPy), deep learning | (PyTorch etc.), or computational geometry (Shapely etc.)? | yw3410 wrote: | GUIs are a PITA like in most languages. | | I think most people use something which binds to gtk (such | as lablgtk) or Qt. | | For scientific computing there is Owl, but I haven't used | it personally. | amelius wrote: | Hmm, I'll stick with Python for now. | | The ecosystem of libraries is just too good. | | Perhaps if OCaml made it very easy to interoperate with | Python, I could give it a chance. | yw3410 wrote: | There is pyml for interopt but I've never used it. | still_grokking wrote: | I've heard good things about OCaml in general. | | But "no serious IDE, except emacs" is a non-starter imho, if | it's true. | | They should really invest in this. Otherwise the language | won't attract any professional developers in the large. | yw3410 wrote: | Vscode works since there is an LSP server. | zem wrote: | one of my favourite languages! not so much for its (excellent) | technical qualities, but just as a matter of personal taste - | it joined ruby and racket in a short list of languages that | just feel nice to program in. (i suspect D would join that list | too but despite being interested in it for a while i haven't | yet had a compelling project to use it for.) | dahwolf wrote: | Saw some comments on the protocol being fluffy and typical | implementations resource hungry. This is an interesting guy to | follow: | | https://universeodon.com/@supernovae | | He's the admin of universeodon, a mastodon instance with 13K MAU. | He recently shared that in a month's time, 3TB of text was | transferred just in ActivityPub events. Images a multiple of it. | I don't know what the bill is, but I was pretty shocked by the | stats...for "just" 13K users. | | And the cruel thing is that it still doesn't work properly. | Likes/boosts and replies do not properly synchronize. | mariusor wrote: | The author makes the basic mistake of most of the people | implementing ActivityPub services: they want to map the logic of | an existing type of web application and contort existing domain | objects into encoding/decoding to an "impractically large number" | of options. That happens because they want two things in one: a | server and a client. | | The ActivityPub specification needs to be read with a goal | similar to an email server in mind. It should do one thing: | receive JSON-LD objects in inbox, process them according to the | specification, and(maybe) store them on disk. | | The idea of "users", "friends", "posts", "feeds" etc, are | concepts that belong to the clients on top of this server, not in | the server itself. | | This separation between clients and server will also allow better | interop/graceful degradation of object types that the | client/server don't specifically understand. | JustSomeNobody wrote: | Do you know of a small sample project that does this as an | example? | mariusor wrote: | There are no "small sample" projects as far as I know. But if | you look in my profile (or other comments in this thread) I | did develop a server which only does ActivityPub, client to | server and server to server. | cratermoon wrote: | OK, but for someone who wants to build a useful tool that does | what the author wants, "interacting with the Fediverse", such | as federating with Mastodon, how useful is doing that one | thing? | mariusor wrote: | If you want to create one just for yourself, sure. If you | want to create something for the rest of the world, probably | not very much. | | I get the "scratch your own itch" mentality, but not if you | kneecap all efforts that try to build on top of it. :D | jeroenhd wrote: | It depends on your goal. If your server is just a tool you | use, you can ignore lot of concepts. There is no local | timeline, there are no users, all follows belong to a single | user, etc. | | I can't find the link but a while back there was a post on | the front page about how to get a findable, read only | ActivityPub profile by just uploading some static JSON files. | Not exactly a Twitter competitor, but you don't need much to | start exchanging messages. | mdasen wrote: | I believe you're looking for this: | https://blog.joinmastodon.org/2018/06/how-to-implement-a- | bas... | cratermoon wrote: | I did that myself. It's quite a distance from passively | accepting requests to interacting with the Fediverse. | still_grokking wrote: | This comment raised a whole bunch of red flags for me. | | Fist and foremost: Saying that something is like an email | server translates for me into "this is an under- and over- | specified swamp at the same time, full of quirks, and actually | not implementable in any reasonable way". Because that's what | email is. I almost can't think of a greater horror than writing | an email server from scratch... | | I don't know enough about ActivityPub to judge whether it's | really like email. I would strongly hope it isn't, as otherwise | it would be a tech you should probably better never touch as a | developer. | | The next thing is: If an ActivityPub server only receives and | sends some opaque BLOBs what's the whole point of it? | | But when it's not about opaque BLOBs you need to map the | structures in the spec to proper types in a statically typed | languages as you can't manipulate them otherwise in any | meaningful way. If it's not possible to do that because the | spec is vague and/or there is no coherent data model behind it | that would be just another reason to not touch this tech. | Nobody needs the next underspecified, stringly-typed "email". | | I really hope I'm reading this wrong! | WorldMaker wrote: | > If an ActivityPub server only receives and sends some | opaque BLOBs what's the whole point of it? | | There's still a difference between "try to black-box the | incoming data as much as possible" and "treat the incoming | data as opaque BLOBs and assume". The data is mostly JSON-LD | which is a far cry from "binary large objects". It is always | going to be "semi-transparent" as it will always be JSON. | Whether or not you like the "-LD" extensions to JSON (they | are heavy, they do have a lot of RDF baggage you may not | desire), they give you a bunch of guaranteed "baseline | schema" for the JSON objects that you can use for static | typing that might be "good enough" for a lot of "meaningful | manipulations" (such as following links to pick up related | objects; LD => linking data) and that is all easily | transparent. | | A lot of the schemas beyond "LD" in ActivityPub are | client/application-specific beyond most of the JSON-LD basics | and should be easy to treat as a black box unless doing | client/application-specific tasks. That's not necessarily | "stringly typed", it's kind of a classic "serialization | onion": The server at best needs to know that it is JSON and | it may have JSON-LD metadata for relevant related linked | objects (and a few other metadata fields common to | "introspection", similar to "headers"). The client can dig | deeper and know it is not just "any" JSON object but a more | specific schema for a given class of thing the client cares | about. | still_grokking wrote: | To be honest, this sounds indeed quite like the mess that | email is. | | If the server isn't just a "dumb 'BLOB' storage" it will | need to handle application logic (sooner or later, as this | is actually what servers are for)... | | But given that the application logic seems to be mostly | unspecified, kind of wild west, where every client | application can do whatever it thinks it's users like, this | will unavoidably end in all the problems you have with | email, where the server needs to know about all the | specific details, quirks, and idiosyncrasies of every | client ever built. | | The whole concept reads like an implementation of | "'Postel's Law' fallacy". | mariusor wrote: | It sounds like you made your mind up. I hope that you'll | decide to stop wasting your time by contributing to this | thread. | still_grokking wrote: | I'm just reflecting on what I've heard here so far. | | I didn't made up my mind, as for that I would need to | study the _primary sources_ myself. Talk is cheap. Even | here on HN. | | But I start to get a kind of picture. And it doesn't look | pretty to be honest. That's kind of discouraging and sad. | | That's not my fault. I'm just trying to understand what | people here are saying. | mariusor wrote: | Thank you for articulating this very well, I was getting a | bit frustrated at OPs contrarianism. :) | mariusor wrote: | The email comparison helps people to understand the | directional way ActivityPub works, I don't know enough about | email (whichever of SMTP or IMAP/POP3/samd you consider that | to be) to make a comparison at protocol level. | | > If [...]receives and sends some opaque BLOBs what's the | whole point of it? | | There are some rules about how to have side effects for said | blobs. Some of the blobs themselves have side effects. That's | mostly what ActivityPub is: rules about how to distribute the | blobs in the federated context, rules to what to do with the | blobs when they reach your servers (when coming from other | servers, or directly from clients). | | The vocabulary that ActivityPub is based upon, is another | whole specification, called ActivityStreams, and which didn't | originate in the W3C group. This vocabulary has three (*main) | types of objects: Activities - which provide the backbone of | ActivityPub (Like, Follow, Create, Update), Actors - | basically different types of users (these are the entities | that operate the activities) and, Objects - whatever the | Activities operate on. | MuffinFlavored wrote: | > JSON-LD | | https://json-ld.org/ for anybody else not super familiar | iudqnolq wrote: | (My only knowledge of activitypub comes from reading this | article.) | | To receive JSON-LD messages don't you need to send follow | requests? And to do that don't you need to deal with the fact | the spec is too complicated and most servers implement | inconsistent parts of it? | vidarh wrote: | To receive JSON-LD messages, someone needs to send them to | you. Sending follow requests is perhaps the easiest way to do | that, but those follow requests do not need to be initiated | by the same code that hosts the inbox. | | The point is there are several potentially independent layers | and modules there: The message pump itself at least can be | implemented separately from the decoding of individual | message types, and separate from managing followers and | following, the same way e.g. a mail server knows nothing | about how to follow mailing lists, or decoding email messages | past the header. | still_grokking wrote: | That sounds like a mess. | | Reading through the other comments here it seems that the | spec is in fact a mess... | [deleted] ___________________________________________________________________ (page generated 2023-04-24 23:00 UTC)