Get started with ActivityPub
This is the written version of the talk by Nilesh Trivedi presented at Git Commit Show 2020. In this, he talks about ActivityPub, a decentralized social networking protocol.
About the speaker
Nilesh Trivedi is an open-source enthusiastic. He enjoys programming and Ruby in particular. He also likes to learn new skills, and that has been true for more than two decades. In this talk, he will be focusing on how to get started with activity pub, decentralized social networking protocol.
What is Activity Pub?
Activity Pub is a relatively unknown protocol, but it will become as important as something like HTTP. It’s a W3C (World Wide Web Consortium) recommended protocol for decentralized social networking. Last year, Twitter ran into controversy because they were censoring or refusing to censor certain accounts. This angered a lot of Indian users, after which they discovered this app called Mastodon. Mastodon is supposed to be an alternative to Twitter, but with a difference. Twitter is a centralized system, meaning there is a central entity where all the accounts and tweets are maintained. Governments can go in and force them to delete accounts and can decide their policies. Sometimes there are a few disagreements on what the moderation policy should be. Mastodon got coverage on the BBC and all other portals, which resulted in its popularity. Thousands of people discovered it and started switching to it.
So, what is this mastodon?
Mastodon is an app that implements the activity pub protocol.
Activity pub achieves the same paradigm in social networking as emails do for messages. Email was invented in the 1970s, and it’s been 50 years since it stood the test of time. If a protocol has been in place for 50 years, it implies that the architecture behind it is really strong, and the reason is that email is a federated protocol.
A federated protocol is an open protocol that allows lots of small networks to talk to one another without giving up control completely.
For example, you can have your own email server and your friend can have her own email server. You can create accounts from your server, and she can create an account from her server. Both of you will still be able to communicate with each other. This has not been the case with Facebook, YouTube, Twitter, etc. Centralization leads to censorship, filter bubbles, moderation issues, disputes, abuse, and a lot of other things. You will not always agree with Twitter’s policies. Therefore, you should be able to choose your server and your instance, and this is already working for email.
So, Activity Pub is an initiative that tries to bring the same paradigm to social networking. The protocol does not differentiate and does not give any privileged status to any of the servers. Activity Pub is a simple protocol, essentially HTTP plus JSON. In 2018, it reached the final recommendation status and has multiple implementations. The schema and the objects that they define are quite flexible, which means that you can achieve the same sort of social networking broadcast with a follow model where users can follow each other. If a user broadcasts something, it gets delivered to all of their followers. People can discover each other’s content. The same paradigm is at work on Twitter, YouTube, and Facebook. Multiple apps are there which implement activity pub as a protocol, like mastodon, peerTube (an alternative to YouTube), etc.
So, let’s try to go into concrete details of why it matters, not just as a user but as a developer.
Let’s make a case. It might be worth a while to implement activity pub in your projects to become part of this larger ecosystem, called the Fediverse.
Fediverse is a network of applications that all talk to each other via this shared protocol. For example, there is an activity called “Create a Note,” which is like posting a tweet. Here they have objects defined for follow and unfollow, and groups, and each instance can determine its own rules. But the discourse schema remains shared.
There’s a standard called “activity stream 2.0.” It is nothing but a combination of HTTPS and JSON with linked data that defines our extendible vocabulary of activities. Activity streams and protocols will define several models.
At the core level, you have objects. So, just like a tweet is an object, a video is an object. They have defined the schemas for this. So, for example, this object is called a “note,” and it can be anything between a tweet and a blog post because there is no character limit, and it can have a material, so it can have an image or media attached to it. It can have a URL to go with it and then use objects. Actors can have activities around objects. So, actors are defined to be very flexible. It doesn’t have to be a human. It can be a company. It can be an app. It can be a service. And when an actor operates on an object, it produces an activity. For example, I create a tweet, so I’m the actor. Creating is the activity, and the note is the object. In addition to this, they also defined collections, where you can group things.
In the case of email, they will hold your inbox on a cloud. The rest of the world is allowed to put things in your inbox. You can retrieve messages from that inbox using any client app, such as Gmail’s front end or Outlook.
However, the other way around, you don’t have an outbox that remains in the cloud. When this is about email, activity pub simply adds that capability. So, along with our inbox, the outbox also remains on a server of your choice. You can place items in your outbox, and the rest of the world can read them. Just adding this simple missing piece opens up a lot of capability.
You can implement things like broadcast. You can put a video in your outbox that becomes your YouTube channel. You can put an audio file in your outbox and it will become your podcast. By allowing this sort of double operation by you and the rest of the world on your own server of your choice, we can implement several applications along with our shared identity, shared schema, and shared protocol of communication. Consider activity pub to be similar to email, except that your outbox is also in the cloud and you can choose your own server.
How do you implement activity pub into your application?
So if you want to implement this in your own applications, these are primarily four things that you need to implement.
For giving an example or reference, we will be using LearnAwesome.
LearnAwesome is an open-source project and an alternative to Goodreads that implements the activity pub protocol. However, no content is posted on LearnAwesome itself; it is just a repository of links. But with this high-quality metadata and that collection of links, there’s a social network. Every user on LearnAwesome has an activity pub profile, which can be followed by anybody else. So when a person follows someone here on LearnAwesome, it maintains a record here, a list of all its followers on Activity Pub. So it just needs a background job to integrate it seamlessly.
So given a name in a domain, how do I find out where the rest of the API calls are and that they are implemented in something called “webFinger”? As a result, you implement this path in your app so that anyone can query that specific user name (@domain) in your app. WebFinger is a very extensible protocol. You can return a phone number or email address. In our case, we will return the URL of the user’s actor, inbox, and outbox. As a result, any activity pub client will make this request. They will figure out the actor’s URL, inbox, and outbox, and it’ll show you the sample responses.
For every user, I maintain activity mapping from my ID to Activity Pub except by replacing “-” with “_” .
Then the JSON will give the user actor URL.
If you follow this actor’s URL i.e. (href),it will give you the details something like this.
The most important things are the inbox URL, the outbox URL, and the public key. A public key will be used to send messages.
This is the JSON that was built for the webfinger query and the returned actor object.
When it comes to inbox, as shown in that diagram, the rest of the world will be posting to a user’s inbox. So when the server receives that request, it checks if it is a verified call. And in this case, I’m responding to the following request. So this is if someone wants to follow me. And in that case, I will create an activity pub for the record in my database, and then I’ll invoke a background job on it to send an accepted response.
In the verifying part, there is a header called “HTTPS Signature,” and it’s supposed to be mapped out to match the hash of certain parts. So the parts for key, ID headers, and signature, and using those parts, if that signature matches, then it can be sure that the request is actually coming from the server.
Then there is the method of post activity pub which will inform the servers of every follower that is in the database if something goes wrong at the developer’s end. It creates a signature header and then does an HTTP post. It takes a document, adds those headers with a signature, and does an HTTP post. This happens in sync mode, in a background job.
So, the main point is that the activity pub began as a substitute for building a microblogging network. But now it’s a distributed pub/sub protocol for the entire Web. Apps can subscribe to each other across domains, across owners, across servers, and something interesting can happen. It could be a public broadcaster for all the subscribers. And apps can follow each other, and the systems can follow each other. AWS can become a publisher of activity-based pub events. All your monitoring dashboards can become activity pub clients. So it opens up a number of possibilities.
As developers, we should look at the activity pub as a generic, powerful glue. It allows us to build far more interesting applications. And to them, we are also solving problems like centralization and then censorship and filter bubbles and other things.
For more such talks, attend Git Commit Show live. The next season is coming soon.