this post was submitted on 29 Dec 2024
76 points (95.2% liked)

Fediverse

29634 readers
721 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to [email protected]!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 2 years ago
MODERATORS
 

Let's say we have lemmy instances A, B, C.

alice from A makes a post "Hello, world" to B. What happens? How is it processed on servers A, B, C and how do users from A, B, C receive her post?

all 46 comments
sorted by: hot top controversial new old
[–] [email protected] 54 points 1 month ago* (last edited 1 month ago) (3 children)

alice from A makes a post “Hello, world” to B

Alice can't make a post to B, but I assume you mean a community on B, let's call it foo. When Alice makes a post it first goes through A's local API and creates the local (and canonical) version of Alice's post. Once A has finished processing Alice's post, it will create an ActivityPub representation of Alice's post to send to B.

ActivityPub is basically a bunch of assumptions laid on top of JSON. An ActivityPub 'file' can be divided into broadly 3 types, Object, Activity and actors.[^note1] These types then have subtypes; for example, both Alice and foo are actors but Alice is a Person while foo is a Group.

A second important assumption of ActivityPub is the concept of inboxs and outboxs, but, for Lemmy, only inboxs matter. An inbox is just a URL where Lemmy can send activities and it's something all actors have.

So when instance A is finished processing Alice's post, it will turn it into a Page object, wrap that in a Create activity and send it foo's inbox.

Round about what the JSON would look like

{
  "@context": [
    "https://join-lemmy.org/context.json",
    "https://www.w3.org/ns/activitystreams"
  ],
  "actor": "https://a/u/alice",
  "type": "Create",
  "to": ["https://www.w3.org/ns/activitystreams#Public"],
  "cc": ["https://b/c/foo"],
  "id": "https://a/activities/create/19199919009100",
  "object": {
    "type": "Page",
    "id": "https://a/post/1",
    "attributedTo": "https://a/u/alice",
    "to": [
      "https://b/c/foo",
      "https://www.w3.org/ns/activitystreams#Public"
    ],
    "audience": "https://b/c/main",
    "name": "Hello world",
    "attachment": [],
    "sensitive": false,
    "language": {
      "identifier": "en",
      "name": "English"
    },
    "published": "2024-12-29T15:10:51.557399Z"
  }
}


.

Now instance B will then receive this and do the same kind of processing A did when Alice created the post via the API. Once it has finished, it will turn the post back into a Page but this time wrap it in an Announce activity. B will then look at all the actors that follow the foo (i.e. are subscribed to it) and send this Announce to all of their inboxs. Assuming a user on instance C follows foo, it will receive this Announce and process it like A and B before it, creating the local version of Alice's post.

Edit: I made a small mistake, I said that foo wrapped the Page in an Announce, when it actually wraps the Create in an Announce.

[^note1]: Technically, Activity and actors are themselves objects, but they're treated differently. There's also Collection's which are their own type, but Lemmy doesn't really utilise them.

[–] [email protected] 5 points 1 month ago (2 children)

Thank you, very clear.

So B will list all users subscribed to foo, look at their instances, and send the update to them.

I assume that if someone from a new instance (D) subscribes to foo, then D will need to request all the old posts from foo, since they weren't pushed to D?

[–] [email protected] 10 points 1 month ago

I assume that if someone from a new instance (D) subscribes to foo, then D will need to request all the old posts from foo, since they weren’t pushed to D?

Lemmy is pretty bad about backfilling content. Communities do have outboxs, but these only list the last 50 posts and you can't get the vote or comments on any of them. See GitHub issues #5283, #3448 and #2004.

[–] [email protected] 3 points 1 month ago

ActivityPub works like a magazine subscription. They don't send you back issues for subscribing.

[–] [email protected] 1 points 1 month ago (1 children)

Does ActivityPub really send copies of all activities to www.w3.org?

[–] [email protected] 6 points 1 month ago (1 children)

No, the https://www.w3.org/ns/activitystreams#Public is just there to indicate that it's ok for receiving instances to display this publicly, nothing actually gets sent to it. See the spec for more details.

[–] [email protected] 3 points 1 month ago (2 children)

Why not a binary flag or something? Is it just to avoid making it a formal part of the protocol?

[–] [email protected] 4 points 1 month ago (2 children)

I actually don't know, you'd need to ask someone privy to design decisions made with ActivityPub, like Prodromou or Lemmer-Webber. It's definitely not to avoid making it part of the protocol, because it already is (see the link in the last comment).

[–] [email protected] 2 points 1 month ago* (last edited 1 month ago)

Thanks—I meant “formal” as in “formal grammar”, not that it wasn’t described in the published protocol. As in, there’s nothing in the protocol’s explicit form that distinguishes between this implied meaning and a real extra recipient—so it simplifies the parsing but adds an extra post-parsing step.

[–] [email protected] 0 points 1 month ago (1 children)

It's because it's JSON-LD.

[–] [email protected] 1 points 1 month ago

What about JSON-LD makes it so they have to include the "this is public" declaration in the to field instead of having an as:public property on the object? (I don't know a whole lot about JSON-LD or RDF more broadly)

[–] [email protected] 1 points 1 month ago* (last edited 1 month ago) (1 children)

Because it is JSON-LD and that's how JSON-LD works. It's an extensible format. Similar to XML namespaces.

[–] [email protected] 1 points 1 month ago (1 children)

Why does a mastodon user get completely different profiles and history when viewed from different lemmy instances? They look like 2 completely different users when compared except for having the same @address. In fact this makes them immune from moderation if they comment from a different instance than the mod is on.

[–] [email protected] 4 points 1 month ago (1 children)

Mastodon doesn't have Group support (fep-1b12), so when they reply to a post, they don't send it to the community's inbox (only to the inbox of the Person they're replying to), thus breaking Lemmy's model of federation.

[–] [email protected] 2 points 1 month ago

Okay, thanks.

[–] [email protected] 10 points 1 month ago (4 children)

The easiest way to explain it is that the instances have no native ability to crawl other instances for communities or content. For all intents and purposes, a fresh Lemmy server is on an island and all other instances are their own island until someone builds a bridge to them.

The ability of an instance to receive content is dependent on the subscriptions users add to the database. Once the instance is aware of these other places it will begin checking them for updates and you'll see them regularly whether you interact with them or not.

This goes completely against what the average person is expecting and causes a lot of confusion.

[–] [email protected] 3 points 1 month ago* (last edited 1 month ago) (1 children)

Piefed instances now do have a form of this for instance admins to populate new instances.

Admins can:
-pull the lemmyverse data and subscribe to a bunch of communities at once
or
-target a single lemmy or mbin instance, get the list of communities that instance hosts, and subscribe to a bunch of communities on that instance.

Both have some tunable settings to allow admins control over how many communities are followed.

Its not an end-user thing, but it should help with setting up new instances and them not being so 'empty'.

edit: typo

[–] [email protected] 2 points 1 month ago

That sounds like a much better implementation of community discovery.

[–] [email protected] 2 points 1 month ago (2 children)

Does that mean that an "all" view is "onl"y all of the subscriptions/places people from my server have?

That's quite interesting.

And thanks!

[–] [email protected] 8 points 1 month ago

Does that mean that an "all" view is "onl"y all of the subscriptions/places people from my server have?

Correct.

[–] [email protected] 5 points 1 month ago (1 children)

Note that many instances either have a bot subscribed to other communities to force federation, or use something like https://lemmy-federate.com/

[–] [email protected] 1 points 1 month ago (1 children)

Note that many instances either have a bot subscribed to other communities to force federation, or use something like https://lemmy-federate.com/

FWIW this approach can be helpful but is flawed in its own ways.

Firstly, since not all instances participate you still aren't getting the "complete" fediverse so to speak. This becomes less of an issue as more instances join the bot program, but it's another step that roadblocks what should be an easy and organic process.

Secondly, the bot can pose a potential security risk depending on how it's configured. If you use it to federate in both directions you're subject to malicious actors spinning up tons of new communities on instances that don't restrict user registration. This will in turn hammer the database an instance uses for EVERYTHING and eventually causes slow downs, crashes, etc. The solution to this is to only seed your communities outwardly but if everyone only does that the bot is rather useless...

I don't have a solution for any of this, I'm just pointing out some rather frustrating problems this platform has in its current state.

[–] [email protected] 1 points 1 month ago (1 children)

Well, you can always defederate if an instance starts abusing it. Not that much different to the normal flow, really.

[–] [email protected] 1 points 1 month ago (1 children)

you can always defederate if an instance starts abusing it

Sure, but potentially after at least one of the instances subscribed to the bot goes down and someone realizes what's happening. It's incredibly easy to overwhelm a small server's database just by subscribing to a lot of communities the normal way. The difference here is potentially any instance federating the bot in both directions is susceptible to this.

Not that much different to the normal flow, really.

The impact across the fediverse vs just one instance would be the main difference. Plenty of people are using that bot having no real idea of what it's doing.

[–] [email protected] 1 points 1 month ago (1 children)

That's just a part of the learning process, IMO. My instance crashed many times, I've fixed it every time and now it's better than before. And I don't think I've had my last fuck up with the instance.

[–] [email protected] 1 points 1 month ago

And that's fine for you, I'm not knocking the experimenting and learning process. That was the whole reason I spun up an instance myself.

What I'm saying is that to the other users that would be impacted by these things, it sucks. People are patient to a point but the fediverse has a lot of odd quirks that make it more difficult than it should be to use for a lot of people. Things have gotten better in the last year or so but it still feels like we're asking people to know more than they should have to just to figure out that Lemmy isn't empty. Many people will get frustrated and leave long before they start making excuses for a site they don't know anything about.

It's easy to sit around proclaiming that reddit sucks but the fact of the matter is that it's easy to use and everything they have to offer is covered under one domain. Again, I don't have the solution to these things for Lemmy, but we can't deny that this platform is harder to use than most and a lot of people aren't going to handle that well.

[–] [email protected] 2 points 1 month ago

instances have no native ability to crawl other instances for communities or content

That's not quite true. They don't do it automatically or routinely, but a user can cause a server to read a post from another server by putting its URL into the search box. This can be useful for an end user to manually address a federation glitch.

Here's a concrete example. I was trying to post a comment via lemmy.world, but lemmy.world sits behind Cloudflare, and Cloudflare flagged its content as potentially malicious. I then posted that comment via my own Mastodon server, but push federation to lemmy.world also failed, for the same reason. I could, however cause lemmy.world to pull the comment using the search.

[–] [email protected] 2 points 1 month ago (1 children)

This goes completely against what the average person is expecting and causes a lot of confusion.

But this is only true if the user looks at the All feed, correct?

[–] [email protected] 1 points 1 month ago (2 children)

But this is only true if the user looks at the All feed

It impacts what content is available to users at all. The All feed is just the visual representation of what's actively federating.

Let's say you join a new instance for whatever reason with no outside awareness of how the fediverse works. If you try to search the instance for "sportball" and get zero results the natural assumption is going to be that there are no communities and no interest in that topic. The user has no idea that lemmyserver5000.com has a sportball community with thousands of users because no one with those interests ever did the work to get the content flowing in a way that they could access it intuitively. It's a poor design IMO.

The reason I brought it up has more to do with starting a new instance or using a smaller instance. Communities that the instance isn't aware of (via someone previously subscribing) won't show up at all which causes places to appear non-existent or dead by default. Someone trying a federating website for the first time isn't going to know this, so to them, that's all the fediverse has to offer.

[–] [email protected] 2 points 1 month ago

OK, I see that problem. In fact I remember having the same issue myself. (Presumably this will create a secondary confusion problem for "All" subscribers, who will see the content of their feed gradually expand without explanation as other users subscribe to other foreign servers, correct? Whatever, I don't care much about them, someone who subscribes to "All" apparently doesn't know what they want anyway!)

So the optimal solution here would be for each instance to preemptively connect to a whitelist of known foreign communities, perhaps? Or maybe each instance could regularly ping other servers in order to update its search database with popular communities.

[–] [email protected] 1 points 1 month ago (1 children)

It's a poor design if what you want to do is emulate a centralized social media service.

But maybe we should stop trying to do that.

[–] [email protected] 1 points 1 month ago

Maybe.

But I'd counter that it's prohibitive to growth. People aren't used to turning up at a domain name only to find out 90% of the content can't be accessed without jumping through a bunch of hoops.

[–] [email protected] 8 points 1 month ago (1 children)
  • A makes a post to B
  • B federates that post to all instances that have at least 1 user subbed to the community of the post

All users from all instances get the post from their home instance.

[–] [email protected] 4 points 1 month ago* (last edited 1 month ago) (2 children)

Thanks but this is quite high-level.

Okay, so Alice makes a request to A. A makes a request to B. B makes requests to all other instances.

If you get posts from your home instance, does it mean that all instances duplicate the same database?

[–] [email protected] 3 points 1 month ago (1 children)

They don't duplicate the database in a technical sense, but when things go right, they each have a copy of the same post and comment text, and the same votes.

[–] [email protected] 2 points 1 month ago* (last edited 1 month ago) (1 children)

Do you mean that the database is not identical, but still duplicates all data, basically? (you said "they each have a copy", I assume it's persistent on disk). So if we have 100 lemmy instances, they all save the same post.

[–] [email protected] 3 points 1 month ago

Correct. Each server that shows the post to its users stores a copy of the post. It does not necessarily store attached media (IIRC Mastodon usually does and Lemmy usually hotlinks media).

[–] [email protected] 1 points 1 month ago

If you get posts from your home instance, does it mean that all instances duplicate the same database?

Ur home instance only has a database of posts that are on a community that at least 1 user has subscribed to.

[–] [email protected] 7 points 1 month ago
[–] [email protected] 4 points 1 month ago

Think of it this way, when you make a post that post will be automatically distributed by your server to everyone who is a subscriber, depending on the type of platform that could mean subscriber to the community, or it could mean to your user account in the case of things like Mastodon. When the post is received it will be copied and re-hosted on all the servers which have subscribers.

Exceptions to this happening are in the case of a user being banned or server being defederated, in which case the request is denied and the post isn't re-hosted by the instance with the ban or defederation against the user or server who made the post. It should be known that bans and defederation only typically happen in extreme cases such as defending against spam, hate speech, or abusive users.

Might be a more simple explanation but I'm trying to keep it more simple since it helps people better understand the process.

[–] [email protected] 2 points 1 month ago

It helps when you understand that you only ever directly interact with your instance.

  • Alice posts to A (in some community hosted on B)
  • B is federated with A so will eventually receive the post
  • C is federated with B so will eventually get the post