• @[email protected]
    link
    fedilink
    English
    17 hours ago

    we’re currently aware of delayed federation from lemmy.ml towards lemmy.world and still working identifying the root cause - see https://lemmy.world/post/22196027 (still needs updating that it’s happening again).

    aussie.zone has been about 6 weeks behind lemmy.world for a few weeks i think at this point, which at least means they’re no longer losing activities, but it’s still taking ages to reduce the lag.

    i don’t know what issue there might be with discuss.online right now, but for startrek.website the explanation is rather simple. as you can see in the sidebar, there are 0 local subscribers for the community. when there aren’t any subscribers to a community on an instance, the instance will not receive any updates for posts in that community. this includes posts, comments, as well as votes.

    startrek.website also had federation issues over the last weeks due to accidentally blocking lemmy instances in some situations.

    lemdro.id has recently had some db performance issues that caused it to get around 3d behind lemmy.world, they’ve been slowly catching up again over the last days.

    • OpenStars
      link
      fedilink
      English
      16 hours ago

      Thanks for the info! If discuss.online was out for a few hours, would that explain the missing content, if it happened during the outage and so it just gets lost forever, or given enough time will it catch up?

      Do you know when Lemmy.World plans to update to 0.19.6 or 0.19.7? I really hope that helps bring stability! Although I can understand not wanting to do it at the same time as the sync issue with lemmy.ml still happening.

      • @[email protected]
        link
        fedilink
        English
        26 hours ago

        downtime should not result in missing content when the sending instance is lemmy 0.19.0 or newer. 0.19.0 introduced a persistent federation queue in lemmy, which means it will retry sending the same stuff until the instance is available. depending on the type of down, it can also be possible that there is a misconfiguration (e.g. “wrong” http status code on a maintenance page) that could make the sending instance think it was successfully sent. if the sending instance was unreachable (timeout) or throwing http 5xx errors, everything should be preserved.

        we are planning to post an announcement about the current situation with lemmy updates and our future plans in the coming days, stay tuned for that. you can find some info in my comment history already if you are curious.

        • OpenStars
          link
          fedilink
          English
          15 hours ago

          Ah, then something is indeed very wrong if discuss.online is missing so much content from a week ago (I thought after something like 7 days it will simply give up and stop trying), and startrek.website is doing far worse than that even.

          Though sh.itjust.works caught up even as we were talking about it so… there’s some hope I suppose. And either way, thanks for any efforts you are doing to help with it - well, on the LW side at least:-).

          • @[email protected]
            link
            fedilink
            English
            15 hours ago

            there is indeed a cutoff. there is exponential delay for retrying and at some point lemmy will stop trying until it sees the instance as active again.

            there is also a scheduled task running once a week that will delete local activities older than a week. downtimes of a day or two can generally be easily recovered from, depending on latency it can take a lot more time though. if an instance is down for an extended time it shouldn’t expect to still get activities from the entire time it was offline.