The problem:

The web has obviously reached a high level of #enremovedification. Paywalls, exclusive walled gardens, #Cloudflare, popups, CAPTCHAs, tor-blockades, dark patterns (esp. w/cookies), javascript that makes the website an app (not a doc), etc.

Status quo solution (failure):

#Lemmy & the #threadiverse were designed to inherently trust humans to only post links to non-removed websites, and to only upvote content that has no links or links to non-removed venues.

It’s not working. The social approach is a systemic failure.

The fix:

  • stage 1 (metrics collection): There needs to be removedification metrics for every link. Readers should be able to click a “this link is removed” button on a per-link basis & there should be tick boxes to indicate the particular variety of removed that it is.

  • stage 2 (metrics usage): If many links with the same hostname show a pattern of matching enremovedification factors, the Lemmy server should automatically tag all those links with a warning of some kind (e.g. ⚠, 💩, 🌩).

  • stage 3 (inclusive alternative): A replacement link to a mirror is offered. E.g. youtube → (non-CF’d invidious instance), cloudflare → archive.org, medium.com → (random scribe.rip instance), etc.

  • stage 4 (onsite archive): good samaritans and over-achievers should have the option to provide the full text for a given link so others can read the article without even fighting the site.

  • stage 5 (search reranking): whenever a human post a link and talks about it, search crawlers notice and give that site a high ranking. This is why search results have gotten lousy – because the social approach has failed. Humans will post bad links. So links with a high enremovedification score need to be obfuscated in some way (e.g. dots become asterisks) so search crawlers don’t overrate them going forward.

This needs to be recognized as a #LemmyBug.

  • Dandroid@dandroid.app
    link
    fedilink
    arrow-up
    30
    ·
    1 year ago

    As a developer, if a tester posted something like this as a bug instead of a change request, it would get thrown right into the trash bin. This isn’t a bug. You are asking for an enhancement.

    Side note, do the hash tags do anything on lemmy, or are they just posted here for emphasis?

    • activistPnk@slrpnk.netOP
      link
      fedilink
      arrow-up
      1
      arrow-down
      24
      ·
      edit-2
      1 year ago

      One man’s bug is another man’s feature. The hair-splitting you attempt here really serves no useful purpose. I’m calling it a bug because input data is overly trusted and inadequately processed. It could be framed as a bug or an enhancement and either way shouldn’t impact the treatment (beyond triage/priority).

      if a tester posted something like this as a bug instead of a change request, it would get thrown right into the trash bin

      Yikes. Your suggestion that it should impact whether it’s treated at all is absurd. Bug reports and enhancements are generally filed in the same place regardless. If you’re tossing out bugs/enhancements because you think they are mis-marked, instead of fixing the marking, I wouldn’t want you working on any project that affects me or that I work on. That’s terrible. Shame on you.

      Side note, do the hash tags do anything on lemmy, or are they just posted here for emphasis?

      They have search index relevance in the fediverse. People outside of Lemmy will find the Lemmy post if they search those hashtags (which are ignored by Lemmy itself).

      • Scrubbles@poptalk.scrubbles.tech
        link
        fedilink
        English
        arrow-up
        5
        ·
        edit-2
        1 year ago

        It could be framed as a bug or an enhancement and either way shouldn’t impact the treatment (beyond triage/priority).

        It absolutely changes priority and how it’s treated. Bugs are things that are actively broken, meaning specifically that the functionality already exists, but it is non-functional/broken. Bugs are prioritized obviously because something that should be working is not.

        You are asking for an enhancement, which should be prioritized by dozens of factors, namely who wants this, how does it stack up against other things that other people want, how much effort will it take, and how much of a change would it be, just to name a few. We are not a part of that process, but you can submit a request on GitHub. Doing it here means pretty much nothing, unless you link the GitHub task and ask people to vote for it.

        Or, you can write it yourself and submit a PR, making a write up on why you did it, why you think it’s useful, and why it should be accepted into the upstream, and then the maintainers can choose to include it or not, again based on theirs and the community feedback.

        We developers aren’t “splitting hairs”, we’ve seen this trick from crappy PMs dozens of times. Half baked feature requests disguised as bugs. We all see right through it. You want a feature, then get the buy in and go through the necessary steps like everyone else, but don’t treat us as morons who will fall for your obvious " well it should work the way I want it to, thus it’s a bug" b.s.

  • mateomaui@reddthat.com
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    1
    ·
    1 year ago

    The inherent fallacy in your argument is that a link is a “bad” link simply because it goes to an original source instead of always being redirected to you via a third party that circumvents what you don’t like.

    If someone posts a link to a original non-misinformation news article and it gets marked as a “bad” link, that’s actually a bug.

    • activistPnk@slrpnk.netOP
      link
      fedilink
      arrow-up
      1
      arrow-down
      14
      ·
      edit-2
      1 year ago

      A link is not a bad link for going to the source. You’ve misunderstood the post and also failed to identify a logical fallacy (even had your understanding been correct).

      Whether the link goes to the source or not is irrelevant. I’m calling it a bad link if it goes to a place that’s either enremovedified and/or where the content is unreachable (source or not). This is more elaborate than what you’re used to. There’s more than a dozen variables that can make a link bad. Sometimes the mirror is worse than the source (e.g. archive*ph, which is a Cloudflared mirror site).

      • mateomaui@reddthat.com
        link
        fedilink
        English
        arrow-up
        10
        arrow-down
        1
        ·
        edit-2
        1 year ago

        You just identified the fallacy yourself.

        Sometimes a paywalled source is the first to report on something. Calling that link a bad link is nonsense.

        90+% of the time, using reader mode will bypass paywalls anyway.

        Many people don’t know all the websites to redirect things through without that, so calling their contribution “bad” just because they posted that link isn’t the greatest.

        It’s not even like it’s that big an issue, because usually someone else comes along that provides an alt link in the replies, so saying that this is a social failure is also ridiculous, because both were provided between two people.

        Also, the notion that you or anyone else is socially filtering non-misinformation news sources from the rest of us, because you don’t see the value in it, or cannot figure out how to bypass the paywall yourself, isn’t all that great either.

        edit: it’s also worth pointing out that if some people contributing links happen to be subscribers to a news source, as a subscriber they won’t necessarily know that a certain article is paywalled for everyone else, until they share it and someone who isn’t a subscriber gets the notice.

        • activistPnk@slrpnk.netOP
          link
          fedilink
          arrow-up
          1
          arrow-down
          7
          ·
          edit-2
          1 year ago

          You just identified the fallacy yourself.

          You’re going to have to name this fallacy you keep talking about because so far you’re not making sense.

          Sometimes a paywalled source is the first to report on something. Calling that link a bad link is nonsense.

          One man’s bad link is another man’s good link. It’s nonsense to prescribe for everyone one definition of “bad”. What’s bad weather? Rain? I love rain. Stop trying to speak for everyone and impose your idea of “bad” on people.

          Many people don’t know all the websites to redirect things through without that, so calling their contribution “bad” just because they posted that link isn’t the greatest.

          So because someone might not know their link is bad, it ceases to be bad? Nonsense.

          It’s not even like it’s that big an issue, because usually someone else comes along that provides an alt link in the replies,

          (emphasis mine) Usually that does not happen.

          so saying that this is a social failure is also ridiculous, because both were provided between two people.

          This based on the false premise that usually bad links are supplemented by an alternate from someone else.

          Also, the notion that you or anyone else is socially filtering non-misinformation news sources from the rest of us, because you don’t see the value in it, or cannot figure out how to bypass the paywall yourself, isn’t all that great either.

          (emphasis mine) Every user can define an enremovedified site how they want. If you like paywalls, why not have your user-side config give you a personalized favorable presentation of such links?

          • mateomaui@reddthat.com
            link
            fedilink
            English
            arrow-up
            5
            arrow-down
            1
            ·
            1 year ago

            fallacy: all paywalled links are bad

            I’ll let someone else continue this, I’ve made my argument well enough already.

            • activistPnk@slrpnk.netOP
              link
              fedilink
              arrow-up
              1
              arrow-down
              8
              ·
              edit-2
              1 year ago

              You don’t know what a logical fallacy is. Bob and Alice can disagree about whether the pizza tastes good or bad. There’s no fallacy there, just subjective disagreement.

          • mateomaui@reddthat.com
            link
            fedilink
            English
            arrow-up
            3
            ·
            1 year ago

            At the time I couldn’t be bothered to respond to most of this reply of yours, because your responses were too ignorant to take seriously, but since you’re still arguing about this, and that other moronic post where you complain about devs, someone should tell you that this line you replied with here

            Stop trying to speak for everyone and impose your idea of “bad” on people.

            is a hilarious example of a total lack of self awareness, as this entire post of yours is trying to speak for others and impose your definition of what a “bad” link is on everyone else.

            But keep on being an idiot. You apparently cannot code anything you want done, but feel like your contribution of providing criticism is somehow equal to the work of the devs who actually built the software before you came along. It’s just entitled stupidity to think they work for you or that you’re equal to them in any way.

            Not to mention that your arguments regarding fair use and letting archive.org lead the way should flag you as a potential very expensive liability for every instance admin who cannot afford a copyright battle. It’s easy to thumb your nose at potential problems when you’re not actually in charge of or responsible for anything.

  • TootSweet@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    1 year ago

    So, first off, I love everything you have here.

    The only thing. Onsite archive. I’d love it, but I wouldn’t want copyright law used to punish the Lemmy community. I don’t think I’m quite qualified to answer this question, so I’ll ask it here: how worried should we be about that?

    • activistPnk@slrpnk.netOP
      link
      fedilink
      arrow-up
      1
      arrow-down
      7
      ·
      1 year ago

      It would need some analysis by legal experts. But consider that archive.org gets away with it. Although archive.org has an opt-out mechanism. So perhaps each Lemmy instance should have an opt-out mechanism, which should push a CAPTCHA in perhaps one of few good uses for CAPTCHAs. Then if Quora wants to opt-out, they have to visit every Lemmy instance, complete the opt-out form, and solve the CAPTCHA. Muahaha!

      Note as well how 12ft.io works: it serves you Google’s cache of a site (which is actually what the search index uses). How did Google get a right to keep those caches?

      There’s also the #fairUse doctrine. You can quote a work if your commenting on it. Which is what we do in the threadiverse. Though not always – so perhaps the caching should be restricted to threads that have comments.

      • Emma_Gold_Man@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        9
        ·
        1 year ago

        Archive.org doesn’t really “get away with it.” They face frequent lawsuits and have a steady stream of donations to fight them, along with enough staff to handle responding to takedown demands etc. That isn’t true of most Lemmy instances.

        • activistPnk@slrpnk.netOP
          link
          fedilink
          arrow-up
          1
          arrow-down
          6
          ·
          edit-2
          1 year ago

          Just like Greenpeace paves the way for smaller activist groups that can’t stand up to challenges, archive.org would serve in the same way. When archive.org (with ALA backing) wins a case, that’s a win for everyone who would do the same. Lemmy would obviously stay behind on the path archive.org paves and not try to lead.

      • TootSweet@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 year ago

        I mean, does archive.org get away with it, though?

        They have legal troubles not infrequently and they’ve lost at least one copyright case that I know of recently.

        I doubt if you pooled all the Lemmy instances’ resources that they’d have the resources to fight a copyright case.

        And do I really have to spell out how Google gets away with caching stuff?

        Finally, “fair use” isn’t magic words that magically absolve you of any liability in all copyright claims. I’m extremely skeptical fair use could be twisted to our defense in this particular case.

        • activistPnk@slrpnk.netOP
          link
          fedilink
          arrow-up
          1
          arrow-down
          6
          ·
          1 year ago

          I mean, does archive.org get away with it, though?

          They get blocked by some sites, and some sites have pro-actively opt-out. archive.org respects the opt-outs. AFAICT, archive.org gets away w/archiving non-optout cases where their bot was permitted.

          And do I really have to spell out how Google gets away with caching stuff?

          You might need to explain why 12ft.io gets away with sharing google’s cache, as Lemmy could theoretically operate the same way.

          I’m extremely skeptical fair use could be twisted to our defense in this particular case.

          When you say “twisted”, do you mean commentary is not a standard accepted and well-known fair use scenario?

          • TootSweet@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 year ago

            They get blocked by some sites, and some sites have pro-actively opt-out. archive.org respects the opt-outs. AFAICT, archive.org gets away w/archiving non-optout cases where their bot was permitted.

            Archive.org is more than The Wayback Machine. You’re just talking about The Wayback Machine, not archive.org as a whole. Nothing I’ve said in this thread is about The Wayback Machine specifically.

            My point is that archive.org does things that bend, skirt, and run afoul of copyright law (and good on them because removed the system) and they spend more money, time, and resources fighting copyright suits than I’d imagine all Lemmy instance owners pooling their resources could afford. And that’s if they even cared enough to risk dying on that hill.

            You might need to explain why 12ft.io gets away with sharing google’s cache, as Lemmy could theoretically operate the same way.

            Not sure how this bit is relevant. I was speaking only about your “stage 4 (onsite archive)” item. (I thought that was pretty clear, but apparently not?) I don’t know if 12ft.io is playing with (legal) fire or not, but I’m not sure why it matters to the conversation. Nothing 12ft.io does is comparable to Lemmy users copying articles into comments.

            When you say “twisted”, do you mean commentary is not a standard accepted and well-known fair use scenario?

            So, I’m only going to be talking about U.S. “fair use” here because as little as I know about that, I know far far less about copyright law in other countries. That said:

            First, whether fair use applies is a fairly complex matter which depends among other things on how much of the original work is copied. While maybe not technically determinitive of the validity of a fair use defense, “the whole damn article” definitely won’t help your case when you’re trying to argue a fair use defense in federal court.

            Second, I think for a fair use argument to work the way you seem to be suggesting, the quoted portions of(!) the article would have to appear in the same “work” as the commentary, but I’d imagine typically all comments in a Lemmy thread would be distinct “works.” Particularly given that each comment is independently authored and mostly by distinct authors. (Copying an entire article into a comment and following it with some perfunctory “commentary” would be a pretty transparent ham-fisted attempt at a loophole. Again, a very bad look when you’re arguing your defense in federal court.) I don’t know about your Lemmy instance, but mine doesn’t seem to say anything in the legal page that could provide any argument that a thread is a single “work.” (It does say “no illegal content, including sharing copyrighted material without the explicit permission of the owner(s).”)

  • jet@hackertalks.com
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 year ago

    You’re trying to solve a social problem with technology. That’s going to be very difficult

  • rglullis@communick.news
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 year ago

    The solution for that can be a whole lot simpler: add these features to the browser so that it works in favor of the users. I have extensions to redirect from YouTube/medium/Twitter, so these issues do not affect me regardless of website I am visiting.

    • activistPnk@slrpnk.netOP
      link
      fedilink
      arrow-up
      1
      arrow-down
      6
      ·
      edit-2
      1 year ago

      The browser (more appropriately named: client) indeed needs some of the logic here, but it cannot do the full job I’ve outlined. The metrics need to be centralized. And specifically when you say browser, this imposes an inefficient amount of effort & expertise on the end-user. A dedicated client can make it easy on the user. But it’s an incomplete solution nonetheless.

      • rglullis@communick.news
        link
        fedilink
        English
        arrow-up
        5
        ·
        1 year ago

        The metrics need to be centralized.

        Why? And how would guarantee the integrity of the ones holding the metrics?

        this imposes an inefficient amount of effort & expertise on the end-user.

        A lot less effort than having to deal with the different “features” that each website admin decides to run on their own.

        • activistPnk@slrpnk.netOP
          link
          fedilink
          arrow-up
          1
          arrow-down
          5
          ·
          edit-2
          1 year ago

          Why?

          1. It’s a big database. It would be a poor design to replicate a db of all links in every single client.
          2. Synchronization of the db would not be cheap. When Bob says link X has anti-feature Y, that information must then be shared with 10s of thousands of other users.

          Perhaps you have a more absolute idea of centralized. With Mastodon votes, they are centralized on each node but of course overall that’s actually decentralized. My bad. I probably shouldn’t have said centralized. I meant more centralized than a client-by-client basis. It’d be early to pin those details down at this point other than to say it’s crazy for each client to maintain a separate copy of that DB.

          And how would guarantee the integrity of the ones holding the metrics?

          The server is much better equipped than the user for that. The guarantee would be the same guarantee that you have with Mastodon votes. Good enough to be fit for purpose. For any given Mastodon poll everyone sees a subset of votes. But that’s fine. Perfection is not critical here. You wouldn’t want it to decide a general election, but you don’t need that level of integrity.

          A lot less effort than having to deal with the different “features” that each website admin decides to run on their own.

          That doesn’t make sense. Either one person upgrades their Lemmy server, or thousands of people have to install, configure, and maintain a dozen different browser plugins ported to a variety of different browsers (nearly impossible enough to call impossible). Then every Lemmy client also has to replicate that complexity.

  • Spzi@lemm.ee
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Not sure if social media in general has failed. That particular point can be solved at the community level.

    Create or join a community which by it’s guidelines restricts posting paywalled or otherwise bad content. Which explicitly encourages posting “liberated” content. Have moderation. Problem solved. Moderators will remove all which you dislike. All that remains is the solution you want.

  • antlion@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    2
    arrow-down
    1
    ·
    1 year ago

    This, but I think equally important is de-duplication of links. Ideally these alternative links to the same content could also be de-duped. All comments should be in one thread. I know what I’m describing is complicated due to communities across servers, but it would really improve lemmy for me.