• maniacalmanicmania@aussie.zone
    link
    fedilink
    English
    arrow-up
    22
    arrow-down
    3
    ·
    edit-2
    1 day ago

    Bypass Paywalls Clean is still around.

    Bypass Paywalls Clean for Firefox

    Extension: https://gitflic.ru/project/magnolia1234/bypass-paywalls-firefox-clean

    Support only: https://github.com/bpc-clone/bpc_firefox_support/issues

    Bypass Paywalls Clean for Chrome

    Extension: https://gitflic.ru/project/magnolia1234/bypass-paywalls-chrome-clean

    Support only: https://github.com/bpc-clone/bpc_chrome_support/issues

    Updating

    For Firefox at least, if you pin the extension to the browser toolbar (or whatever the space next to the address bar is called) you will see a little yellow triangle badge whenever there is an update. Click the extension icon to update.

    For Firefox mobile and forks, you may get a notification that there is an update but I haven’t found a one click solution so I just go to the repo, download the xpi and install. To install from file on mobile you need to go to Settings > About Firefox > Tap the logo several times until you see Debug enabled > Go back to main Settings > Under Advanced look for Install extension from file.

  • Knock_Knock_Lemmy_In@lemmy.world
    link
    fedilink
    English
    arrow-up
    66
    ·
    1 day ago

    The archive runs Apache Hadoop and Apache Accumulo. All data is stored on HDFS, textual content is duplicated 3 times among servers in 2 datacenters and images are duplicated 2 times. Both datacenters are in Europe, with OVH hosting at least one of them.

    To avoid detection, archive.today runs via a botnet that cycles through countless IP addresses, making it quite difficult for grumpy webmasters to stop their sites getting scraped. Access to paywalled sites is through logins secured via unclear means, which need to be replenished constantly: here’s the creator asking for Instagram credentials. Finally, the serving of the website is also subject to a perpetual game of cat and mouse: “I can only predict that there will be approximately one trouble with domains per year and each fifth trouble will result in domain loss.” As of today, archive.today still works, but users are redirected to archive.md.

    • Daemon Silverstein@calckey.world
      link
      fedilink
      arrow-up
      9
      ·
      1 day ago

      @[email protected] @[email protected] @[email protected]

      Same when I tried to access the archived version of the linked article of this thread. I was faced by a TLS error I never saw before (SSL_ERROR_INTERNAL_ERROR_ALERT), so I thought the Archive Today was facing server-side issues, until I decided to try accessing through the smartphone, and no error happened there.

      I only managed to access Archive Today through my computer after disabling several security things, which seems quite suspicious, as if the Archive Today were being hijacked by a MitM (possibly the FBI themselves? They’re famous for setting up honeypots) who were trying to push malicious code/tracking to whomever access it.

      I would be further worried if I were USian or a citizen from Global North (as I’m Brazilian and from Global South, I can tell the FBI to go pound sand, lol).

      To USians, my suggestion is caution accessing Archive Today (at least the current IP address being pointed at by mainstream DNS resolvers) for a while, as the server, while seemingly Archive Today, may be actually some kind of FBI honeypot in disguise. It goes without saying how ICANN and IANA are US entities, prone to interference from three-lettered US agencies. There are alternatives to Archive Today, such as Ghost Archive and 12ft.

      • punkibas@lemmy.zip
        link
        fedilink
        English
        arrow-up
        3
        ·
        16 hours ago

        Interesting, all 3 domains are blocked on the protonvpn DNS server, can only access the if I turn off my VPN.

  • Balldowern@lemmy.zip
    link
    fedilink
    English
    arrow-up
    127
    ·
    1 day ago

    Why isn’t the FBI doing anything about Epstein island list ? That’s more important than some archive website.

  • NGC2346@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    14
    ·
    1 day ago

    If it’s someone operating from Russia, they can beat it and get lost, because it won’t disappear.

    • silence7@slrpnk.netOP
      link
      fedilink
      English
      arrow-up
      30
      ·
      1 day ago

      They dont let sites opt-out, and they do a much more seamless job of enabling people to archive paywalled content

    • kazerniel@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      11 hours ago

      Wayback Machine lets you select snapshots in a calendar without thumbnails, which is better for navigating among a large number of snapshots, while Archive.today shows a chronological dump of thumbnails, which is better for noticing visible changes.

      Archive.today is better at getting through paywalls, the Wayback Machine doesn’t really do this.

      And while not a functional difference, but imho quite important: The Wayback Machine is ran by a 100+ employee non-profit registered in the USA, which lends it quite a bit of legal and financial stability, but also subjects it to official oversight/censorship, while Archive.today is ran by a single mysterious dude who carefully hides his identity, and we don’t know where the most of the site’s finances come from. (Edit: In one of the posts copied below he mentioned that he has some donations and ad revenue, but as of 2021 this covered less than 1/3 of the running costs.)

      Both financial security and resistance to censorship can be useful attributes to an online archive, but I have more trust in the Wayback Machine being online in 10 or 20 years, than Archive.today.


      Edit The archive.today owner has a few blog posts mentioning these kind of things:

      July 27, 2021:

      anonymous:
      Not respecting people’s privacy, copyright laws, or the veracity of content on your website… Please tell us more about how this archive isn’t being well managed and is doomed to die at any moment!

      archive-is:
      Of course, it is doomed to die at any moment (you should not have any illusions, as well as about the “veracity of content” on the Internet). The only idea is to hold back a little something that is doomed to die a little earlier. I hope that it is obvious after all the deplatforming dramas of the last months (disappearance of @realDonaldTrump, etc)

      August 13, 2021:

      anonymous:
      You said that before you die of old age you would implement a download zip of your whole site. That’s fine but links to archived pages will still be broken if you die if you don’t have someone to follow in your footsteps to maintain the site because the site will go offline or somebody will buy your expired domain name using it for another purpose. Do you have plans for someone to take over your site? I have thousands of archived pages, don’t want that work to go to waste.

      archive-is:
      I do not think there are many people willing to maintain such a project, which is also unprofitable. All 4½ projects over there - (IA, Archive.today, Megalodon.jp, half-suspensed WebCite, and paid Pinboard.in) look running on energy and money of a single person each and likely will be greatly changed or shutdown by the heirs.

      I could only advise to save everything locally to sync your documents with your own lifespan. Do not rely on clouds.

      daveymames:
      You don’t need many people mate, just a small amount of people is all that’s required. I for example would be willing to accept a passing of the torch. I would fund it with my own money and allow people to donate. I’m planning a site similar to Archive.org of my own that allows uploading via torrents so you can upload big files which is hard to do on archive.org and it bans people who don’t keep 1TB of stuff permanently seeded. This way I don’t need to waste money on storage.

      How much does hosting cost you per month at the moment?

      archive-is:
      about ~$2600/mo of pure expenses on servers/domains, not counting “work time”, “buying laptop/furniture”, etc. ($100…300/mo covered by donations + $300…500 by ads)

      I’d suggest starting with pdf/djvu archive:

      • It is of demand: people here often ask about archiving pdf/djvu and are particularly interested in archiving from another website rather than uploading (for some vague legal reasons).

      • Unlike archive.is, it is more a blob storage and fit to “store me a terabyte” model: there is no need to develop and support own file formats and its renderers.

      • There is a ready-made dataset to rescue and get some press attention on: Sci-Hub.

      • The mission is more about “save forever“ than our “keep a page online after the original took down or altered“.

      archive-is:
      Also, https://docs.softwareheritage.org/ is a storage-heavy initiative which would need extra mirrors and crawlers. I need it as a user, especially immutable weekly snapshots of whole language repositories (such as maven.apache.org, npmjs.com, crates.io, …)

      January 28, 2022:

      anonymous:
      Do you have anything prepared for the fate of the archive in the event of your death?

      archive-is:
      It is an overly optimistic assumption that there will be no risks before I die. Many projects (including at least two in this area: peeep.us and webcitation.org) stopped working long before the death of the people behind them. Many projects pivoted following the money. In addition, there are many critical points (e.g., domains) that I have no control over.

      • NateNate60@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        9 hours ago

        If I had to guess this guy (or girl) is a Bitcoin millionaire or something. But that’s just based on the vibes of his speech with no concrete basis.

  • PKscope@lemmy.world
    link
    fedilink
    English
    arrow-up
    265
    arrow-down
    2
    ·
    1 day ago

    Tackling the problems that really matter. Good job, FBI.

    Fucking clowns.

  • dan1101@lemmy.world
    link
    fedilink
    English
    arrow-up
    16
    ·
    13 hours ago

    The news sites are trying to have it both ways. Serving the news articles to visitors and then covering them up with a paywall with browser tricks.

      • ITGuyLevi@programming.dev
        link
        fedilink
        English
        arrow-up
        5
        ·
        9 hours ago

        I would put that more on the ad networks, if the ads were related to the article, it may generate a few more clicks. The ads are completely random and built off a profile they assume would contain relevant info about me… but it doesn’t really seem to be accurate (this is kind of by my own choosing though).

        Instead articles about rebuilding cars should have ads related to perhaps rebuilding cars and not some fucking nutritional supplement or some other unrelated thing.

        • silence7@slrpnk.netOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          4 hours ago

          Better ad targeting does make ads more valuable…but because only Google and Facebook have the visibility and ML to do it effectively, they wound up with all the ad revenue. Everybody else ended up with a few pennies

    • punkibas@lemmy.zip
      link
      fedilink
      English
      arrow-up
      6
      ·
      16 hours ago

      I have JavaScript disabled by default on all pages, I only activate it if I need to, as per the privacyguides recommendations, but on this site at least, it still won’t load the article. If I want to read it I’d have to either register or use the archive.

  • Broadfern@lemmy.world
    link
    fedilink
    English
    arrow-up
    29
    ·
    1 day ago

    That would explain why adguard’s public DNS started blocking it (labeled vaguely as “legal request”).

  • snoons@lemmy.ca
    link
    fedilink
    English
    arrow-up
    80
    arrow-down
    2
    ·
    1 day ago

    Friends of tech Bros Incorporated.

    Regulatory capture is complete in the states.