this post was submitted on 24 Jul 2024
9 points (100.0% liked)

Technology

37565 readers
576 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
top 19 comments
sorted by: hot top controversial new old
[–] [email protected] 6 points 1 month ago (1 children)

"We believe in an open internet... as long as you use these specific services."

This really sucks. So we're looking at a future where search engines are like streaming services now. "Hmmm now which search engine was on?"

[–] [email protected] 1 points 1 month ago

That's why I use a SearXng instance. Why bother searching for something on 1 instance when you could search for it on 5 and then correlate the results.

[–] [email protected] 3 points 1 month ago (1 children)

Does this mean the Internet Archive will no longer be archiving reddit posts? That's how I've tried viewing most since I deleted my accounts.

[–] [email protected] 1 points 1 month ago (1 children)

I honestly do not think Internet Archive even should be archiving such behemoths like Reddit or Twitter. Only thing it should keep would be currently dead sites.

Even worse when people are accessing these posts through Archive even when there is a live copy. A lot of storage and bandwidth wasted.

[–] [email protected] 3 points 1 month ago* (last edited 1 month ago)

Counterpoint: Scumbag companies ninja-editing their timestamped warranty page such that the only way you know they edited it after you bought the product is because it was archived previously.

Archives are ideal for identifying sneaky behavior like that. You never know when an admin might have the ability to delete or edit something without anyone noticing.

[–] [email protected] 2 points 1 month ago (3 children)

I don’t have a ton of knowledge in this area, but this seems like it should run afoul of antitrust regulations?

[–] [email protected] 1 points 1 month ago

Given lawmakers that understand how the internet works, I think it would be. To me this isn't any different than a handful of years back when ISPs were throttling websites to give an advantage to the certain ones that paid them to work faster.

[–] [email protected] 1 points 1 month ago (1 children)

That was my first thought too. Yet another reason to vote for Dems this November - only one party actually gives a shit about enforcing antitrust regulations!

[–] [email protected] 3 points 1 month ago

are you absolutely positive the democrats give a shit about antitrust regulations? Biden did actively strike break.

[–] [email protected] 1 points 1 month ago

Who should be regulated, Google or Reddit? Reddit updated there robots.txt to disallow everything. As it's their site, I guess it's also their right to determine that. They then made a deal with Google, which I guess is also not abusing a dominant position by Google, as Reddit could have made a deal with anyone.

[–] [email protected] 2 points 1 month ago

If you use Bing, DuckDuckGo, Mojeek, Qwant or any other alternative search engine that doesn’t rely on Google’s indexing and search Reddit by using “site:reddit.com,” you will not see any results from the last week. DuckDuckGo is currently turning up seven links when searching Reddit, but provides no data on where the links go or why, instead only saying that “We would like to show you a description here but the site won't allow us.” Older results will still show up, but these search engines are no longer able to “crawl” Reddit, meaning that Google is the only search engine that will turn up results from Reddit going forward.

Can anyone confirm this? I typically use DDG, and I tried verifying this, but i'm not sure what to search on reddit that would exclusively bring up results from the past week. Seems like most of the time I'm reading posts from a year ago or more anyway, so it's hard to see the effect immediately.

[–] [email protected] 1 points 1 month ago (1 children)

Are we looking at a future where we need a search engine to tell us which search engine to use for your queries?

[–] [email protected] 1 points 1 month ago

I think we’re looking at a future where Google ensures we don’t ever have to worry about making such a choice.

[–] [email protected] 0 points 1 month ago* (last edited 1 month ago) (1 children)

It's a bit of a dilemma reading their policy:

We believe in the open internet and in keeping Reddit publicly accessible to foster human learning (...) Unfortunately, we see more and more entities using unauthorized access (...) especially with the rise of use cases like generative AI. This sort of misuse of public data has become more prominent as more and more platforms close themselves off from the open internet.
We still believe in an open internet, but we do not believe that third parties have a right to misuse public content just because it’s public.

Being a open/public platform, but still wanting to protect user's content from being used for AI could be a good thing, and I guess also what many fediverse users would want for this platform. Making a distinction between AI and search indexing could indeed be difficult. But then making content deals with Google for search indexing and AI training is a bit hypocrite.

[–] [email protected] 2 points 1 month ago

We still believe in an open internet, but we do not believe that third parties have a right to misuse public content just because it’s public.

You need to pay us for the right to misuse our site's data!

[–] [email protected] 0 points 1 month ago (2 children)

It's not a big deal... for now, because most of the time when I limit DDG results I ask for 1 year back (for solutions that are sort of recent but not ancient).

I would never limit results to just the last week, and typically posts that are that fresh won't have enough accumulated knowledge so even if they pop up on the results they're not really useful.

Again, that's just my experience. I'm curious if others have similar ones.

[–] [email protected] 2 points 1 month ago

In 51 weeks, the decreasing usefulness of that search drops to zero. This is not about now; it's about the future.

[–] [email protected] 1 points 1 month ago (1 children)

True but It will become a bigger deal every passing day that is the problem.

I also wonder if search engine's will delist results after a period of time. If the site is blocking them. After all you don't want your top results to just be 404s all of the time.

[–] [email protected] 1 points 1 month ago

Unless it's 404 Media!