this post was submitted on 24 Jul 2024
7 points (100.0% liked)

Technology

59187 readers
2182 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Research Findings:

  • reCAPTCHA v2 is not effective in preventing bots and fraud, despite its intended purpose
  • reCAPTCHA v2 can be defeated by bots 70-100% of the time
  • reCAPTCHA v3, the latest version, is also vulnerable to attacks and has been beaten 97% of the time
  • reCAPTCHA interactions impose a significant cost on users, with an estimated 819 million hours of human time spent on reCAPTCHA over 13 years, which corresponds to at least $6.1 billion USD in wages
  • Google has potentially profited $888 billion from cookies [created by reCAPTCHA sessions] and $8.75–32.3 billion per each sale of their total labeled data set
  • Google should bear the cost of detecting bots, rather than shifting it to users

"The conclusion can be extended that the true purpose of reCAPTCHA v2 is a free image-labeling labor and tracking cookie farm for advertising and data profit masquerading as a security service," the paper declares.

In a statement provided to The Register after this story was filed, a Google spokesperson said: "reCAPTCHA user data is not used for any other purpose than to improve the reCAPTCHA service, which the terms of service make clear. Further, a majority of our user base have moved to reCAPTCHA v3, which improves fraud detection with invisible scoring. Even if a site were still on the previous generation of the product, reCAPTCHA v2 visual challenge images are all pre-labeled and user input plays no role in image labeling."

top 31 comments
sorted by: hot top controversial new old
[–] [email protected] 3 points 3 months ago* (last edited 3 months ago) (3 children)

I kinda figured. It was annoying to do one, but then they wanted you to do two or three and that's absurd. Whenever it comes up now, I usually just close out.

[–] [email protected] 2 points 3 months ago (2 children)

they wanted you to do two or three and that's absurd

Yea how about 20

[–] [email protected] 1 points 3 months ago

VPN? Google will just go in a loop with these things, so I just stopped using Google completely.

[–] [email protected] 0 points 3 months ago (1 children)

if you have to do that many, you either have some privacy setting on or on a flagged ip given from a VPN

[–] [email protected] 0 points 3 months ago (1 children)

Well yah of course I do. Why the hell is that 'abnormal'?

[–] [email protected] 0 points 3 months ago (1 children)

its abnormal to them because vpns are often also used by bad actors. your use is not abnormal but its a there are other people misusing it making it worse for everyone else.

[–] [email protected] -1 points 3 months ago

Wow, way to blame individuals who take basic precautions instead of the corporations who are blantly invading your privacy. Good job making the world a better place, bud.

[–] [email protected] 1 points 3 months ago

Im surprised that this is in the news right now. This has been acknowledged as fact for a decade or so.

[–] [email protected] 0 points 3 months ago (1 children)

Some captchas have also just gotten obvious AI training. "Click on the living being in this image", "Select every image of the same object as in this example image". And the images you have to select look obviously AI generated.

[–] [email protected] 0 points 3 months ago (1 children)

Heh, I got one just the other day "Select the images containing structures built by people" lmao

[–] [email protected] 1 points 3 months ago

"click on all people not helping with the robot uprising"

[–] [email protected] 1 points 3 months ago (1 children)

Getting served a captcha often results in me closing the tab. I'm not doing stupid puzzles for you.

[–] [email protected] 0 points 3 months ago (1 children)

Do them wrong and then close out

[–] [email protected] 0 points 3 months ago (1 children)

I do it right and it says I’m wrong =\

[–] [email protected] 1 points 3 months ago

I have bad news for you friend...

You might be a robot

[–] [email protected] 0 points 3 months ago (1 children)

I bypassed 35000 google recaptcha v2 using bots. Don't ever rely on this for security

[–] [email protected] 0 points 3 months ago (1 children)

Where can I learn this power?

[–] [email protected] 0 points 3 months ago (1 children)

I just spent 3$ worth of bitcoin on NoCaptchaAI. I used their web extension on a server which had a browser opened and controlled by a custom webextension I made so that a solved challenge would be returned to a swarm of clients upon request

[–] [email protected] 1 points 2 months ago (1 children)

Your extension is archived, I'd rather not use it.

[–] [email protected] 2 points 2 months ago

It's a custom extension solving my very specific problem on a specific internal website. It was never meant for you to use it, it's just there to serve as inspiration to others

[–] [email protected] 0 points 3 months ago (1 children)

I honestly thought it was common knowledge that these things were essentially free labor for training AI.

[–] [email protected] 1 points 3 months ago

The original reCAPTCHA from Carnegie Mellon University was helping to digitize books. It showed one known word and one unknown word, and if enough people answered the second word with the same answer, that'd be marked as the correct value.

[–] [email protected] 0 points 3 months ago (2 children)

The objective of reCAPTCHA (or any captcha) isn't to detect bots. It is more of stopping automated requests and rate limiting. The captcha is 'defeated' if the time complexity to solve it, whether human or bot, is less than what expected. Now humans are very slow, hence they can't beat them anyway.

[–] [email protected] 0 points 3 months ago (1 children)

[…] reCAPTCHA […] isn’t to detect bots. It is more of stopping automated requests […]

which is bots. bots do automated requests and every automated request doer can also be called a bot (i.e. web crawlers are called bots too and -if kind- also respect robots.txt which has "bots" in its name for this very reason and bots is the shortcut for robots) use of different words does not change reality behind it, but may add a fact of someone trying something on the other.

[–] [email protected] 0 points 3 months ago (1 children)

There isn't a good way to classify human users with scripts without adding too much friction to normal use. Also bots are sometimes welcome amd useful, it's a problem when someone tries to mine data in large volume or effectively DoS the server.

Forget bots, there exist centers in India and other countries where you can employ humans to do 'automated things' (youtube like count, watch hour for example) at the same expense of bots. There are similar CAPTCHA services too. Good luck with those :)

Only rate limiting is the effective option.

[–] [email protected] 0 points 3 months ago (1 children)

Only rate limiting is the effective option.

i doubt that. you could maybe ratelimit per IP and the abusers will change their IP whenever needed. if you ratelimit the whole service over all users in the world, then your service dies as quickly into uselessness as effective your ratelimiter is. if you ratelimit actions of logged in users, then your ratelimiting is limited by your ability to identify fake or duplicate accounts, where captchas are not helpful at all.

at the same expense of bots. they might be cheap, but i doubt that anyway, bots don't need sleep.

i was answering about that wording (that captchas were "not" about bots but about "stopping automated requests") and that automated requests "are" bots instead.

call centers are neither bots nor automated requests (the opposite IS their advantage) and thus have no relation to what i was specifically saying in reply to that post that suggested automated requests and bots would be different things in this context.

i wasn't talking about effectiveness of captchas either or if bots should be banned or not, only about bots beeing automated requests (and vice versa) from the perspective of the platform stopping bots. and that trying to use different words for things, (claiming like "X isn't X, it is really U!"* or automated requests aren't bots) does not change the reality of the thing itself.

*) unrelated to any (a-)social media platform

[–] [email protected] 0 points 3 months ago (1 children)

stopping automated requests

yeah my bad. I meant too many automated requests. Both humans and bot generate spams and the issue is high influx of it. Legitimate users also use bots and by no means it's harmful. That way you do not encounter captcha everytime you visit any google page, nor a couple of scraping scripts gets a problem. Recaptcha (or hcaptcha, say) triggers when there is high volume of request coming from same ip. Instead of blocking everyone out to protect their servers, they might allow slower requests so legitimate users face mininimal hindrance.

Most google services nowadays require accounts with stronger (like cell phone) verification so automated spam isn't a big deal.

[–] [email protected] 1 points 3 months ago (1 children)

since bots are better at solving captchas and humanoid services exist that solve them, the only ones negatively affected by captchas are regular legitimate users. the bad guys use bots or services and are done. regular users have to endure while no security is added, and for the influx i guess it is much more like with the better lock on the front door: if your lock is a bit better than that of your neigbhour, theirs might be force-opened more likely than yours. it might help you, but its not a real but only relative and also very subjective feeling of 'security".

beeing slower than the wolves also isn't as bad as long as you are not the slowest in your group (some people say)... so doing a bit more than others always is a good choice (just better don't put that bar too low like using crowdsnakeoil for anything)

[–] [email protected] 1 points 3 months ago (1 children)

the bad guys use bots or services and are done. regular users have to endure while no security is added

put in other words, common users can't easily become 'bad guy' ie cost of attack is higher hence lower number of script kiddies and automated attacks. You want to reduce number. These protections are nothing for bitnet owners or other high profile bad actors.

ps: recaptcha (or captcha in general) isn't a security feature. At most it can be a safety feature.

[–] [email protected] 1 points 2 months ago

isn’t a security feature. At most it can be a safety feature.

o,,O

[–] [email protected] -1 points 3 months ago

I thought captcha's worked in a way where they provided some known good examples, some known bad examples, and a few examples which aren't certain yet. Then the model is trained depending on whether the user selects the uncertain examples.

Also it's very evident what's being trained. First it was obscured words for OCR, then Google Maps screenshots for detecting things, now you see them with clearly machine-generated images.