Cloudflare offers to make AI pay to crawl websites (www.computerworld.com)
from Zerush@lemmy.ml to privacy@lemmy.ml on 01 Jul 17:53
https://lemmy.ml/post/32546497

#privacy

threaded - newest

Ulrich@feddit.org on 01 Jul 18:33 next collapse

I don’t believe they have that capability

Zerush@lemmy.ml on 01 Jul 18:48 next collapse

Oh yes, they have and more, they are one of the most powerfull security and AI company with a ton of services. Perfectly capable to remove the plug of any service and web. Sadly with similar privacy concerns as Google.

www.cloudflare.com

Ulrich@feddit.org on 01 Jul 18:54 collapse

Okay great. Go ahead and explain to me how they plan to fight an army of bots doing everything they can to be invisible?

HelloRoot@lemy.lol on 01 Jul 19:00 next collapse

  1. they can already block VPN traffic (unless you use their VPN)

  2. their whole business model is based on them being a man in the middle that decrypts ssl and analyses the packets plainly

  3. about a third of the worldwide websites are using cloudflare so they have a pretry good birds eye view on behaviour of any machine, datacenter or ip range that will be visiting a lot of websites, which in turn will trivially whether it is normal user behaviour or a crawler.

Ulrich@feddit.org on 01 Jul 19:23 next collapse

they can already block VPN traffic unless it goes through their VPN

Yeah that’s how most VPNs work.

their whole business model is based on them being a man in the middle that decrypts ssl and analyses the requests plainly

Okay? Analyze all you want. They can’t stop bots on any of the other sites they regulate either.

about a third of the worldwide websites are using cloudflare so they have a pretry good birds eye view on behaviour of any machine that will be visiting a lot of websites

Great. Bots intentionally change up their behavior and identifying information as to be undetected.

HelloRoot@lemy.lol on 01 Jul 20:08 next collapse

They can’t stop bots on any of the other sites they regulate either.

They can and do. What is blocked depends on what the website owner sets as settings in cloudflare.

Bots intentionally change up their behavior and identifying information as to be undetected.

If they have to crawl the web while behaving like a normal human, it will be magnitudes slower and more costly.

Ulrich@feddit.org on 01 Jul 21:14 collapse

What is blocked depends on what the website owner sets as settings in cloudflare.

And how does the owner know which connections are bots?

If they have to crawl the web while behaving like a normal human, it will be magnitudes slower and more costly.

They don’t care, they have trillions of dollars of VC money to power through.

HelloRoot@lemy.lol on 01 Jul 21:25 collapse

The owner sets the level. If they set strict level, all bots are blocked.

They do care. VC funding happens because the result is profitable. If it is less profitable, there will be less funding because of higher investment risk.

Ulrich@feddit.org on 01 Jul 21:42 collapse

If they set strict level, all bots are blocked.

I don’t know what you don’t understand. These bots are not labeling themselves as bots. They are camouflaging themselves to look like any other type of traffic.

VC funding happens because the result is profitable.

No, VC funding happens because investors are duped into thinking the result is profitable.

[deleted] on 02 Jul 04:47 collapse

.

Ulrich@feddit.org on 02 Jul 04:50 collapse

I’m not and your tone is completely unnecessary so maybe dial it back a bit.

Yes, VC funding can be profitable. It’s also often not. Like any other investment. Corporations will absolutely lie and blow smoke up their ass if they think it can get them more money.

[deleted] on 02 Jul 04:57 collapse

.

PowerCrazy@lemmy.ml on 01 Jul 20:59 collapse

They can’t stop bots on any of the other sites they regulate either.

Why not? They are doing edge caching, they can literally just block the connection from visiting the site just like they do with their DDoS mitigation.

Ulrich@feddit.org on 01 Jul 21:12 collapse

they can literally just block the connection

block which connection? Again, these AI companies know people don’t want them crawling their sites and they do everything they can to be invisible. This has been an issue for years at this point.

just like they do with their DDoS mitigation

blocking DDoS is trivial by comparison.

Zerush@lemmy.ml on 01 Jul 19:30 collapse

Not the first time that, with all my privacy measures on, instead of an page, I see the page from Cloudflare analyzing if I am a bot, before it let me access the page I wanted. Invisible in the web is only a bad joke. Anybody is visible in the moment he goes online, irrelevant if he use VPN, TOR or whatever, this times have passed. Believing it is as hilarious as in the Movie Independence day infecting with an Virus an Alien Mothership, using an crappy Laptop (I have laughed a lot with this scene).

slackness@lemmy.ml on 02 Jul 04:33 collapse

Anybody is visible in the moment he goes online, irrelevant if he uses …, TOR

No

Zerush@lemmy.ml on 02 Jul 08:16 collapse

Yes, TOR never was secure against secret services and goverments, les nowadays with AI and massive server power from these. Don’t forget who developed the TOR network and from whom are the servers used. Drug Barons since time turned to use pen and paper for their orders and communication, because the web and even the Dark Web isn’t really private anymore (traffic analysis, exploiting software vulnerabilities, monitoring exit nodes, using Honeypot nodes…)

slackness@lemmy.ml on 02 Jul 11:22 collapse

Just calling it “Dark Web” gives away you have no idea what you’re talking about.

Zerush@lemmy.ml on 02 Jul 12:25 collapse

Naturally the Darkweb has nothing to do with the TOR network, But even if you access it using the TOR network, they find you if they want, as you can see.

Steve@communick.news on 01 Jul 19:52 next collapse

It’s literally what their entire business is based on. Filtering good and bad traffic.

Ulrich@feddit.org on 01 Jul 21:14 collapse

Their business is largely based on security and preventing specific types of actions on a site. Not just the mere act of visiting it.

surjomukhi@lemmygrad.ml on 01 Jul 23:00 collapse

It is. And they are blocking AI crawler for a while now. wired.com/…/cloudflare-tools-detect-block-ai-bots…

webghost0101@sopuli.xyz on 02 Jul 10:19 next collapse

Based on the headline this is not about blocking ai scrappers but by making them pay to do it.

Based on the discussion below which moved that goalpost the most likely answer is by making it cheaper to scrape “legally” then it costs to mimic millions of individual residential browsers with human users.

I don’t know how many aces cloudflare has up its sleep to detect secret ai but they definitely have the tools to make it pretty costly and difficult. There is also a broadband impact difference between a few capitalist megapigs scrapping secretly versus loads of global basement dwellers and smaller companies scrapping worry free.

sudo@programming.dev on 02 Jul 22:50 collapse

How about you just read up on Cloudflare Turnstile instead of acting like you know anything? Here are some notable methods:

  • Residential IP requirements
  • TLS Fingerprinting
  • Canvas Fingerprinting

It’s still possible to get around these but it’s not easy. You either must have as good network engineers on staff as Cloudflare or pay some third party service to unlock it for you. All Cloudflare needs to do is keep their prices lower than the third party services.

SheeEttin@lemmy.zip on 01 Jul 22:10 collapse

Believe all you want, reality doesn’t care.

Ulrich@feddit.org on 01 Jul 23:21 collapse

I agree!

dis_honestfamiliar@lemmy.sdf.org on 02 Jul 05:59 next collapse

They probably can. Link

AceFuzzLord@lemmy.zip on 02 Jul 07:03 next collapse

This idea that you could have Cloudflare help by telling off AI crawlers sounds nice, but how long until it becomes a premium feature that requires loads of money to operate because AI companies lobby them to make it inaccessible to the masses? Or something equally as bad happens?

Hirom@beehaw.org on 02 Jul 11:29 collapse

This could further accelerate the arms race between malicious srappers and websites.

My fear is this would create collateral damage, block legitimate scrappers and visitors, hassle people with an increasing number of captcha.

TerHu@lemmy.dbzer0.com on 02 Jul 13:47 collapse

yeah i think that there’s a good chance for vpn users to be harassed by anti ai measures