Things that have been happening to me too often lately
from tux0r@feddit.de to security@lemmy.ml on 23 Feb 2024 21:04
https://feddit.de/post/9261519

Guys, a VPN is self-defence. A website banning VPNs is like a brothel banning condoms. I mean, of course the house rules apply, but I’d like to see a bit more judgement. What’s happening right now is ridiculous and hardly does justice to the security aspect of these “tests”. If you find yourself as a contributor to this list, I urge you to stop. I am not a bad guy. All I do is use a VPN.

Thank you.

#security

threaded - newest

doublejay1999@lemmy.world on 23 Feb 2024 21:33 collapse

Websites have no interest in banning VPNs and excluding visitors. The fact is that they are a conduit for spam, bots and more rarely hacking and so hosts will protect themselves. Self defence.

tux0r@feddit.de on 23 Feb 2024 21:46 collapse

How does it defend a website to deny reading access to static content?

Rossphorus@lemmy.world on 23 Feb 2024 21:57 collapse

Topical answer: Bots going around scraping content to feed into some LLM dataset without consent. If the website is anything like Reddit they’ll be trying to monetise bot access to their content without affecting regular users.

tux0r@feddit.de on 23 Feb 2024 22:09 collapse

It should be easy to distinguish a bot from a real user though, isn’t it?

Rossphorus@lemmy.world on 23 Feb 2024 22:29 next collapse

Unfortunately not. The major difference between an honest bot and a regular user is a single text string (the user agent). There’s no reason that bots have to be honest though and anyone can modify their user agent. You can go further and use something like Selenium to make your bot appear even more like a regular user including random human-like mouse movements. There are also a plethora of tools to fool captchas now too. It’s getting harder by the day to differentiate.

damnthefilibuster@lemmy.world on 23 Feb 2024 22:31 collapse

Nope. It gets difficult every single day. Used to be easy - just check the user agent string. Real users will have a long one that talks about what browser they’re using. Bots won’t have it or will have one that mentions the underlying scraping library they’re using.

But then bot makers wizened up. Now they just copy the latest browser agent string.

Used to be that you could use mouse cursor movement to create heat maps and figure out if it’s a real user. Then some smart Alec went and created a basic script to copy his cursor movement and broke that.

Oh, and then someone created a machine learning model to learn that behavior too and broke that even more.

tux0r@feddit.de on 23 Feb 2024 22:45 collapse

Good point, thank you. Uh… beep!