a.x61.sh

Large-scale online deanonymization with LLMs (arxiv.org)
from FineCoatMummy@sh.itjust.works to privacy@lemmy.ml on 28 Mar 21:02
https://sh.itjust.works/post/57571819

Paper by,

Simon Lermen, Daniel Paleka, Joshua Swanson, Michael Aerni, Nicholas Carlini, Florian Tramèr

It talks about deanonymizing those who writes under a pseudonym. Sites like reddit, lemmy would be that type.

From the paper,

Given two databases of pseudonymous individuals, each containing unstructured text written by or about that individual, we implement a scalable attack pipeline that uses LLMs to: (1) extract identity-relevant features, (2) search for candidate matches via semantic embeddings, and (3) reason over top candidates to verify matches and reduce false positives.

Our results show that the practical obscurity protecting pseudonymous users online no longer holds and that threat models for online privacy need to be reconsidered.

They can match writing styles, interests, details to infer a job or city, or other unstructured information. That allows to match unrelated pseudonyms to the same person. Like, FooFighterGroupie and Yolanda43905 are the same human, despite they never said it. It can allow also, to match a pseudonym to a real identity across sites. Like someone posted on LinkedIn with a real name. It takes less info than most people expect, to figure out Julia Greenberg of Cedarville, NH is FooFighterGroupie.

You can protect yourself by never giving away much info. But ofc sometimes that’s the whole point! Think talking about specific hobbies or w/e, gives away info. Also change up writing styles + vocab use, b/c it is a unique fingerprint.

I doubt this technique is used in a dragnet way… YET! But no reason it can’t scale, if the cost of resources goes low eonugh. We could eventually see it become standard, analysis to link people across sites and identities.

#privacy

threaded - newest

PolarKraken@lemmy.dbzer0.com on 28 Mar 21:33 next collapse

I’ve been expecting to hear of something like this, it’s a natural evolution of LLM use cases and grimly inevitable.

astraeus@lemmy.ml on 28 Mar 21:46 next collapse

It’s a damn good thing I’m a gun toting Ohio libertarian that never lies online at all

grey_maniac@lemmy.ca on 28 Mar 21:49 collapse

Definitely! I recall seeing you at the Lodge meetings.

astraeus@lemmy.ml on 28 Mar 21:50 collapse

We should go to the range sometime to get away from those dang liberals😎

corvus@lemmy.ml on 28 Mar 22:09 next collapse

So it seems that letting LLMs to write sloppy posts for us can be useful after all. May be c/privacy should implement an automatic AI reformating XD

FineCoatMummy@sh.itjust.works on 28 Mar 22:17 next collapse

Yah, there might be something to that. For protection against style + vocab matching.

It sucks though. I recently read where the more people use LLM assisters when they write, the more the whole virtual commons grows bland. It feeds back upon itself.

Sigh. I just want a world where we can have nice things. And assholes don’t try to ruin the nice things we could have.

astraeus@lemmy.ml on 28 Mar 22:31 next collapse

You’re absolutely right! It’s not just subterfuge—it’s praxis.

Nils@lemmy.ca on 29 Mar 09:01 collapse

Previously, the advice was to translate your posts into one or two languages before posting. It seems that even rough content generated by large language models (LLMs) can help people fit in more easily.

I like how slop became “rough content” after translation.

Zacryon@feddit.org on 29 Mar 03:47 next collapse

As if we need more lessons in how cautious we should be with what we’re putting on the internet. What has been true 20 years ago hasn’t changed.

Bloefz@lemmy.world on 29 Mar 12:28 next collapse

Yes I’ve been worried about exactly this. I’m sure it’s very much within the realm of possibility these days.

racoon@lemmy.ml on 29 Mar 15:33 next collapse

Do Throwaway accounts with no more than 6 months or 100 comments

irmadlad@lemmy.world on 31 Mar 21:12 next collapse

You can protect yourself by never giving away much info.

Which is why you should silo all online accounts and avoid linking them together.

The1TruePower@lemmy.myserv.one on 03 Apr 17:09 collapse

The corruption and rape of our digital freedom only sucks the life out of the internet. Eventually it will die. At some point the only way to avoid AI, algorithmic manipulation, and the like will be to live a life completely offline. That’s what the powers that be will push people to. I have hope that these shitty evil practices aren’t sustainable and the whole thing will collapse.