site banner

Small-Scale Question Sunday for February 26, 2023

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

1
Jump in the discussion.

No email address required.

This comment I made about how hard/easy it is to find pseudonymous users online identities got me thinking about infosec.

So I spent the rest of the evening researching about OSINT tools and other methods to do bad things (This has the added benefit of implicitly letting you know how to not have bad things done to you, but knowing how to be safe won't necessarily teach you how to be dangerous).

I am not sure how feasible this is, but I checked some emails of people I know using https://haveibeenpwned.com/ and it tells you which databreach the email password combination was found in. So isn't all that remains to acquire the breached data hoping they don't use 2FA? Or am I missing something?

Anyways, back on the topic of infosec/osint, what are you favorite tools that you totally use for security reasons? I am interested in knowing any clever techniques you have heard being used or used yourself as well for/against all things infosec.

I have been tracked down a few times actually – though all instances but the one mentioned in the last thread have taken place before I started to pay any real attention to privacy. Virtually all of those people who have succeeded became my friends, at least for a while. Guess I was more likable back then. One of them was a fledgling teenage hacker who later started working for FSB in basically the same capacity, and knows a lot about this stuff, but we haven't been in touch in a while.

I don't believe there are very clever things one can do to ensure anonymity. (Maybe LLM instances to populate correlated but misleading online identities? Style transfer? I'll use this as soon as possible though my style is... subjectively not really a writing style in the sense of some superficial gimmicks, more like the natural shape of my thought, and I can only reliably alter it by reducing complexity and quality, as opposed to any lateral change). @gattsuru gives decent advice but, aside from those technological attack surfaces, you should just understand the threat model and not share data that meaningfully narrows down one's identity. Speaking of elder gods, Terry Tao has written on this topic:

Anonymity on the internet is a very fragile thing; every anonymous online identity on this planet is only about 31 bits of information away from being completely exposed. This is because the total number of internet users on this planet is about 2 billion, or approximately 2^{31}. Initially, all one knows about an anonymous internet user is that he or she is a member of this large population, which has a Shannon entropy of about 31 bits. But each piece of new information about this identity will reduce this entropy. For instance, knowing the gender of the user will cut down the size of the population of possible candidates for the user’s identity by a factor of approximately two, thus stripping away one bit of entropy. (Actually, one loses a little less than a whole bit here, because the gender distribution of internet users is not perfectly balanced.) Similarly, any tidbit of information about the nationality, profession, marital status, location (e.g. timezone or IP address), hobbies, age, ethnicity, education level, socio-economic status, languages known, birthplace, appearance, political leaning, etc. of the user will reduce the entropy further.

One can reveal quite a few bits of information about oneself without any serious loss to one’s anonymity; for instance, if one has revealed a net of 20 independent bits of information over the lifetime of one’s online identity, this still leaves one in a crowd of about 2^{11} \sim 2000 other people, enough to still enjoy some reasonable level of anonymity. But as one approaches the threshold of 31 bits, the level of anonymity drops exponentially fast. Once one has revealed more than 31 bits, it becomes theoretically possible to deduce one’s identity, given a sufficiently comprehensive set of databases about the population of internet users and their characteristics.

Thus, in today’s online world, a crowd of billions of other people is considerably less protection for one’s anonymity than one may initially think, and just because the first 20 or 30 bits of information you reveal about yourself leads to no apparent loss of anonymity, this does not mean that the next 20 or 30 bits revealed will do so also.

Restricting access to online databases may recover a handful of bits of anonymity, but one will not return to anything close to pre-internet levels of anonymity without extremely draconian information controls. Completely discarding a previous online identity and starting afresh can reset one’s level of anonymity to near-maximum levels, but one has to be careful never to link the new identity to the old one, or else the protection gained by switching will be lost, and the information revealed by the two online identities, when combined together, may cumulatively be enough to destroy the anonymity of both.

...one additional way to gain more anonymity is through deliberate disinformation. For instance, suppose that one reveals 100 independent bits of information about oneself. Ordinarily, this would cost 100 bits of anonymity (assuming that each bit was a priori equally likely to be true or false), by cutting the number of possibilities down by a factor of 2^{100}; but if 5 of these 100 bits (chosen randomly and not revealed in advance) are deliberately falsified, then the number of possibilities increases again by a factor of \binom{100}{5} \approx 2^{26}, recovering about 26 bits of anonymity. In practice one gains even more anonymity than this, because to dispel the disinformation one needs to solve a satisfiability problem, which can be notoriously intractible computationally, although this additional protection may dissipate with time as algorithms improve (e.g. by incorporating ideas from compressed sensing).

We've moved past compressed sensing, of course.

I wonder if this was a purely theoretical musing, a good-faith advice, or a hint that Fields laureates, too, have opinions to hide and shitposts to send. «On the internet, nobody knows you're a Tao».

I don't believe there are very clever things one can do to ensure anonymity. (Maybe LLM instances to populate correlated but misleading online identities? Style transfer? I'll use this as soon as possible though my style is... subjectively not really a writing style in the sense of some superficial gimmicks, more like the natural shape of my thought, and I can only reliably alter it by reducing complexity and quality, as opposed to any lateral change).

Reminds me of that joke about a janitor who looked exactly like Vladimir Lenin. When someone from the Competent Organs suggested that it's kinda untoward, maybe he should at least shave his beard, the guy responded that of course he could shave the beard, but what to do with the towering intellect?

This is precisely why one should regularly nuke one's internet identity. Online handles are like underwear in this regard. Have multiple, switch them often, and don't use them past their expiration date.

This goes very much against common sentiment on basically all internet communities except for the chans. Mainly because the people who will have the largest sway over the community will be those who have built a reputation for themselves. We are thus amplifying the voices of the least OPSEC conscious.