site banner

Small-Scale Question Sunday for August 3, 2025

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

2
Jump in the discussion.

No email address required.

How much detail do you think is in the data that governments and tech companies are keeping about us?

  1. Are they keeping a log of every website you visit?

  2. Are they keeping a log of your phone's 24/7 location data?

  3. Are they keeping transcripts of all of your phone calls?

  4. Are they keeping transcripts of every word you say in the vicinity of a smart device?

  5. etc.?

  1. Depends on what you mean by "tech companies", technically unless you do fulltime VPN at least your ISP has the full list of all websites you visit. Given that we have confirmed report of dragnet surveillance installed at least at some major ISPs, you can assume NSA (and whatever TLAs they share with) has the full log of these (storage paid for with our tax money, thank you very much!) though they probably don't check it unless you become a focus of their attention somehow.

  2. Google/Apple most definitely has these data, and likely they sell some of it, and give some of it on a search warrant. The government can request it, the legality of it is kinda debated but it's legal at least in some cases, so you can assume if the government wants it, it will have it. I don't think we have any info about Feds keeping independent logs, but they wouldn't need to.

  3. Not likely, as it would be a direct violation of wiretapping laws AFAIK. Unless, of course, you got into trouble enough for The Law to be able to get a wiretapping warrant on you. Though really with all the rest of NSA shenanigans I wouldn't be totally surprised if they start doing it, but I haven't heard any indications of that happening yet.

  4. Not likely, since the traffic to record it all would be large enough for people to notice and start talking about it. It is plausible that there could be "keyword triggers" that record specific conversations and clandestinely ship them back to the phone/OS company (where the previous items apply), but for full transcripts of every word it'd be hard to do without people noticing, and since we don't have AFAIK any good evidence of this right now, I'll tend to say no, at least in the form presented. They definitely could listen and update e.g. your advertisement profile - that'd be very hard to catch without having enough access, though the longer we go without somebody Snowden-ing it out, the lesser is the probability that it is actually happens. If NSA couldn't keep their secrets secret, why Google or Apple would be able to?

  5. In general, it all depends on a) what is your threat model and b) how interested the government is in you. For most normal people, the government is not interested in them unless they become a target of the investigation - which means they did something to trigger it, or somebody else pointed at them as somebody to watch. If that happened, pretty much any contact with modern society means you're screwed. Bank account? Forget about it. Driving? You better wear a mask and steal a car from somebody that doesn't mind they car being stolen. Communication? Burner phones probably would get you somewhere but don't stay in the same place too long or use the same burner for too long. It's possible to live under the radar, but it's not convenient and usually people that do that have their own infrastructure (like drug traffickers) and if you're going into it alone, it will be tough for you. OTOH, if you're just a normie feeling icky about your data being stored at the vast data silos, you can use some tools - like VPNs, privacy OS phones, etc. - with relatively minor inconvenience, and avoid being data-harvested. But it wouldn't protect you if The Law becomes seriously interested in you.

I think with 4 and 5 it’s much more likely that they have various companies do that for them, and have arrangements to let them ask to see it. There’s a lot of ways that this could be happening, and since your isp/phone company/social media isn’t literally the government, it’s not really illegal. The arrangement would be something like what happens with pictures. Apple can search your photos (or at least tge ones on their cloud) for child porn. They are also obligated to report any such images they find. But I absolutely believe that if I said something that the government really really doesn’t like that it would be reported to the government fairly quickly. And it’s mostly down to liability laws — if I have a social media account where I talk about doing something illegal and I actually do it, my victims can absolutely go after those media outlets for knowing that I said that and not warning people to stop me.

I actually opt into a service with Google where they track where I am at pretty much all times through my phone. I can go to a dashboard and follow myself through the past going back to when I first opted in. I assume they do this for everyone and I'm only opting into the tools to see the data myself. My wife can also see where I am at any given time, which is also intentional. I have issues with my health and get holes in my memory; I've needed others to be able to locate me before when I'm not well.

#1, #2 and #3 are technically feasible and only require coordinating a small number of companies with server side mechanisms, but it's vulnerable to whistleblowing.

At least until the 2020s the employees at Google would have revolted if they found insidious spyware like that. Not sure how it would go in 2025...

#4 would be challenging to do en masse without the infosec community noticing.

If you're willing to go full tinfoil hat you can greatly increase inconvenience to yourself to mitigate the first 3. Naomi Brockwell's YouTube channel is a fairly high quality resource.

https://youtube.com/@NaomiBrockwellTV

This reminded me of something that is not quite on topic, but close enough.

Towards the end of summer 2018, I broke up with my girlfriend. Given my age and maturity at the time, I, of course, took care of the most important things first; I hid all of our photos that we had posted together on Facebook.

Well, not actually all of them. All of the one's that popped up on my "wall" at the time (I actually deleted Facebook for good about a year later, so my appreciation for both the terminology and function of the site is now out of date. Apologies if what I detail here isn't how it works anymore). These were typical couples photos; lots of couple-selfies of us eating things or being in places or even eating things in places.

As with any millennial breakup, however, I didn't actually unblock or unfriend my ex. No, no. You see, there is etiquette to the Facebook break-up. Although there can be a a period of mutual blocking, you never hard delete one another. But you also never interact with one another. You simply cyber-stalk one another to see who rebounds first.

Being a career technology dude, however, I noticed something interesting. Within just one or two days of my totally-not-crying deletion of the various wall photos, I became aware that my ex and her friends were no longer getting prioritized in my newsfeed. This was a stark contrast to just a month before where every damn day my newsfeed was filled with whatever new photos she had posted that day along, often, with the goings on of her friends (whom I had friended on facebook when we began dating). Quite the abrupt shift! I double checked to see if anyone hand blocked anyone else. Nope. Should I navigate to any of these profiles directly, I could still click on stuff without any new limitations (pro tip: don't get caught liking a photo from six years ago).

The realization didn't take long to formulate in my head. It seems to me that Facebook detected the pattern of "relationship status change followed by rapid hiding / deletion of photos only featuring two people ... those previously in a relationship" and then quickly, and easily, followed the random forest down to "breakup protocol." To help spare my feelings, it began to algorithmic shadow-censor the new things my ex and her friend's were doing (why the friends? Probably just in case my ex popped up in their photos. A likely outcome).

But then I realized something else that really gave me an "oh shit" moment (and, happily enough, made me forget about my ex). Facebook must have hundreds of these kind of behavioral decision trees. Breakups, divorces, graduations, new births, deaths in the family ..... deaths in the family .... wait, what kind of deaths? old age, cancer, car accident .... suicide.

It then became apparent to be that Facebook likely has a fairly reliable (though probabilistic) means of identifying social media posts that evidence suicidal ideation. Then, thinking back on my own situation, I wondered if there was some sort of correlation between breakups and suicidal ideation (it's my understanding that, yes, there generally is. I think job loss is the other big one.)

So, in 2018, instead of doing normal break up related stuff, I'm trying to piece together how accurately fascebook can predict suicide, or drug overdose, or alcoholism, or intent to harm others (I stumbled across a bunch of articles about how cops would try to find ways to infiltrate private instagram feeds because, apparently, gangs would literally announce their intended targets that way).

And this is the bigger conundrum to me than just the collection of data. If the data available to a company could be used to make these reliable behavioral profiles and, in fact, probably is. Then, to what extent do we want them to take preventative measures for all of these potentially horrible outcomes? But think about what that is -- it's corporate sponsored Minority Report. Hell-the-fuck-no! The level of dystopia that comes with "Hi, we're the cops, facebook told us to visit you" is off the charts.

There are basically 3-5 types of players of note. The government, large stack tech providers, and data brokers are the most distinct and relevant ones.

The government is extremely capable but also doesn’t usually bother to assemble its data into a full-you, longitudinal picture unless it’s motivated to do so. Theoretically that requires a warrant or a high degree of suspicion but in practice it just requires a casual interest. I think regular citizens worry far too much about this and powerful citizens worry far too little about it.

There are only about 3 players in tech with large “stacks”. Google, Meta, and Amazon/AWS. Second tier players in terms of exposure or will to track include Apple, ByteDance, and Microsoft. Any other tech company relevant for an American only matters insofar as they integrate their stuff with the final group…

The “data brokers”. These guys assemble pictures of you based on what dregs they can buy from bigger players, smaller but more comprehensive deals with single or more focused services, and occasionally supplement with data leaks even if such is technically illegal I’m pretty sure they still do.

It’s important to keep these 4 groups distinguished (there’s a major gap between the top tier of tech and the second tier). The answers and usage of the data differ a lot. To some extent the top tier hold back from their full theoretical power.

I will add that there is probably a fifth group of relevance: ISPs and cell providers. These groups are theoretically high exposure but held back due to regulation or fear of lawsuits. The government teams up with them again in cases of suspicion but otherwise doesn’t usually bother. (Banks might count as a sixth group but AFAIK they are super regulated about what they do so don’t matter)

I’d say that the exact words and recordings usually aren’t a major worry. It receives too much attention if you ask me. Your location is far from granular but big picture is likely very knowable even by smaller players. The data brokers are a bit inconsistent but potentially the biggest store of info and also the least regulated. However that inconsistency also works somewhat in your “favor” as the knowledge they get is by nature very inconsistent. You’d be surprised at how hoarding some companies are about their own data and how reluctant the biggest players are to share the Crown Jewels even if they only sorta use it themselves. Your web activity is pretty patchy because the tech evolves so fast and there’s a major wax and waning of exposure. Sometimes they can track a ton and sometimes the noise is strong and it’s hard to assemble patches of data with reliability.

And again each of the 4 nongovernmental groups get different slices of the data so unless you’re asking specifically about the top tier it really depends.

To take this on a slight tangent - at least for phone/tech companies, they're not keeping nearly enough data about me.

I bought a new flagship Samsung phone this year, billed as having all the AI bells and whistles. It was supposed to work magic with its cloud access, integration with all the built-in apps, on-device processing, and smart assist / suggestion features.

What Samsung AI actually does is sit around offering an inferior version of my SOTA-subs (Claude,Gemini,ChatGPT) and I basically never touch any of its features. It's the brand-new-but-already-outdated-car-touchscreen of AI tech. Also, a few times a day it annoys me with an unnecessary pop-up saying "Good afternoon! Here's a random news article based on your location. The current weather is overcast. Have a great day!". I hope to god no inference cycles were wasted generating these turds that wouldn't have passed muster as a feature in 2015, let alone 2025.

I want to be able to sell my soul to the machine. I want it to spy on me every second I use it. I want it to already know that I've been pulling up my topo map every time I have a spare minute, see that I've been looking at such and such an area, know that I usually do hikes of this distance and that elevation gain, and go have a think about that in the background and come back to me with something useful that I would actually want to know, and haven't seen yet - that "there's low cloud forecast for that area on Saturday, just FYI", and "trip report from 2 days ago mentioned an active bear in the area".

and to head off objections, Yes I want it reading my texts. Yes, I want it looking at my photos. Yes, I want it to be my Whispering Earring. "Better for you if you don't hit send on that reply. She'll likely think you're being flippant even though you're being sincere".. and so on.

obviously not with that kind of sharing enabled by default, but it should be available!

Have to agree with your conclusions, although I come at it from nearly the opposite valence. I want to own my phone on the hardware level, and NOT have it spying on me unless I choose to transmit certain info out.

I have all the extra Samsung AI features disabled on mine, and have yet to hear a single reason to turn them on.

If it is going to be spying on everything, it damn well better be able to figure out how to be a good little servant and satisfy my actual preferences.

This has been my ongoing annoyance with targeted advertising. I should never, ever be exposed to a digital ad that isn't at least somewhat enticing to me, or at least feels relevant to my interests. Yet 99% of the time, I'm simply nonplussed by the offerings that actual get served. Oh, I can see that they're taking educated guesses, they're not completely winging it, but whatever 'consumer profile' or equivalent they've got of me is laughably off base. I could see a me that was shorter on willpower and maybe 15-20 IQ points lower might be engaged with it.

Full disclosure though, I've also used the Firefox browser the entire time I've been on the internet, and I adblock every website by default, so it is just possible they can't get a good read on me.

After decades of data gathering, they aren't any better at predicting my preferences DESPITE ME BEING VERY CONSCIENTIOUS when feeding my preferences to them!

My end thought is "Look guys, if you want my hard-earned money you have to at least display things that are genuinely appealing at a price point I would be willing to consider. Otherwise, maybe leave me be." I can figure out what I want and how to buy it just fine on my own!

And that's kind of the meta-issue with AI products and their integration. "If you want me to opt-in to your digital surveillance panopticon, SHOW ME HOW IT WILL IMPROVE MY LIFE FROM BASELINE, I don't want parlor tricks and corporate marketing jargon, I want tangible improvements in the metrics that I care about with regards to my life quality. If you can't figure out how to do that, I literally do not trust you to run this system wisely."

EDIT: Although, I am waiting in trepidation/excitement for the day I log into one of my accounts and have a conversation with the AI and it becomes clear that the robot has me 100% pegged, it knows precisely what I want and it can offer a plausible plan on how to get those things/give them to me, and demonstrated capacity to assist in that goal. Then, I like to think that I'll have the willpower to put it down and think things over, and try to maintain enough sense of self that I do not just immediately empty my wallet and tell it to do whatever it takes to make my dreams come true.

Google shows you it's location tracking in Google maps timeline.

Apps that respond to voice commands Hey Alexa, OK Google etc. Must record all sounds to parse the command sound. That doesn't mean they archive it all but it's all getting recorded and processed. They certainly have all your phone call metadata (who you called and for how long). Your browser has site history which is generally sold widely.

I would operate under the expectation that all of those but phone transcripts to be available to anyone who wants to buy them.

How difficult would it be to set up a company that actually buys the data on either a major country, anonymized, or specific smaller groups or individuals, not anonymized? What are the rules for who they are allowed to sell to, and in what form?

If you’re in California or a small handful of other states there are a few rules but otherwise it’s the Wild West. GDPR is the only real game in town in the EU. Your biggest challenges are cost and access (getting someone to sell to you is harder than closing the sale). It also really depends on what you mean by major country.

If you mean like, could China buy data on Americans from sketchy brokers and assemble it themselves? Yes almost certainly and they probably have. There is essentially no mechanism preventing them from doing so either. However, as I noted in my comment above, real-time and granular data from the biggest primary players is usually kept strictly in-house. They are also almost certainly keeping their capabilities in their pocket in case of major conflict.

In fact the CFPB was thinking about putting in nominal sale restrictions (really basic stuff) but the Trump admin tanked those plans. It’s my understanding that there are a few Executive Orders that attempt to fill the gap (eg prevent sale to China or Iran or Russia etc) but it’s unknown how much tooth or enforcement consistency they have (my guess: very little)

I have no idea why I added the word "major" to country. I was just trying to ask about the differences between anonymized large data sets, and smaller but far more specific data sets.

I guess I'm trying to find out... how open to abuse is this system - is it simple for a somewhat resourced person or group to just set up a company and acquire sensitive data on business rivals or personal enemies or whatever else.