[HN Gopher] Billion-record stolen Chinese database for sale on b...
       ___________________________________________________________________
        
       Billion-record stolen Chinese database for sale on breach forum
        
       Author : ellen364
       Score  : 258 points
       Date   : 2022-07-05 10:21 UTC (12 hours ago)
        
 (HTM) web link (www.theregister.com)
 (TXT) w3m dump (www.theregister.com)
        
       | bell-cot wrote:
       | Kinda interesting that _The Register_ does not even speculate
       | about steps which China 's higher-level security services might
       | take in response, to "memorably demonstrate their displeasure" at
       | the theft. (A certain cynical attitude is usually part of _The
       | Register_ 's stock-in-trade.)
        
       | spoonfeeder006 wrote:
       | This makes me really sad for all those people, especially the
       | people advertised on the sample
        
       | FollowingTheDao wrote:
        
         | pedro2 wrote:
         | And not receive those sweet dollars?
         | 
         | I am sorry sir, I will not.
        
         | noirbot wrote:
         | Governments have been collecting (and poorly securing) this
         | sort of information and more for most of recorded history. It's
         | not to say that I like it, or would work for somewhere like
         | Meta or the like, but plenty of these major data leaks have
         | been from places that used to collect and store physical data
         | bases of this stuff since before most of us were alive.
         | 
         | I'm talking calmly about this because people have been
         | screaming in my ear about it for 20 years, and I listened. And
         | then I lived my life around the fact that this was going to be
         | happening whether you scream yourself hoarse or not, at least
         | for now.
        
           | FollowingTheDao wrote:
        
         | Agamus wrote:
         | Five years! I've been screaming that for at least 15 years, and
         | I'm pretty sure I'm a noob to the discussion.
        
           | FollowingTheDao wrote:
           | I am with you, I was just minimizing.
        
       | r721 wrote:
       | Karen Hao (WSJ): "I downloaded the sample the hacker provided and
       | called dozens of people listed. Nine picked up & confirmed
       | exactly what the data said."
       | 
       | https://twitter.com/_KarenHao/status/1543949945614393344 (thread)
        
         | guywithahat wrote:
         | That WSJ article is so much better than the posted one, I mean
         | what even is "the register"
        
           | imron wrote:
           | The home of snarky IT journalism since the first dotcom boom.
        
         | neonate wrote:
         | The WSJ article: https://www.wsj.com/articles/vast-cache-of-
         | chinese-police-fi...
         | 
         | https://archive.ph/02v3p
        
         | twicetwice wrote:
         | nitter link, since Twitter put up what seems to be a timed
         | login gate when I was halfway through reading the thread:
         | https://nitter.net/_KarenHao/status/1543949945614393344
        
           | hackernewds wrote:
           | The app download nags on mobile web are so unbearable I
           | stopped using Twitter entirely
        
             | l33tman wrote:
             | Same with reddit on a mobile browser... it actually shuts
             | you out, and says (after a couple of clicks) that they have
             | locked you out "for your protection" as the content is
             | "unverified", and that you need to use their app..
        
             | BbzzbB wrote:
             | I made a webapp home icon from my Firefox and picked out
             | the app-bait popover with uBlock.
             | 
             | Basically just about every app (YouTube, Reddit, Facebook,
             | ...) is better this way. I.e., no ads, erase-able elements,
             | less spyware, defaults to no notification and sometimes
             | even gets better functionality. For instance, it (browsers)
             | gets rid of "hearts" in Duolingo for whatever damn reason,
             | so you can practice however much you'd like in a day.
             | 
             | The downsides I've found is that you seemingly can't
             | Chrome-cast from it, and it often creates new tabs instead
             | of reusing existing ones or making it's own app-instance,
             | so you gotta close all tabs every so often.
        
           | black_puppydog wrote:
           | Nitter is the only sane way to read twitter nowadays. Even if
           | I still had an account it would be better for reading.
        
             | moneywoes wrote:
             | I keep getting timeouts from them interestingly
        
       | khana wrote:
        
       | dQw4w9WgXcQ wrote:
       | Excellent, a fair trade for all the TikTok data Hoover-ing
       | they've been doing on US citizens.
        
       | neallindsay wrote:
       | This has to be the largest leak of personal information yet,
       | right?
        
         | nicce wrote:
         | Facebook leaked much more couple years ago. Somehow everyone
         | has forget that.
         | 
         | Some example news: https://www.privacyaffairs.com/facebook-
         | data-sold-on-hacker-...
        
           | jsnell wrote:
           | That was not a data leak. It was a compilation of scraped,
           | publicly available data.
        
             | hansel_der wrote:
             | private data was publicized without consent, a leak indeed.
        
         | O__________O wrote:
         | A lot of the press is saying it is, but unclear since "entries"
         | is as vague as the "records" in this 1.2 billion leak:
         | 
         | https://www.wired.com/story/billion-records-exposed-online/
         | 
         | Appears this leak is a single dataset -- one I linked to is
         | multiple datasets.
        
       | mvdwoord wrote:
       | What do we do now?
       | 
       | It seems the majority of people on the planet now have had some
       | of their data leaked. Or are becoming ever more entangled with
       | government and corporate systems which control and peddle their
       | information as they see fit.
       | 
       | Is it ultimately a big nothing burger, or is this some
       | singularity we are passing through?
        
         | gonzo41 wrote:
         | Covid is a good excuse to wear a mask, and pair it with a set
         | of mirror sun glasses in public. Maybe that's how we live now.
        
           | thomassmith65 wrote:
           | We should probably consider a person's voice-print, too. To
           | be safe, you need a mask with a real-time voice changer.
        
         | nonrandomstring wrote:
         | > What do we do now?
         | 
         | I was thinking - if I had this, what could I do with the
         | personal records of a billion Chinese people?
         | 
         | And I must conclude - absolutely nothing. It's of no interest
         | to me.
         | 
         | Now, I probably lack sufficient criminal imagination, but the
         | point is stuff like this is hard to fence because there's a
         | very small market of buyers. In an article I wrote for
         | Routledge about the markets for stolen digital data
         | (specifically movie and album releases) I suggested that the
         | underlying problem is there's symbiosis between leakers and
         | buyers.
         | 
         | If you want to do anything, _target the buyers_. There 's less
         | of them. Don't try to secure inherently insecure massively
         | centralised systems (Blotto + Dolev Yeo problem) . Or chase
         | leakers. Or blame users. Or fire the CIO. Find out who _wants_
         | this stuff and take down the show from the demand-side.
         | 
         | But hold on! Guess who the buyers are. And guess what sincere
         | will exists within "law enforcement" to tackle this sort of
         | "cybercrime".
        
           | vbezhenar wrote:
           | After my data was leaked, now scammers periodically call my
           | phone to let me know that "I'm from bank security and
           | someone's recently tried to change phone number for your bank
           | account" or "I'm from police and we're opening a criminal
           | case against you". It was fun first few times, but now I'm
           | considering changing my phone number because I could miss an
           | actual bank security call.
           | 
           | And I'm sure that plenty of gullible people were scammed and
           | lost their money because of those leaks. When someone calls
           | you, knows your full name and talks with enough confidence,
           | it causes some trust.
        
           | rz2k wrote:
           | I suppose you could go the other direction. You could be an
           | international human rights organization, and treat the
           | database like a billion claim checks.
           | 
           | Having a definitive record of people's existence would make
           | it more difficult for the authorities to skimp on natural
           | disaster rescue efforts then lie about casualty numbers,
           | treat citizens as canon fodder for military purposes, or
           | simply wipe out individuals who have grievances with the
           | government or powerful functionaries.
        
           | dc-programmer wrote:
           | This type of information is used all of the time to discover
           | and compromise web accounts of the victims in bulk. There are
           | scripts that take in this data as input and will do a lot of
           | the work for you to take over their accounts (or at least
           | find their active accounts across web). Any additional data
           | you are able to trawl can be sold itself, leaving the next
           | steps to more advanced or motivated threat actors.
           | 
           | It's also useful for more targeted social engineering
           | attacks.
        
         | gfd wrote:
         | The previous big case I remember was linkedin leak with 700M
         | users: https://news.ycombinator.com/item?id=27674393
         | 
         | At this point I've basically accepted that all my info will be
         | found on sites like fastpeoplesearch.com and that anything I
         | tell any company (or I guess in this case, govt too) will
         | eventually be leaked, correlated, and used against me.
        
           | ge96 wrote:
           | Wow that's bigger than Equifax
        
             | AnimalMuppet wrote:
             | LinkedIn doesn't have my Social Security number. It doesn't
             | have a list of my bank accounts and credit cards. So, more
             | people, but less damaging information.
        
             | hackernewds wrote:
             | Another nothingburger since these companies still exist.
             | and profitably
        
             | scandinavian wrote:
             | The linkedin "leak" was just a scrape of public data.
        
               | moneywoes wrote:
               | Is there any word out how they managed to avoid linkedins
               | relentless rate limiting? For example my account gets
               | rate limited for normal browsing
        
               | nikcub wrote:
               | Likely hacked/purchased browser extensions
        
           | the_biot wrote:
           | What's fastpeoplesearch.com? Some search engine for leaked
           | credentials? (it appears to be geoblocked in Europe)
        
             | baud147258 wrote:
             | I was able to connect from France, it's for people living
             | in the US, look like you can search for people and there'd
             | be aggregated information scrapped from god knows where. I
             | checked a few (not really famous) people I knew of and it
             | seems they have some accurate information.
        
         | pyinstallwoes wrote:
         | In history what have databases of people and state actor
         | interests usually led to if any events are similar?
        
           | MadsRC wrote:
           | IIRC when Nazi Germany invaded Denmark in 1940, one of the
           | first things the SS did was to send representatives to the
           | local churches.
           | 
           | In Denmark, every child was (I'm not sure if they still are
           | actually?) registered at birth by the local parish in so
           | called "church books".
           | 
           | With these "databases" in hand, the SS had a neat list of all
           | names, and the approximate location of peoples homes.
           | 
           | Those lists were used to identify and prosecute jews.
        
             | ricochet11 wrote:
             | and ibm made machines to help do this as quickly as
             | possible.
        
             | black_puppydog wrote:
             | There were also the "pink lists" tracking gay men [1] (link
             | to German Wiki sorry) and which the nazis also greatly
             | appreciated. Although to be fair^blunt they were collected
             | exactly for reasons of prosecution, so not that far off
             | from their use by the nazis.
             | 
             | [1] https://de.m.wikipedia.org/wiki/Rosa_Liste
        
             | Natfan wrote:
             | "Fun" fact: It was IBM who helped tabulate data from the
             | 1933 national census, which was then used to identify
             | hundreds of thousands more Jews than would have been found
             | by the Nazi party without their efforts.
             | 
             | "Machine-tabulated census data greatly expanded the
             | estimated number of Jews in Germany by identifying
             | individuals with only one or a few Jewish ancestors.
             | Previous estimates of 400,000 to 600,000 were abandoned for
             | a new estimate of 2 million Jews."
             | 
             | [0]: https://en.wikipedia.org/wiki/IBM_and_the_Holocaust
             | 
             | [1]: https://en.wikipedia.org/wiki/History_of_IBM
             | 
             | [2]: https://en.wikipedia.org/wiki/IBM_and_World_War_II
        
               | chasd00 wrote:
               | Did working with IBM contribute to Hitler's spiral into
               | insanity? 4/5 joking
        
               | jsiaajdsdaa wrote:
               | Hey Siri, select * from all_humans where
               | atLeastOneOverlap(schools_attended, art_schools) = true
               | and atLeastOneOverlap(employers, list.of(ibm)) = true;
        
               | mvdwoord wrote:
               | And to add insult to injury, the IBM office in Munich
               | (birthplace of national socialism), is located on 1
               | Hollerithstrasse (Hollerith street).
               | 
               | The IBM subsidiary in Nazi Germany selling and
               | maintaining the tabulating machines was DeHoMag, Deutsche
               | Hollerith Maschinen AG.
               | 
               | ...
        
               | daniel-cussen wrote:
               | That's just the name of the founder, Herman Hollerith. He
               | had nothing to do with any of that.
        
               | TedDoesntTalk wrote:
               | nit: the founder of IBM was Tom Watson Senior, not Herman
               | Hollerith. But your point stands -- Hollerith had nothing
               | to do with this.
        
             | t_mann wrote:
             | _Church_ books were used to find Jews? Do you have a source
             | for that?
        
               | samus wrote:
               | Antisemitism was not really about religion. Many Jews had
               | actually converted to Christianity for generations. The
               | Nazis still considered them to be Jews.
        
               | daniel-cussen wrote:
               | Ahh...well there is the famous saying, "I decide who is a
               | Jew." It was used on the head of the German Manhattan
               | Project and a Jewish head (like a headmaster some shit)
               | of a concentration camp, forget which one. And that's why
               | we say "German Manhattan Project" stedda "Americaner
               | Atomwaffenunternehmen" (I made that word up, it is
               | correct in German to make words up, that means atom
               | weapon undertaking), because German antisemitism amounted
               | to forfeiting the bomb.
               | 
               | That was the price, the defeat of their last hope against
               | the Allies. All of the Great Jews that slapped those
               | firecrackers together were exiled due to antisemitism:
               | Fermi, Szilard, Einstein (to get the president to read
               | the letter to get the Los Alamos show on the road in the
               | first place, get Roosevelt to read top to bottom left to
               | right, no easy task), von Neumann (spesh because of his
               | schizophrenia, no concentration camp for him, he would
               | have been experimented on to then do that same sin to
               | everybody in the camps, Schizophrenic Jews were at the
               | absolute bottom o the Nazi world order).
               | 
               | I just posted about this.
               | https://news.ycombinator.com/item?id=31990431
               | 
               | Fermi was originally a fascist, it basically made sense
               | to him as a way of organizing a country.
               | 
               | Only non-Jew in the top desks of Los Alamos. Why? Only
               | when the racial laws against his Jewish wife and children
               | did he pack his shit and leave for America.
               | 
               | And Fermi was packing heat.
        
               | TedDoesntTalk wrote:
               | You forgot some other Jewish scientists who emigrated to
               | America because of Nazism, some of whom earned the Nobel
               | and many of whom worked on the Manhattan Project
               | 
               | Hans Bethe James Franck Edward Teller Rudolf Peierls
               | Klaus Fuchs Otto Loewi Max Bergmann Dieter Gruen Lilli
               | Hornig
               | 
               | I also forgot many in this list.
        
               | rejectfinite wrote:
               | They where like the tax office before the tax office.
               | 
               | Same in Sweden.
        
               | meepmorp wrote:
               | > Church books were used to find Jews?
               | 
               | If you know who to rule out, you have a smaller pool of
               | people to go after.
        
               | TazeTSchnitzel wrote:
               | It's not a religious thing: in Denmark, the church is the
               | arm of the state tasked with civil registration. Until
               | 1991 it was the same in Sweden.
        
               | yellow5 wrote:
        
             | mgdlbp wrote:
             | IIRC there was a central registry of religion in the
             | Netherlands that had the same effect. Can't find anything
             | on that now, though (it's mentioned in Wikipedia in an
             | unsourced paragraph; I think I first read about it on HN,
             | actually).
             | 
             | -----
             | 
             | Tangent: the info pages on the Anne Frank House site have
             | sections cycling through different pastel background
             | colours.[0] I've wondered before whether something like
             | that would the brain acquire context in a long page, making
             | comprehension more like that of a physical book. Seeing it
             | implemented, it doesn't seem to help. I think being able to
             | easily flip to a previous page and back was one of the
             | advantages of printed paper, so maybe a sticky TOC with the
             | same colours or a minimap scrollbar would allow that?
             | Actually, why not have that standard in browsers?
             | 
             | Hmm, the concept of coloured sections was known in 2013
             | already.[1]
             | 
             | [0] https://www.annefrank.org/en/anne-frank/go-in-
             | depth/netherla...
             | 
             | [1] https://ux.stackexchange.com/questions/62808/website-
             | layout-...
        
               | jacquesm wrote:
               | > IIRC there was a central registry of religion in the
               | Netherlands that had the same effect.
               | 
               | > I think I first read about it on HN, actually
               | 
               | That may have been my article:
               | 
               | https://jacquesmattheij.com/if-you-have-nothing-to-hide/
        
             | pessimizer wrote:
             | These days you'd just go to a data broker, who would also
             | tell you what toothpaste they preferred and whether they
             | managed to finish bingewatching The Sopranos.
        
             | juanani wrote:
        
           | shapefrog wrote:
           | Spam and phishing calls.
        
           | googlryas wrote:
           | Not quite the same, but the US used census records that were
           | supposed to be protected to round up the west coast japanese
           | for their internment during WWII.
        
             | AnimalMuppet wrote:
             | They were "protected". That is, they didn't leak out of the
             | government into private hands. But that still turned out
             | pretty badly.
             | 
             | In fact, information in the government's hands is the most
             | dangerous, because they have more power than anyone else to
             | use it against you.
             | 
             | (On the other hand, as others have said about Denmark and
             | Netherlands, data that was not in government hands _became_
             | in government hands, and was used against people. So it 's
             | not "safer" if it's in private hands, except to the degree
             | that the government has to go through the extra step of
             | getting it.)
        
           | mvdwoord wrote:
           | I would say, impossible to compare. Digital changes the cost
           | of acting upon this information, for good or bad purposes.
           | 
           | Obvious comparisons to e.g. the Netherlands' famous over-
           | registering of religion and how the Nazis abused that. But I
           | feel this is long term potentially worse than that. Not in
           | the level of horribleness, but in the effect on society
           | moving forward.
        
             | pyinstallwoes wrote:
             | Can you extrapolate that on what the effect on society
             | looks like in your assessment?
        
         | boomskats wrote:
         | It is both. It is huge, I'd say it's absolutely the latter. but
         | I can't think of a single thing anyone can do about any of it
         | at this point, which also makes it the former.
        
           | derwiki wrote:
           | One thing I've thought about doing is using CCPA to have
           | companies delete all my data, hopefully before it leaks.
        
             | ev1 wrote:
             | At several places I've seen they keep certain data such as
             | phone, address, etc as a bullshit "business need" to
             | "prevent abuse" and "prevent promo reuse" and keep forever
             | even through CCPA.
             | 
             | Also they keep the record of the delete request, which
             | contains the PII you ask to remove.
        
         | swader999 wrote:
         | I just change my name every few years. Makes the job hunt
         | difficult but I like a challenge.
        
         | thriftwy wrote:
         | A lot of data may be made public to equalize, similarly to how
         | real estate property rights or car registries may be public.
        
           | mvdwoord wrote:
           | I would counter that, although it could, some groups will be
           | able to evade it, effectively maintaining their
           | advantage/power. Effectively averaging out the position of
           | middle and lower classes, and lowering their chances of
           | moving up the social ladder?
        
             | thriftwy wrote:
             | I'm not sure it would give such a large advantage compared
             | to the cost of hiding
        
         | stjohnswarts wrote:
         | All you can do (in the USA) is freeze your credit and sign up
         | for one of the free (or paid) credit monitoring services. That
         | only protects you from financial ruin though. Not sure about
         | people using your credentials to commit fraud, fake birth
         | certificates, etc.
        
         | carapace wrote:
         | > What do we do now?
         | 
         | Well, if you look at (global) society as a dynamical system it
         | seems to me that there are two stable basins or attractors,
         | call them "Star Trek" and "North Korea".
         | 
         | In the "Star Trek" future the people in charge are themselves
         | also subject to the panopticon, and the world is ruled fairly
         | and humanely. (The other name I use for this is the "Tyranny of
         | Mrs. Grundy".)
         | 
         | In the "North Korea" future there are (human or AI or hybrid)
         | masters and brain-chipped cyborg slaves, and rule is absolute
         | and enforced with digital precision.
         | 
         | (Of course, this is all predicated on the idea that we can't
         | put the genie back in the bottle in re: ubiquitous
         | surveillance. I think that's likely the case (although I do not
         | like it) but I'm not going to make the argument here unless
         | someone asks.)
         | 
         | Given the above the thing to do is work to make politicians
         | subject to 24/7 total surveillance (ASAP, before everybody
         | else) so we can keep an eye on them. This policy would also
         | presumably weed out the crazies and corrupt, eh?
        
           | swader999 wrote:
           | And CEO's - everyone!
        
           | lagrange77 wrote:
           | > Well, if you look at (global) society as a dynamical system
           | it seems to me that there are two stable basins or
           | attractors, call them "Star Trek" and "North Korea".
           | 
           | Nice analogy. Do you really believe, that us being on an
           | utopian trajectory is realistic?
        
         | cm2012 wrote:
         | Well, leak can mean a lot of things.
         | 
         | The standard "leak" of names and addresses of people is totally
         | meaningless, though HN "privacy" obsessives blow it out of the
         | water all the time. It's basically public information, we used
         | to have everyone in phone books in the US and almost no one
         | cared.
         | 
         | Cell phone number is a riskier one because of the opportunity
         | for 2FA hacks. It's not hard to get people's cell phone numbers
         | as it is (you can buy direct marketing lists for pennies per
         | person in the US) but its not good to make it easy for hackers.
         | 
         | However this leak in particular appears to go much deeper so it
         | is insidious. Police records are named and who knows what else.
         | That is a genuine privacy issue and sucks for those involved.
        
           | maxbond wrote:
           | Names and addresses can absolutely be used to stalk and
           | harass people, and there are password reset flows that
           | involve physically mailing secrets to people. Perhaps almost
           | no one cared about phone books, but if you thought about the
           | differences between phone books and a website for a moment,
           | you'd see that these are different technologies that have
           | different implications, and that it is entirely reasonable
           | for people to have a different reaction.
           | 
           | You've chosen some arbitrary amount of information where you
           | begin to care and become interested, and decided everyone
           | with a different cutoff is an absolutist you don't need to
           | listen to. But it's really just that your situation permits
           | you to leak that information without fear, and you haven't
           | deigned to imagine that other people are in a different
           | situation.
           | 
           | I'd encourage you to rethink this perspective.
        
             | charcircuit wrote:
             | Names and addresses are already public information in the
             | US. It's not that big of a deal.
        
       | daniel-cussen wrote:
        
       | himinlomax wrote:
       | This is interesting, this could be a major blow to the Chinese
       | dictatorship.
        
         | upupandup wrote:
         | i dont think so. Chinese citizens seems unable to fight back
         | against the military. they have no access to guns, or mass
         | riots will break CCP's will
         | 
         | just look at north korea and cuba if you want to get a sense
         | for how long these regimes last. USSR was an exception.
        
         | hansel_der wrote:
         | why?
        
           | nonethewiser wrote:
           | I am guessing he means that it highlights the incompetence or
           | even just the consequences of centralizing power.
           | 
           | Personally I don't expect this to bear true. Historically in
           | China, government failures have been cited as evidence for
           | further centralizing the power of the federal government. And
           | this argument is bought hook-line-and-sinker by the people. I
           | don't think that will change until there is serious economic
           | hardship.
        
       | throwaway4good wrote:
       | Who would buy this?
       | 
       | How could anyone possibly make money off this data set?
       | 
       | I could understand if the Chinese government would pay for it to
       | avoid embarrassment but making the sale public kinda voids that.
        
         | pessimizer wrote:
         | The US government might buy it to help them find good
         | candidates to recruit as spies and saboteurs, or to note if
         | current spies and saboteurs are under suspicion or have been
         | discovered.
        
           | AustinDev wrote:
           | If the records are digital and non-air-gapped in any system
           | of any country, you can assume that the US government has
           | access to those records already. The exceptions to this
           | assumption are exceedingly rare.
        
             | alchemist1e9 wrote:
             | As a US citizen I want to believe bravado like this but I'm
             | guessing this is just your fantasy world talking not actual
             | knowledge of the government being competent, which in my
             | personal experience seems extremely unlikely.
        
         | upupandup wrote:
         | making money is not the motive for some. this database will be
         | very useful going forward. imagine the leverage you could have
         | over business dealings.
         | 
         | some guys at the top of the game are probably already doing
         | this and have figured out how to both insulate themselves and
         | launder/hide data they horde.
        
         | [deleted]
        
         | SoylentYellow wrote:
         | China has foreign call scams just like the US.
        
         | hutzlibu wrote:
         | "Who would buy this?"
         | 
         | Foreign intelligence agencies for classic espionage. If you
         | want to do blackmailing in china, such a DB would be a good
         | start.
         | 
         | Otherwise, data brokers. Advertisement, financial credibility,
         | trustworthines of buisness partners etc.
        
           | hansel_der wrote:
           | rest assured that intelligence agencies have means of
           | accessing police records in other nations.
           | 
           | this data is only interesting to the low end of data brokers,
           | advertisers and other scammers, hence the rather low price.
        
           | throwaway4good wrote:
           | I don't know how it works in China but where I am a person's
           | criminal record is not public but not exactly private either.
           | In the sense that an employer can ask for your criminal
           | record and you have the choice giving a printout of it or not
           | having your job. Making it kind of hard to see how the
           | knowledge of a criminal record could be used to blackmail
           | someeone.
           | 
           | As for "data brokers. Advertisement, financial credibility,
           | trustworthines of buisness partners etc.". Maybe. But these
           | companies would turn themselves into criminals by using or
           | purchasing this information.
        
             | hutzlibu wrote:
             | It is likely, that this DB contains more information, than
             | what a formal printout gives.
             | 
             | "But these companies would turn themselves into criminals
             | by using or purchasing this information."
             | 
             | Which is why they probably would not deal with the
             | information gathering directly, but use a service of a data
             | analyst company. When they do something illegal, nobody who
             | contracted then did ever know anything. I think this game
             | is played in china as well.
        
       | tpaksoy wrote:
       | Apparently there was a "blogpost" of a developer showing of their
       | code, where they accidentally leaked access tokens in a piece of
       | commented code: https://archive.ph/mP3bh
       | 
       | This is completely unverified though, so take it with a grain of
       | salt.
        
         | thrdbndndn wrote:
         | The consensus in Chinese community is while this is likely how
         | the token got leaked, this alone isn't enough. To visit private
         | Alibaba Cloud instance you can't just use some random IP. It's
         | isolated from the Internet in certain way.
        
         | bilekas wrote:
         | It's incredibly disappointing actually how often this happens.
         | 
         | I can't count the amount of SO questions I've had to edit from
         | others posting live API Keys for everything from custom
         | services to AWS.
        
           | TecoAndJix wrote:
           | I wonder if you could make a luhn-like check that would
           | require an additional approval step to post if it comes back
           | positive. Something like "It looks like you may be posting a
           | secret *****. Do you wish to continue?
        
             | jewel wrote:
             | If vendors agreed to a common prefix on all secret key
             | values then it'd be easy for everyone to add checks, to
             | everything. Something like "_SECRET88_".
             | 
             | Of course, then your secret key checker would need to build
             | that string by concatenating so that it wouldn't set off
             | itself.
        
               | pitched wrote:
               | How about scanning for any string with high entropy?
               | Might be easier to get buy-in if we don't all have to
               | bike-shed over what the prefix is.
        
               | CoffeeOnWrite wrote:
               | That's helpful but the token prefixes are also helpful.
               | You might be interested in GitHub's reasoning at
               | https://github.blog/2021-04-05-behind-githubs-new-
               | authentica...
        
               | zricethezav wrote:
               | More and more providers have been adding unique prefixes
               | to their tokens and access keys which makes detection
               | much easier. Ex, GitLab adds `glpat-` to their PAT.
               | 
               | A project I maintain, Gitleaks, can easily detect
               | "unique" secrets and does a pretty good job at detecting
               | "generic" secrets too. In this case, the generic gitleaks
               | rule would have caught the secrets [1]. You can see the
               | full rule definition here [2] and how the rule is
               | constructed here [3].
               | 
               | [1] https://regex101.com/r/CLg9TK/1
               | 
               | [2] https://github.com/zricethezav/gitleaks/blob/master/c
               | onfig/g...
               | 
               | [3] https://github.com/zricethezav/gitleaks/blob/master/c
               | md/gene...
        
             | bilekas wrote:
             | I was thinking about that too, but it's actually tricky,
             | even the example given, they use the var `accessId` but you
             | could filter for all that, even the standard ones, but you
             | couldn't have enough confidence in it so that if someone
             | did post with a typo or even a random var name, they would
             | think "Okay, no warning so must be okay".
             | 
             | Something like giving false confidence to the user. Not the
             | best idea.
        
           | swimfar wrote:
           | When you do this is there a way to completely get rid of the
           | information? Usually you can go back an look at the edit
           | history to see the original post.
        
             | aembleton wrote:
             | Change the keys.
        
             | capableweb wrote:
             | Wouldn't matter. Tons of bots are scraping every inch of
             | the internet all the time, and if something been online for
             | five seconds, it has been cached/stored somewhere. Always
             | assume that anything you've put up on the internet, can
             | forever be accessed _by someone_.
             | 
             | The only thing you can do is rotating the token/secret.
        
               | teddyh wrote:
               | http://www.threepanelsoul.com/comic/on-that-guy
        
             | bilekas wrote:
             | Yeah mods can clear the review history - for this very
             | reason!
             | 
             | But as mentioned below - Still advised to change your keys
             | for obvious reasons
        
         | truthwhisperer wrote:
         | poor developer. He may spend this life at a "re-education camp"
        
         | haasted wrote:
         | Binance CEO confirmed this version:
         | https://twitter.com/cz_binance/status/1543905416748359680
        
           | throwaway787544 wrote:
           | Starting today, this will be known as "Shanghai'd
           | credentials" and be reason #1 why we use ephemeral
           | credentials (e.g. AWS STS/SSO) rather than static credentials
           | (e.g. IAM Users)
        
             | throwaway2037 wrote:
             | I never heard about "ephemeral credentials" before your
             | post. I have some Googling to do!
        
               | krageon wrote:
               | It's essentially an access token with a very short expiry
               | time.
        
               | toomuchtodo wrote:
               | The other term of art is "dynamic secrets."
               | 
               | https://www.vaultproject.io/use-cases/dynamic-secrets
        
               | 0des wrote:
               | Good lookin out, thanks for the link
        
             | stefan_ wrote:
             | This is not at all the takeaway from this. It's "this
             | shitty developer should not have had access to this data in
             | the first place". With a nuance of "this database probably
             | shouldn't exist in this form in one place to begin with".
        
             | babelfish wrote:
             | Let's not. After the whole "China Virus" shit propagated by
             | the right, I'd prefer if we tried not to associate
             | vulnerabilities with specific people.
        
               | xfitm3 wrote:
               | I don't believe this comment is made in good faith, there
               | is nothing wrong with the "right" and it's senselessly
               | adding fuel to our political division.
        
               | malcolmgreaves wrote:
               | There is something deeply wrong with the authoritarian
               | politics of the right and its casual use of racism to
               | further political control.
               | 
               | > it's senselessly adding fuel to our political division.
               | 
               | This comment, whether you realize it or not, is coming
               | from a place of extreme social privilege.
               | 
               | Remember that for the majority of people, politics is not
               | a game. It is serious. People lose their rights to live
               | the life they want all the time. Sometimes those politics
               | turn violent and people lose everything.
        
               | markdown wrote:
               | It's not a new word.
               | 
               | https://dictionary.cambridge.org/dictionary/english/shang
               | hai...
               | 
               | https://www.urbandictionary.com/define.php?term=Shanghaie
               | d
        
               | malcolmgreaves wrote:
               | That's not an argument for continuing to use a word.
        
               | markdown wrote:
               | It is if the argument to stop using it is some irrelevant
               | point about some other location-based word that was used
               | negatively only recently.
               | 
               | Something got shanghaied isn't a pejorative in the way
               | that Trump acolytes use "China virus".
        
               | malcolmgreaves wrote:
               | > irrelevant point about some other location-based word
               | that was used negatively only recently.
               | 
               | Are you unaware of the Chinese Exclusion Act of 1882 --
               | which is exactly around the time that this term was
               | popular and in common use?
        
               | markdown wrote:
               | The correlation is coincidental. It has nothing to do
               | with that. https://en.wikipedia.org/wiki/Shanghaiing
        
               | bequanna wrote:
        
             | compumike wrote:
             | Doesn't the client still need to know a long-lived secret
             | (or a long-lived refresh token) in order to generate the
             | ephemeral credentials?
        
               | toomuchtodo wrote:
               | It can either use a secret injected into an env var to
               | bootstrap rotating ephemeral/refresh tokens or use a role
               | provided by the environment (which can also provide short
               | lived tokens), depending on your runtime environment and
               | use case (on prem, cloud, k8s, etc).
               | 
               | Static, long lived secrets with limited governance that
               | have no conditional access guards are weapons of mass
               | self destruction.
        
               | robonerd wrote:
               | Keeping secrets in environmental variables has always
               | seemed dodgy to me. Unless specifically cleared, they get
               | inherited by all child processes. Maybe there are never
               | any child processes in your application, or that could be
               | desired behavior in some circumstances, but generally it
               | seems like asking for trouble.
        
               | RajT88 wrote:
               | There's also the reverse issue - if they change after
               | your process is started.
               | 
               | Refreshing an environment variable that has changed is
               | (for me) a line I won't cross. Time to write the app a
               | different way, once that becomes a concern.
        
               | toomuchtodo wrote:
               | Its safety is proportional to your isolation model. Never
               | use env vars for secrets when you're executing arbitrary
               | code, for example.
        
               | steelaz wrote:
               | We got rid of all IAM users used by applications and
               | moved to role-based access. Nowhere in the application do
               | you need to enter AWS credentials. AWS SDK will attempt
               | to discover short-lived credentials for you and will
               | assume the role specified at the infrastructure layer,
               | e.g. in a task definition.
        
               | kbenson wrote:
               | One of the major benefits of ephemeral tokens is that
               | they become less attractive to put into the code, and
               | more attractive to put in a config file/vault that's
               | easier to update and keep secret. This in itself is
               | useful because it makes it less likely that it will be in
               | some source file someone shows, or pushed to some remote
               | repo that at some point has permissions allowed so people
               | can see it.
        
               | FujiApple wrote:
               | Yes, but credentials should either be long lived with
               | (very) limited scope _or_ short lived with required
               | scope.
               | 
               | For example, for AWS you can create long lived
               | credentials for users which are scoped to only allow one
               | operation, namely obtaining a short lived token (with the
               | aid of a hardware token such as a Yubikey) with scope to
               | perform other operations.
               | 
               | AWS guide here:
               | https://aws.amazon.com/blogs/security/enhance-
               | programmatic-a...
        
               | thedougd wrote:
               | You may also setup federated (trusted) relationships. For
               | example, a GitHub Workflow can be trusted to assume an
               | IAM role. In that scenario, there's no long lived secret
               | in scope.
               | 
               | The oidc subject includes the GitHub org, repo, branch,
               | and environment for the IAM assume role policy to match
               | or filter.
        
               | jffry wrote:
               | For my dev machine's interactions with AWS, I use
               | https://github.com/99designs/aws-vault
               | 
               | You add the long lived IAM user API key/secret to it and
               | it stores it in a password protected storage (MacOS
               | keychain or similar).
               | 
               | Then you invoke aws-vault with an IAM role and command,
               | and it will handle obtaining short-lived credentials
               | scoped to that role (including TOTP 2-factor code auth),
               | and then run the command with those temporary credentials
               | as env vars.
               | 
               | With the right AWS permissions on your user, it can also
               | automatically rotate the IAM user API keys for you.
        
               | rad_gruchalski wrote:
               | I like your approach. So far I used profiles extensively.
               | AWS_PROFILE is your friend. No idea why AWS doesn't
               | heavily promote this everywhere they can.
        
           | 72736379 wrote:
           | This is less a confirmation but more of a "piggybacking".
        
         | zricethezav wrote:
         | Assuming this unverified version of the story is true, the
         | danger of accidentally leaking credentials in code is enormous
         | and one of the reasons I continue to maintain and develop
         | gitleaks. Those credentials[1] would have been caught by the
         | gitleaks' generic rule [2]
         | 
         | [1] https://regex101.com/r/CLg9TK/1
         | 
         | [2]
         | https://github.com/zricethezav/gitleaks/blob/master/config/g...
        
           | alias_neo wrote:
           | How were the words selected for the regex? It's interesting
           | that "pass" is not there and breaks detection in your first
           | link, but I assume they were chosen based on the statistics?
           | 
           | Is it covered by a different rule perhaps?
        
       | cm2187 wrote:
       | > _This database contains many TB of data and information on
       | Billions of Chinese citizens_
       | 
       | how many billions?
        
         | _Algernon_ wrote:
         | I'd assume between 1 and 1.402
        
       | dang wrote:
       | Related:
       | 
       |  _Hacker claims they stole police data on a billion Chinese
       | citizens_ - https://news.ycombinator.com/item?id=31984663 - July
       | 2022 (1 comment)
       | 
       |  _Hacker claims to have obtained data on 1B Chinese citizens_ -
       | https://news.ycombinator.com/item?id=31980101 - July 2022 (1
       | comment)
       | 
       |  _Hacker claims to have stolen 1 bln records of Chinese citizens
       | from police_ - https://news.ycombinator.com/item?id=31977354 -
       | July 2022 (1 comment)
       | 
       |  _Police data of 1B Chinese people leaked_ -
       | https://news.ycombinator.com/item?id=31969617 - July 2022 (4
       | comments)
       | 
       |  _Shanghai Police leaking 20TB Chinese citizens data?_ -
       | https://news.ycombinator.com/item?id=31962526 - July 2022 (3
       | comments)
        
         | freewizard wrote:
         | Thanks for reposting this. The last link submitted by me only
         | got 3 upvotes. Guess it sounded just too crazy to be true 2
         | days ago!
        
           | dang wrote:
           | There's just a lot of randomness in what gets
           | attention/traction off /newest. That's why HN doesn't try to
           | prevent reposts of stories that haven't had significant
           | attention yet.
           | 
           | It sucks when you're earlier and don't 'win', but it evens
           | out in the long run if you post lots of good stories, since
           | sometimes the lottery works in your favor. One of these years
           | we'll get around to implementing karma-sharing to spread
           | credit across multiple submitters.
        
             | silentsea90 wrote:
             | What's the point of "winning" if everything is made up and
             | the points don't matter?
        
       | hrgiger wrote:
       | Well I imagine cloud sales teams reaching out haveibeenpwned with
       | attractive storage offers
        
       | nonethewiser wrote:
       | Ultimately the fault lies in the police and government for having
       | this data.
        
       | markus_zhang wrote:
       | "Looks genuine" from my Chinese friends. Also this might be
       | leaked through a hardcoded token in some code posted on CSDN
       | (sort of blog for programmers).
        
       | luke-stanley wrote:
       | In 2018 I saw a local branch office were using Windows XP and an
       | old Internet Explorer. You cannot expect that to be secure. This
       | does not surprise me at all.
        
         | JamesSwift wrote:
         | A lot of those are actually pirated/modified installs of
         | Windows. I think its called Tomato Windows or something like
         | that? I forget, but its incredibly prevalent in China.
        
         | baybal2 wrote:
         | Surprise, it's 2022, and XP is still a de-facto standard
         | Windows version, with hacked Win7 slowly gaining.
         | 
         | Why? Tons of Software was written for XP, and then abandoned
         | without any support. Many of that stuff in the government
         | sector. A lot of online banking clients outright say "only
         | works on XP," and copyright years reads 2006.
         | 
         | This is similar how Android 7+ support was almost nuked in
         | China for nearly a year because Tencent didn't want to port
         | Wechat to newer APIs cuz "nobody uses Android newer than 4.X in
         | China"
        
           | ceeplusplus wrote:
           | That was not why they refused to port it to newer APIs
           | though. It was because Google changed the permissions API to
           | be more granular and request permissions at runtime, which
           | would have meant Tencent would have to request tons of
           | permissions to gather user data (presumably users would not
           | be inclined to grant so many permissions).
        
         | Haemm0r wrote:
         | XP is very common on airports in China too.
        
           | dontbenebby wrote:
           | it's in US ones too, it's an industry wide issue in the
           | aviation sector, don't hack the airport, people will come for
           | you and if you are lucky they will be carrying badges
        
             | anewpersonality wrote:
             | Whatever happened with the Gatwick drone?
        
       | contingencies wrote:
       | Anyone care to compose a classical Chinese poem featuring Yun
       | (cloud)?
        
       | freewizard wrote:
       | - 10 BTC sounds a lot but it's peanuts for such large data sets.
       | 
       | - 750k row of sample data is large enough for a leak by itself,
       | many on reddit/twitter/fediverse have already started to explore
       | the data set for gender ratio, age composition and frequency of
       | raping cases, etc.
        
         | rejectfinite wrote:
         | >many on reddit/twitter/fediverse have already started to
         | explore the data set for gender ratio, age composition and
         | frequency of raping cases, etc.
         | 
         | Any links?
        
       | keewee7 wrote:
       | The Shanghai police has a unique role in China and abroad. For
       | example the Shanghai police is tasked with spreading pro-CCP
       | propaganda globally on platforms like twitter and Facebook.
       | 
       | There was an HN post about this a few months ago:
       | 
       | https://news.ycombinator.com/item?id=29654137
       | 
       | Someone posted a comment explaining a little more about
       | Shanghai's special relationship with the CCP/PLA:
       | 
       | >Shanghai is a city with a unique role in the progression of the
       | CCP and its global efforts. Also PLA Unit 61398 is in Pudong, the
       | shanghai district mentioned in the article. Overall there's a lot
       | of CCP/PLA-adjacent tech talent in the area, and of course the
       | local police still ultimately report to the CCP.
       | 
       | https://news.ycombinator.com/item?id=29656017
        
         | WilTimSon wrote:
         | So I'm guessing that database would have quite a few activists
         | listed in it and other anti-government people. Might even give
         | someone a much-needed warning if they find themselves there.
        
           | dontbenebby wrote:
        
           | drexlspivey wrote:
           | It's obviously a database of all Chinese citizens so yes
           | those people are included alongside everyone else
        
           | stjohnswarts wrote:
           | I was having this exact conversation with a friend last
           | night. Give them warning, especially people in Hong Kong.
        
         | nonethewiser wrote:
         | People didn't think Shanghai was open so that the world could
         | come IN to China, did they? It's about the opposite direction.
        
       | hintymad wrote:
       | The leaked screenshot of the data's metadata looks like the
       | output of Elasticsearch's /_cat command. Someone probably left
       | the port 9200 open to the public, or stored the index on a public
       | cloud but somehow leaked its keys either on github-like service
       | or in some discussion forum -- a typical mistake that engineers
       | make.
        
         | flatiron wrote:
         | https://www.alibabacloud.com/product/datahub is what they were
         | using, and yeah their keys were in a commented out psvm tester
         | method. pretty awful
        
       | pedro2 wrote:
       | Is it 1 billion in long scale or small scale?
        
         | sgjohnson wrote:
         | Why would it be in long scale? Is long scale even used in
         | english at all?
        
           | pedro2 wrote:
           | It was a joke. But it made me realize, thanks to the comment
           | above, that Earth's population is around 8 thousand millions,
           | and not 8 billion as I'd come to believe.
        
         | bitdivision wrote:
         | For anyone wondering what that is, English uses short-scale,
         | i.e. 1 billion = 1000 million, some other languages / countries
         | use long-scale i.e. 1 billion = 1 million million.
         | 
         | https://en.wikipedia.org/wiki/Long_and_short_scales
        
         | [deleted]
        
         | ginko wrote:
         | Last I checked there weren't 10^12 people living on earth just
         | yet.
        
           | pedro2 wrote:
           | I honestly didn't know that.
           | 
           | One gets used to short scale on the Internet.
        
             | hansel_der wrote:
             | it's about 1MMM
        
       ___________________________________________________________________
       (page generated 2022-07-05 23:00 UTC)