[HN Gopher] Dorking: the use of search engines to find very spec...
       ___________________________________________________________________
        
       Dorking: the use of search engines to find very specific data
        
       Author : abarrettwilsdon
       Score  : 253 points
       Date   : 2020-08-09 18:52 UTC (4 hours ago)
        
 (HTM) web link (www.alec.fyi)
 (TXT) w3m dump (www.alec.fyi)
        
       | bmay wrote:
       | the "link:" operator doesn't work for me--it just seems to
       | include the URL's tokens in the search
        
         | snowwrestler wrote:
         | Pretty sure that one is deprecated. It was very useful for SEO
         | research, which is probably why it doesn't work anymore.
        
       | harimau777 wrote:
       | Is there any way to search the actual page text? I find that
       | often I remember some unique turn of phase from the page that I'm
       | looking for and it would be extremely helpful to be able to
       | simply search for that.
        
         | abarrettwilsdon wrote:
         | `intext:phrase` and `allintext:multi part phrase`
         | 
         | generally "phrase" works well too
        
           | harimau777 wrote:
           | Thank you!
        
       | ricardo81 wrote:
       | Worth pointing out if you do some of these crafted operator
       | searches quite quickly, you'll end up getting blocked or having
       | to complete a captcha. I haven't done so in a while so I'm not
       | sure what their current behaviour is.
       | 
       | Main reason being there's plenty data mining, e.g. looking for
       | "powered by wordpress" and vulnerable versions, and generally all
       | kinds of data mining that involve very specific requests for
       | information, likely queries that aren't creating revenue, either.
        
       | iandanforth wrote:
       | The email specific queries don't appear to work. The "@" is
       | ignored by google so you just get results for the domain string.
        
         | abarrettwilsdon wrote:
         | The first two appear to still work, but the third does not.
         | 
         | The permutation searches are tricky because you don't know if a
         | lack of results means the email does not exist, or just hasn't
         | been posted anywhere indexed
         | 
         | Will update and credit
        
       | voldacar wrote:
       | Why doesn't google.com have a comprehensive list of these? I'm
       | constantly seeing new ones that I didn't know about, but google
       | never teaches you about them so you have to find them in obscure
       | blog posts
        
         | beefield wrote:
         | > Why doesn't google.com have a comprehensive list of these?
         | 
         | It is quite obvious that google does not give a s&it whether I
         | find what I think I want to find. Google is much more
         | interested in 1) serving me ads they think are most profitable
         | and 2) giving me results _they_ think I want.
        
         | KorfmannArno wrote:
         | My guess would be because Google eventually wants users to find
         | everything via natural language queries.
        
           | dragonwriter wrote:
           | Actually, Google _eventually_ wants users to find everything
           | with predictive AI giving it to them before they search. That
           | 's not really a secret, they've announced more than once in
           | the past that that is what they are increasingly working
           | toward.
        
             | souprock wrote:
             | That would be great for malware researchers. Google can
             | give them malware before they even search for it!
             | 
             | The reality is that all sorts of things are blocked now,
             | including things that are perfectly legal.
        
           | lizardmancan wrote:
           | the goal is to make you look at advertisement
        
         | vezycash wrote:
         | Google randomly ignores "search term in quotes".
         | 
         | Related:examplesite.com used to work well. Now, it's better to
         | use sites like alternativeto.net.
         | 
         | ~phrase is unnecessary because but google searches for synonyms
         | by default
         | 
         | phrase1 + phrase2 - Google randomly ignores it. I use it this
         | way +compulsoryTerm
         | 
         | Although rare, there are things I simply can't find using
         | Google. But Bing would. If Google keeps it up, other search
         | engines would benefit.
        
           | stanislavb wrote:
           | Ah, I never knew "related:" existed. Also, saashub.com could
           | be used as an alternative to alternativeto.net :)
        
           | EE84M3i wrote:
           | I would be interested to see an example of it ignoring quotes
           | silently because I've heard a lot about it. I use search
           | terms in quotes relatively often and have never noticed that,
           | although it does the 'did you mean without quotes' thing all
           | the time.
           | 
           | In the past for very long tail content, I've found Bing and
           | Yandex to be useful. Yandex image search in particular is
           | often better than Google or Bing, particularly if you are
           | searching for people because it does some facial recognition.
        
           | ewired wrote:
           | Doing some "related:" queries returns some interesting
           | results that look human-curated and out-of-date.
           | related:google.com shows results for Yahoo, Bing, AOL Search,
           | and HotBot (which used to be a search engine, but the brand
           | is now for a VPN provider).
        
         | EE84M3i wrote:
         | One reason they might not have a comprehensive list is because
         | some might be relatively expensive to execute, but they
         | can't/won't disable them for legacy reasons.
        
           | JadeNB wrote:
           | > One reason they might not have a comprehensive list is
           | because some might be relatively expensive to execute, but
           | they can't/won't disable them for legacy reasons.
           | 
           | Ah, Google, always so reluctant to get rid of anything legacy
           | because of their fanatical devotion to their existing user
           | base.
        
         | [deleted]
        
         | mrnuclear wrote:
         | At least now we are somewhat more empowered to find obscure
         | blog posts. Which raises the suggestion that hackers are
         | advantaged towards finding information. Which raises the
         | suggestion that we should take the independent initiative of
         | using SEO to inform more people about how to become search
         | super-users.
        
         | abraae wrote:
         | Having a reliable search syntax would commoditise Google as
         | other search engines could offer the same options. Having just
         | a search box, instead of lots of options was how they moved
         | ahead of e.g altavista in the first place.
         | 
         | Google would rather people are trained to just type human speak
         | into the search box.
        
         | lstamour wrote:
         | https://support.google.com/websearch/answer/2466433?hl=en but
         | it's not complete. My favourite is actually the "range"
         | operator. I don't need it often, but when combined with the
         | exact match quotation marks, it's great. For example, here's a
         | search for Sony bluetooth headphones available on Amazon.ca for
         | between CA$100 and $150:
         | https://www.google.com/search?rls=en&q=site%3Aamazon.ca+%22C...
         | 
         | The range operator also works great with years, dates, though
         | the Tools menu with shortcuts for before: and after: operators
         | can help there too.
         | 
         | One I haven't seen mentioned yet but used to be documented is
         | that you can leave out words in a phrase by replacing them with
         | an asterisk. I'm having trouble not italicizing text in this
         | comment box, so pretend \\* means a single asterisk: "Stocks
         | rose today by \\* percent" as a search matches the phrase
         | "stocks rose today, led by a 4.4 percent". (Which until this
         | post, had only one result on Google.)
         | 
         | Note that it's not 100% exact matching, because for actually
         | exact matches you have to select "Verbatim" under Tools > All
         | Results in the menu below the search box on the results page.
         | 
         | The only downside to using all these operators is that you'll
         | get very familiar and frustrated with the Google reCAPTCHA
         | prompts as your search is "too precise to be human". Even when
         | signed in to Google, especially often in Safari on an iPhone.
         | Sigh.
        
           | BeeOnRope wrote:
           | You can use three asterisks in a row, surrounded by
           | whitespace, to get a single asterisk like: "Stocks rose today
           | by __* percent ".
           | 
           | Oddly, this results in a non-italicized asterisk in the
           | output, contrary to reports in earlier comments that the
           | resulting asterisk would be in italics. There is, however, a
           | zero-length italicized string right before the asterisk in
           | the HTML:                   "Stocks rose today by <i></i>*
           | percent".
        
             | JadeNB wrote:
             | > Oddly, this results in a non-italicized asterisk in the
             | output, contrary to reports in earlier comments that the
             | resulting asterisk would be in italics. There is, however,
             | a zero-length italicized string right before the asterisk
             | in the HTML:
             | 
             | > "Stocks rose today by <i></i>* percent".
             | 
             | Sounds like the matching is something like
             | /\<\*.*\*/
             | 
             | or maybe                   /\<\*[^*]*\*/
             | 
             | rather than                   /\<\*.+*/
        
           | 1vuio0pswjnm7 wrote:
           | Is there actually a page that says "too precise to be human"
           | or are you just assuming this is what triggered the
           | reCAPTCHA?
           | 
           | If there is such a page, can you give an example query that
           | would trigger it?
        
             | lstamour wrote:
             | It mostly happens using "site:" queries which I use
             | frequently to limit things to local websites (by domain) or
             | for searching sites that have poor search engines (Amazon,
             | for example). It rarely happens the first query, but often
             | by the third or fourth modification or by the third or
             | fourth page of results you visit, it will show a reCAPTCHA
             | if it doesn't have enough "randomness" or doesn't think
             | you're actually browsing Google and third-party sites the
             | way others commonly do. (Robots are more likely to use
             | search operators, for example, and more likely to pretend
             | to be iPhones so they don't have to move the mouse, etc.)
             | 
             | My earlier query triggered it. Without a query, I can make
             | the following text show up by going to
             | https://www.google.com/sorry/index which when a relevant
             | query is attached to the URL, it shows a reCAPTCHA for the
             | search query, and also shows your IP address, etc.
             | 
             | > About this page
             | 
             | > Our systems have detected unusual traffic from your
             | computer network. This page checks to see if it's really
             | you sending the requests, and not a robot. Why did this
             | happen?
             | 
             | If you click the link "Why did this happen?" it says:
             | 
             | > This page appears when Google automatically detects
             | requests coming from your computer network which appear to
             | be in violation of the Terms of Service[1]. The block will
             | expire shortly after those requests stop. In the meantime,
             | solving the above CAPTCHA will let you continue to use our
             | services.
             | 
             | > This traffic may have been sent by malicious software, a
             | browser plug-in, or a script that sends automated requests.
             | If you share your network connection, ask your
             | administrator for help -- a different computer using the
             | same IP address may be responsible. Learn more[2]
             | 
             | > Sometimes you may be asked to solve the CAPTCHA if you
             | are using advanced terms that robots are known to use, or
             | sending requests very quickly.
             | 
             | [1]: https://www.google.com/policies/terms/ [2]:
             | https://support.google.com/websearch/answer/86640
             | 
             | The annoying part is that my account has never been
             | whitelisted based on good behaviour. Instead, I end up
             | seeing such reCAPTCHAs thousands of times a year, to the
             | point where I stop counting them. Roughly half the time
             | I'll answer the reCAPTCHA and the other half of the time,
             | I'll close the tab and go do something else. Cloudflare
             | site loading captchas are even worse, though. They delay
             | the site by 5 seconds while they "check my browser", and
             | then show an hCAPTCHA to solve, even when I'm already
             | signed in with the first-party site. Very annoying, though
             | the captcha is often easier to solve than Google's. The
             | Cloudflare block often on streaming media websites.
             | Ironically, Cloudflare's captchas have never prevented me
             | from using commonly available Python scripts to watch
             | streaming flash videos in VLC, they only block my web
             | browsing...
             | 
             | I can only assume that Safari's excellent ad blocking and
             | tracking prevention is causing my browsing traffic to stand
             | out compared to others', enough that it prompts these
             | CAPTCHAs more frequently.
        
       | malwarebytess wrote:
       | NLP and to a lesser extent SEO has vastly diminished the value of
       | this type of searching.
        
       | yourad_io wrote:
       | Fun fact: googling for -273.15 without double quotes produces no
       | results.
       | 
       | You need to quote negative arithmetic values when searching, even
       | if there are no other query parameters. It made me wonder if I
       | was misremembering absolute zero.
        
       | Shared404 wrote:
       | Syntax for doing things like this with DDG:
       | 
       | https://help.duckduckgo.com/duckduckgo-help-pages/results/sy...
        
         | kps wrote:
         | I'd switch to DDG in a half second if they supported the full
         | query syntax of altavista.digital.com (see
         | http://jkorpela.fi/altavista/ if you've forgotten). Disclaimer:
         | I work for, um, Google.
        
           | Shared404 wrote:
           | I do wish they supported a larger search syntax. My current
           | workaround is I have a massive bookmark folder of alternative
           | search engines that I try if I'm not having luck narrowing
           | things down enough.
        
             | lsiebert wrote:
             | That might make an interesting blog post
        
               | Shared404 wrote:
               | I'm still mid-setting up a blog, but I'll keep that in
               | mind once I've got it up and running.
               | 
               | I'm afraid it probably wouldn't be that interesting to
               | HN'ers though, because this is where I found most of
               | them.
        
       | marcrosoft wrote:
       | I love the "inject JS into the page to find stuff" hack. The
       | author mentions local "site you are on" but this can be applied
       | with headless chrome to crawl many sites.
        
         | flywheel wrote:
         | That's web scraping 101
        
       | lizardmancan wrote:
       | https://www.google.nl/search?q=site%3A+news.ycombinator.com+...
       | 
       | i use to use these a lot but now it's just useless
        
         | abarrettwilsdon wrote:
         | Try wrapping lizardmancan in quotation marks - "lizardmancan".
         | That narrows it down to 10 results for me
         | 
         | (also: you'll want to remove the space between site: and
         | news.ycombinator.com)
        
         | mikequinlan wrote:
         | You need to remove the blank after the colon.
         | 
         | https://www.google.nl/search?q=site%3Anews.ycombinator.com+l...
        
       | uniqueid wrote:
       | Last week I blocked every * .google.* domain on my network except
       | "youtube-ui.l.google.com".
       | 
       | Google Search: (1) ask a natural language question (since actual
       | search is hobbled) (2) get unrelated garbage and ads back (3)
       | blame yourself for "not being technical enough" to understand why
       | the results aren't actually garbage.
       | 
       | Google Search has deteriorated to the point that so far I haven't
       | missed it _at all_.
        
         | ip_addr wrote:
         | What search do you prefer and why?
        
           | uniqueid wrote:
           | I prefer Google circa 2005, but DDG and Bing work better for
           | me now.
           | 
           | I've never wanted anything fancy:
           | 
           | - don't show me paid search results - show me a blank page if
           | there are no results - make it easy to 'AND' terms (+include
           | +search +terms) - most importantly: search for my damned
           | search terms! If you want to "did you mean" my spelling,
           | fine. I don't really care. But it's unacceptable to _ever_
           | drop a search term.
           | 
           | I have plenty of other complaints about Google, but in terms
           | of search quality, those are the relevant ones.
        
         | MattGaiser wrote:
         | What is it you are searching for that the results are useless?
        
           | fortyseven wrote:
           | Indeed. This seems like a bit overreacting. Google is lots of
           | things, but a shitty search engine to the point of deserving
           | being blocked is not one of them.
        
             | uniqueid wrote:
             | > a shitty search engine to the point of          >
             | deserving being blocked is not one of them.
             | 
             | Google's search quality isn't why I blocked Google. I've
             | _wanted_ to block Google for over half a decade, but the
             | excellence of their search stopped me. That stopped being
             | an issue this year.
        
         | darepublic wrote:
         | Google still good for coding related searches
        
           | snakeboy wrote:
           | To be fair, I suspect "coding related searches" are easy for
           | any search engine, given
           | 
           | 1. the immense online/open-source nature of the profession:
           | every blog/forum question and answer/documentation since the
           | origin of the profession being in plain-text and mostly
           | publicly accessible by default
           | 
           | 2. and it all revolves around a precise, limited vocabulary.
        
           | JadeNB wrote:
           | Depends on what coding-related search, I think. Searching for
           | C is useless unless you know to search for clang, for
           | example; but then you get results for the compiler. If you're
           | trying to search for lesser known languages with short names
           | or names that overlap with common words, then forget it!
           | (Arguably that's a fault of the language, but arguably
           | arguably you shouldn't have to choose what to name your
           | creation based on Google.)
        
           | IdiocyInAction wrote:
           | You get SEO crap very often though IME.
        
             | spanhandler wrote:
             | Depends on the platform and what you're looking for. Some
             | operating systems and languages/ecosystems are worse than
             | others. Windows stuff is largely _incredibly_ bad (not
             | saying Windows is bad, for this reason anyway, just that
             | search results for anything MS-related tend to be awful).
             | The nerdier the OS and less  "corporate" the language, the
             | better the results get.
        
               | reaperducer wrote:
               | _The nerdier the OS and less "corporate" the language,
               | the better the results get._
               | 
               | But don't get _too_ obscure. Otherwise, you 'll discover
               | that Google has dropped the information you require from
               | its index because it's not new or trendy enough.
               | 
               | If we can get Taylor Swift interested in the old
               | internet, then Google will suddenly snap back into
               | usefulness.
        
           | uniqueid wrote:
           | Ever spent three minutes opening useless links from Google's
           | Search results, only to realize they dropped the keyword you
           | searched? That seems quite common now, especially with
           | programming keywords, which are often obscure.
           | 
           | Remember Google Code Search, and Google (Usenet) Groups? Back
           | then, Google cared about this stuff. Now they seem only to
           | want to show you furniture ads, or get you to use their Zoom
           | knockoff, etc.
           | 
           | These days Google substitutes the heck out of searches.
           | Perhaps it's better if you've logged in, but I'd rather hack
           | my leg off with a rusty saw than voluntarily log in to an
           | account just to search the web.
        
       | chris_f wrote:
       | A few corrections:
       | 
       | The + (formerly used to force a term to be present in the result)
       | and ~ (also find synonyms) operators have been deprecated.
       | 
       | Google now advises to wrap the word in quotes instead of using
       | the +. Google will also automatically look for synonyms without
       | the use of ~.
       | 
       | I have seen 'AROUND(n)' mentioned in many other places working as
       | a proximity operator in Google, but I don't believe that is true
       | and haven't found it to work in any logical way.
       | 
       | Also the use of parentheses to nest queries is not necessary in
       | Google. It is actually required for Bing on complicated queries
       | though.
        
         | TheSpiceIsLife wrote:
         | No longer have Google Chrome on any devices, switched over to
         | Chromium Edge.
         | 
         | Same browser, different overloads.
         | 
         | Left the default search engine as Bing, but only because Duck
         | Duck Go is useless for geographicly local search.
        
         | GordonS wrote:
         | Worth mentioning that even if you put a term in double quotes,
         | Google _still_ tries to be too clever - you are not guaranteed
         | to get results that contain your quotes search term : /
        
           | solarist wrote:
           | As a workaround and under search tools one can enable the
           | "verbatim" option.
        
             | GordonS wrote:
             | AFAICT, the verbatim option gives the same results as if
             | I'd quoted my search term?
        
               | solarist wrote:
               | In my experience it depends on the number of results, and
               | the results are more accurate with verbatim.
        
             | jlokier wrote:
             | I was under the impression "verbatim" is to disable filter-
             | bubble personalisation.
             | 
             | Normal queries are tailored to your personal filter bubble.
             | You can't see what other people see from same search, and
             | if you're doing SEO or just trying to find who tends to
             | come top in results for something you have a lot of history
             | looking at, you can't tell who comes top for other people.
        
           | devjungle wrote:
           | This must be a recent change? It's been driving me nuts
           | lately. I have to resort to adding a lot of negated search
           | terms to compensate but it's still sub optimal.
        
             | GordonS wrote:
             | No, this has been the case for a long time, years anyway. I
             | don't know if it goes back quite as far as when they
             | removed the '+' operator tho.
             | 
             | But bejesus, this drives me nuts! If I know the double
             | quotes function even exists, then Google should know I
             | actually want to use it as intended - it shouldn't decide
             | "yeah, but _maybe_ you 'd like these irrelevant results
             | too!"
        
             | JasonFruit wrote:
             | It seems to me that after a few negated search terms are
             | included, they are taken less strictly; "minus" seems to
             | mean "probably minus".
        
           | nostromo wrote:
           | And even if that exact term is present on popular websites,
           | like Stack Overflow, Google still seems to have trouble
           | finding those exact results regularly.
        
         | abarrettwilsdon wrote:
         | Updated the article to reflect and credited you for the
         | contribution!
        
         | EE84M3i wrote:
         | When you say "deprecated", you mean as in "discontinued" right?
         | Not just like, discouraged?
        
           | flywheel wrote:
           | Whoever the first developer was that used "deprecated" got it
           | kind of wrong, the word should have been "depreciated".
           | 
           | Deprecate: "express disapproval of."
           | 
           | Depreciate: "diminish in value over a period of time."
           | 
           | I kind of cringe when other developers say "deprecated".
           | 
           | Edit: Versioning and not removing APIs is kind of the way to
           | go, so you don't break client apps that possibly can't be
           | updated easily or at all. "Depreciated" is a far better word
           | to use with a far better outcome. AWS versions their APIs,
           | they don't remove old ones. "I disapprove of using this API
           | and we're taking it away at some random date" vs "this isn't
           | the latest API, use the current one for new development"
           | seems like a pretty stark difference in thinking to me. YMMV.
        
             | theodric wrote:
             | You could not be more wrong if you practiced every day http
             | s://www.etymonline.com/word/deprecate#etymonline_v_29603
        
             | tines wrote:
             | But "express disapproval of" is exactly the meaning
             | intended when we say that a feature is deprecated. It
             | signifies that it is best practice not to use it.
        
               | harha wrote:
               | If it's given as a warning then yes, e.g. the dplyr
               | package in R sometimes outputs "feature xyz is deprecated
               | and will be removed in version x.x".
               | 
               | Often though it's used when the feature is already
               | removed, i.e., it's not only best practice not to use it,
               | but also impossible with that version.
        
               | efreak wrote:
               | In this case, depreciated is incorrect. Removal has
               | already happened, the "period of time" is already over.
        
               | flywheel wrote:
               | Removing APIs is not a great practice though. Look at
               | AWS, they version their APIs, they don't just remove
               | them, and removing them should be unnecessary if your
               | underlying tech isn't brittle and badly written.
               | "Depreciated" is a far better term to use, with a far
               | better outcome in my opinion. Companies that remove old
               | versions of APIs and break existing client apps (that
               | possibly can't be udpated) really suck.
        
               | oceanswave wrote:
               | The "public APIs form an immutable, irrevocable contract"
               | argument means that an api layer with these tenants is
               | always going to be a source of technical debt. Get it
               | right the first time or fight an ever growing
               | compatibility matainance war - even when your
               | instrumentation is saying that old apis aren't being
               | used, just published, seems like a footgun
        
               | hamburglar wrote:
               | 1) Whether you agree with the practice doesn't affect the
               | terminology used. People remove APIs. Before doing that,
               | they deprecate them for a period to advise people to move
               | off of them.
               | 
               | 2) If you were to always maintain backward compatibility,
               | how is "depreciated" in any way an accurate term? If the
               | old API continues to work indefinitely, its value stays
               | the same.
        
               | smichel17 wrote:
               | I don't think these two are incompatible?
               | 
               | If APIv3 has a `/foo` endpoint that is deprecated,
               | usually I take that to mean that the developers
               | discourage its use, and likely plan to remove it in a
               | future version (say, APIv4 or APIv5). `/foo` will never
               | be removed from APIv3, because that would be a breaking
               | change, and so if I'm willing to stay on v3 forever,
               | that's fine, but in the (likely) event I will want to
               | take advantage of new features at some point in the
               | future, I'm doing myself a disservice by using /foo
               | because it will make the migration harder.
               | 
               | There is at least one case where I think "deprecated" is
               | clearly, inarguably, the right word: when the developer
               | _wants_ to remove a part of an API (say, because it is a
               | large maintenance burden), but it 's also committed to
               | stability, so they _won 't_ remove that api until some
               | acceptably small number of users are using it.
        
             | eigenvector wrote:
             | Isn't deprecated actually correct here?
             | 
             | It means the feature still works, but will be removed in
             | the future or is no longer supported. There also be may a
             | new implementation of it that the developer would like you
             | to use, hence the warning that it's deprecated.
             | 
             | Depreciation implies a rate of change over time, which
             | isn't the case. Today we deprecate feature X, and in two
             | years we plan remove it. It never depreciates.
        
             | cjaybo wrote:
             | The first definition is intended and more fitting for the
             | usages of "deprecated" I've encountered.
        
             | hamburglar wrote:
             | This is a jaw-droppingly arrogant attitude. You're trying
             | to justify your own incorrect usage by asserting that the
             | person who coined the term decades ago "got it kind of
             | wrong"? And you cringe when others get it right?
             | 
             | "Depreciated" is absolutely the wrong term, because it
             | implies that the value is less, when the intent is to
             | communicate "this is still fully functional, but you are
             | warned away from it because it is targeted for future
             | removal." Deprecated.
        
             | TallGuyShort wrote:
             | Feels like I often see it used to retire APIs that are now
             | understood to be unsafe, insecure, or otherwise a bad
             | practice for some reason. It gets replaced with an API that
             | does not inherently have that problem, and the old one is
             | in deprecated. it feels like "expressing disapproval of" is
             | the right definition in that case. It's only there for a
             | migration period to happen more gracefully, but its
             | continued use is frowned upon, and not just because it will
             | eventually be removed.
        
             | Xophmeister wrote:
             | I used to always use "depreciated" until I was
             | embarrassingly corrected one day :P
        
               | hamburglar wrote:
               | To be frank, grandparent sounds like someone who was
               | corrected one day, and rather than learn something and
               | move on, dug in and developed a detailed justification
               | for why the rest of the world was mistaken so he can
               | cringe about their ignorance.
        
             | jrochkind1 wrote:
             | Nope.
             | 
             | It is deprecated -- it's use is disapproved of, you should
             | stop using it. In the future it will go away but for now it
             | works, so you _can_ use it, but its use is discouraged.
             | 
             | Depreciated doesn't make any sense -- the value of the
             | deprecated API does not diminish over time. It works, until
             | it stops working. It's on or off. It doesn't work less and
             | less every month or anything. It currently still works
             | completely, but is deprecated -- that is, discouraged. At
             | some point in the future, it will stop working, completely.
             | 
             | the rest of us don't just kind of but REALLY cringe when
             | people say "depreciate" when they mean "deprecate". They
             | are different words, "deprecated" is the right one, it is
             | intentional, it is the word.
             | 
             | Sorry, you are the one using the wrong word.
        
               | flywheel wrote:
               | Yeah, nope yourself. It seems like a lot of people aren't
               | really thinking this through very much.
               | 
               | And that is absolutely the wrong way to approach API
               | development. An API that is being sun-setted should never
               | be removed, because older clients could still use it but
               | sometimes can't be upgraded to newer clients. Removing a
               | v1 API breaks those clients and it's a shitty thing to do
               | to users. Yeah, people should be building NEW things with
               | it, but there's no reason to look at the v1 API with
               | "disgust" as "deprecated" implies - It's simply an older
               | version that should remain functional, if your system is
               | worth half a shit. AWS doesn't terminate older API
               | versions, they just create new versions. Or you can be
               | like Facebook and "deprecate" stuff and just shut it down
               | before your official shutdown date, or not give any
               | notice at all - that's REALLY a fun culture to work in, I
               | guess, for them. "deprecated" is a really negative word,
               | and doesn't even really translate to anything good in
               | terms of software development. It's my opinion that
               | "depreciated" is a far better word and far better outcome
               | when used in software development instead of
               | "deprecated". YMMV.
        
               | jrochkind1 wrote:
               | OK, I understand you have an opinion that API design
               | should be done in a certain way (by the way, by "API" I
               | meant like method signatures, not network API, but it
               | could be either).
               | 
               | And I understand you disapprove of the word "deprecated"
               | being used to refer to API that is discouraged, usually
               | because it will be no longer supported/going away in the
               | future.
               | 
               | But that doesn't change the history of the word. The word
               | "deprecated" is what engineers have been using,
               | intentionally, for several decades.
               | 
               | "Depreciated" is a mistaken variation. Even if you think
               | "deprecated" has unfortunate connotations, it still
               | doesn't make "depreciated" right. "Depreciated", as you
               | said, means losing value over time. That is, 10% a year
               | or something. Deprecated API does not "lose value over
               | time".
               | 
               | The word "deprecated" has historically been used to mean
               | that certain API (again, likely a method or function, I
               | don't mean network api specifically) is now discouraged,
               | it's use is disapproved of. Usually becuase it will be
               | going away in the future. Arguments about whether this is
               | the right way to do API change are entirely separate to
               | this historical and current usage, where API change often
               | IS done this way, and it's what the word is used for.
               | 
               | You can have opinions of how you'd like to people to
               | handle API change over time, but that doesn't chagne the
               | fact that "deprecated" is the word engineers have meant
               | to use for decades. If you'd like to advocate for a
               | differnet word and/or different practice you can -- but
               | all "depreciated" has going for it is it sounds
               | confusingly similar to "deprecated", it is not the word
               | you are looking for.
               | 
               | > Not to be confused with Depreciation.
               | 
               | > In several fields, deprecation is the discouragement of
               | use of some terminology, feature, design, or practice,
               | typically because it has been superseded or is no longer
               | considered efficient or safe, without completely removing
               | it or prohibiting its use.
               | 
               | > It can also imply that a feature, design, or practice
               | will be removed or discontinued entirely in the future
               | 
               | https://en.wikipedia.org/wiki/Deprecation
               | 
               | > In accountancy, depreciation refers to two aspects of
               | the same concept: first, the actual decrease of fair
               | value of an asset, such as the decrease in value of
               | factory equipment each year as it is used and wears, and
               | second, the allocation in accounting statements of the
               | original cost of the assets to periods in which the
               | assets are used (depreciation with the matching
               | principle)
               | 
               | https://en.wikipedia.org/wiki/Depreciation
               | 
               | > In economics, depreciation is the gradual decrease in
               | the economic value of the capital stock of a firm, nation
               | or other entity, either through physical depreciation,
               | obsolescence or changes in the demand for the services of
               | the capital in question.
               | 
               | https://en.wikipedia.org/wiki/Depreciation_(economics)
               | 
               | Depreciation has nothing to do with what we're talking
               | about, it's not the right word. Deprecation is the word
               | that has been used for decades for API whose use is
               | discouraged, often because it will not be supported in
               | the future. You can argue that a new term is needed, but
               | that's your argument not a historical usage, and there's
               | no reason you need to limit yourselves to words that
               | sound confusingly similar to "deprecation".
        
               | [deleted]
        
         | mehrdadn wrote:
         | The plus operator in the page appears to be binary rather than
         | unary. I've never used it. Is that affected as well? (Though
         | I'm confused why AND is necessary. Isn't it implied normally?)
        
           | chris_f wrote:
           | That is correct. AND is added by default and is never
           | necessary in Google.
           | 
           | It's a little confusing because fo how Google implemented
           | some of the operators. The boolean + operator in many cases
           | is used in the same way as AND, but Google originally used it
           | to let users to force a specific word to be present in a
           | search result.
           | 
           | So a search for Fish +Chips was a search for both words, but
           | 'Chips' MUST be present. The equivalent search today is Fish
           | "Chips". It's a little annoying because it requires typing
           | another character, and it it is still not always respected.
        
       | jhbadger wrote:
       | Does filetype: still work? I'm getting zero hits for example
       | filetype:epub
        
         | choo-t wrote:
         | It still work but some file type never return anything, I have
         | the same problem with epub, pretty sure it's some google's
         | shenanigan about books piracy.
         | 
         | https://support.google.com/webmasters/answer/35287?hl=en
        
           | achairapart wrote:
           | Maybe Google doesn't index epub at all? I think I never saw
           | one in search results.
        
             | choo-t wrote:
             | Well, I may have become crazy but i have vivid memory using
             | it in the past, and some websites even refer to this
             | specific query ( https://ebookfriendly.com/google-search-
             | tips-books/ )
        
       | chc wrote:
       | I'm kind of surprised to see Google brought back the + operator.
       | I remember they prominently changed its meaning when they made it
       | the @ of Google+, and I never bothered to check again after it
       | died.
        
       | yuvadam wrote:
       | Dorking is not that easy to do, Google is very easy on assuming
       | you are being malicious on certain queries, try one too many and
       | you'll hit their dreaded captcha that is impossible to pass.
        
         | userbinator wrote:
         | That really angers me, and I've tripped it more times than I
         | can count, usually by searching for very specific things.
         | Coworkers have also run into it multiple times (before everyone
         | started working from home, we would exclaim "Fuck you, Google!"
         | and raise a middle finger to the screen, which was a cue to
         | everyone else to help).
         | 
         | The fact that they think you're "not human" when you use a
         | search engine for its intended purpose and show how much you
         | know how to use it is both disturbing and saddening. I wonder
         | if Google's own employees run into it and/or the continuing
         | degradation of results, or if they're somehow given immunity
         | and a much better set of results...
        
           | IggleSniggle wrote:
           | I'm curious about this. Can you give an example of the kind
           | of query you are talking about where Google assumes you are a
           | bot and not a human?
        
           | uj8efdkjfdshf wrote:
           | You could always keep Google Chrome open to rerun these
           | specific queries, as the captchas are less irritating to
           | solve then.
        
       | weisbaum wrote:
       | This is a pretty common practice among SEOs for a variety of
       | different reasons. They are also known as advanced search
       | operators.
       | 
       | Ahrefs has a pretty comprehensive list here:
       | https://ahrefs.com/blog/google-advanced-search-operators/
        
       | 1vuio0pswjnm7 wrote:
       | I have a question for anyone reading this thread:
       | 
       | Do you believe you can get consistent results with _any_ search?
       | 
       | For example, if we pick some _uncommon_ search terms will we get
       | the same results on the first search, the second search, the
       | third, etc. Or will the results change?
       | 
       | I did a search with some terms from one of the comments in this
       | thread, in quotes. The first search returned only one result:
       | this thread.
       | 
       | As I searched the same quoted terms repeatedly along with
       | additional terms, more results were returned that contained the
       | exact string of original terms. Surprised by this, I tried a
       | search with only the original terms, in quotes, once again. This
       | time the search returned more than just the one result.
        
         | abarrettwilsdon wrote:
         | If it's specific enough, the SERP should stay the same until
         | someone else publishes the same thing
         | 
         | e.g. the search of another article "set up Google Sheets APIs
         | (and treat Sheets like a database)"
         | 
         | turns up my site and a couple Twitter threads talking about it
         | (plus a phishing site which has scraped and republished it). I
         | presume that will stay the same b/c it's such a specific title
         | phrase (but not because searches are necessarily deterministic)
        
       | the_jeremy wrote:
       | All I want is the ability to search for symbols. Symbolhound.com
       | is the only site I've heard that will support that, but it leaves
       | a lot to be desired.
        
         | Brakenshire wrote:
         | It's strange to me that more domain-specific search engines
         | haven't been created. There must be value in a programmer-
         | specific search engine for instance. Or why aren't there search
         | engines that specialise in news, social media, Q&A websites or
         | events, to give a few examples.
        
       | surround wrote:
       | Exploit database with more dorks
       | 
       | https://www.exploit-db.com/google-hacking-database
        
       | jrochkind1 wrote:
       | Why is this called "dorking"? "Dorking" is a word that just means
       | using search engines to find very specific data? This seems
       | bizarre to me. Why does this need a special word?
       | 
       | Or it actually means using search operators beyond natural
       | language entry? That's what this page seems to be about? I don't
       | know why that would be called "dorking" either?
        
         | p410n3 wrote:
         | It all started with a def con talk if I remember correctly.
         | 
         | https://youtu.be/N3dzVl40lQA
        
       | sawaruna wrote:
       | Might be my librarian career bias but I'm always surprised at how
       | few people know about query operators. Ironically as Google
       | search seems to be ignoring vital parts of people's queries, they
       | are becoming more needed now, whereas years ago I would have
       | assumed a constantly improving Google search would get better at
       | determining what I was looking for.
        
         | colordrops wrote:
         | The operators don't work as well as they used to, and even when
         | using them lots of results are still left out or are not an
         | exact match. The combination of the SEO arms race and Google's
         | algorithms to filter "bad" information make it nearly
         | impossible to find some things. Sometimes you are looking for
         | that "bad" piece of info as a counter example rather than a
         | source of truth, and don't need google's patronizing filtering,
         | so would prefer exact string matches. But apparently they know
         | better than you.
        
         | kebman wrote:
         | You don't even wanna know how many times specialized searches
         | have saved my ass, after multiple years on uni, and working as
         | a writer, journalist, programmer, en even a musician! You can
         | safely say that my entire life revolves around being good at
         | doing various forms of searches.
        
           | sawaruna wrote:
           | No doubt. Enjoying & (feeling like compared to others I was)
           | excelling at finding information was what made me get
           | interested in information science in the first place, but I
           | often felt advances in ML and NLP would allow for anyone to
           | find exactly what they wanted (which would be great) even
           | considering the increasing amount of information to have to
           | search through. Google's 'I'm going to ignore half the words
           | in your search query' seem to be moving away from that for
           | whatever reason.
        
           | Wistar wrote:
           | I have long believed that the art of precision search should
           | be taught at the primary level. It is a necessary skill.
        
       | flywheel wrote:
       | Prediction: Using the methods of "dorking", this is the only page
       | on the internet among 10 million+ results that is calling this
       | "dorking".
        
         | montjoy wrote:
         | I hope it doesn't catch on since it makes me die a little
         | inside. It's a very Reddit-type word though. I can easily
         | imagine it being used by non-technical folk and tech
         | journalists.
        
       | neilduncan wrote:
       | I live two towns over from Dorking.
       | 
       | https://en.wikipedia.org/wiki/Dorking
        
         | tutfbhuf wrote:
         | This is reddit humor, that I sometimes miss here. Thx
         | neilduncan.
        
         | tomalpha wrote:
         | I grew up in Dorking, but this is the first time (that I can
         | remember...) that I actually read its wikipedia article.
         | 
         | TIL: No one knows why 'Dorking' is called 'Dorking', but
         | there's a English Place Names Society which since the 1920's
         | has researched the origins of town names in England, and is
         | considered [0] to be "the established national body on the
         | subject".
         | 
         | [0] https://epns.nottingham.ac.uk/
        
         | aidos wrote:
         | Also weird for me to see the name here (I'm in the next village
         | over), not one you see popping up often. I occasionally wonder
         | how many other HNers there are scattered about in my local area
         | (I suspect not many).
        
         | zeristor wrote:
         | Didn't it feature in "War of the Worlds"?
         | 
         | My Dad worked for Mullard, which was renamed to Philips
         | Electronics and relocated to Dorking.
        
         | chrisb wrote:
         | I live just a few miles to the North. Nice to see a few other
         | ~Dorking locals here :)
        
           | julian_t wrote:
           | Slightly more than a few miles, but still pretty local (KT4)!
        
       | indit wrote:
       | A very comprehensive and frequently updated list is here:
       | https://www.exploit-db.com/google-hacking-database
        
       | harha wrote:
       | I think it would be useful to be able to explicitly search around
       | knowledge graph entities or site topics, e.g. a programming
       | language, a city, a season, without having that single/specific
       | term.
       | 
       | So a search including all sites related to an entity, say Munich
       | or python along with the terms the user is searching because a
       | page might then not specifically include the entity in its
       | keywords or the text on the site or have a different language or
       | use a synonym.
       | 
       | I'm sure search engines consider this somewhat, but explicitly
       | activating such a feature would be a great improvement for the
       | user.
       | 
       | Stackexchange has this feature with tags (using []), with user
       | curated tags. Would be nice to have in DDG or google.
        
         | epanchin wrote:
         | Businesses don't game stackexchange.
        
       ___________________________________________________________________
       (page generated 2020-08-09 23:00 UTC)