hngopher.com

       [HN Gopher] Generative AI could make search harder to trust
       ___________________________________________________________________
        
       Generative AI could make search harder to trust
        
       Author : jedwhite
       Score  : 118 points
       Date   : 2023-10-05 17:13 UTC (5 hours ago)
        
 (HTM) web link (www.wired.com)
 (TXT) w3m dump (www.wired.com)
        
       | anjel wrote:
       | More than Pinterest?
        
       | throwawaaarrgh wrote:
       | Why are people calling them hallucinations and not just errors,
       | flaws or bugs? You can't hallucinate if all of your perception is
       | one internal state. Chatbots don't dream of electric sheep.
        
         | crispycas12 wrote:
         | Personally, I think confabulations would be a better term. To
         | the best of my understanding, these AI rely on a model similar
         | to the reconstructive theory of memory in humans. The
         | connotation of the word confabulation indicates no
         | maliciousness while highlighting the erroneous nature of the
         | action.
        
       | IronWolve wrote:
       | it's almost like AI just repeats data its fed on, even incorrect
       | data, without any real intelligence to determine if the data is
       | correct.... /s
       | 
       | Its not simply garbage in garbage out. There is no logic to
       | verify and analyze the data. You are simply told what is popular
       | in the data.
        
         | lazide wrote:
         | Unfortunately, that is also a sizable portion of the human
         | population. AI definitely does it cheaper and at larger scale
         | though!
        
           | yetanotherloser wrote:
           | I've definitely met a lot of people who fail the GPT test.
        
         | aconsult1 wrote:
         | All of a sudden the saying "eat your own dog food" takes a
         | twist and is no longer fun.
        
         | smt88 wrote:
         | AI doesn't "just" repeat data. You can feed a LLM 100% fact-
         | checked data and it'll still hallucinate.
         | 
         | It's a core problem with generative AI and it can't be solved
         | with better data.
        
       | zpeti wrote:
       | Here's what people don't understand: this is mostly good for
       | google.
       | 
       | The worse organic results are, the more people will click on paid
       | links. This is WHY everyone on HN is complaining about search
       | results, because google doesn't really have an incentive to give
       | you really good results. They only need to be good enough to keep
       | 95% of the population still using google, but mostly expecting
       | the good results to be ads.
       | 
       | Google ads are the equivalent of verification on FB and X. They
       | just call it something different. The verified, high quality
       | results will be paid.
        
       | tivert wrote:
       | We did it guys! We're definitely heading into a new era, one
       | perfected by software engineers. I can't wait!
        
       | jowea wrote:
       | AI powered citogenesis!
       | 
       | I'm starting to wish articles had inline citations as a standard.
        
         | dredmorbius wrote:
         | Inline as opposed to hyperlinks?
         | 
         | Or would footnotes / sidenotes be acceptable?
        
       | faizshah wrote:
       | I started to go down a line of thinking where I think we might
       | see a return to books in the next 3-5 years. The reason is that
       | with a book it's a big collection of knowledge and people can
       | post reviews about the quality of the book whereas on the web you
       | have no way of knowing what quality of an article will be
       | anymore.
        
         | klyrs wrote:
         | Only, amazon is now flooded with crapbooks written by
         | artificial psychonauts and also reviews written by artificial
         | psychonauts.
        
       | notamy wrote:
       | https://archive.ph/2023.10.05-165142/https://www.wired.com/s...
        
       | infoseek12 wrote:
       | Leaving aside the article to discuss the source for a moment.
       | When did Wired become so antitech?
       | 
       | There are good critical viewpoints but most of the articles they
       | are putting out at this point read like bitter diatribes. Which
       | is a shame because they used to be an excellent publication.
        
         | cr__ wrote:
         | People are generally more cognizant of the harms caused by the
         | tech industry than they were even a few years ago.
        
           | LikelyABurner wrote:
           | You can find a plethora of critical viewpoints on Hacker News
           | and the various blogs it links to which are well cognizant of
           | the dangers of the tech industry.
           | 
           | The problem isn't that Wired is critical, it's that they've
           | gone weirdly reactionary and their writing has gone so mass
           | market dumbed down that Some Random Guy's Blog is likely to
           | have a better written and researched viewpoint.
        
             | lazide wrote:
             | They probably laid off almost everyone but some burnt out
             | interns.
        
               | robinsonb5 wrote:
               | Plot twist: maybe the article was written by ChatGPT!
        
               | lazide wrote:
               | Better than 50/50 odds I'm guessing
        
           | thejazzman wrote:
           | This.
           | 
           | The academic internet of the 90s is so far gone and while
           | we're seeing a lot of magic lately, it's magic available to
           | literally everybody for any and every purpose.
           | 
           | We're rapidly seeing how boring and disappointing that is :(
        
         | illwrks wrote:
         | Putting journalists out of work I guess?
        
       | salynchnew wrote:
       | Recently an article came out where someone said that the company
       | I work for is a big user of WebAssembly, but the reality is that
       | we don't use it.
       | 
       | After finding the contributed article (on a well-known news site,
       | not Wired though), it looks like a tech founder might've been
       | using ChatGPT to write an article about the uses for WASM. The
       | arguments were generally sound, but I don't think that anyone did
       | the work to manually check any of the facts they presented in it.
        
         | notabee wrote:
         | This is kind of like the advent of spellcheck, where a whole
         | class of errors started to appear regularly in almost every
         | article because publishers stopped paying for the human labor
         | to manually review for things like homonym or word ordering
         | errors. Except much worse, because it could allow spurious or
         | even harmful facts to accrue and spread instead of just
         | grammatical mistakes.
        
       | lykahb wrote:
       | The SEO garbage has been poisoning the search for years. Even
       | before the chatbots it got to the point when most top results are
       | crap. The LLM's can surely make it much worse, though.
        
         | hashtag-til wrote:
         | I think this is a given these days. LLMs likely will become the
         | new single point of failure search.
         | 
         | This is too much of a temptation for the SEO scum to resist.
        
       | abujazar wrote:
       | <<Could>>? Google has already been doing this for quite some
       | time, at least in my region (Norway), and I'd say more than half
       | of the suggestions Google provides as top results are false.
        
       | nonrandomstring wrote:
       | More amusing and frightening is when people search about
       | themselves and turn up AI generated crap. Googling yourself was
       | always a lucky grab bag, with the possibility of long-forgotten
       | embarrassments being dragged up. But at least you'd have to face
       | facts.
       | 
       | Now I hear of people discovering they're in prison, married to
       | random people they've never met, or are actually already dead.
       | 
       | What is this going to do to recon on individuals (for example by
       | employers, border agents or potential romantic partners) when
       | there's a good chance the reputation raffle will report you as a
       | serial rapist, kiddy-fiddler or Tory politician?
        
         | vorpalhex wrote:
         | This is a new way to be anonymous too. Someone post something
         | true but nasty about you? Have LLMs cook up dozens of
         | preposterous stories - you're secretly a rodeo clown, you write
         | childrens books, you built a castle in Rome, you once drank a
         | goldfish, etc.
         | 
         | Increase noise to drown signal.
        
           | kr0bat wrote:
           | This is essentially the service Reuptation.com claims to
           | provide. Jon Ronson's "So You've Been Publicly Shamed"
           | describes the site games SEO to flood the search results of
           | controversial figures with banal nothing posts[1]. The
           | difference being that actual humans had to create that
           | content.
           | 
           | In the near future, the web could become opaque with LLM
           | schlock, but at least it may grant people a right to be
           | forgotten.
           | 
           | [1]https://www.businessinsider.com/lindsey-stone--so-youve-
           | been...
        
           | acomjean wrote:
           | I think Boris Johnson tried that by saying out of the blue:
           | he makes model busses. There was some thinking at the time
           | that he didn't want the brexit bus to show up in searches and
           | was trying to game search results..
           | 
           | I don't think it worked.
        
         | JohnFen wrote:
         | > Now I hear of people discovering they're in prison, married
         | to random people they've never met, or are actually already
         | dead.
         | 
         | My real name is very, very common -- so this has been my
         | reality for my entire life.
         | 
         | These days, I have grown to appreciate it. It's like an
         | invisibility superpower.
        
       | p0w3n3d wrote:
       | And entropy rises... people thought AI will kill us with machine
       | guns. AI will kill us by making us super stupid...
        
         | euroderf wrote:
         | I have already externalized my to-do lists and other reminder
         | lists to teh interwebz. I can't wait to outsource my faculties
         | for reasoning too.
        
           | ChatGTP wrote:
           | And it's only $20 a month and it's useful !! I'm using it
           | eVerYdAy to save hOuRS!!!
        
       | 23B1 wrote:
       | That really sucks for all the people whose job it is to make
       | search impossible to trust already /s
        
       | pseudosavant wrote:
       | I wonder if there will be a human information/knowledge
       | equivalent of low-background steel (pre-WWII/nukes). Data from
       | before a certain point won't be 'contaminated' with LLM stuff,
       | but it'll be everywhere after that.
       | 
       | https://en.wikipedia.org/wiki/Low-background_steel
        
         | thih9 wrote:
         | I wonder how we'd test for AI contamination. And would there be
         | attempts to sell a larger data set, one that pretends to be
         | human generated, but instead is padded with some AI content.
         | 
         | Does this mean we'd end up with a finite set of verified human
         | only data?
         | 
         | Would people start going through all kinds of offline archives
         | via AI-gapped means, trying to uncover and document new sources
         | of human input?
        
         | ryanklee wrote:
         | People are vastly everestimating how unique this problem of
         | hallucinations is.
         | 
         | It seems to me it relies mostly on discounting just how much
         | we've already had to deal with this same problem in humans over
         | the millenia.
         | 
         | The problem of proliferation of bad information might be
         | getting worse, but this isn't native to generative AI. The
         | entire informational ecosystem has to deal with this. GPTs
         | compound the issue, but as far as I can tell, no where near
         | what social media has forced us to deal with.
        
           | cscurmudgeon wrote:
           | How do we know you are not hallucinating this comment?
        
           | blibble wrote:
           | humans can only produce semi-convincing bullshit at a limited
           | rate
           | 
           | with AI this limit is all but removed
           | 
           | all the human generated bullshit ever created will soon be
           | dwarfed by what AI can vomit out in an hour
        
             | HappyDaoDude wrote:
             | Like most things in the world. The problem isn't
             | necessarily the technology but the scale at which it is
             | implemented.
        
           | wellthisisgreat wrote:
           | Yeah if you think about it, there is no history for example,
           | as all we have in that domain is just someone's perspective
           | on some events. They may or may not have agenda but that's
           | beside the point.
           | 
           | That soft data could have never been trusted, rhe information
           | that can be verified (calculations etc.) seems safe from LLM
        
           | BobaFloutist wrote:
           | The thing is when you call a human on bullshit, they usually
           | can't back it up well enough to pass the smell test. When you
           | call an AI on bullshit it can instantly fabricate plausible,
           | credible seeming sources/evidence.
           | 
           | A human's lie is different than an AI's hallucination, since
           | it's still based on (distorting) the truth, whereas the
           | hallucination is based on an invented reality (yes I know
           | it's applied statistics and there's no true model of the
           | world in there, but it can report as if there is)
        
             | ryanklee wrote:
             | Intelligent people can fill the void of ignorance with
             | plausible sounding but factually incorrect information.
             | They are apt to engage cognitive biases in such a way that
             | the biases produce assertions that are deeply
             | indistinquishable from factual assertions. They fool
             | themselves in this way and they fool others. This happens
             | all the time.
             | 
             | LLMs are no different in this respect.
        
             | gyudin wrote:
             | It's not a big deal, there are many ways to handle it. It
             | just has some overhead costs. LLMs that are offered to
             | general public are more of a POC and they are making sure
             | to use as little resources as possible.
        
         | Agree2468 wrote:
         | Right now is best time to buy encyclopedias.
        
         | dotnet00 wrote:
         | In some ways it already is that way. If I come across an artist
         | I suspect is passing off AI generated stuff as their own
         | (without using the tagging features the site has to indicate as
         | much), an easy test is to just check if they've been posting
         | since before ~2020. If they have, and the style has
         | recognizable similarities, it's clear that it's honestly human
         | made or at most blends characteristics of both together.
        
         | BitwiseFool wrote:
         | Those simple web 1.0 sites made by college professors are a
         | gold-standard in my book. I always enjoy finding them in search
         | results. Although they are becoming increasingly rare.
        
           | dredmorbius wrote:
           | Unfortunately, that's a trivial signal to emulate.
           | 
           | At a minimum, you'd have to validate them by confirming
           | existence in the Wayback Machine.
           | 
           | Otherwise agreed that those are indeed high-signal documents.
           | Increasing reliance on integrated educational software means
           | that even such things as online syllabi are increasingly
           | rare.
        
             | LordDragonfang wrote:
             | The type of sites GP is talking about are typically hosted
             | on .edu servers, under faculty webhosting (often featuring
             | a "/~profname/" in the url). That's a non-trivial signal.
        
               | dredmorbius wrote:
               | ~/name at an edu is pretty attainable.
               | 
               | .edu domains can be had for any otherwise eligible
               | "U.S.-based postsecondary institutions" per Educause:
               | <https://net.educause.edu/eligibility.htm>
               | 
               | Pages at _extant_ domains might variously be available to
               | undergraduate or graduate students, faculty, staff, and
               | adjuncts. Those might either directly host emulative
               | material or be convinced or compromised into hosting
               | content.
               | 
               | If there's one thing that the Internet's history to date
               | has proved, its that perverse incentives lead to perverse
               | consequences.
        
               | l33t7332273 wrote:
               | It is not easy for a regular person to obtain access to a
               | .edu webpage.
        
               | [deleted]
        
           | heavyset_go wrote:
           | Can't prove it, but it seems to me like black text on white
           | background sites from the past are poorly ranked compared to
           | sites with "modern" layouts.
        
             | hashtag-til wrote:
             | Yes. I love black text on white background. A rare find
             | these days.
             | 
             | Browsing today is like: "You ask for a spaghetti recipe and
             | the page tell you the whole history of civilization."
        
               | zeroonetwothree wrote:
               | Thats specific to recipes because they can't be
               | copyrighted
        
               | hashtag-til wrote:
               | I had a look and definitely learned something today so
               | #til.
               | 
               | Also, note to self to collect my favourite recipes in
               | markdown files from now on.
        
           | MrVandemar wrote:
           | search.marginalia.nu is a great place to find those sites,
           | and some more interesting stuff besides.
        
         | DayDollar wrote:
         | There will be a web of trust, with a valuation of nodes by
         | trustworthyness. And people will get only one id for this. Ones
         | name is ones value and a reputation will be a hard earned thing
         | again.
        
           | ratg13 wrote:
           | This was how the "internet" functions in the book "Ender's
           | Game".
           | 
           | There is a small sub-plot about how he had to give a fake
           | persona credibility on the untrusted network in order to be
           | able to leverage a creating a fake account on the trusted
           | network.
        
             | dredmorbius wrote:
             | I find the xkcd interpretation more realistic:
             | <https://xkcd.com/635/>
             | 
             | Explained: <https://www.explainxkcd.com/wiki/index.php/635:
             | _Locke_and_De...>
        
               | notahacker wrote:
               | I love that interpretation, but in today's retweet driven
               | world of politically commentary, I actually find it quite
               | plausible that pseudonymous kids with no grasp of the
               | real world who _think_ rational political debate is the
               | nonsensical slogans they 're spouting on the internet
               | become major Twitter influencers that actual politicians
               | want to court for their "authenticity" and "willingness
               | to say the unsayable", and maybe their dank memes.
        
               | dredmorbius wrote:
               | The conceit of _Ender 's Game_ was that _thoughtful_
               | discourse would be influential online.
               | 
               | Reality has largely demonstrated that far more
               | thoughtless propaganda of the Big Lie, Firehose of
               | Bullshit (or Falsehood), associated with Russia, floods
               | of irrelevance which tend to bury more significant
               | stories, favoured by China, and outrage / hot-button
               | topics, which are common in US-centric media, though a
               | timeless technique.
               | 
               | Memes and simple messages attract attention and spread.
               | Complex narratives and analyses ... not so much.
               | 
               | But yes, voices that deserve no attention whatsover have
               | dominated the media landscape of the past decade or so.
               | Not that this is _entirely_ novel.
        
           | carlosjobim wrote:
           | Isn't this how it has been since the dawn of time?
        
         | RandomWorker wrote:
         | My sense is to avoid this have a personal blog.
         | 
         | That being said how many people write blogs with grammerly or
         | chatgpt these days. The temptation to use these technologies
         | all the time is too strong for even self preservation of your
         | own (writers) voice.
         | 
         | My sense is that you use this technology you might be happy
         | with the results at first but on later review you just notice
         | something off in some sentences and maybe it just doesn't flow
         | right. I'm not convinced that it will replace writers jobs yet.
         | Especially when you want to create something authentic and
         | unique.
        
           | pseudosavant wrote:
           | Sometimes the value is specifically because my voice won't
           | come through. When I'm stressed and being asked for
           | unreasonable things at work, I know that I tend toward
           | passive aggression. But professionally, that isn't the way I
           | want my message to come across.
           | 
           | I use ChatGPT all the time to suggest how I could make sure
           | something isn't passive aggressive. It'll point out parts
           | that aggression and suggested changes. It can be for a short
           | slack message, or a many paragraph message.
        
           | floren wrote:
           | I have definitely read "blogs" written by stitching together
           | LLM outputs. For years people were advised that a technical
           | blog "looks good on a resume" so we saw lots of lightly
           | rewritten Stackoverflow content. Now it's gotten easier.
        
           | tredre3 wrote:
           | > The temptation to use these technologies all the time is
           | too strong for even self preservation of your own (writers)
           | voice.
           | 
           | I don't know about that. I have played with
           | ChatGPT/Copilot/etc enough to know what they're capable of
           | doing. But the thing is, I enjoy programming. I enjoy
           | breaking down a problem and solving it with code. I enjoy
           | crafting elegant code. So I don't use AI even though I'm
           | fully aware it could save me hours on projects. Why? Because
           | I enjoy those hours very much.
           | 
           | Why am I telling you all this? Because I suspect many writers
           | are the same and personal blogs are their canvas. They enjoy
           | communicating. They enjoy crafting articles. They might have
           | AI proof-read them, but they won't let them write everything.
           | So, to me, there is hope that personal blogs will maintain
           | their human element, as opposed to news websites or tabloids
           | or learning platforms.
        
             | steelframe wrote:
             | > So I don't use AI even though I'm fully aware it could
             | save me hours on projects.
             | 
             | Enjoy this luxury while it lasts. Based on what I have seen
             | in performance review committees for software developers,
             | your peers who drive results faster than you do because
             | they use AI will be rewarded more and will be more likely
             | to survive rounds of layoffs when they inevitably happen.
        
               | JohnFen wrote:
               | That's fine. I genuinely wouldn't want to continue
               | working in an industry that worked like that anyway, so
               | I'd just quit and keep on programming with my own
               | projects. So that luxury will last as long as I want it
               | to.
        
             | SoftTalker wrote:
             | Agree. I've never even looked at any of these AI tools. I
             | enjoy the process and the challenge of programming, and the
             | rewards of doing it well. I have no desire for someone or
             | something else to write code for me.
        
         | robinsonb5 wrote:
         | I suspect in the coming years the Wayback Machine at
         | archive.org will become ever more important - always assuming
         | it's not lost as collateral damage in their copyright battles.
         | Indexing that dataset and making it searchable would massively
         | increase its value.
         | 
         | My inner conspiracy theorist can't help wonder if the continued
         | reduction in search usefulness isn't part of an ongoing
         | deliberate disempowerment of everyday people - but my rational
         | side says it's merely an unfortunate emergent behaviour of the
         | systems we've built.
        
         | carlosjobim wrote:
         | The shadow libraries.
        
         | datadrivenangel wrote:
         | There's the branch of philosophy called epistemology.
        
       | LetsGetTechnicl wrote:
       | Just another reason that I consider generative AI to be a lot
       | like crypto. A lot of talk about it being the future but really
       | only turns out to be dangerous or useless. I find it incredibly
       | irresponsible that companies are shoving their latest AI tech
       | into all their products when it's still unproven.
        
         | stevenwoo wrote:
         | One thing I've noticed about simple one word searches on Bing
         | now - a lot of times it just errors out and closes the Bing app
         | tab you've opened with no explanation to the user. This only
         | started happening after they pushed the AI driven search
         | narrative to make you use it in the app, so apparently single
         | word searches are too much somehow for their version of AI to
         | handle.
        
         | happytiger wrote:
         | AI has so completely disrupting Search that it's destroyed
         | leading platforms effectiveness in a matter of months.
         | 
         | But because of its current lack of optimization for accuracy,
         | we shouldn't consider it disruptive because it's not yet proven
         | technology?
         | 
         | You can call it dangerous but you can't call it useless. It's
         | also only going towards improvement from here, including
         | drastic reductions in hallucinations.
         | 
         | You have to remember too that AI models are generally
         | attempting to interpret the intent behind the prompt, so many
         | of these crazy articles are happening because people aren't yet
         | good at writing clear instructions for AI and AI isn't yet
         | mature enough to disambiguate poor instructions in its output
         | and is trying to deliver on unclear instructional intents.
        
           | [deleted]
        
           | 12_throw_away wrote:
           | > It's also only going towards improvement from here
           | 
           | Why?
        
             | 0xEFF wrote:
             | See for yourself, 4.0 is clearly improved over 3.5.
        
               | ChatGTP wrote:
               | True, 5 is a bigger number than 4 so logically it makes
               | sense.
        
         | pseudosavant wrote:
         | Except, unlike crypto, ChatGPT helps me with real day things
         | that I easily find at least $20/month of value from.
        
       | figassis wrote:
       | I think we all saw this coming, talked about it, articles were
       | published even...but now its news
        
       | gumballindie wrote:
       | The correct term is spamming. People are using these text
       | generators to spam everyone and everything under the sun. It will
       | be detrimental to the internet as many people will just give on
       | this huge pile of ... spam.
        
       | kiernanmcgowan wrote:
       | Without naming the company, I have seen specific examples of blog
       | posts being written by AI, hallucinating a "fact", and then that
       | "fact" re-surfacing inside of Bard.
       | 
       | Its xkcd's Citogenesis automated and at internet scale
       | https://xkcd.com/978/
        
       | mattlondon wrote:
       | Or to use the technical term: "shat the bed". Welcome to the
       | future.
        
       | Condition1952 wrote:
       | Please get your answers from Anna's Library
        
       | abruzzi wrote:
       | I have to say--the opening paragraph doesn't describe a reality
       | I'm familiar with:
       | 
       | >Web search is such a routine part of daily life that it's easy
       | to forget how marvelous it is. Type into a little text box and a
       | complex array of technologies--vast data centers, ravenous web
       | crawlers, and stacks of algorithms that poke and parse a query--
       | spring into action to serve you a simple set of relevant results.
       | 
       | Web search has, for me, become a nasty twisted hall of mirrors
       | well before generative AI. I almose never get fed relevant
       | results, I alsmost always have to go back and quote all my search
       | terms because the search engine decided it didn't really need to
       | use all of them (usually just one.) The only difference is the
       | poison was human generated. generative AI will simply erase the
       | 5% of results that might give me an answer quickly.
        
         | meowface wrote:
         | I've had the exact same experience. That said, when I do add
         | all the right quotes and conditions to the query to filter out
         | the blog/newsspam drivel, I still - usually - eventually - get
         | pretty good results. Sometimes I have to switch to Bing or even
         | Yandex, but it's rare.
         | 
         | Adding "reddit" to queries can be pretty useful. You're prone
         | to get terrible, inaccurate information since it's just random
         | people on an internet forum, but at least it's (usually) actual
         | humans and not blogs trying to SEO-game. (Though one big caveat
         | is searching for products/services. Lots of threads full of bot
         | accounts writing "[link] has been the best [thing], in my
         | experience". They're usually easy to spot, but sometimes they
         | do seem pretty natural until you check the post history.)
        
           | ryandrake wrote:
           | > You're prone to get terrible, inaccurate information since
           | it's just random people on an internet forum, but at least
           | it's (usually) actual humans and not blogs trying to SEO-
           | game.
           | 
           | Less and less so. Reddit has always had a bot problem, but it
           | seems to be getting exponentially worse lately. Not just
           | article reposters, but comment reposters, bots that reverse
           | images and videos just to repost, seems like it's at least
           | 75% bot content now.
        
         | bnralt wrote:
         | Not only that, but you're also left with the issue of parsing
         | what someone else has written. Even when using answers I find
         | from web searches, I often drop results into ChatGPT so I can
         | get a rough idea of what the person is trying to say first, or
         | check if it agrees with my understanding of what's being said.
        
         | jfengel wrote:
         | I experience that when I try to google for technical problems
         | I'm having at work, but otherwise searches still go pretty well
         | for me.
         | 
         | I just had to google a bunch of races that I wanted to run. The
         | top result was always the event's own web site.
         | 
         | When I google some news, relevant news articles always come up.
         | 
         | The last search I did was for how to display a ket vector in
         | LaTeX. The top result was the StackExchange article with the
         | right answer.
         | 
         | From what I see, certain domains seem to be targeted for
         | exploitation. Programming questions seem to be high up on the
         | list. I wonder if that skews HN readers' perceptions.
        
           | JSavageOne wrote:
           | Google search to retrieve specific factual information is
           | pretty good.
           | 
           | Google search to retrieve anything opinion related has been
           | horrible and infested with blogspam for years (hence people
           | searching Reddit to get that kind of info).
        
             | jamal-kumar wrote:
             | Really? I've been finding it doesn't even find stuff it
             | used to in certain documentation (I'm talking like things
             | it found maybe a year ago), "searching in quotes for this
             | stuff", things that other search engines (bing, kagi) are
             | indexing just fine - And since I've switched to using these
             | engines more when I'm searching things for programming
             | work, it's definitely been a lot more helpful than google
             | which often just seems to be missing a ton now
        
             | jfengel wrote:
             | I suppose it never occurs to me to search for opinions. I'm
             | not even sure how I'd got about it, even if search weren't
             | broken. Blogspam is what I'd expect to see.
             | 
             | I'm more likely to start at a place that aggregates reviews
             | and try to hallucinate which ones were written by people
             | who know what they're talking about. That usually seems to
             | work.
             | 
             | I imagine that somewhere out there is a person who bought
             | the product and reviewed it on their blog or made some
             | enthusiastic social media post about it, and that's what
             | you'd want to locate were it not for the spam. But I don't
             | expect any search engine to be able to find it for me.
        
             | fnordpiglet wrote:
             | Google search to retrieve product marketing pages is pretty
             | good. Specific factual information searches lead to product
             | marketing pages. Opinion searches lead to product marketing
             | pages.
             | 
             | Google is a giant adware tool that's been taken over by
             | adware SEO sites. The example given - find the product
             | marketing pages for some races - falls directly in its
             | sweet spot. If you venture outside it'll do its best to get
             | you back into the product marketing sweet spot, and the SEO
             | companies of the world take care of the rest.
             | 
             | Search is a lost cause.
        
         | icyberbullyu wrote:
         | As someone who has been using search engines since the 90's,
         | I've found that the "old-school" way of formatting your search
         | almost like a database query has gotten significantly worse. It
         | seems like search engines are geared more towards natural
         | language queries now; probably because the old Google-Fu way of
         | doing things wasn't very friendly for people who didn't use
         | computers regularly.
        
           | klyrs wrote:
           | My understanding is that google went from a more traditional
           | database style which supported such queries, to a newer
           | "n-gram" index with a layer of semantic similarity. Notably,
           | you can no longer put a sentence in quotes to only find pages
           | that contain that exact phrase. Also, the order of words
           | matters more now than it used to (where the old search
           | engines treated a space as AND, so order was irrelevant
           | outside of quotes)
        
             | saalweachter wrote:
             | https://www.google.com/search?q=%22you+can+no+longer+put+a+
             | s...
        
               | klyrs wrote:
               | Hah, perhaps I should edit that to say "reliably."
        
           | interstice wrote:
           | If someone brought back a search engine like this i'd happily
           | use it
        
         | heavyset_go wrote:
         | Sounds like a success if that means people see more ads while
         | trying to find what they actually searched for.
        
           | __loam wrote:
           | Nationalize Google.
           | 
           | Nothing will change as long as search is optimized for
           | revenue over user value.
        
         | loupol wrote:
         | Agreed that web search quality has been deteriorating since
         | much earlier than LLMs gaining popularity.
         | 
         | Interestingly, we are in spot right now where I feel that for
         | certain types of queries LLMs can outperform search engines.
         | But from what is shown in the article, it seems like that state
         | might only be temporary, and that in the same way that shitty
         | content farms mastered SEO and polluted search results, we
         | might see the same happening with LLMs that have access to the
         | Internet.
        
       ___________________________________________________________________
       (page generated 2023-10-05 23:00 UTC)