hngopher.com

       [HN Gopher] We can't all use AI. Someone has to generate the tra...
       ___________________________________________________________________
        
       We can't all use AI. Someone has to generate the training data
        
       Author : redbell
       Score  : 34 points
       Date   : 2023-03-14 21:53 UTC (1 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | ggm wrote:
       | A reminder that as long as it demands training and reinforcement
       | it's almost certainly low on induction and production of new
       | things.
       | 
       | Very artificial. Not very intelligent.
        
         | s17n wrote:
         | Humans need training and reinforcement.
        
           | ggm wrote:
           | Yes, undeniably true. But what they acquire is inductive
           | reasoning skills, and the production of new things.
        
         | catach wrote:
         | Do you have examples of human-created "new things" that aren't
         | essentially novel combinations of old things? Because I come up
         | blank. And this current crop of AI generators are very good at
         | combining old things in novel ways.
         | 
         | I do agree with your general point that these generators aren't
         | really "intelligent", however. Will have to ponder if I agree
         | about the induction bit.
        
       | aidenn0 wrote:
       | ...Tell that to AlphaZero?
        
       | dogman144 wrote:
       | Live validation Jaron Lanier's siren servers. All this magic is
       | built on the back of free labor, from captchas to duolingo.
        
       | rebelde wrote:
       | Why write your thoughts on the web when AI/GPT is only going to
       | steal and paraphrase it? Nobody sees what you write and everybody
       | thinks GPT is the genius.
        
         | olalonde wrote:
         | Because you can get points on Hacker News.
        
         | raincole wrote:
         | Why write your thoughts on the web when other humans are going
         | to steal and paraphrase it? I mean... you're on HN. Don't tell
         | me you didn't notice people often regurgitate tech influencers
         | like Paul Graham and Joel Spolsky's thoughts.
        
         | Swizec wrote:
         | Becoming part of the cultural lexicon is the ultimate goal of
         | thought leadership.
         | 
         | Just look at how many people say stuff like "Two women can't
         | make a baby in 4.5 months". Someone (Brooks) had to invent,
         | write down, and popularize that analogy.
        
         | cableshaft wrote:
         | Just saw something today where the wife of TotalBiscuit, who
         | died of cancer several years ago, is contemplating deleting all
         | of his Youtube videos[1] to prevent people from using A.I. to
         | make him say terrible things.
         | 
         | Did give me a bit of a pause about putting stuff out there.
         | Although I think I'd still rather have my data be used for
         | training A.I. than not (and I probably am already in the
         | training data anyway, I believe I saw that one of the datasets
         | it's been trained on was Hacker News comments).
         | 
         | [1]: https://kotaku.com/totalbiscuit-john-bain-youtube-delete-
         | vid...
        
         | pklausler wrote:
         | The general problem of "AI"s being trained on copyrighted
         | content needs to be discussed more thoroughly, I think.
        
           | noogle wrote:
           | The current (legal) answer is "unclear". There are
           | indications that training is fine, but producing and using
           | the generated content is questionable at least. As many IP
           | issues, it will solved only when someone will try that in
           | court and go all the way until a verdict. Some cases are
           | actually being processed but it might take years to get an
           | answer.
        
           | bluefirebrand wrote:
           | Every time I bring this up, people accuse me of resisting
           | progress, "the cats out of the bag", etc.
           | 
           | It has been frustrating.
        
         | SketchySeaBeast wrote:
         | That's why I keep my content as low quality as possible - keeps
         | the machines humble.
        
       | mo_42 wrote:
       | Or the AI will trigger people to provide necessary training data.
       | If I would run OpenAI I would provide a free version of ChatGPT
       | that is slightly tuned to extract useful knowledge out of the
       | people who use it. There might be adverserial attacks but overall
       | enough people will use it blindly and provide useful information.
       | People even trusted Eliza. Needless to talk about what we typed
       | into Google.
        
         | ggm wrote:
         | Are you familiar with what is called "the drunkards walk"
         | Because if you think stochastic inputs will not unfortunately
         | admit of less benign paths being taken inside the dataset.. I
         | think you're probably wrong.
         | 
         | I have very little doubt the primary problem in the GPT<x>
         | model is going to remain: it is capable of reproducing highly
         | believable crap. In a world of pizzagate, that has a risk of
         | becoming highly weighted "I told you so" and self-reinforcing.
        
       | thelittleone wrote:
       | Does this assume we are not AI?
        
       | Gigachad wrote:
       | Only until we plug it in to the real world with sensors and
       | ability to conduct new research and observations.
        
       | zone411 wrote:
       | Human curation of AI-generated content is the true future.
        
       | ProAm wrote:
       | PG is back on Twitter? I thought he left a month or two ago?
        
         | coldtea wrote:
         | That might have been to jump on the fashionable wave virtue
         | signalling wave. No longer needed anymore
        
         | [deleted]
        
       | StrictDabbler wrote:
       | http://ascii.textfiles.com/
       | 
       | Gosh, why would anybody bother archiving Yahoo answers,
       | Angelfire, Geocities, Tumblr, Myspace, Friendster, old BBSes, old
       | Apple II and C64 and PC floppies, Usenet, forums... what value
       | does any of that have?
        
       | s1k3s wrote:
       | We can't all use AI. Only those of you who can afford to pay our
       | subscription.
        
       | voz_ wrote:
       | The more I see of his writing, the less I think of it. I wonder
       | what Diogenes would think of him...
        
         | satvikpendem wrote:
         | Behold, a man.
        
       | pixl97 wrote:
       | For a time. but as we bring audio/visual AI online then it will
       | have another boom of incorporating humanities data in that form.
       | Then we'll have another boom of AI robot learning by experiment
       | with reality.
       | 
       | After that point it gets tricky to figure out what if any booms
       | will be next. When you get near AGI lots of horizon problems crop
       | up.
        
       | GaggiX wrote:
       | People will generate the dataset using AI tools too, you can
       | create garbage with or without AI, you can create useful data
       | with or without AI.
        
       | advisedwang wrote:
       | A lot of the ground truth for AIs (and it's not just training
       | data - it's also ongoing validation of quality) is coming from
       | companies like Appen, Sama, DefinedCrowd, Q Analysts and many
       | others. There's a lot of variation, but the trend is moving
       | towards low-wage/gig work/outsourcing.
       | 
       | I think Paul means someone will be writing content, but whatever
       | the form it's going to be a whole class of low-wage workers
       | enabling tech from here on.
        
       | jgrahamc wrote:
       | This is kind of why I created https://lowbackgroundsteel.ai.
        
         | Mordisquitos wrote:
         | I have to say, I love the analogy you used for the name.
        
       ___________________________________________________________________
       (page generated 2023-03-14 23:00 UTC)