[HN Gopher] Tortured phrases: A dubious writing style emerging i... ___________________________________________________________________ Tortured phrases: A dubious writing style emerging in science Author : DanBC Score : 137 points Date : 2021-08-08 15:54 UTC (7 hours ago) (HTM) web link (www.nature.com) (TXT) w3m dump (www.nature.com) | vmilner wrote: | I've a horrible premonition that the paper describing this | problem (and those that cite it) may eventually end up being | flagged for containing too many tortured phrases... | PhasmaFelis wrote: | I've been seeing this in news articles as well. Swipe someone | else's article, run it through a synonym-replacer algorithm, and | have Reddit bots post it on a bunch of news subs. Presumably the | thesaurus work fools Google's just-a-copy detector. | | It's the next step in clickbait monetization. Why settle for low- | effort content when you can have _no_ -effort content? | lettergram wrote: | This is pretty much how corporate news works imo. I can't tell | you how many times I've seen one article then generate a | million more. | | My favorite example, go to google or DuckDuckGo and type: | | "Xxx number hospitalized" or "yyy new cases" | | You can type almost any number and get a ton of articles. Not | exactly a reprint, but they all seem almost generated | newsclues wrote: | Next will be a hybrid model where no effort content that begins | to trend virally gets a human to tweak it for optimization. | | Rewriting headlines that bots wrote and A B testing humans vs | Software | withinboredom wrote: | Even more entertaining would be all the traffic being from | bots trying to do the same thing. | coldpie wrote: | Thanks to advertising as a business model. | wolverine876 wrote: | Could you share any examples? | im3w1l wrote: | I wish they had kept the method secret. Getting these papers | retracted is less valuable than being able to secretly keep tabs | on them. | _Microft wrote: | If they are not retracted, they might get cited by other works | which themselves might get cited. Suddenly this faked, | nonexistent research has been "laundered" into mainstream and | nobody knows anymore that there was a problem in the first | place. | WesolyKubeczek wrote: | I'm wondering what happened to good old reading with | comprehension. Ain't nobody got no time for that? If so, | doesn't it make those papers worthless? | eecc wrote: | Nope. Time is the only non-fungible asset being burned here | and everyone is desperately defending their own allotment. | WesolyKubeczek wrote: | Then can we at least draw a border around such "science" | so that serious people who have work to do know to not | waste time with it? | wmf wrote: | We already know which venues are legit (because we've | heard of them) and which aren't (because we haven't). | twirlock wrote: | >how to excuse an intelligentsia which manages the public by | simply lying its ass off | waterhouse wrote: | What would be cool is if they'd figured out two methods, and | only published one. | | Though if they've published a convenient list of the bad | papers, then, assuming other markers exist, that makes it easy | for others to discover them. | bonniemuffin wrote: | Maybe they did. | maficious wrote: | As much as it is sad that such a thing is happening, this is | hilarious. | ipsum2 wrote: | A high profile case (on the internet) similar to the one | described in the article is when Siraj Raval plagiarized a paper | on quantum ML and made some amusing replacement phrases: | | complex Hilbert space -> Complicated Hilbert space | | Quantum gate -> Quantum door | | https://www.theregister.com/2019/10/14/ravel_ai_youtube/ | varjag wrote: | First thought after the opening paragraph, "these have to be | mainlanders". Scrolling down, yup. | FabHK wrote: | Pertinent passage from the preprint: | | > Out of 404 papers accepted in less then 30 days after | submission, 394 papers (97.5%) have authors with affiliations | in (mainland) China. Out of 615 papers of which editorial | processing time exceeded 40 days, 58 papers (9.5%) only have | authors with affiliations in (main- land) China. This tenfold | imbalance suggests a differentiated processing of papers | affiliated to China characterised by shorter peer-review | duration. | mrfusion wrote: | I wonder if any phrases or styles could detect group think or | studies following the crowd. | neoCrimeLabs wrote: | I'm very tempted to introduce tortured phrases at work for | occasional humor. For example, who needs "continuous integration" | when you have "ceaseless incorporation"? Sometimes it's nice to | see if anyone reads my notes. | | In all seriousness though, I've experienced something similar | before at a Japanese run American corporation as far back as the | 90's. The combination of Jargon with executives and executive | assistants who didn't know American tech-jargon often resulted in | accepting mangled suggestions by the spell-checker. A notorious | example was the "Data Whorehousing" presentation, which somehow | made it through several reviews and rehearsals before being | presented to the entire American IT department at an all-hands | meeting. | dmos62 wrote: | I feel like only the highest profile journals can be trusted at | this point. How long will it take academia to adapt? | 08-15 wrote: | Why do you feel that, though? | | My favorite counter example is "A Draft Sequence Of A | Neandertal Genome". The article was accepted by both Nature and | Science _before it was written_. The authors chose to publish | in Science, because Science offered more on the side: the title | page and an unlimited(!) number of "contributed" (this means | unreviewed) companion papers. The article itself was about 20 | pages of drivel; all the substantial content was relegated to | the 200(!) pages of "Online Supplemental Material". Nobody ever | read, let alone reviewed, all of that. | | After that, I can't trust either Science or Nature, which | offered pretty much the same crooked deal. If those two aren't | "highest profile", who is? | robwwilliams wrote: | Not even those! The impact factor of a journal is a terrible | guide to quality. It is more appropriately thought of as a | measure of scientific sex appeal. | | You must read each paper to judge its merits. Lots of junk gets | published in top ranked journals. | nick__m wrote: | Lots of junk gets published in top ranked journals. | | A lot more get published in vanity journals, so I use the | impact factor as a first pass filter: I avoid papers from | journals not listed the JCR1 or those with a factor below | 1.000. | | I assume, maybe naively, that if an important finding were to | be published in such low quality journal, it would eventually | get published in a more legit publication. | | 1- https://www.researchgate.net/publication/342623066_Journal | _C... | raincom wrote: | I thought top ranked journals have good reviewers, since the | editorial board consists of researchers/professors from top | notch schools. Can you share your thoughts why junk get | published in such journals? Has it to do with collusion or | reputation-laundering or more? | wmf wrote: | There's an order of magnitude difference between the worst | paper published in a good venue vs. the "tortured" fake | papers in fake journals though. | AlexCoventry wrote: | That's a low bar, though. The point is that it's very | difficult to judge the scientific merits of a paper without | actually reading it. (And even then, it's easy to be | fooled.) | geofft wrote: | This is an Elsevier journal. Due to a mistake by Elsevier, | these papers were published without review. | | University libraries who continue to pay Elsevier should know | that they are propping up scammers and grifters. | FabHK wrote: | For the journal in question, its "Journal Impact Factor | increased from 0.471 to 1.161 over 2015-2019, that is a 146% | increase over four years" | | Would that be considered good? | sampo wrote: | > I feel like only the highest profile journals can be trusted | at this point. | | The highest profile journals ( _Nature_ , _Science_ , _The | Lancet_ in medicine, ...) have some tendency to go for | sensationalism. They want to publish radical, ground-breaking | research more than there is actual new ground-breaking results | happening. So they also end up publishing mediocre research | presented as ground-breaking, and some less-than-accurate | research where results are exaggerated to make them look | ground-breaking. | Animats wrote: | Um, yes. "Nature" used to have a great reputation. Supposedly | it still does in bio. But battery articles in Nature are just | awful. They keep blowing up "minor advance in surface | chemistry" into "10x better battery that costs 10x less Real | Soon Now". | | (I'd like to see EV World or something else in that space | reprint old articles as "1, 5, and 10 years ago in battery | hype".) | petschge wrote: | Yeah in my field the general attitude is that Nature isn't | all that great. I have heard the phrase "it was published in | Nature but might still be right" more than once. | gunfighthacksaw wrote: | A colleague in an unnamed field, attending an unnamed Polish | university mentioned that this kind of thing was rife: publishing | Polish papers translated from English texts and occasionally vice | versa. Poland is a country with a strong academic tradition and | similar enough institutions to others in the EU so I can only | imagine this happens in more 'peripheral' countries with even | less globalization. | dang wrote: | The paper is at https://arxiv.org/abs/2107.06751. | | (We merged this thread and | https://news.ycombinator.com/item?id=28108111) | doubtfuluser wrote: | Maybe a future direction would be to train new models to identify | plagiarism by training on this information. Use ,,non matching | backtranslations for training classifiers. It's again the typical | cat and mouse game I guess | tarboreus wrote: | Or someone could...read the papers. | wereHamster wrote: | It's the classical problem of people trying to find | technological solutions to social problems. If plagiarism and | fake research is still a problem after we've applied | technology to fight it, clearly we haven't applied enough of | it. | waterhouse wrote: | Sometimes technological solutions work really well to solve | social problems. For example, at one point, one person | using the internet would tie up the phone line for everyone | else in the house, and vice versa. Negotiating this shared | resource could be considered a household social problem. | But now there's no such interference, and most people have | their own cell phones. | tnzm wrote: | This is a social problem around the shared use of a | technological resource. I'm reminded of the old saying, | "computers can only solve problems that are created with | computers". | | But then again you can view _all_ solutions to social | problems as inherently technological in the broader | sense; I adhere to that paradigm. | robertlagrant wrote: | That saying seems silly. Computers (i.e. Zoom) help with | the problem of needing socially distanced education | during Covid lockdowns. | pas wrote: | Referees have no real incentive to keep quality high. They | already don't get anything in return for doing it. (At best | they do it for reciprocity/goodwill.) Papers are usually hard | to follow, replication rate is abysmal, etc. The incentives | are all set for publishing, not for making real progress. | PragmaticPulp wrote: | The number of papers being published is growing at a | staggering rate. This requires proportional growth in the | number of people reading these papers, which inevitably means | the plagiarists and cheaters themselves are being pulled into | the review system as well. They don't care about letting | fraudulent papers slip through because they never really | cared about the science in the first place. | | They see it as a game that they're playing and they're doing | their best to put as little effort as possible into the game | while extracting as much reputation upside as they can. | | We really need to make publishing fraudulent papers a career- | ending move across academia and even the industry. The only | reason this continues to happen is because it has a lot of | upside but very little downside. Caught publishing fraudulent | papers? Oh well, just leave them off your resume and apply | somewhere else. | rusk wrote: | In telecoms they call all the backend infrastructure "back haul" | and have never read a satisfactory explanation. I'm convinced | that somebody once coined "back hall" with the intention of | invoking the image of service passages like what you see in the | mall, it was misheard (as is often the case in Telecomms given | it's global nature) and the metaphor of the bulldozer tail stuck | for ever after | robertlagrant wrote: | IME I think backhaul is just sending data to the main internet, | not all backend. I thought it just meant it hauled the data | back into the core network. | rusk wrote: | Main Internet, across main Internet, between networks, intra- | domain. Intra-station. Anything that joins it all together | that isn't "front facing" i.e wireless network towards | handsets | Animats wrote: | I've heard that term used where an ISP is piggbacking on a | larger service. Sonic.net offers some of their services over | AT&T infrastructure. Data to and from home DSL lines is | "backhauled" to Sonic HQ in Santa Rosa, CA and then goes out | over the bulk Internet backbone from there. This is a different | path the data would take than if handled entirely by AT&T. | vericiab wrote: | In freight "backhaul" typically refers to transporting goods | during the return journey. During the principle (non-return) | journey, often the starting location is more central, like a | distribution center, and the destination is a smaller satellite | location like a store. So when something is backhauled, that | tends to mean it's transported from the smaller satellite | location to the central location. | | Maybe that's where the term came from? | rusk wrote: | Maybe actually, or at the very least it could explain the | confusion. Better than some of the other explanations I've | heard for sure. | FabHK wrote: | And the journal involved, _Microprocessors and Microsystems_ , is | an Elsevier journal. Huge surprise. I am glad the publisher earns | their outrageous fees by careful screening, peer-review, and | editing of submitted manuscripts. /s | | Ceterum censeo Elsevier(um) esse delendum. | CRConrad wrote: | Elsevirus? | DanBC wrote: | Full title is: "Tortured phrases: A dubious writing style | emerging in science. Evidence of critical issues affecting | established journals" | zozbot234 wrote: | No mentions of tortured phrases in the humanities and softer | social sciences? For all their supposed appreciation of _les | belles-lettres_ (viz., "fine writing") those researchers sure | seem to like their tortured phrasings. | wmf wrote: | That's a completely separate issue that shouldn't be conflated. | zozbot234 wrote: | > That's a completely separate issue | | How so? It seems quite related to me. Anecdotally, one would | expect a pretty clear negative correlation between | torturedness in the sense of this article and indicators of | research quality. | robertlagrant wrote: | This is a category error. The article is only about the | subset of low quality papers generated by automatic | translation. | atrettel wrote: | I have encountered something similar to this for a submission | that I reviewed for a scientific journal. I will not list any | names or give much detail past those generalities, but I pointed | out that the authors were misusing a particular technical term. | In my review I defined the term and explained it briefly. I asked | the authors to revise their submission accordingly. The paper was | not bad but the authors did not know English very well, so it was | quite difficult to read. That was its main problem. However, when | I received the revised submission, I noticed that the authors | plagiarized my definition and explanation almost word for word | (from my _confidential_ review). I pointed this out to the | editors and they said to just reject the paper with the stated | reason being plagiarism, which I did. The journal ended up | rejecting the article, but I discovered it a few years later in a | different journal. The plagiarized section remained, but the | authors swabbed out a lot my phrases for these kind of "tortured | phrases". | | That said, the authors did not fabricate their research (as far | as I can tell). They just did not know English well, so it was | easier to just copy things that you know are phrased well than to | learn to write English well. As the saying goes, do not attribute | to malice what can be explained by ignorance or laziness. That | does not excuse it but it makes it more understandable. | | I agree with the article that this is probably just the tip of | the iceberg. There are likely many more lesser evils being | committed with similar tools that are just much more difficult to | spot. I would not have noticed my particular example if I were | not a reviewer for the paper, for example. It makes me wonder how | big the problem really is. | hdjjhhvvhga wrote: | > The paper was not bad but the authors did not know English | very well, so it was quite difficult to read. That was its main | problem. | | This seems to confirm my suspicion than these cases are not so | much about AI-generated content but rather a result of machine | translation. | aliswe wrote: | it's a common technique/first layer of plagiarizing a text to | translate it from english to eg. spanish and then from | spanish to english, to get rid of the unique words the author | used. | craftinator wrote: | > it's a common technique/first layer of plagiarizing a | text to translate it from english to eg. spanish and then | from spanish to english | | It's also a common technique for people who don't speak | English to translate it... In fact, quite a bit more | common. | turnersr wrote: | In your review, did you suggest the definition and explanation | that they used? In this situation, would have an acknowledgment | at the end have been enough? In my mind, it seems like you all | had a conversation and the authors took up your suggestions as | the reviewer. | atrettel wrote: | No, I did not suggest the definition and explanation as | content for them to use. I was trying to explain a concept | that they discussed incorrectly multiple times in the paper. | It is an advanced concept that might not even appear in | graduate-level courses on the subject, so I can understand | why they did not understand it fully. That said, I did not | give them permission to copy my words there. If there are any | particular changes I want the authors to do I put them in | quotes. This wasn't in quotes. It was an explanation for | their own benefit so that they can correct the mistakes in | the paper (by re-writing it). | | Once I re-read the submission I wanted to reject it | immediately, but I realized that I should get a second | opinion first. So I contacted the editors, who agreed that it | was blatant plagiarism. Hence, they rejected the paper once I | recommended rejection in my second review. So this wasn't | just a conversation where I made some suggestions and the | authors used them. Even the editors thought it was plagiarism | once they looked at it. | | An acknowledgment would be impossible because the review was | single-blind. The reviewers knew the identities of the | authors but not the other way around. What the authors should | have done was just re-phrase where they used the term in the | paper. They didn't even need to copy my explanation, to be | frank. The paper would worked fine without the paragraph they | copied. If they just re-phrased the relevant parts no other | changes would have been needed and this whole thing could | have been avoided. | adaml_623 wrote: | Not 100% sure but I believe the word confidential implies | that the review should only have been read by the editor(s) | and not passed on to the authors. | pottertheotter wrote: | A review is the written feedback authors receive from the | journal reviewer. The reviewer can recommend that the | authors revise and resubmit, based on the review comments. | Usually the review is not published with the final piece, | which is what was meant by "confidential review". | MikeUt wrote: | > the authors plagiarized my definition and explanation almost | word for word (from my confidential review). | | Is there any way the authors could have kept your definition, | and somehow credited you, even anonymously? Because rephrasing | definitions is the pinnacle of wasted effort, and leads to | confusion - you are asking them to say what you said, but | without using your words. | atrettel wrote: | That is a good question that I do not have a good answer for, | unfortunately. The review process for this journal is | supposed to be blind, so crediting me would only reveal me as | a reviewer. An anonymous acknowledgment is better than | nothing, if the authors only copied a short definition | without my permission, but they copied _an entire paragraph_ | from my review without my permission. That 's just | inexcusable. I can understand to some degree why they did not | understand the concept well, since you may not encounter it | even in a graduate-level course on the subject, but what they | did was just inexcusable and really poor judgment. | sunshineforever wrote: | Yeah. How can you plagiarize a definition? | da39a3ee wrote: | I agree with this. You sound expert and provided a | definition. I don't think we should expect serious | professionals to mess around altering the words to make it | look like it didn't come from the source that it did come | from. In fact wouldn't that itself be plagiarism? The usual | approach here is to use a phrase like "as suggested by one of | our reviewers". | pottertheotter wrote: | I think this is the cause of some of these weird terms that | this HN post is discussing. I have a PhD and found it | incredibly frustrating to write research papers because | there was an expectation in my field to add a ton of | background. That meant I had to spend a lot of time to | rephrase bits and pieces of other papers where the authors | had worked hard to word something very well. The professors | didn't like me quoting from other papers. I had to come up | with my own way to say something very specific. | petschge wrote: | I write papers too and hate finding a new way to say "my | X is a Y that does Z". Especially if it is your tenth | paper on the topic and you should even sound like the | previous nine times. | | But about three sentences into the introduction (where | you explain all the background) you start going into | "there is also Y's that do Z backwards". Which Y's you | compare and connect with is important and says alot about | how you think about your X. It might even be a new way of | looking at it. So telling other people how you think of | it can be important. | | And another 5 sentences in you start referencing previous | work on the topic. At this point you are crediting other | and you get to chose whom to credit how much, with the | benefit of hindsight. You refer to papers that are useful | to people new in the study of capital letters. What you | write here helps them much more than a mere list of | papers or a google (well google scholar or ADS or pubmed | or what ever) result list, because you can provide a good | order to read them or which aspect of X's are best | explained where. You also name papers that might be | useful to practitioners in the field because they have a | particular technique or a good explanation of it. | | So it is very much worth while of providing the | background that others expect at the beginning of your | paper. Even if it requires rewriting that first paragraph | several times. | mdp2021 wrote: | I cannot understand: those articles should have been carefully | examined before publishing - I understand they are in the set of | those "Under the Warranty of the Publications' Authority". But if | anyone read them, the rubbish involved would have emerged. | | What am I missing? | NotSwift wrote: | Some people don't have English as their native language. When | such people want to write a scientific article in English they | will have to use someone who can write English but probably does | not know much about the research. So of course there will be | articles with "Tortured Phrases". | alexmcc81 wrote: | If you read the article the authors directly address this and | use datasets of machine translated articles as controls. | wolverine876 wrote: | People who don't speak English natively could use machine | translation, and people plagiarizing could use machine | translation. How do they distinguish (if you don't mind | saving me digging into the research)? | arthur2e5 wrote: | Did you even read the abstract of the TFA? This should not even | be a cultural background issue. As an L2 English speaker myself | I have never ever thought about throwing a thesaurus onto some | established phrase so I can turn "artificial intelligence" into | "counterfeit consciousness", or "deep neural network" into | "profound neural organization". These are deliberate use of | fancy words without trying to make sense. | | Heck, we got a word for this sort of rampant plagiarism masking | on Chinese internet -- Xi Gao (manuscript (or blog | post)-laundry). | | OT: I do appreciate the funny phrase "elite figuring" for HPC. | It's kind of like how they translate things to Anglish. | Zababa wrote: | No one would write "colossal information" instead of "big data" | because English isn't their first language. | twirlock wrote: | So the way we can tell a computer has generated a scientific | paper is... because the computer probably failed to use idiomatic | terminology when it referred to concepts. | guyromm wrote: | Back in 2004 or so, I was building a distributed CMS with the | goal of creating artificial "link pyramids" with the purpose of | SEO, which was a rather new thing at the time. | | Content generation was one of our bottlenecks, and as Google was | already rather successful at detecting duplicate content, we were | looking for a way to "uniqify" posts that would be used to stuff | sites intended for googlebot, but not humans. | | One of the methods that worked was taking source English content, | running it through Babelfish, the Altavista translator to French, | Spanish or German, and then using the same method to translate it | back to English. | | This resulted in texts that did not make much sense to humans, | were full of precisely such "tortured phrases" but which were | considered unique by Google. | etempleton wrote: | I often wonder while reading an academic paper how the writing | could be as hopelessly bad as it is. | | This type of manipulation and plagiarism may be partially to | blame, but the academic writing style has also gone completely | off the rails to the point that half the journal articles being | published today read as if written by some kind of paper writing | AI robot even when I am quite certain that that isn't the case. | And no, I am not talking about cases where the author is writing | in a non-native language. | | I have a theory that it may have to do with imposter syndrome and | a need to sound smart. The author, fearing that they don't really | belong and at any moment will be found out, therefore never | making tenure, starts jamming academic sounding words where they | don't belong and stretching sentences with commas and semi colons | until the whole thing is just as insufferable to read as it was | to write. | | There is also the possibility that there are just a lot of | terrible writers out there. | zwaps wrote: | I am sure this was not your intention or meaning, but please be | aware that it is virtually impossible for a non-native speaker | to write perfect English. English is a language you have to | intuit. In contrast to other languages, it has very few fixed | rules. Writing elegantly in English is most certainly an art | form. | | Of course, writing good science is hard enough for native | speakers. It is very difficult for the vast majority of people | on the planet - no matter how good their research. | | And just so we are clear: Not everyone can afford professional | editing services at every point in their career. | | We meet in English under the premise that it allows for | universal communication. In this, we accept that English | natives are almost infinitely more privileged in writing, | speaking, conferencing and networking. We also have to accept | that the level of English proficiency varies, and - especially | English - is easy to learn and so difficult to master. | LargoLasskhyfv wrote: | I think at least skimming some edition of the | | [1] https://en.wikipedia.org/wiki/The_Chicago_Manual_of_Style | | and some of what is available under | | [2] https://duckduckgo.com/?q=military+writing+guide | | would be useful for american english and technical writing. | endtime wrote: | I think you missed this part of the comment to which you were | replying: | | > And no, I am not talking about cases where the author is | writing in a non-native language. | raincom wrote: | A friend submitted a paper to a journal in humanities. The | reviewer said "his English is informal". In other words, these | reviewers are asking for stilted English. | LargoLasskhyfv wrote: | This makes me think of people smelling bad, in dark robes, | wearing white powdered wigs, frantically using their | https://en.wikipedia.org/wiki/Hand_fan | Strilanc wrote: | I also get this feedback on my papers. E.g. saying that it's | written "more like a blog post". | | Of course, they're not wrong. It _is_ written more like a | blog post. Because the writing style used in blog posts is | hands down better than the writing style used in scientific | papers. Blogs talk about the real reasons you worked on | something, they go through simple examples, and they mention | where you struggled and what you found confusing and what you | tried that didn 't work. All of these things are very useful | for understanding, and in my experience almost entirely | lacking from papers. Or at least, in my experience they're | lacking from modern papers. I think in papers from 100 years | ago the authors tended to talk more about their worries and | their excitement e.g. [1]. | | [1]: https://youtu.be/RZfCqWZ8EAY?t=630 | yissp wrote: | Good essay by Orwell that touches on this sort of thing | https://www.orwellfoundation.com/the-orwell-foundation/orwel... | I used to be guilty of writing this way and one of my high | school English teachers recommended I read it. I've tried to | take the message to heart ever since. | hutzlibu wrote: | "There is also the possibility that there are just a lot of | terrible writers out there. " | | Surely they are and writing in a way that is easy to read and | understand is an art in itself. | | But I would agree, that the main reason is probably the | intention to sound smarter, than they are. Whole scientific | disciplines seem to live by that standard. | | This is not limited to science though, I recall a german poet | (I think Heinrich Heine) said about his fellow poets: | | You only fly so high like the swallow, that no one can actually | hear your singing. | FabHK wrote: | Some of these tortured phrases are great. My favourites: | | "flag to clamor" for signal to noise | | "individual computerized collaborator" for PDA (personal digital | assistant) | | "haze figuring" for cloud computing | | "information stockroom" for data warehouse | | "focal preparing unit" for CPU | | "discourse acknowledgement" for voice recognition | | "mean square blunder" for MSE (mean square error) | | "arbitrary right of passage" for random access | | "arbitrary timberland" for random forest | | "irregular esteem" for random value | | ETA: | | "notoriety examination" for sentiment analysis | abecedarius wrote: | Reminiscent of | https://en.wikipedia.org/wiki/Uncleftish_Beholding | netr0ute wrote: | Reminds me of https://www.youtube.com/watch?v=GyV_UG60dD4 | aaron-santos wrote: | I enjoyed finding "counterfeit consciousness" for artificial | intelligence. To me it evokes a kind of science fiction that's | shown up occasionally on HN[1]. | | [1] https://qntm.org/mmacevedo | Freak_NL wrote: | Also "haze figuring" for cloud computing. | | It sounds like something you'd find in 30s, 40s, 50s sci-fi | for sure! Like "visiplate" (E.E. "Doc" Smith, Heinlein) for a | computer display screen. (Along with ticker tape printouts | and tape reels in the far future of course.) | slowmovintarget wrote: | Makes me want to put smog-hosting in my CV. | synquid wrote: | The smog is just the Chinese cloud. | laurent92 wrote: | Ah, vapordecisionware. But that might be confused with | regular management. | seoaeu wrote: | Really highlights that the actual phrases don't make any | more sense than the tortured versions, other than the fact | that we've been hearing all of them for years so they now | sound normal | rhino369 wrote: | I happen to be reading Dune today, and AI is referred to as | counterfeiting the human mind | golemotron wrote: | There might be a common concept between this, chaff[1] and | Steven Pinker's Euphemism Treadmill. | | [1] https://en.wikipedia.org/wiki/Chaff_(countermeasure) | nick__m wrote: | If I was in a situation where I had to write on occupational | health and safety in forestry I would shamelessly appropriate | "mean square blunder" and "arbitrary timberland", those are | superbly above the mean square! | arkitaip wrote: | Any HNers who want to join me in creating the dad-punk band | Leftover Vitality? | jszymborski wrote: | As someone who has had to write technically in a second-language | (French, funding agencies in Quebec), this rings particularly | true. | | Luckily, I'm fluent enough to recognise the particularly | egregious examples, but finding good translations for technical | words is hard! | | One example that comes to mind is when trying to translate the | phrase "data feed" which came back as "alimentation donnees" | which ostensibly means "animal feed data". | | If you're looking for a lot of English-to-French translations of | technical terms, check out the theses any English University in | Quebec (McGill, Concordia, etc..). They're made public online | [0]. Can't vouch for the quality as I'm sure there are plenty | that just use Google Translate, but everyone I know has their | abstract edited by a francophone in their field. | | A good way to validate translated technical terms is to just give | them a quick internet search on e.g. DuckDuckGo or | Semanticscholar. | | [0] McGill's is https://escholarship.mcgill.ca/ | dghughes wrote: | It reminds me of scientific an article about Canadian journal | publishers are being bought by a shady company (OMICS Group Inc.) | so they can seemingly publish whatever they want to. | | https://www.ctvnews.ca/health/offshore-firm-accused-of-publi... | jokoon wrote: | What's the point of this? To waste the time of foreign | scientists? Would we call this science warfare? | riedel wrote: | It is the result of a misguided science system that relies | mostly on external quality checks (peer reviewed publication) | and flooding the world with so much "novelty" that there is no | way to digest it. At least you can use the output to train | language models up to now: will machines now have to train | themselves... | wmf wrote: | It's resume inflation. | FabHK wrote: | Got to agree with the conclusion of the paper: | | > In our strong opinion, the root of the problems discussed in | this work is the notorious publish or perish atmosphere | (Garfield, 1996) affecting both authors and publishers. This | leads to blind counting and fuels production of uninteresting | (and even nonsensical) publi- cations. | dash2 wrote: | I get this with students a lot. Papers which have been copied | from some website, but then they've gone through and altered a | few bits of vocabulary to disguise it. | gzer0 wrote: | This is anecdotal evidence at best, but it is worth considering. | I know of several individuals who were able to complete their | entire Master's thesis utilizing a combination of AI generated | content (GPT-3) and a paraphrasing tool. | | The generated text was well over 50 pages, completely bypassed | all known content/plagiarism checks and was even included in the | Universities "exemplary examples". To this day, it is still | there. | | This is of significant concern as some of these GPT-3 based tools | are now integrated within MS Word itself. Word 2021 allows for | "add-ons", out of which I have noticed several third party | content generation and paraphrasing tools. | 13415 wrote: | Please include a link to these theses, because as it stands | this anecdote sounds extremely implausible. I don't know what | university you were, but I've been at a few in Europe and at | every one of them Master theses were evaluated from the start | to the end by several humans. GPT-3 is unable to produce even | two pages of coherent text, let alone 50 pages good enough to | be accepted as a Master thesis in _any_ discipline at any | university I could think of (even the worst ones). | | I can imagine that plagiators use paraphrasing software quite | extensively, though, and that it is a problem. | gzer0 wrote: | Let me clarify: | | It was not all automated, there was a fair bit of manual | intervention needed. I understand your concerns and they are | valid and this is why I preface my statement with "anecdotal | evidence". What I write is most certainly not the entire | story and a fair bit of detail is left out. | | It should be known that this is widespread across multiple | industries and this will only become more of an issue in the | future. | | This is a US-based institution, fully accredited. | throwawaygh wrote: | _> Master 's thesis_ | | I don't doubt this at all, and I have no doubt that GPT-3 with | a bit of human editing can spit out something better than the | lower third of masters students at corn row colleges. | | Masters degrees are cash cows, which is why no one in | unregulated industries cares about them. People in | regulated/unionized industries also don't _actually_ care; even | educators, who at least nominally see intrinsic value in | education, go to borderline diploma mills to get that union- | mandated raise at minimal effort. | bjt wrote: | > ... masters students at corn row colleges. | | First time I've heard the term "corn row colleges". Google's | not bringing up anything that looks relevant. | | I suggest picking something else. Given that "cornrows" are a | predominantly black hairstyle, the term reads like a racial | slur. | CRConrad wrote: | I (non-American, not a native English speaker) thought it | was a pejorative reference to rural universities; ("hick" / | "rube") state universities of Midwestern states etc. | throwawaygh wrote: | The name originates as a perforative for small tuition- | dependent non-research teaching colleges. Those colleges | mostly catered to pastors, teachers, etc. and were located | in small towns. The historical reasons that these | institutions are now "in the corn fields" provides an | interesting topic for historical inquiry. Perhaps many are | in old rail-road or factory towns that have since | languished, but schools that were similar at time of | founding and didn't die are in industrial and post- | industrial hubs where they attracting the attention needed | to thrive. Who knows. The point is that they are small, | inconsequential institutions that are predominately located | in rural and semi-rural towns.' | | The name now includes small state schools -- usually branch | campuses with lower enrollment and no major (R1) research | output. | | (NB: corn row colleges are also by definition non-elite, so | small liberal arts colleges with billion dollar endowments | which might otherwise count, don't). | | Many such institutions have since started offering graduate | (or at least non-bachelors) degrees and certificates that | are somehow even more worthless than their undergraduate | programs. | | Apparently the name has a lot of different meanings these | days -- see sibling comments -- but it has DEFINITELY never | been meant as a racial pejorative. If anything, exactly the | opposite, since most of those "crap-tier | midwestern/southern colleges" cater to 99.99% WASP social | networks (the P is even explicit). | selimthegrim wrote: | No meaning in the cornfields, ie second tier state | universities. | aliswe wrote: | nono, it means the long line of colleges that are virtually | indistinguishable from eachother. | CRConrad wrote: | > Masters degrees are cash cows, which is why no one in | unregulated industries cares about them. | | So, uh, is Business Administration a regulated industry? | throwawaygh wrote: | No one cares about MBAs. The networks can be helpful, but, | unlike JDs/PhamDs/etc., an MBA from a no-name college & | weak alumni network isn't worth the paper isn't printed on. | OminousWeapons wrote: | > Masters degrees are cash cows, which is why no one in | unregulated industries cares about them. People in | regulated/unionized industries also don't actually care; even | educators, who at least nominally see intrinsic value in | education, go to borderline diploma mills to get that union- | mandated raise at minimal effort. | | I don't mean this rudely, but it is attitudes like this which | cause the CS interviewing process to be 100X more painful | than the interviewing process in any other field: "I don't | trust your credential so I demand you prove your competence | to me on the spot and let's do 5 rounds of interviews just to | be sure." | derefr wrote: | How do you feel about doctorates? | throwawaygh wrote: | Depends. University of Phoenix awards doctorates that take | 3-4 years (HUGE red flag -- the best and brightest phd | students _might_ get out in 4 years if everything goes | perfectly; an "expected time to graduation" of anything | less than 5 years is almost certainly a worthless degree). | | Those doctorates don't require much more than taking some | coursework and paying a boatload in tuition. Basically an | expensive and length online masters program. Not worth the | paper they're printed on, unless you're employed by the | government or in a union job that mandates raises for | education attainment. | | As a general rule of thumb, PhDs from R01 universities that | are paid for by the university through research | assistantships or teaching assistantships are generally a | good signal of at least minimal training in research | skills. | | Another good general rule of thumb is that paying for a PhD | -- beyond perhaps some MD/PhDs or maybe nursing phds, stuff | like that -- is always a good sign of someone who has both | a meaningless degree and also poor reasoning/research | skills. | | But anyways, real doctorates outside of a few fields (e.g., | pure math) usually come with a non-trivial publication | record that speaks for itself. You don't even need to know | that the person has a doctorate; you can just read their | papers and a rec letter from an advisor describing the | student's role in each paper. | | (I'm excluding discussion of professional degrees like JDs, | PharmDs, etc. which are technically doctorates but sort of | their own class.) | lyaa wrote: | Length of PhD programs is an indicator that should be | considered in context. UK Universities, for example, | often have research PhD programs that take 3 years to | complete and they are legitimate. | throwawaygh wrote: | Yes, my comment is specific to US (where, additionally, | it's somewhat uncommon to have a masters degree prior to | starting the phd). | thebooktocome wrote: | NB: I'm only speaking about the math doctorate as it | currently stands in the United States. | | Due to the current market saturation of math doctorates, | any pure mathematics PhD worth the paper its printed on | will also probably come with a non-trivial publication | record. The exceptions I can think of are high-risk high- | reward areas like cutting-edge number theory (I had a | friend go eight years without publishing, which, yikes, | but his thesis was semi-revolutionary (or so I'm told)) | or, I guess, suitably abstract category theory (though | the people I follow in this area seem to publish lots of | interesting papers, like the Baez school or the homotopy | type theory people; your mileage may vary). | | It's really too bad. One wonders why we can't simply ax | the entire advisor-candidate system (with all its myriad | opportunities for physical, emotional, and even sexual | abuse) and certify new candidates by saying: "You're a | doctor of mathematics when you get five professors to | sign off on 3-5 papers you've had published." | derefr wrote: | > certify new candidates by saying: "You're a doctor of | mathematics when you get five professors to sign off on | 3-5 papers you've had published." | | Or one big one. | | Basically, take the "honorary doctorates" some | Universities give out to people retrospectively to people | who have made major contributions to their fields; do it | more often; and then make it the _only_ path to getting a | doctorate, such that they 're no longer "honorary" at | all. | wolverine876 wrote: | > Masters degrees are cash cows, which is why no one in | unregulated industries cares about them. People in | regulated/unionized industries also don't actually care... | | People don't care about masters degrees engineering, law, | business, art, etc. etc.? Try applying for many jobs without | one, or with one from lower-ranking colleges. | | The Chronicle of Higher Education article recently on the HN | front page said that masters in some fields, they give the | example of 'positive psychology', are indeed cash cows. But | in the example, that degree was not part of the actual | Department of Psychology, which is taken very seriously. | throwawaygh wrote: | Engineering and CS masters are 100% cash cows that no one | cares about. I promise you. | | I haven't even heard of a Masters in Law (law degrees are | doctorates), but I can't imagine it's worth the paper it's | printed on. | | MBAs are worthless unless they're from a few good places, | and even then the brand and networking does a lot of the | lifting. | wolverine876 wrote: | > law degrees are doctorates | | Law degrees are called _Juris Doctor_ but are | professional degrees, like MBAs. You aren 't required to | publish original research (afaik) and in the US they were | formerly Bachelor of Laws (LL.B.) and then renamed (as I | understand it). | | The doctorate is Doctor of Juridical Science (J.S.D.). | You can also get a Master of Law (LL.M.). | fighterpilot wrote: | In engineering and financial services, my experience is | that they don't care. Some weight is given to a PhD but not | a Masters. | [deleted] | MengerSponge wrote: | Does Poe's Law cover parody becoming real? Because BBSpot | called this nearly 18 years ago: "Word 2004 to Pioneer | AutoUnsummarize Feature" | https://www.bbspot.com/News/2003/12/autounsummarize.html | bjourne wrote: | I really doubt you can computer generate a Master's thesis. | Completing a Master's thesis at an accredited institution is a | heck of a lot of work and even a cursory reading of a thesis by | an examiner, supervisor, opponent, or other interested party | would give the generated content away. Maybe if you get your | degree from a diploma mill you could get away with it, but then | your degree wouldn't be worth toilet paper anyway. | | I've heard similar stories about generated phd theses and it is | even more implausible. The reason is that writing a thesis is | much more than just producing a hundred pages or so of prose. | Any university student can poop that out in a few weeks. The | main job of a thesis is coming up with a research question, | conducting an experiment or a study, and describe the results | and how it fits in whatever niche of the scientific world you | are working in. | hdjjhhvvhga wrote: | I agree that in most cases it would be very difficult to do. | But I can imagine some specific circumstances where it could | be pulled off, possibly with some manual modifications: soft | sciences like sociology (you can't imagine the amount of bs | I've read during my college years), the subject matter being | very different from the area your supervising prof | specializes in, the topic that allows for arbitrary | speculation, an underfunded university branch with profs | having a more lax attitude. | BoxOfRain wrote: | There is apparently nothing new under the sun, in 1996 a | physics professor fed up with people from a social sciences | background publishing insufficiently rigorous papers on the | subject of physics decided to submit a nonsensical paper | liberally sprinkled with buzzwords to a journal of cultural | studies [0]. While this was simply someone making up | nonsense that "sounded right" as these AI language models | obviously weren't around then and aimed at addressing a | different issue, I definitely think it's relevant to this | discussion because it shows that academia (or at least | parts of academia) can be a bit flakey with what they | accept. | | [0] https://en.m.wikipedia.org/wiki/Sokal_affair | andai wrote: | https://xkcd.com/451/ | dash2 wrote: | Oh my sweet summer child. | | I regularly get dissertations with any or all of: barely | readable English, useless empirics, half-baked research | questions. | pottertheotter wrote: | How are dissertations getting to you like that? When I did | my PhD, no one would have allowed a PhD student to start | writing a dissertation without first having sufficient | research questions and then completing appropriate | statistical analyses. | laurent92 wrote: | > I know of several individuals who were able to complete their | Master's thesis utilizing... | | Doesn't it stay published forever? Might be a shame for the | someone during their career. | | On the other hand, even a chapter of Mein Kampf was accepted in | 20 journals, after replacing the old word with newer versions. | Human reviews are hard. Maybe we should put computers in charge | of reviewing papers, they'd recognize the work of AI quicker? | | https://www.foxnews.com/us/academic-journal-accepts-feminist... | phkahler wrote: | Sounds like automated review of automatically generated papers. | And people pay money for that... | lostlogin wrote: | > third party content generation and paraphrasing tools. | | Presumably this is an arms race against things like | https://www.turnitin.com/ | | Empower students 'to do their best, original work' and this is | what you get. Though what the alternative is, I have no idea. | tasty_freeze wrote: | I ran into something like this in an amazon review once. I was | looking for a book of transcriptions for the instrument I play, | and two of the handful of reviews used the same awkward phrase: | "music goals". I scratched my head and then realized what | probably happened. They weren't native english speakers and they | were being paid to write reviews and they had gotten the wrong | synonym. "music goals" was supposed to be "music scores". | itronitron wrote: | I like that "citation of non-existent literature" is also a | feature of these articles, although I wonder if the non-existent | literature was previously cited in other papers. | (https://irthoughts.wordpress.com/2009/07/15/the-most-influen...) | Animats wrote: | This is a major failure of Elsevier. | | Here's "Microprocessors and Microsystems."[1] This is supposed to | be about embedded systems, which is generally a no-bullshit | field. I'd never heard of this journal. People read Electronic | Design, EE Times, "Embedded.com", maybe Control Systems Journal, | etc. Those have either articles about how to do something, or | "why what we're selling is great" articles. | | Now look at the article titles in Microprocessors and | Microsystems.[2] Here are the first three. | | - COPS: A complete oblivious processing system | | - A perceptron-based replication scheme for managing the shared | last level cache | | - Efficient underdetermined speech signal separation using | encompassed Hammersley-Clifford algorithm and hardware | implementation | | Now those might be legitimate, although what they're doing in an | embedded systems journal isn't clear. They're all behind a | paywall, so it's hard to tell if they're any good. | | "Oblivious processing" is a security concept. That belongs in a | journal on security and encryption, where the crypto people will | know what holes to look for. (Microsoft was doing work in this | area in 2013, but I don't think a product emerged. If you can | make it work, some cloud computing company can use it.) | | Cache management belongs in a journal on CPU design, where people | who have struggled to make caches work will take a look. There | are people using perceptrons for this, which makes sense; a cache | has to guess which things will be reused. (If this works well, | someone should be trying it in web caches such as NGINX to | improve cache hit rates.) | | Signal separation is an active field, but this isn't a journal | where you'd expect to find articles on it. Wikipedia has a good | article on signal separation. The history of that article | indicates attempts to sneak in citations to sketchy articles. No | idea if the Hammersley-Clifford algorithm is even relevant. (If | it's a significant advance, there's commercial value in this in | improving audio quality for conferencing systems.) | | So these papers were all sent to a journal where the odds of | getting published are good, and the odds that the editors have no | idea about the subject matter is high. | | Why is Elsevier even publishing this journal? | | [1] https://www.sciencedirect.com/journal/microprocessors-and- | mi... | | [2] https://www.sciencedirect.com/journal/microprocessors-and- | mi... | | [3] | https://en.wikipedia.org/w/index.php?title=Signal_separation... | ksaj wrote: | I noticed this happening in other areas a few years ago, but with | faked blogs. The titles and subjects would sound interesting, but | then when you tried to read them, you'd need a specialized | decoder to get through the utterly baffling word replacements. | But they already got their ad revenue by the time you notice the | article is complete gibberish. | | The first one I found was about dog illnesses. They kept | referring to dogs with phrases like "Your domesticated canine," | and it was quite a chore trying to figure out most of the | symptoms that they were listing. "Heart worms" was translated to | "love snakes," which I thought was delightful. | armchairhacker wrote: | Nowadays too many real blogs are padded with weird phrasing and | sentences which don't really mean anything. | | In this case, sometimes you get lucky and can actually find | meaningful information between the padding. But sometimes you | just read an article that takes 5 paragraphs and 500 words to | say "we don't know". | dimatura wrote: | Yes, this may be a specific example of a more widespread | phenomenon. There's certain websites out there that republish | articles from well-established publications (e.g., New York | Times) almost word for word, except that they are rife with | synonym swaps that may or may not make sense in context, | presumably to escape some kind of automated copy detection. | Results can be amusing. For example, the copied article said | ""Drukqs" acquired a blended essential response..." where the | original said ""Drukqs" received a mixed critical response...". | Pixelbrick wrote: | This will come as no surprise to FE/HE lecturers the world | over... | cpach wrote: | What is FE/HE? | cbarrick wrote: | Seems like the UK equivalent of community colleges in the US. | cperciva wrote: | I'm guessing Further Education / Higher Education. | dash2 wrote: | Further Education = 16-18 years old; Higher Education = | 18-21 years old, i.e. universities. | mrfusion wrote: | I wonder if there are any phrases to detect industry funded | papers? I know they tell you their funding sources but it doesn't | seem to always help. ___________________________________________________________________ (page generated 2021-08-08 23:00 UTC)