[HN Gopher] In Defense of Inclusionism (2009-18) ___________________________________________________________________ In Defense of Inclusionism (2009-18) Author : luu Score : 71 points Date : 2020-05-05 10:07 UTC (12 hours ago) (HTM) web link (www.gwern.net) (TXT) w3m dump (www.gwern.net) | andrewla wrote: | As an inclusionist who has basically given up on editing | wikipedia because of deletionist forces, I read this article and | see nothing but points I agree with. | | What I would like to see, however, is a good defense of | deletionism. I have a lot of trouble even understanding the point | of view. | | If there's some higher-level motivation about storage costs or | something, then I can buy it, but I don't think anyone is making | that argument. Does it hurt wikipedia to have low-quality | articles? I think only insofar as the articles are about subjects | that are important, so the effect seems to be self-balancing. Can | wikipedia be used by people to glorify themselves with personal | wikipedia pages? Sure, but who cares -- unless someone does care | in a particular instance (i.e. a person named Albert Einstein | tries to take over the main article -- people will fight it | because they want the article to point to the right person). | scott_s wrote: | I see two problems. One, as the qualifications loosen, | Wikipedia will tend to have not just more articles, but more | lesser quality articles. At a certain point, Wikipedia would | become _mostly_ poor quality. | | Two, _managing_ the articles has a cost for the editors | involved. As the number of articles increases, their ability to | do a good job will decrease. | umanwizard wrote: | > Does it hurt wikipedia to have low-quality articles? I think | only insofar as the articles are about subjects that are | important, so the effect seems to be self-balancing. | | Here is where I disagree. I think Wikipedia is hurt by _any_ | low-quality articles. If I see some information on Wikipedia, | there is a moderate positive signal that that information is | correct. (Not as strong as a signal as I 'd like, but oh well). | Reducing the strength of that signal would make Wikipedia less | useful. | astine wrote: | I think that the main argument that isn't simply a matter of | storage costs (which probably does kick in at some point) is | simply a matter of brand-management. If the average Wikipedia | article is a discussion an incomplete and poorly sourced | article about Pokemon then Wikipedia gets a reputation as poor | source of information. It actually had the reputation pretty | strongly early on, but it's reputation for reliabity has | actually improved over the years. | | I suppose that there is also the inverse problem where poor | articles can use Wikipedia's relatively good reputation to give | visibility to some pseudoscience or conspiracy theory. If it's | really obscure, the only people who will be writing articles | about are will be proponents who will then link to the article | from their closed Facebook groups. It won't necessarily be | obvious to non-experts what's going on so deleting certain | kinds of articles could be a protection against that as well. | | (I'm not any kind of Wikipedian btw; just speculating off of | the top of my head.) | domador wrote: | If that's the case, I'd prefer a two-tiered Wikipedia over a | deletionist Wikipedia. In such a Wikipedia, each page could | be clearly-labeled as either an encyclopedia-quality entry or | as a "wannabe" encyclopedia entry. (A better term would be | needed for the lower tier.) Such labeling should be very | noticeable and visible when you open an entry, at least for | the lower-tier entries. | astine wrote: | They used to use the term "Stub" for brief pages that | weren't full articles yet and they do put warnings on | articles that don't meet certain quality standards, though | I think that's different that what you're talking about | here. | domador wrote: | There's some overlap, though not full equivalency. Stubs | presumably fall in line with Wikipedia's editorial | vision, even if they haven't been fleshed out into full | entries. The entries I'm referring to are mostly ones | which would run afoul of Wikipedia's notability | criterion. | | Going back to the brand management issue, I'd be OK with | deleted articles and their history being moved to another | wiki with a different URL, with a name that is not | "Wikipedia". Cookies or some other mechanism could be | used to manage user/session preferences so that users who | want to to be redirected to this other, questionable-page | wiki can be redirected when they try to access pages that | don't fall in line with Wikipedia's editorial vision. I'd | be fine with having huge in-your face warnings about the | likely lack of quality or veracity of such entries... | just as long as I can access that content. | Permit wrote: | As a purely end-user of Wikipedia I just want to chip in only to | say that I have not seen any observable decline in Wikipedia over | the last 15 years. I am generally very happy with the results I | get when I use it. | | > The fundamental cause of the decline is the English Wikipedia's | increasingly narrow attitude as to what are acceptable topics and | to what depth those topics can be explored, combined with a | narrowed attitude as to what are acceptable sources, where | academic & media coverage trumps any consideration of other | factors. | | I guess it sucks for people who wanted to write articles on each | chapter of Atlas Shrugged or thought Bulbasaur needed his own | page. Presumably, though, you agree there should be SOME criteria | for what is notable enough to deserve a Wikipedia article? For | example, in order of increasing mundanity these topics probably | don't deserve a Wikipedia article: Me, my cat, my cat's water | bowl, the water in the bowl on May 5, 2020 etc. | | Like, there must be some line that separates what deserves an | article and what is does not. We can argue about where to draw | that line but I'm actually pretty happy with how Wikipedia has | done it. | andrewla wrote: | > these topics probably don't deserve a Wikipedia article: Me, | my cat, my cat's water bowl, the water in the bowl on May 5, | 2020 etc. | | Yes, they probably don't. But what is the cost of having them? | If they confuse an issue, like if your cat shares a name with a | more notable animal, then yours can just be renamed to | "Fluffers (Permit's cat)". Who does it hurt to have that page | on there, even in the reductio ad absurdum? Even, | hypothetically, storage costs -- in the current deletionist | world, the initial article and its deletion will be preserved | in history forever, so we don't save anything. | | If we need to provide the capability to flag pages as | "deletionists don't like this" and present a deletionist view | of wikipedia to those who don't wish to be exposed to that | content, then go for it. | | I don't mean to try to throw rhetorical exclamation points | everywhere; I'm genuinely curious about the cost of having a | page about your cat's water bowl. | scott_s wrote: | > But what is the cost of having them? | | The time and effort of the Wikipedia editors; the reputation | of Wikipedia in general. | _jal wrote: | > I'm genuinely curious about the cost of having a page about | your cat's water bowl. | | Several billion cat-water bowl pages are probably just an | annoyance for someone. | | But what would you think of a page explaining, say, all the | healthy virtues of drinking diluted bleach for fighting C19 | being hosted on wikipedia.org? | | How do you think that's going to work out, in the first | instance, when panicking people read it, and in the second, | when people start treating wikipedia.org as trustworthy as | their spam folder? | Barrin92 wrote: | > I'm genuinely curious about the cost of having a page about | your cat's water bowl. | | The fact that it's at some point impossible to disambiguate | information. If you have 50.000 pages of everyone's cat it'd | be borderline impossible to find general information that is | relevant to a public audience. It's the same reason I can't | go to the public library and put random writings of me on a | shelf, there needs to be a level of curation so that the | content isn't being bogged down by what is mostly going to be | noise. | | The article brings up a page for each pokemon say, and if you | have countless of pokemon all with similar names to other | real-world stuff everyone who doesn't care about the pokemon | will have to wade through links and pages of irrelevant | content, it'd quickly turn into a huge digital garbage dump. | | Also not to mention that Wikipedia wants to provide a | reasonable level of accuracy and factfulness, and nobody can | independently verify personal content or topics so niche that | only one person knows what's going on. | | I don't know why someone really would want wikipedia to turn | into a website for in-universe fiction or people's personal | content. That stuff is more suited for a self-hosted wiki. | yorwba wrote: | > If we need to provide the capability to flag pages as | "deletionists don't like this" and present a deletionist view | of wikipedia to those who don't wish to be exposed to that | content, then go for it. | | That flag exists. It's set by deleting the article. As you | note: | | > in the current deletionist world, the initial article and | its deletion will be preserved in history forever | | So if someone wanted to present an inclusionist view of | Wikipedia to those who find that content valuable, they could | do so. | dooglius wrote: | No, you can't edit pages that have been deleted, so that's | more than a flag. | a1369209993 wrote: | In the reductio ad absurdum, storage costs do become a | problem, and deletionism discourages many useless pages from | being created in the first place. OTOH, deletionism | discourages many useful pages from being created in the first | place, which is a much more serious problem. | Permit wrote: | > Yes, they probably don't. But what is the cost of having | them? | | I like the idea that the articles on Wikipedia are generally | on "notable" topics or people. When I am reading an article I | know that it has probably been eyed up and down by | deletionists, hunting for any excuse to delete it, but they | walked away unable to do so. That's a very useful signal to | me! | | For example, right now I can look someone up and use "has a | Wikipedia page" as a rough proxy for "is well known". If | everyone had Wikipedia pages, I could no longer do this. | TigeriusKirk wrote: | As gp says, though, a deletionist flag and deletionist | viewport would accomplish that just as well, while | preserving more obscure and niche topics. | teraflop wrote: | Wikipedians have been arguing over these positions for many | years, so if you want to see the arguments in favor of | deletionism, there are plenty of places to look. | | For example, here are IMO the most salient points from | https://meta.wikimedia.org/wiki/Deletionism: | | > Some believe that the presence of uninformative articles | damage the project's usefulness and credibility, particularly | when casual visitors encounter them through Internet search | engines or Wikipedia's "random page" or "recent changes." | | > Articles on obscure topics, even if they are in principle | verifiable, tend to be very difficult to verify. Usually, the | more obscure, the harder to verify. Actually verifying such | articles, or sorting out verifiable facts from exaggeration | and fiction, takes a great deal of time. Not verifying them | opens the door to fiction and advertising. This also leads to | a de facto collapse of the "no original research policy", | which is one of the fundamental Wikipedia policies. | Empirically, there have been a number of hoax articles which | were difficult to prove to be hoaxes but which could have | easily been deleted by a sufficiently strict notability | policy. | | > Poorly-sourced articles can result in Citogenesis, as | incorrect or unsourced information on Wiki (e.g., information | that is the product of original research) is then repeated | outside Wiki and eventually works its way into a publication | that is normally regarded as a reliable source. | fnl wrote: | Isn't what you are describing exactly the Internet plus a | search engine you trust will pick up the article type you are | interested in? In fact, you'd still need to trust that open | WPv2 thing just like that search engine. | ardy42 wrote: | >> these topics probably don't deserve a Wikipedia article: | Me, my cat, my cat's water bowl, the water in the bowl on May | 5, 2020 etc. | | > But what is the cost of having them? | | 0. Volunteer time may be cheap, but it's not infinite. | Without standards, the volume of articles could become so | large it will be an impossible task to fact check and edit | them all. | | 1. Malicious, unscrupulous, or misguided actors co-opting | Wikipedia's reputation for their own purposes (e.g. Jack's | snake oil has been scientifically proven to cure all | disease). | | 2. Useless garbage in search results that you have to wade | through. If I put up a page about my vanity music act, | literally no one will want to read about it but me, yet it | will still show up in your search results. | SpicyLemonZest wrote: | It's worth noting that the Bulbasaur fans won and he does have | his own page. | Permit wrote: | That's great! And I think it shows that no matter where you | draw the line you're going to end up with compelling topics | on either side of it. I see it as a strength of Wikipedia | that these things are fluid and there is an ongoing tension | between inclusionists and deletionists. I'm very happy where | they've ended up at the moment. | musicale wrote: | > I guess it sucks for people who wanted to write articles on | each chapter of Atlas Shrugged or thought Bulbasaur needed his | own page | | Bulbasaur is a pop culture icon; he may very well be the second | most recognizable Pokemon after Pikachu (though as noted | Charmander and Squirtle are up there too.) In any case, he | definitely earned his own page: | | https://en.wikipedia.org/wiki/Bulbasaur | nitwit005 wrote: | I'm in essentially the same camp. When I've tried to look | things up, it's always already been there. I've made edits to | wikipedia, but it's always been undoing vandalism. | duskwuff wrote: | > Like, there must be some line that separates what deserves an | article and what is does not. | | That line is simply: | | > If a topic has received significant coverage in reliable | sources that are independent of the subject, it is presumed to | be suitable for a stand-alone article or list. | | -- https://en.wikipedia.org/wiki/Wikipedia:Notability | | This _instantly_ weeds out a lot of obviously trivial topics. | There 's arguments about how exactly "significant coverage", | "reliable sources" and "independent of the subject" should be | defined, and refinements of this policy for specific subject | areas, but that's the core of Wikipedia's notability policies. | I've never heard any solid arguments for including a topic | which doesn't meet this criterion. | markwkw wrote: | Is it easy to tell what technology gwern uses for his website? | I'm not web-savvy, but it looks html looks like it's not | generated, but hand crafted. Is is a static website? | juped wrote: | You can find this information on the site | astine wrote: | As Juped said, there is actually an in depth discussion of the | tool used to generate the website on the about page: | https://www.gwern.net/About#tools It looks like static | generated website is about correct. | miblon wrote: | I noticed the decline. About a year ago, during a hackathon, we | tried to get an article online. It took us 3 deletes and retries. | Then I reached out to a national official at Wikipedia and | finally the article got accepted. That's not good... | eitland wrote: | Same goes for stackoverflow. | | Since the stackoverflow database is freely available I cannot see | a single good reason why they haven't been outcompeted years ago. | the_af wrote: | So I have my own beefs with stackoverflow and the stackexchange | network at large, but... | | ...at some point you have to ask yourself, _why_ haven 't they | been outcompeted? For every post complaining about | stackoverflow moderation or policies, there are probably | thousands of people using it every day to do their jobs, and | _it hasn 't been outcompeted_! | | So if we're honest, we cannot rule out the possibility they are | doing something right. They set out to replace closed sites | like "Expert Sex Change" -- ok, sorry for the joke, | expertsexchange -- and also reduce low quality noise. They | succeeded and they are now the gold standard. So why hasn't | anyone simply taken their data and forked it? | | A great example: softwareengineering.stackexchange -- formerly | programmers.stackexchange -- has a troubled history. At times | it has decided _everything_ was off-topic there (I 'm not | joking, there were times where every question on the home page | was closed as off-topic), and some long time "inclusionist" and | well-intentioned contributors declared they would fork it, and | their fork would include everything even tangentially related | to programmers and people wouldn't be censored for asking | questions about anything. | | Where are those forks now? | Animats wrote: | Most of the important topics were covered years ago. | Encyclopedias are not high-maintenance; the maintenance team for | Britannica wasn't all that big. | | I used to edit Wikipedia quite a bit, but got into other things. | narag wrote: | No comments yet? I guess others are still reading TFA... | | There are a number of issues that seem to be eternal. Even if | there's an obvious right answer, I see year after year, decade | after decade that they keep being discussed, with the same points | being made over and over again. Is there a name for these? | | I guess that for each of them, there is some kind of unspeakable | reason that trumps any sound rationale. | CarVac wrote: | I would attribute it to an inadequate equilibrium, a local | minimum that's simply too hard to escape at this point. | pdonis wrote: | _> I guess that for each of them, there is some kind of | unspeakable reason that trumps any sound rationale._ | | I think the general unspeakable reason is pretty simple: once | you give people power, some of them will misuse it. And since | it takes more effort to correct a misuse of power than to | commit the misuse in the first place, any institution that | gives people power becomes more and more corrupt over time as | misuses of power outweigh valid uses of power. | ghaff wrote: | While I'm mostly in Camp Inclusion, I appreciate the issues | that tend to come up as you loosen criteria for articles more | and more--you probably inevitably get articles that are | "notable" to a narrower and narrower set of people and a lot | of verification depends on shakier and harder to access | sources. | | That said, it's pretty clear that there are more than a few | Wikipedia admins who seem to have embraced deletionism for | topics that aren't near and dear to them personally and/or | which poke at whatever their particular political hot buttons | are. | kragen wrote: | The nice thing about Wikis in general is that it's usually | easier to correct a misuse of the power to edit a page than | it is to commit the misuse in the first place. That's why | Wikis work at all. | narag wrote: | That works for the power to edit, not for the power to | delete. So it makes sense, yes. | ooobit2 wrote: | Not just _some_ people, _nearly all_ will abuse it. But we | 've seen this more-so in people who are less familiar with | these institutions. Baltimore, MD, for example, has had now | _two_ black, female mayors convicted of corruption and | removed from office. We 've seen the current mayor of Chicago | attempt authoritarianism to enforce the stay-at-home order. | Michigan's mayor has reached far beyond CDC and NHS | recommendations, and now faces possible corruption charges | over abuse of power. Maine's governor is also in the | spotlight for circumventing checks and balances by the | legislature on the executive office. | | People love AOC, and I too believe she means well, but even | she has warped into an anarcho-syndicalist, going so far as | to make absurd statements that "incremental change isn't | working", when everyone knows governance is a complex, | incremental series of processes, necessary to maintain a free | republic and limit opportunities for tyranny. Omar has had a | sexual affair with a senior member of her staff. Tlaib has | been censured for collaborating with anti-semitic | organizations on official business. | | Biden, himself, has now managed over a year of avoiding | investigations into how his son secured a chair on a | Ukrainian energy board and was on that board during | negotiations between the Obama-Biden administration and | Ukraine related to that company's finances. | | It's cute that people like John Oliver continue to say that | the rise of fascism isn't obvious, that it happens slow, | subtle. But abuses of power and position are everywhere right | now, especially among anti-Trump activists in the media and | Congress. You might agree that Trump is a racist, but taking | that very position is complicit with the anarcho-syndicalism | among a growing number of leftist politicians now. | | Keep an air of skepticism about you for your own safety. Just | because 75% of us think a hate speech law is a reasonable | circumvention of free speech guarantees doesn't mean we | aren't also enabling scope creep. As we've seen thus far, | "slippery slope" is a fallacy in everything _but_ governance. | lidHanteyk wrote: | As a former Wikipedian, what's there to say? The evidence is | still there, and was there for years and years. Deletionism | was, and remains, a wrong-headed attitude that does not | understand what makes WP qualitatively different from standard | encyclopedic offerings. | | There is not a motivating need to limit WP's scope, and indeed | that is why WMF forked off projects like Wiktionary and | Wikidata to their own TLDs. The main problem with WP is that it | is far easier to be wrong than to be right, and the effort | required to be right is linear in the number of words written. | In short, the number of editors per page required for | acceptable quality does not roll off with large numbers of | pages, but stays relatively high, at around 1-2 editors/page. | | Worse, the number of moderators per editor does not roll off | either. The number of bureaucrats required therefore keeps | growing, logarithmically but steadily, and the demands on | arbitration committees keep growing. The committees themselves | have long ago failed basic principles of legal legibility, | leading to sprawling bureaucracy. | | One possible solution is to fundamentally alter what we store. | Rather than writing thousands of words of prose, we could use | Wikidata to automatically generate articles. We already have | factboxes which could be largely automatically populated, and | many people only care about the factboxes. Prose would be | limited to commentary and explication, but would not be the | main bodies of articles. This is not just a pipe dream; LMFDB | [0] exists and is worth examining as an example of how code and | data can automatically generate the bulk of an encyclopedia. | | But, let's be honest, the writing was on the wall when | Esperanza [1] was dissolved. We are now somewhere between | Bureaucracy and The Aftermath. | | [0] https://www.lmfdb.org/ | | [1] https://en.wikipedia.org/wiki/Wikipedia:Esperanza | zozbot234 wrote: | > One possible solution is to fundamentally alter what we | store. Rather than writing thousands of words of prose, we | could use Wikidata to automatically generate articles. | | This is in fact being proposed at https://meta.wikimedia.org/ | wiki/Wikimedia_Forum#Proposal_tow... by a prominent | Wikidatan. (Edit: follow up at | https://meta.wikimedia.org/wiki/Wikilambda and | https://meta.wikimedia.org/wiki/Talk:Wikilambda .) However | generating sensible articles would require expanding the | current Wikidata model, and this is something that should | happen gradually and be managed by the WD community itself, | not at a separate project. The whole pretty-printing-in- | natural-language part is the most speculative by far, and | incubating it separately makes more sense. | | It's worth noting that Wikidata itself is not "deletionist" | other than as implied by verifiability- and sourcing- | requirements. Its model is far more general and far more | "inclusionist" than even the most permissive visions for | Wikipedia. | narag wrote: | Wow, the death of hope... | duskwuff wrote: | > There is not a motivating need to limit WP's scope, and | indeed that is why WMF forked off projects like Wiktionary | and Wikidata to their own TLDs. | | This feels like a gross misreading of the facts. | | Wiktionary is separate from Wikipedia because its work | product is fundamentally different -- it's a dictionary, not | an encylopedia. Wikipedia has a lot of articles about things | that aren't words, and Wiktionary has a lot of pages for | words that wouldn't make sense to write an encyclopedia | article about. | | Wikidata, meanwhile, is _only_ about raw data. Which is a | part of an encyclopedia, but far from all of it. (How would | you write an article about the history of Rome using only | data?) | | > But, let's be honest, the writing was on the wall when | Esperanza [1] was dissolved. | | This, too, is a gross misstatement of the facts. | | Esperanza was dissolved because it was becoming a cabal. It | was becoming its own organization, with its own decision- | making process and elected officials, a significant part of | which happened off-wiki. There was widespread agreement, | _even from Esperanza 's founder_, that the organization was | no longer fulfilling its purpose, and an effort to reform it | before it was shut down. | | There's a decent summary at: https://en.wikipedia.org/wiki/Wi | kipedia:Wikipedia_Signpost/2... | som33 wrote: | > I see year after year, decade after decade that they keep | being discussed, with the same points being made over and over | again. Is there a name for these? | | Because human beings don't see reality accurately, see the | science: | | https://www.youtube.com/watch?v=PYmi0DLzBdQ | netcan wrote: | This sort of problem (or its opposite) is inevitable with our | internet circa 2020. | | Maybe wikipedia _is_ too narrow. Maybe it 's too broad. These | things don't have singular, indisputable answers... | | The problem is that we have an internet of bottlenecks. Wikipedia | choices about what is encyclopedic or notable is the only | definition of encyclopedic that matters. Youtube's interpretation | of fair use, twitter's definition of offensive or facebook's | definition of obscene... they're the working definitions. | | The internet needs to be less centralised... Even wikipedia. | severine wrote: | Does anyone here remember Seth Finkelstein? | | https://www.theguardian.com/technology/2008/jul/31/wikipedia | | I miss his writing! | | edit: His blog is still up: | http://sethf.com/infothought/blog/archives/cat_wikipedia.htm... | but unfortunately, no new entries since 2013 :( | dang wrote: | A thread from 2016: https://news.ycombinator.com/item?id=13152255 | | Thread from 2014 - interesting top comment there: | https://news.ycombinator.com/item?id=8791791 | | (Reposts are ok after about a year: | https://news.ycombinator.com/newsfaq.html) | | I've made the year in the title a gwernian range. | domador wrote: | If Wikipedia is to retain a deletionist editorial cultural, I'd | at least like to be able to access the history for deleted | entries and read old versions of such entries. As it stands, | those entries seem to be permanently removed from public access | (and maybe even on the back end.) I don't like that 1984-style | versioning, which gives the message that certain entries never | existed in the first place. | | (I'd understand a small exception for copyright infringing | content--that such content should remain unavailable when | deleted.) | Stierlitz wrote: | "The fundamental cause of the decline is the English Wikipedia's | increasingly narrow attitude as to what are acceptable topics and | .. what are acceptable sources, where academic & media coverage | trumps any consideration of other factors." | | .. as well as self-serving corporate and political interests. As | in they sit on an article 24/7 making sure nothing controversial | get in. | | "Imagine a world in which every single person on the planet is | given free access to the sum of all human knowledge." | | Except for those that contradict the inner party. Go to the Talk | Page and discuss it they say. Do that and your account gets | disabled for violating some obscure WP rule. | qu4ku wrote: | Gwern's website starts to be next level. ___________________________________________________________________ (page generated 2020-05-05 23:00 UTC)