[HN Gopher] I'm a programmer - how can I help SciHub?
       ___________________________________________________________________
        
       I'm a programmer - how can I help SciHub?
        
       Author : andyxor
       Score  : 141 points
       Date   : 2021-07-11 18:28 UTC (4 hours ago)
        
 (HTM) web link (www.reddit.com)
 (TXT) w3m dump (www.reddit.com)
        
       | sprash wrote:
       | SciHub needs to be decentralized. Zeronet seems to be a good
       | blueprint on how to do it.
        
         | jazzyjackson wrote:
         | I am afraid the basic problems are still not solved with
         | regards to proof of authorship and file integrity - how do I
         | know the pdf I'm downloading is what was published?
         | 
         | I'm just in a "be careful what you wish for" state of mind, if
         | there was no one in charge of sci-hub, the publishers could go
         | on attack and fill the database with noise, copies of papers
         | with numbers and methods altered.
        
           | einpoklum wrote:
           | > I am afraid the basic problems are still not solved with
           | regards to proof of authorship and file integrity
           | 
           | These are not basic problems. We're talking about scientific
           | papers, not deeds to a piece of land or something.
           | 
           | > how do I know the pdf I'm downloading is what was
           | published?
           | 
           | How do you know the paper photo-copy you have of an article
           | is what was published? You don't 100% know, you make a
           | reasonable assumption.
           | 
           | The exceptional case of needing integrity verification can
           | have a niche solution.
        
           | bnj wrote:
           | Sounds like it would be a good idea for authors to start
           | providing checksums for their papers
        
             | user3939382 wrote:
             | Checksums are for error correction, I think you're
             | referring to digital signing, unless I'm misunderstanding
             | the issue being referred to
        
               | marcosdumay wrote:
               | In the context of sharing data between people, people use
               | checksums with the same meaning as cryptographic hashes,
               | as anything else would break for nearly all use-cases.
               | 
               | Digital signatures are actually overkill. The same
               | infrastructure one uses to discover the authorship of a
               | paper can distribute file hashes without any loss of
               | autentiction, and I don't think anybody needs the non-
               | refutation feature they bring. Maybe there is a nice use
               | case for a paper authorship database with a PKI, but to
               | my view the idea goes against an open scientific
               | community that I think is much more valuable. (But again,
               | maybe there is a way to have both.)
        
             | jazzyjackson wrote:
             | Right it's not insurmountable, authors can publish public
             | keys and sign the files - it's a UI/UX problem like
             | everything else in crypto.
             | 
             | Keybase makes it fairly easy, but people still have to
             | learn what it means and why it's trustworthy - my bet is
             | things don't change, there is a small percentage of people
             | who understand how to verify the source, and the general
             | population who either believes anything or nothing.
             | 
             | In the context of scientists and professionals tho maybe it
             | is achievable to do some outreach and get people on eg
             | keybase, something user friendly
        
               | ithkuil wrote:
               | Most people I've been advocating Keybase to, just assumed
               | it's yet another place where you create an account and
               | hence you must trust them (now owned by zoom, boo-hoo).
               | 
               | I found it well explained etc, but it's hard to reach
               | everybody.
        
             | viraptor wrote:
             | Unfortunately that's not practical. When you download from
             | a uni gateway, pdfs are often auto-watermarked with the
             | source. That way the authors have no idea what checksum you
             | see. And the places which don't do that yet could easily
             | start padding the end of the pdf with a random number of
             | spaces to stop verification efforts.
        
               | sli wrote:
               | It would be kind of neat if that could be layered so that
               | one could not only verify that their source-provided copy
               | is legit, but also that the underlying paper itself has
               | also not been modified.
        
             | vbezhenar wrote:
             | Torrent file is a checksum.
        
               | 6510 wrote:
               | Yes, the price we pay for the exploitation of stupid
               | movies? Torrents should be the standard way to distribute
               | anything.
        
           | [deleted]
        
         | captn3m0 wrote:
         | A DOI to IPFS directory would be cool. Is it still a copyright
         | violation if you write a hash, but don't put a IPFS link?
        
           | logifail wrote:
           | > A DOI to IPFS directory would be cool
           | 
           | Q: How open is the DOI system itself?
        
             | remram wrote:
             | It's mostly freely readable. It's closed to publishing
             | (gotta pay CrossRef or other licensees) and otherwise a
             | pain to interface with.
        
           | cassonmars wrote:
           | If the argument holds that a hash is equivalent to
           | holding/sharing the file, there's some big problems for
           | PhotoDNA given strict liability laws.
        
         | generalizations wrote:
         | All you really need is a searchable database that points you to
         | the torrent containing the paper you want. Then just download
         | the part of the torrent that you need.
         | 
         | Long as the seeders stick around, this is decentralized,
         | simple, and very hard to block.
         | 
         | Pretty sure this already exists, too, though it's extremely
         | unpolished.
        
         | bluedays wrote:
         | I think utilizing Chia coin would actually be a better idea.
         | Incentivize people to donate their hardware space with real
         | money.
        
         | andai wrote:
         | Couldn't we just use something like LimeWire? (Does anything
         | like that exist now?)
        
           | jazzyjackson wrote:
           | See lib gen and r/datahoarders, there are torrents available
           | split into chunks, I think it was 77 terabytes last I checked
           | 
           | An interesting question I think, is what value add does Sci-
           | Hub provide, because obviously it wasn't happening before
           | Alexandra made it happen, does it outgrow her or is she
           | holding it together?
        
             | devoutsalsa wrote:
             | How about creating SciPubCoin, a blockchain where you get
             | one coin from each published paper you submit!
        
               | beckman466 wrote:
               | > SciPubCoin, a blockchain where you get one coin from
               | each published paper you submit!
               | 
               | Probably better to create a system that includes a
               | reputation currency, since you can't 'spend' your
               | reputation: https://medium.com/metacurrency-
               | project/reputation-is-orthog...
        
               | 6510 wrote:
               | First the interface problem needs to be solved, after
               | that we get a good picture of adoption.
               | 
               | That there are gigantic torrents available is almost
               | useless. A popcorn time type of GUI client is needed that
               | allows search, can dl the right chunks reasonably fast,
               | seeds the rare chunks, has tit for tad implemented for a
               | group of torrents. Could even do full text search by
               | downloading all possible candidates after applying some
               | bloomfilter.
        
       | sega_sai wrote:
       | I don't think scihub is the future. OpenAccess is the future. I
       | know that in the UK you basically _have_ to post your accepted
       | papers in publicly accessible repositories if you want your
       | papers to count for the Research Excellent Framework exercise
       | (which basically compares Universities every 5 years). I know
       | many grants now have openaccess requirements. Plus in the field
       | like physics, basically everyone posts the papers to arxiv
       | anyway.
        
         | einpoklum wrote:
         | > I don't think scihub is the future. OpenAccess is the future.
         | 
         | It seems you are claiming that the future isn't making all
         | scientific content available, period - but rather only that
         | content whose copyright holders have decided to make available.
         | 
         | I whole-heartedly disagree. We must not submit to arbitrary
         | restrictions on the copying of information; and we certainly
         | cannot and should not wait for Elsevier, Springer, IEEE et alia
         | to grace us with access to articles.
         | 
         | Also - if "Open Access" means authors have to pay a large wad
         | of money to have their papers published - that's not tolerable
         | either.
        
         | crazygringo wrote:
         | OpenAccess can be _part_ of the future, but it 's certainly not
         | doing anything about all the papers published in the past. How
         | are you going to address that?
        
         | orzig wrote:
         | "In the long run, we're all dead" - John Maynard Keynes
         | 
         | You might be right about the far future, but there's still a
         | lot of human flourishing that fails to happen every day until
         | then. Or you could be wrong, and I'd hate to /start/ having
         | this conversation once we realize that.
        
         | teddyh wrote:
         | > _OpenAccess is the future_
         | 
         | Yes, but the perfect can be the enemy of the good.
        
         | jmcgough wrote:
         | I just don't see things changing in bio sciences because the
         | people who benefit from it don't want it to change, and the
         | people who want it to change (labs without a lot of money, grad
         | students, post-docs) have the least power change it.
         | 
         | Scihub is at least levelling the playing field and forcing the
         | conversation to happen.
        
         | esalman wrote:
         | One problem with open access is that the cost is prohibitively
         | high. It can range from $2500 in decent peer-reviewed journals
         | to $10,000+ in Nature. Recently we decided not to pay for open-
         | access for one of our articles (as we already had the preprint
         | out). One solution could be that the funding agencies take care
         | of the fees.. because I can't see publishers charging less for
         | it.
        
           | ineedasername wrote:
           | Yes, it all has to be paid for some how. Proofreading,
           | copyediting, typesetting, handling the logistics of physical
           | printing and distribution, usually an honorarium for the
           | journal editor, and probably other costs... It all takes
           | resources that have to be paid for somehow.
           | 
           | Personally, I think part of the solution would be to have
           | grants that are in some way publicly funded (taxes) to have a
           | portion set aside to pay publishing costs, and require
           | publishing in some way. This would both make open access with
           | well-polished articles more accessible, it would also help
           | solve the issue of negative results rarely being published.
           | 
           | Not perfect, not a.silver bullet, but at least an incremental
           | improvement.
        
             | [deleted]
        
             | Dayshine wrote:
             | Perhaps they should just stop wasting money on physical
             | journals and stick to the web format that 99% of people
             | read?
             | 
             | Claiming $10k for a bunch of unnecessary work is
             | outrageous.
             | 
             | The cost of prestigious journals are the curation and high
             | standard of peer review. Except they don't actually pay
             | their staff for those things...
        
             | IshKebab wrote:
             | > Proofreading, copyediting, typesetting, ... physical
             | printing...
             | 
             | Uhm, I don't know if you've published a paper since the 90s
             | but none of those are costs that modern journals incur. Or
             | if they do, nobody asked them to.
             | 
             | The main thing journals do is peer review and that is all
             | done for free by other academics. Authors do basically all
             | of the typesetting, and although journals still insist on
             | printing issues there's really no need for them to do so.
             | 
             | The only really important thing that journals do is finding
             | and hassling reviewers.
        
         | threatofrain wrote:
         | But then SciHub can be the makeshift bridge between now and the
         | open access future.
        
         | amelius wrote:
         | You are forgetting that a significant chunk of science is still
         | locked up behind paywalls.
        
         | cblconfederate wrote:
         | Open access should be the yesterday. I don't mind if they have
         | to charge $2000 for a paper (but not $5000 to open the paper
         | one year later), but publishers should be _forced_ to open up
         | all existing papers to open access as well . But it is not
         | happening largely because academia is stuck in the chicken-and-
         | egg situation where open access cannot become prestigious while
         | people keep publishing in prestigious journals, and academics
         | have not been able to replace journal prestige with something
         | better
        
         | logifail wrote:
         | > I know that in the UK you basically have to post your
         | accepted papers in publicly accessible repositories (..)
         | 
         | How does that _actually_ work, though? Can anyone download the
         | final, published PDF from "publicly accessible repositories"?
         | 
         | (Full disclosure: ex scientist with published papers, still no
         | idea how I can legally share _my_ work with anyone who might be
         | interested...)
        
           | crimsoneer wrote:
           | Have a look at this
           | 
           | http://lesscrime.info/post/how-to-stop-hiding-your-research/
        
       | akvadrako wrote:
       | You should checkout libgen-seedtools
       | 
       | https://github.com/subdavis/libgen-seedtools
       | 
       | The sci-hub archive is partly supported by libgen.rs. To ensure
       | that their content remains accessible, they have thousands of
       | very large torrents, many of which are not well seeded. If you
       | have a few TB of disk space and bandwidth to spare, it's a good
       | way to help out.
        
         | jszymborski wrote:
         | It's worth disclaiming that you may or may not be incurring
         | legal risk depending on your jurisdiction.
        
       | alfiedotwtf wrote:
       | For all the cryptocurrency, distributed, anti-censorship,
       | anonymous filestorage projects from the past few years, where the
       | hell are they all?
       | 
       | Cryptocoin community: hosting SciHub should be your platform's
       | Litmus Test. If you can't do this one thing, your anonymous,
       | decentralised, anti-censorship platform is a scam, so GTFO.
        
       | 533474 wrote:
       | Authors, publish your papers in your personal webpage. Do not
       | promote paywalls. To all others, donate and support
       | decentralization efforts. To those that are particularly wealthy,
       | please think about supporting financially too
        
         | logifail wrote:
         | > publish your papers in your personal webpage
         | 
         | Q: Do authors actually _have_ the right to republish the final
         | published version from the journal they submitted their work
         | to?
        
           | goerz wrote:
           | All journals I have ever published to (in theoretical
           | physics) explicitly allow the authors to put the journal PDF
           | of their article on their personal or institutional websites.
           | They also allow to have a "reprint" (identical content to
           | published version, but not the exact same PDF) to be on the
           | arXiv. I'm not aware of any journals in the field that don't
           | allow this. I'm sure they exist, but I would not consider
           | them for publication.
        
             | logifail wrote:
             | > explicitly allow the authors to put the journal PDF of
             | their article on their personal or institutional websites
             | 
             | Q: Where is an author supposed to _obtain_ the final,
             | "official" journal-approved PDFs in order to republish
             | them?
             | 
             | Unless I head for sci-hub, I don't have any of mine :(
        
               | goerz wrote:
               | You usually have institutional access. If not, I suppose
               | you can ask the publisher to email you the pdf. Or you
               | might be able to download it from the the submission
               | website. I don't understand: you really don't have PDFs
               | of your own papers? Did you leave academia and delete
               | your data?
        
           | einpoklum wrote:
           | They can easily obtain this right using the "Standard Trick":
           | 
           | https://academia.stackexchange.com/a/119002/7319
        
         | lalaland1125 wrote:
         | There is no need for publishing papers on personal websites
         | when arxiv exists and is better.
        
           | kensai wrote:
           | Arxiv is preprint. Not peer-reviewed. Most times reviewed
           | articles have critical changes before they reach their final,
           | journal version.
        
             | petschge wrote:
             | Most if not all journals in my field allow you to upload
             | the final accepted PDF that you have typeset in latex
             | yourself to arxiv. We even put "accepted in $journal" and
             | the DOI in the comment field. This includes all the changes
             | you have made in response to referee comments. What you can
             | not upload is the finial language-edited and nicely
             | layouted version that the journal has build from your
             | submission.
        
               | [deleted]
        
             | remram wrote:
             | arXiv is not only for pre-printed. Whatever paper you
             | legally put on your website can and should go there (or
             | Zenodo, Figshare, OSF, etc).
        
           | 6510 wrote:
           | Do both.
        
         | jasode wrote:
         | _> Authors, publish your papers in your personal webpage. Do
         | not promote paywalls. _
         | 
         | This plea to authors to change their behavior _ignores_ why
         | they submit papers to paywall publishers: The prestigious
         | journal 's acceptance of their paper helps _promote their
         | career_.
         | 
         | Academic publishing is not a _web host server for pdf files_
         | type of problem. Therefore, suggesting authors to upload their
         | pdf to a public Dropbox url, GoogleDrive, Github repo, or their
         | university faculty homepage doesn 't solve the real problem. So
         | even if Scihub had a "direct upload pdf" option, that still
         | doesn't solve the underlying problem for getting their paper
         | _recognized_ for good work which spurs citations.
         | 
         | Scihub is a _distribution mechanism for pdfs_ but not a
         | _recognition and impact filter for which papers are
         | _important__. This is why scientists keep doing contradictory
         | behaviors: On the one hand, they praise Scihub because it gives
         | them access to papers -- but on the other hand, they keep
         | submitting to paywalled journals to help their career.
         | 
         | Think of _game theory incentives_ instead of hosting pdf files.
         | Journals have the _respected editorial staffs_ to look at their
         | submitted paper and _forward it to other peers for review_.
         | OpenAccess is a possible option but most OA journals don 't
         | have same prestige as the paywalled journals. That may change
         | but it will take a long time.
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2021-07-11 23:00 UTC)