[HN Gopher] Inciteful: Using Citations to Explore Academic Liter...
       ___________________________________________________________________
        
       Inciteful: Using Citations to Explore Academic Literature
        
       Author : adamnemecek
       Score  : 107 points
       Date   : 2020-12-19 18:21 UTC (4 hours ago)
        
 (HTM) web link (inciteful.xyz)
 (TXT) w3m dump (inciteful.xyz)
        
       | weishuhn wrote:
       | Creator here! Wasn't expecting this to happen :) The site is
       | definitely still in Beta so I appreciate any and all feedback. I
       | just launched it a few days ago. It's been my COVID project and I
       | finally got to the point where I felt comfortable having others
       | use it.
       | 
       | The biggest hurdle was the speed of the graph creation. Basically
       | taking a 250,000,000 paper/2,500,000 citaiton db and creating
       | graphs that could be up to 200k papers and 3-4mm citations. For
       | that I ended up learning/using Rust (which was a great
       | experience).
       | 
       | The plan is to keep it totally free and hopefully get some
       | institutional support once I get a better handle on demand and
       | costs.
       | 
       | Ask me anything!
       | 
       | EDIT: As you are going through the site, be sure to use the
       | purple "+" buttons to create your own graphs centered on the
       | topic of your choice. That combined with the in-graph keyword
       | filters are probably the most powerful ways to quickly zero in on
       | the most relevant literature.
        
         | jonmoore wrote:
         | Very nice work. I especially liked the ability to build up a
         | collection of papers, that the response time was good, and that
         | the SQL could be edited directly.
         | 
         | Do you have any plans to add a graphical visualization of
         | top/central papers?
        
           | weishuhn wrote:
           | That is the most requested feature and something I'm working
           | on. It's a fun (and hard) design/data problem. Which of the
           | 5k-150k papers do you show in the graph? And then how do you
           | render them in a way that is both visually appealing but also
           | conveys the most import information?
        
         | feanaro wrote:
         | Which crate are you using for graph manipulation?
        
           | weishuhn wrote:
           | I go through a lot of the details in my post on the Rust
           | subreddit: https://www.reddit.com/r/rust/comments/kfiaqn/just
           | _wanted_to...
           | 
           | But long story short, I end up doing most of the graph
           | analysis by passing in the citations, using PyO3, to graph-
           | tool in python then returning the data I need about each
           | paper. I am planning on moving that over to Rust. But not
           | being an academic I wanted to get feedback on the quality of
           | the results before making it difficult to quickly test
           | different types of algorithms.
        
             | adamnemecek wrote:
             | Are you planning on open sourcing parts of it?
        
               | weishuhn wrote:
               | Eventually I'd like to move the site to open source, but
               | right now the repo isn't in a place where I can do that.
               | As for specific parts, it's pretty purpose built and this
               | is my first Rust project and so I'm not sure which parts
               | would be helpful to the community. And I doubt they would
               | meet the communities standards just yet :)
        
               | adamnemecek wrote:
               | It's better to open source sooner rather than later even
               | if it's not in a place you'd want it to be. Like some of
               | the work you have to do might be done by the community.
        
         | [deleted]
        
       | joshgev wrote:
       | First impression is positive; relevant results and a reasonably
       | straight-forward UI. I appreciate the warning about the slow-
       | loading graph (which did load after a minute or two).
       | 
       | Two things that might be tweaked:
       | 
       | * The search didn't behave in an ergonomic way: I typed a query
       | ("graph neural networks") and great relevant stuff came up
       | immediately in the dropdown. When I hit enter, however, I got an
       | error that read "Invalid search: Check your spelling, enter a
       | DOI, or another paper identifier or." I would have expected my
       | action to take me to a search results page that listed what I saw
       | in the dropdown (which I regard as a preview of the top hits) so
       | that I could peruse the selection carefully.
       | 
       | * I wanted to load a paper to take a look at it and it took me a
       | while to realize that I could click the "Yes" above "Open Access"
       | to download it. Since one of the big use cases for a site like
       | this is the eventual consumption of these papers, I suggest
       | making a "read/download paper" call to action more explicit.
        
         | weishuhn wrote:
         | Thanks for the feedback. #1 might be some sort of bug. #2 Good
         | call!
        
       | comex wrote:
       | In case the creators of this site are reading, there are some
       | grammatical issues with the front page copy:
       | 
       | "on it's head" should be "on its head".
       | 
       | "not only with" should be "with not only".
       | 
       | "analysis'" should not have an apostrophe and should possibly be
       | "analyses".
        
         | weishuhn wrote:
         | Much appreciated! I'll update it. I really hadn't anticipated
         | much traffic. I was planning on doing an open beta but things
         | have kind of taken on a life of their own (in a good way).
        
       | adamnemecek wrote:
       | Here's the original announcement
       | https://www.reddit.com/r/rust/comments/kfiaqn/just_wanted_to...
        
       | nt2h9uh238h wrote:
       | Cool idea, but I think I broke the search with quantum computing.
       | No papers to download, no links, and no related pages. Just 404s
       | 
       | https://inciteful.xyz/p/186039733?&keywords=hello&maxDistanc...
        
         | whatshisface wrote:
         | You had "hello" as a keyword. I removed it and saw a bunch of
         | papers pop up.
        
       | tmabraham wrote:
       | Is this any better than Connected Papers?
       | https://www.connectedpapers.com/
        
         | weishuhn wrote:
         | Creator here. It's similar in that it uses citations to make
         | paper recommendations. But different in that I give you access
         | to the entire paper graph rather try to distill it down to just
         | a few. You can even write your own queries by clicking on the
         | "SQL" button at the bottom of each table. I kind of view it as
         | a Connected Papers for power users.
        
         | anigbrowl wrote:
         | Relevant to my interests, thanks.
        
       | currymj wrote:
       | this seems very nice -- very polished and gives good results.
       | 
       | I would like it if the bibtex entries had meaningful cite keys as
       | opposed to long numbers. as is, it would be pretty difficult to
       | actually write a paper using these bibtex files.
        
         | weishuhn wrote:
         | I am planning on making the info in the BibTex files a bit more
         | robust. Right now I'm just adding what I have readily
         | available. But in general, if you are using Zotero or Mendely,
         | the functionality to enrich the metadata on the entries does a
         | good job filling in the missing info.
        
       | anigbrowl wrote:
       | 8 mentions of the 'graph' on the page. Zero renderings of the
       | graph :(
       | 
       | It _is_ a nice user interface and the reference material is
       | useful and well presented. But when you get down to it, it 's
       | linear lists about the characteristics of a semantic graph. As
       | I've said many times, this is like describing a tree with a
       | tabular catalog of its leaves. Graph navigation needs a graphical
       | representation, because things like branchiness (node out-degree)
       | and other factors are more easily shown than described.
       | 
       | The basic problem with graph representation/ navigation/
       | traversal is that there are many valid ways of looking at the
       | graph and it's hard to render them all. Maybe try using a gutter
       | to allow users to temporarily pin certain graph characteristics
       | and render accordingly. In this context, sometimes I might be
       | interested in the latest research that cites a paper, other times
       | I might be looking to see who picked it up first, or to apply
       | some sort of windowing function to a large spectrum of citations.
       | 
       | But I want to _see_ the graph, even if I am looking for a
       | particular leaf.
        
       | bachmeier wrote:
       | Curious about HN's rules on titles. How is the language relevant
       | in any way to this particular post? That sure seems to be
       | editorializing in an attempt to promote something other than the
       | project itself (and obviously to get upvotes). Given the explicit
       | rules against both editorializing and soliciting upvotes, I don't
       | see why this title is allowed.
       | 
       | > Otherwise please use the original title, unless it is
       | misleading or linkbait; don't editorialize.
       | 
       | > Don't solicit upvotes, comments, or submissions.
        
         | dang wrote:
         | From an HN point of view Rust is so frequently discussed that
         | it's probably better not to orient a submission like this in
         | that direction--it will probably get sucked in the generic
         | direction, and the diff here (new academic search engine) is
         | more interesting.
         | 
         | https://hn.algolia.com/?query=generic%20discussion%20by:dang...
         | 
         | https://hn.algolia.com/?query=curiosity%20repetition%20by:da...
         | 
         | https://hn.algolia.com/?query=diffs%20by:dang&dateRange=all&...
        
         | adamnemecek wrote:
         | Below, I link to the original announcement which was posted to
         | /r/rust. The author talks about the choice of Rust a bunch.
        
           | bachmeier wrote:
           | Okay, but my point still stands - that has nothing to do with
           | what was posted to HN. That part was included in the title to
           | get upvotes.
        
             | adamnemecek wrote:
             | It was included in the title since the author though that
             | the language choice mattered.
        
               | libria wrote:
               | It does matter in the context of the author addressing
               | the /r/rust community. If you had posted the reddit link
               | instead this would make sense.
               | 
               | But I don't think the language is relevant for the direct
               | inciteful.xyz site itself. Better to submit both links
               | separately than trying to combine them as they have
               | different audiences.
        
               | adamnemecek wrote:
               | Posting it twice is a terrible suggestion.
        
               | dang wrote:
               | Posting one and linking to the other in the comments, as
               | you did, is the best way.
        
         | tedunangst wrote:
         | My first question was how much academic literature written in
         | rust is there.
        
         | chapium wrote:
         | There may or may not be some interesting reasons for choosing
         | rust, but also the original article does not even include Rust
         | in the title. It seems like Rust is mentioned solely for
         | attention in this case. Would "A better way to search through
         | academic literature" have been popular? Probably. How about "A
         | better way to search through academic literature written in
         | Java"? I assume not.
         | 
         | Regarding moderation, its a thankless task which I don't envy
         | and its hard to draw a line over nitpicky article titles when
         | one has been voted in already.
        
       | jszymborski wrote:
       | I'm a big fan of trying new search engines for academic research;
       | I hopefully am more likely to break out of whatever search bubble
       | I'm unaware I'm in.
       | 
       | This one in particular had some very nice features, some of which
       | are present in Semantic Scholar (my current favourite) but some
       | which are certainly not.
       | 
       | Recommending papers based on citation graphs is a good way to
       | very quickly get up to speed with fields I'm not to familiar
       | with, but I'm always wary that I'll end up back in the feedback
       | loop of very few popular papers rising to the top while perfectly
       | good papers go unseen because they weren't well cited in the year
       | they were written.
       | 
       | So I'll certainly keep an eye on this and give it a try, but I'm
       | certainly still in the market for a "serendipity" slider on such
       | recommendation engines.
        
         | weishuhn wrote:
         | If you have the chance, try the following:
         | 
         | 1. Find a paper you like in a field you want to learn.
         | 
         | 2. Use the keyword filters to filter down to papers that match
         | your criteria.
         | 
         | 3. Add a bunch of the interesting ones to a new graph using the
         | purple "+" buttons.
         | 
         | 4. On the "new" graph page, check out the similar papers
         | section. If any of them are interesting, add those to the
         | graph.
         | 
         | 5. Repeat until you don't find anything else that is
         | interesting.
         | 
         | The similar papers section uses a link prediction algorithm
         | that basically says, if two papers cite a bunch of the same
         | papers, rank them higher BUT if the paper they cite, is cited a
         | bunch of times, don't give that connection much weight. The net
         | effect of this is that it doesn't really matter if the paper
         | was highly cited, only that it cites the same niche of papers
         | as the ones you just chose. Also, because of the temporal
         | nature of academic literature, the papers it brings up tend to
         | be the newer and harder to discover papers.
         | 
         | The results are pretty great and it's as close to the
         | "serendipity" slider that you'll get right now.
         | 
         | EDIT: Formatting
        
       | raister wrote:
       | How do you make it work with arXiv documents - I'm
       | copying/pasting the ID and it's not finding it. Bug or feature?
       | PS: very good idea, cheers.
        
         | weishuhn wrote:
         | The database is about a month out of date right now. I am going
         | to be updating it soon. You should be able to either put in the
         | url or do arxiv:XXXX.XXXXX
        
       | petschge wrote:
       | I played a bit with the site and liked it. The one killer feature
       | I might pay a few dollars per year for is to create a profile,
       | with a few graphs attached and get a daily or weekly email when a
       | new paper is published that fits well into one of my graphs. Feel
       | free to add a monthly "the most important old paper that you have
       | not read yet (or not read in the last 5 years)" email too.
        
       ___________________________________________________________________
       (page generated 2020-12-19 23:00 UTC)