[HN Gopher] Show HN: Promnesia - an attempt to fix broken web hi...
       ___________________________________________________________________
        
       Show HN: Promnesia - an attempt to fix broken web history
        
       Author : karlicoss
       Score  : 164 points
       Date   : 2020-06-28 13:18 UTC (9 hours ago)
        
 (HTM) web link (beepb00p.xyz)
 (TXT) w3m dump (beepb00p.xyz)
        
       | dpacmittal wrote:
       | This is awesome! I've definitely wanted this for as long as you
       | have. I have this idea noted down exactly as you have described
       | somewhere in evernote. Well done! Looking forward to contributing
       | to it.
        
       | CWuestefeld wrote:
       | I've been frustrated by many of the same things, and have
       | recently been playing with Memex.
       | 
       | The solution outlined here leaves me with a couple of questions,
       | though.
       | 
       | 1. Since there's a local app acting as a service, it's not clear
       | to me how this would run on a mobile device.
       | 
       | 2. Once it is running on my mobile device (and home computer, and
       | work computer, and chromebook, and various other machines I use),
       | how do I aggregate all of the data? I'd like to be doing work-
       | related research at home in the evening, and be able to see the
       | fruits of it from the office.
       | 
       | I suspect that the answer to this is the same thing: that rather
       | than a locally-running server, I could put something on my home
       | server or on a cloud-based server, and direct all my various
       | devices to communicate with that rather than localhost?
        
         | karlicoss wrote:
         | Someone was asking that before, perhaps I should add to FAQ!
         | https://github.com/karlicoss/promnesia/issues/114
         | 
         | Yep, you could use a VPS or something and host it behind a
         | reverse proxy, that's what I've been doing so far.
         | 
         | Also for mobile specifically, on Android it works under Termux
         | (haven't personally tried yet, but can't see why not, and the
         | person in the issue I linked claims it works).
         | 
         | For data aggregation: it depends on the data source, but the
         | easiest seems to make sure your data ends on a single computer,
         | index it there, and after than you get an sqlite database which
         | you can simply sync with Dropbox/Syncthing or anything else you
         | prefer.
        
       | m0zg wrote:
       | Also, since the various web archives are getting shut down soon,
       | it'd be great if such extensions could locally and securely
       | preserve pages much like an archival crawler does it, or better
       | yet create a distributed archive that's impossible to shut down
       | or censor. Better yet still if there's local, language-aware
       | index over such pages so that I could search them easily, without
       | Google deciding what I should and should not see.
        
       | indentit wrote:
       | Nice description of what you're trying to solve - it certainly
       | resonates with me so I plan to try it out!
       | 
       | I've recently started trying Shiori[1] to manage my "bookmarks"
       | and preserving offline copies locally without relying on The
       | Internet Archive, however it still doesn't really help with
       | private content (i.e. Pages only accessible as an authenticated
       | and authorized user) so it'd be great if Promnesia caters for
       | that. Plus the whole data silo thing...
       | 
       | I was a little surprised to see no mention of the "tree style
       | tabs" extension which can help with "where did I get to this link
       | from?" style questions
       | 
       | [1]: https://github.com/go-shiori/shiori
        
       | zingermc wrote:
       | Does promnesia server run a local HTTP server? How do you prevent
       | a website from slurping up the entire database?
        
         | karlicoss wrote:
         | Yep, it's a local HTTP server by default. It's also possible to
         | expose it via reverse proxy, and you can set basic auth
         | password in the extension's settings.
         | 
         | What do you mean by slurping here? Security-wise, a random
         | website shouldn't be able to query a localhost because of CORS
         | policies.
        
           | zingermc wrote:
           | Unfortunately, CORS isn't a magic bullet. Suppose a site
           | named evil.example adds a script tag pointing to
           | http://localhost:1234/promnesia.js and a victim loads
           | evil.example. If your JS updates a DOM element with info from
           | the database, evil.example's JS can read that DOM element and
           | report it back to the server, without violating CORS.
        
             | zingermc wrote:
             | To follow up: the solution is that the localhost server
             | needs to make sure each API call is authorized (if you
             | aren't already). This means there must be a login/setup
             | step.
             | 
             | An API call can't be considered authorized just because it
             | came from localhost :)
        
               | karlicoss wrote:
               | Thanks! Created an issue
               | https://github.com/karlicoss/promnesia/issues/115
        
             | karlicoss wrote:
             | Ah I see, thanks! Good point, and I guess basic auth would
             | protect against such sort of attack. So it seems it makes
             | sense to use a token even if it's running as localhost, I
             | could add an option, so it doesn't require setting up a
             | separate proxy.
             | 
             | Either way, I hope I've been fairly reasonable about
             | security so far, but I've mostly been concentrating on the
             | 'plugging in the data' bit, so it's possible I've
             | overlooked something (also I'm not a security specialist!).
             | There is an open issue in case people have any specific
             | concerns or spot something, happy to receive feedback!
             | https://github.com/karlicoss/promnesia/issues/14
        
               | zingermc wrote:
               | Awesome! Unguessable auth is the answer. You could even
               | have the server generate a uuid token and have the user
               | paste it into the browser extension.
        
       | infogulch wrote:
       | The motivations and analysis of current problems resonates with
       | me deeply, thank you for the writeup!
       | 
       | Perkeep is another project that might be interesting to analyze
       | in this context. https://perkeep.org/
        
       | owenshen24 wrote:
       | Justifications are very well-reasoned; a good read in and of
       | itself.
        
       | mongojunction wrote:
       | Well done. The write up really being together some concepts and
       | creates some clarity on things I've been feeling about for a
       | while.
       | 
       | Is author aware of my history based fully interactive offline
       | archiver? https://github.com/dosyago/22120
        
         | gpm wrote:
         | Do you know of a reason that this can't work with firefox, or
         | is it probably just a matter of someone putting in the work?
        
         | karlicoss wrote:
         | Author here, thanks!
         | 
         | Haven't seen your tool in particular, thanks for the link, I'll
         | check it out. I only used https://github.com/pirate/ArchiveBox
         | before, but haven't set up an automatic archival pipeline
         | (yet)!
         | 
         | Also, integrating with local web archives is on my Promnesia
         | todolist! I expect them to be very useful for indirect history
         | retrieval, e.g. "I haven't visited that page, but it's within
         | one link". Having local web archives makes it possible to
         | implement such functionality in efficient way.
        
       | newman314 wrote:
       | This sounds like it might be close to meeting my use case.
       | 
       | I have bad memory and hence try to write down everything I can.
       | But often throughout a single day/week, I do research on a topic
       | and have a bunch of tabs open that I intend to come back to. Or I
       | read an article that several days later that I cannot recall
       | where I read it at (HN, Twitter, etc.) This usually leads to a
       | frantic search until I can find what I'm looking for as well as
       | having a ton of tabs open.
       | 
       | Manually grouping topics together is too hard. What would be
       | great is a tool that knows where I've been, discards bad
       | information (google search result, followed by near immediate
       | close) and some sort of an attempt at topic autoclassficiation
       | (SAP, storage, backup etc.) that gives me the confidence to close
       | tabs knowing that I can get back to a particular topic at a later
       | date.
        
         | soulofmischief wrote:
         | Bruh I've got tabs open from years ago. Hundreds across
         | multiple VMs. I have tabs open that I migrated from my last
         | computer. Someone please help.
        
           | karlicoss wrote:
           | I've struggled for a while with this kind of overload and
           | ended up with a system that makes it manageable:
           | 
           | 1. make is as easy to 'bookmark' stuff as possible -- with a
           | single hotkey
           | 
           | 2. make it as easy to search over bookmarks as possible --
           | also ideally with a single hotkey or as quick as you could do
           | google search
           | 
           | My way of achieving this is using org-mode files for
           | 'bookmarks' [0] and using emacs/ripgrep to search over it
           | [1]. Additional benefit of org-mode is that it's very easy to
           | add notes, priorities, refile bookmarks, so the most
           | interesting stuff propagates through my notes, and I don't
           | feel bad about missing out on information that I don't have
           | time to process because I can always quickly find it when I
           | need.
           | 
           | [0] https://github.com/karlicoss/grasp#readme
           | 
           | [1] https://beepb00p.xyz/pkm-search.html#personal_information
        
       | rosstex wrote:
       | As a PhD student, your post reads like a beautiful research
       | paper. Motivation, prior work, contributions, technical details,
       | example use cases, self-references, future work, even a system
       | design chart. You've certainly sold me on the extension, great
       | work!
        
       | j88439h84 wrote:
       | Have you thought about using SingleFile/SingleFileZ [1] to
       | download archives of the pages instead of using links to wayback?
       | 
       | [1]
       | https://chrome.google.com/webstore/detail/singlefilez/offkdf...
        
       | idm wrote:
       | You've convinced me to try it out.
       | 
       | My personal knowledge management project, Gthnk (gthnk.com),
       | would appear to plug in easily as a Source - without any special
       | plugin necessary. I really like what you've made!
        
       | StavrosK wrote:
       | I've been thinking about this problem a lot myself too, and I'm
       | currently rewriting www.historio.us to attack the problem more
       | efficiently. I've been considering various new features, and this
       | writeup is very useful, thank you.
        
       | an4rchy wrote:
       | This is awesome! I just started using the WorldBrain Memex and
       | was trying to solve the issue of accessing other data sources, so
       | perfect timing -- thanks!
       | 
       | Looking forward to trying it out.
        
       | spurgu wrote:
       | This might be more suitable as a Github issue but since you're
       | here, I'm simply getting an error using Brave: "ERROR: Failed to
       | fetch" (shown in the extension popup when clicking the eye, which
       | is always red)
       | 
       | Another thing: Have you considered adding annotation capability
       | directly into the extension? This is something I've thought about
       | creating an extension for, since I don't use anything like
       | Instapaper.
        
         | karlicoss wrote:
         | Very unlikely I'll be adding support for annotations -- the
         | idea is using the existing tools and integrating with their
         | data. Otherwise I end up reimplementing yet another annotation
         | tool :).
         | 
         | If you're looking for something similar to Instapaper, but
         | local only, your best bet is probably Worldbrain Memex. And as
         | I mentioned in the post, I was thinking of potentially
         | integrating with them tighter anyway.
        
         | rosstex wrote:
         | You have to run the local Python server by following the next
         | instructions.
        
           | karlicoss wrote:
           | Yep! I guess I should make the error more clear in the
           | extension and point to the readme.
           | 
           | In theory, I could make it defensive too and allow using
           | without the local backend (only with local browser history),
           | but not sure if there is much value in this.
        
             | rosstex wrote:
             | I think the aspect of knowing where you browsed to a page
             | from, and visualizing a hierarchy of pages within a site
             | that I've visited, are the most interesting parts for me,
             | and those certainly apply to the browser history alone.
        
               | karlicoss wrote:
               | Fair enough! Created an issue
               | https://github.com/karlicoss/promnesia/issues/120
        
       | m-localhost wrote:
       | Great write up for a problem I'm thinking about myself a lot
       | (https://marcus-obst.de/wiki/Notetaking)
       | 
       | Thanks also for using the Yak Shaving - for one, I got curious
       | what was first, the term or the Ren & Stimpy episode illustrating
       | the term and second, I found a description of most of my modus
       | operandi.
        
       | ybbond wrote:
       | I am following this post too. I meant, from the first time you
       | published this. I am using Worldbrain's Memex 2 and when I see
       | this post reposted, I check the "Memex 2" section.
       | 
       | There is update! Maybe I will look into Promnesia and StorexHub
       | integration next weekend. Thank you for your effort with
       | Promnesia!
        
       | kemonocode wrote:
       | This sounds like something that would be close to meeting my
       | needs. I, too, end up leaving far too many tabs open and I feel
       | the need to have something in between a bookmark I'll never look
       | at again and may have little context as to why I may have created
       | it to begin with, and a tab just eternally polluting my browser
       | and that might just end up getting sent to OneTab and thus as a
       | "lesser" bookmark. I know Firefox (and probably Chrome as well)
       | lets you leave tags on bookmarks, but these always seem like
       | they're hardly enough. And that's without even mentioning all the
       | different pseudo-bookmarks scattered over many different
       | services!
        
       ___________________________________________________________________
       (page generated 2020-06-28 23:00 UTC)