[HN Gopher] Show HN: Self-hosted offline Internet from your brow... ___________________________________________________________________ Show HN: Self-hosted offline Internet from your browsing history Author : graderjs Score : 297 points Date : 2020-11-11 16:15 UTC (6 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | segmondy wrote: | I just wanna cache all my bookmarks, I rarely look at them, but | when I do go look at them, a good chunk tend to have rotted. It | will be awesome to cache all my bookmarks and then have option to | recursively cache the path I'm in. I don't want to cache every | page I visit, 90% is junk. | knyazhefilms wrote: | >have option to recursively cache the path I'm in | | It's interesting, what do you mean by that? | jb775 wrote: | Would be cool if you could create a whitelist of websites, then | have a feature to check if any other users have more recent | versions of those sites (if they happen to be online). This way | you get decentralized site updates without actually going to each | site itself. | cutemonster wrote: | Yes. But also keep one's own originally downloaded version, in | case the newer version is messed up | | And even more cool: If one could browse one's friends' sites, | while everyone was offline (if their privacy / sharing settings | allowed), just a local net in maybe a rural village | | Edit: roadmap: "Distributed p2p web browser on IPFS" -- is that | it? :-) | severine wrote: | Given the hard 'no' in the FAQ, does anyone know about a similar | project for Firefox? | rzzzt wrote: | Could the "HAR" file I can save from Firefox' Network tab | somehow be used for this? That looks to be a recording from the | entire timeline, including payloads. | franga2000 wrote: | I have used HAR files for archiving purposes in the past and | it did work fairly well, but I'm not sure if there's a way of | getting them programmatically | BlackLotus89 wrote: | Like the title of the github repo suggests ArchiveBox can be | used. You have to manually import your browsing history | though.... | | In theory you could also use yacy... But that is intended as | search engine and not as archive. | | Edit: while looking into it I found alternatives [2] and Memex | [3] seems to be interesting. | | Edit2: I remember 2 Show HNs. One recorded your entire desktop | and made it searchable. Can't remember what that was called, | but the AllSeingEye I found [4] | | [0] https://archivebox.io/ | | [1] https://yacy.net/ | | [2] https://docs.archivebox.io/en/latest/Web-Archiving- | Community... | | [3] https://getmemex.com/ | | [4] https://news.ycombinator.com/item?id=7886270 | severine wrote: | Thanks for your answer, I had missed ArchiveBox completely! | | I currently use Memex, but this is different approach, and I | keep looking for a polished experience that can get more | mainstream users into archiving/offline browsing. | vezycash wrote: | Add webrecoder to the list | BlackLotus89 wrote: | It's now called Conifer and listed under my [2] link :) but | thank you for mentioning it by name so I could look it up | again, seems interesting. | | Looks like I got some research for this week | tiborsaas wrote: | I'd love to see an entry in the FAQ explaining the weird name. | guavaNinja wrote: | Just the port they used by default, to help remember it | deelawn wrote: | Seems cool, but can someone explain to me the need for all of the | obfuscated code in the files with "22120" in the name? | totony wrote: | I think those are the build artifacts | alliao wrote: | I miss RSS primarily because I was able to search for stuff | either I've read or I care about... | | I am too embarrassed to admit that a disproportionate amount of | my time are spent on looking for a sentence or god forbid a tweet | I vaguely remember reading last week. | | SO yes, consider this a vote for that sexy full text search | please. | [deleted] | avmich wrote: | Awesome thing, but - | | > Can I use this with a browser that's not Chrome-based? > No. | | Note that a (rather similar) thing I've participated in in 2002 | was browser-neutral. | nosmokewhereiam wrote: | This is really cool. Thank you for being open source. | xiphias2 wrote: | I'd love to use this on my mobile, as that's where I mostly have | problems with connecting to internet, but it still looks pretty | interesting | peterburkimsher wrote: | It looks like something I'd appreciate! I make a significant | effort to archive things that I think I'll need. | | Unfortunately it didn't work when I just tried installing it now | (macOS 10.13.6, node v14.8.0). MacBook- | Pro:Desktop peter$ npx archivist1 npx: installed 79 in | 8.282s Preferences file does not exist. Creating one... | Args usage: <server_port> <save|serve> <chrome_port> | <library_path> Updating base path from undefined to | /Users/peter... Archive directory | (/Users/peter/22120-arc/public/library) does not exist, | creating... Created. Cache file does not exist, | creating... Created! Index file does not exist, | creating... Created! Base path updated to: | /Users/peter. Saving to preferences... Saved! Running | in node... Importing dependencies... Attempting to | shut running chrome... There was no running chrome. | Removing 22120's existing temporary browser cache if it exists... | Launching library server... Library server started. | Waiting 1 second... | {"server_up":{"upAt":"2020-11-11T21:48:25.324Z","port":22120}} | Launching chrome... (node:33988) | UnhandledPromiseRejectionWarning: Error: connect ECONNREFUSED | 127.0.0.1:9222 at TCPConnectWrap.afterConnect [as | oncomplete] (net.js:1144:16) (Use `node --trace-warnings | ...` to show where the warning was created) (node:33988) | UnhandledPromiseRejectionWarning: Unhandled promise rejection. | This error originated either by throwing inside of an async | function without a catch block, or by rejecting a promise which | was not handled with .catch(). To terminate the node process on | unhandled promise rejection, use the CLI flag `--unhandled- | rejections=strict` (see | https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). | (rejection id: 1) (node:33988) [DEP0018] | DeprecationWarning: Unhandled promise rejections are deprecated. | In the future, promise rejections that are not handled will | terminate the Node.js process with a non-zero exit code. | (node:33988) UnhandledPromiseRejectionWarning: TypeError: Cannot | read property 'writeFileSync' of undefined at ae (/User | s/peter/.npm/_npx/33988/lib/node_modules/archivist1/22120.js:321: | 14209) at Object.changeMode (/Users/peter/.npm/_npx/339 | 88/lib/node_modules/archivist1/22120.js:321:8088) at /U | sers/peter/.npm/_npx/33988/lib/node_modules/archivist1/22120.js:3 | 21:16174 at s.handle_request (/Users/peter/.npm/_npx/33 | 988/lib/node_modules/archivist1/22120.js:128:783) at s | (/Users/peter/.npm/_npx/33988/lib/node_modules/archivist1/22120.j | s:121:879) at p.dispatch (/Users/peter/.npm/_npx/33988/ | lib/node_modules/archivist1/22120.js:121:901) at | s.handle_request (/Users/peter/.npm/_npx/33988/lib/node_modules/a | rchivist1/22120.js:128:783) at /Users/peter/.npm/_npx/3 | 3988/lib/node_modules/archivist1/22120.js:114:2533 at | Function.v.process_params (/Users/peter/.npm/_npx/33988/lib/node_ | modules/archivist1/22120.js:114:3436) at b (/Users/pete | r/.npm/_npx/33988/lib/node_modules/archivist1/22120.js:114:2476) | (node:33988) UnhandledPromiseRejectionWarning: Unhandled promise | rejection. This error originated either by throwing inside of an | async function without a catch block, or by rejecting a promise | which was not handled with .catch(). To terminate the node | process on unhandled promise rejection, use the CLI flag | `--unhandled-rejections=strict` (see | https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). | (rejection id: 3) ^CCleanup called on reason: SIGINT | MacBook-Pro:Desktop peter$ | halukakin wrote: | This was a nice ie feature 20 years ago. | | https://support.microsoft.com/en-us/help/196646/how-to-make-... | atum47 wrote: | Very interesting indeed. I remember having to paste a script in | the console in order to be able to view my cached files. | abnry wrote: | I am coding my own hacked together bookmarks manager. I can save | any page with a click of the button using SingleFile (a fantastic | Chrome extension, by the way!). | | Then a cronjob runs and puts it into a folder to be processed | into a database, which generates a static html index and puts it | in my Google Drive. | | Then it syncs offline on my chromebook. Which means that without | internet, I can put my chromebook in tablet mode and do some nice | reading. I've been very pleased so far. | johnmaguire2013 wrote: | Any chance you have this open-sourced or described in more | detail somewhere? | abnry wrote: | It's too hackish, system dependent, and not feature complete | yet. I plan to run it as a flask app on the local network as | a more intuitive way of tagging and managing bookmarks... lot | more to do. | kilroy_jones wrote: | I had started working on something similar to this, but without | the Google Drive component. I wanted something where I could | right click and "snag" a file, link or document and have it | saved to a server I controlled. | | It's not complete, mostly because the frontend is a mess, but | the backend is able to save files, pages and links | (https://gitlab.com/thebird/snag). Used Rust backend, Svelte | and JS for the extension (of course). | rezeroed wrote: | Pocket? | throwii wrote: | I currently extract bookmarks from Firefox and Safari and store | them inside a local database. Then a cronjob saves them to | Wayback machine if a prior check revealed that they are | currently not.. donating regularly for that. Mine just makes | sure that the pages are not lost, but yours enables offline | reading. | | I'm uncertain what the best mechanism is, there are so many | ways to solve it. From filtering to recrawling for new content | to enabling more advanced features, there are so many | possibilities. | agumonkey wrote: | one day this will be as famous as youtube-dl | jbc1 wrote: | https://github.com/pirate/ArchiveBox | darepublic wrote: | Interesting stuff I will look into this more | wooptoo wrote: | Isn't this how the internet was supposed to work in the first | place? I remember Netscape navigator having a 'go offline' icon | in the corner. | rusk wrote: | Kind of. HTTP was designed with caching in mind, so the idea | was that if you GET a page it should more or less not change | and you could add headers and stuff to instruct proxy servers | about whether to cache or not and for how long. I think you | could use HEAD then to check if a page had changed ... | | The browser cache used to actually be quite dependable as an | offline way to view pages but this seems to have fallen out of | favour in the mid naughties. I remember how disgusted I was | when I realised Safari was no longer me letting see a page | unless it could contact the server and download the latest | version. | | I used to have a caching proxy server that would basically MITM | my browsing and be more vigilant than even the cache and it | really worked quite well. This was back in the 90s when every | bit of your max 54kbs counted, or when you wanted to read | something while your Dad or sister wanted to also use the | phone. | | Anyway, you can no longer take this approach because bad people | broke the Internet and now you have to have a great honking | opaque TLS layer between you and the caching servers so there's | no way for this optimisation to work any more. | | Of course it isn't really as important these days because we've | got faster connections and interactions with the server are far | less transactional and richer. But I still would like to have a | way of tracking my own webusage and being able to go back in | time without having to actually revisit each and every site. | | These days you have to hack the browser because that's where | your TLS endpoint emerges. Kaspersky tried this for their HTTP | firewall application and there was ructions over that. | | I'll defo take a look at this. Sounds just like what I've been | looking for. | | > Isn't this how the internet was supposed to work in the first | place? I remember Netscape navigator having a 'go offline' icon | in the corner. | | Thinking back actually, if you forget about "the web"/HTTP - | then yes actually - this is exactly how usenet worked and now | I'm remembering that the "go offline" button used to download | all your newsgroups along with your email and stuff so you | could look at it all offline :-) | | If you want something that's like Usenet these days check out | Scuttlebut. | romanoderoma wrote: | Don't know if recently it changed, but that's how internet in | Cuba works (at least until 2014) | | https://www.google.com/amp/s/amp.theguardian.com/world/2014/... | lights0123 wrote: | Firefox still has it under File. | runxel wrote: | Sure, but does it do anything? | teddyh wrote: | In the modern hamburger menu, it's under "More". | anonymfus wrote: | Ever if you have menu disabled, you can still open it by | pressing Alt, no need to suffer using hamburger. | reaperducer wrote: | I remember that button, too, but I think it had more to do with | connection charges than caching. | | In Netscape days, many people would have to pay by the minute | to be connected to the internet. In those days, web pages | generally contained far more information than they do now, and | were less interactive. So you'd connect, load the content you | wanted to see, disconnect, and then just sit there and read it | for free, instead of bleeding cash. | asdff wrote: | Or even dialup. Expecting a phone call? Go offline. | rusk wrote: | Back in the nineties the web was fairly new and people still | used a thing called usenet quite a bit. You interacted with | it kind of like email (Google groups is actually the final | vestiges of it) - and the go offline button would just | download all your emails and newsgroups and you got peruse | them offline at your pleasure. It might seem strange also | that back in those days you downloaded your emails from a | server using POP3 rather than looking at them remotely (e.g. | Web or IMAP), and you viewed them offline. | derefr wrote: | I'm not sure that was the use-case. With pre-DHTML HTML4, | there really just wasn't anything on a page that could | continue to interact with the server after the page finished | loading. So, presuming the button was for your described use- | case, what would the difference be between "going offline" | and just... not clicking any more links? (It's not like | Netscape could or _should_ signal your modem to hang up -- | Netscape doesn 't know what else in your OS might also be | using the modem.) | Donald wrote: | You must be young :) | | These browsers were born in the era of dialup Internet that | had per minute charges and/or long distance charges. At the | very least you were tying up your family's phone line. | | Basically it's like paying for every minute your cable | modem is plugged in. | | For the feature itself: Netscape had integration with the | modem connectivity for the OS and would initiate a | connection when you tried to visit a remote page. Offline | mode let you disable automatic dialing of the modem. | derefr wrote: | I ran a BBS, my friend :) I'm quite familiar with modems. | I just never used Windows (or the web!) until well past | the Netscape era, so I'm not too familiar with the | intersection of modems and early web browsers. | | > Netscape had integration with the modem connectivity | for the OS and would initiate a connection when you tried | to visit a remote page. | | That's not "integration with modem connectivity", that's | just going through the OS's socket API (or userland | socket stack, e.g. Trumpet Winsock); where the socket | library dials the modem to serve the first bind(2). Sort | of like auto-mounting a network share to serve a VFS | open(2). | | Try it yourself: boot up a Windows 95 OSR2 machine with a | (configured) modem, and try e.g. loading your Outlook | Express email. The modem will dial. It's a feature of the | socket stack. | | These socket stacks would also automatically hang up the | modem if the stack was idle (= no open sockets) for long | enough. | | My point was that a quiescent HTML4 browser _has_ no open | sockets, whether or not it 's intentionally "offline." If | you do as you say -- load up a bunch of pages, and then | sit there reading them -- your modem _will_ hang up, | whether or not you play with Netscape 's toggles. | | (On single-tasking OSes like DOS -- where a TCP/IP socket | stack would be a part of a program, rather than a part of | the OS -- there was software that would eagerly hang up | the modem whenever its internal socket refcount dropped | to zero. But this isn't really a useful strategy for a | multitasking OS, since a lot of things -- e.g. AOL's | chatroom software presaging AIM -- would love to poll | _just_ often enough to cause the line that had just | disconnected to reconnect. Since calls were charged per- | minute rather than per-second, these reconnects had | overhead costs!) | | > [Netscape's] offline mode let you disable automatic | dialing of the modem. | | When you do... what? | | When you first open the browser, to avoid loading your | home page? (I guess that's sensible, especially if you're | using Netscape in its capacity as an email client to read | your already-synced email; or using it to author and test | HTML; or using it to read local HTML documentation. And | yet, not _too_ sensible, since you need to _open_ the | browser to _get_ to that toggle... is this a thing you | had to think about in advance, like turning off your AC | before shutting off your car?) | | But I think you're implying that it's for when you try to | navigate to a URL in the address bar, or click a link. | | In which case, would the page, in fact, be served from | the client-side cache, or would you just get nothing? | (Was HTTP client-side caching even a _thing_ in the early | 90s? Did disks have the room to _hold_ client-side | caches? Did web servers by-and-large _bother_ to send | HTTP /1.0 Expires and Last-Modified headers? Etc.) | fiddlerwoaroof wrote: | I used to go into offline-mode so the browser would | access pages from the cache when I went to their URLs. It | wasn't a ton, but it was enough that you could queue up a | handful of sites, go offline and then, if you | accidentally closed the tab, re-open it and see the | caches version. | rzzzt wrote: | Server-Sent Events? How old is that mechanism? | [deleted] | yamrzou wrote: | This uses Chrome DevTools Protocol in a pretty clever way. I used | it to archive a highly interactive website and it worked like a | charm. | | The README states: "It runs connected to a browser, and so is | able to access the full-scope of resources (with, currently, the | exception of video, audio and websockets, for now)" | | I wonder what kind of limitations makes it hard to intercept | those resources like the rest of the content. | jmaygarden wrote: | Video and audio is probably just a matter of not having gotten | around to it yet. WebSockets are another matter. I'm not sure | what one would do with a two-way channel in a general sense. | It's often not an idempotent operation. | nexuist wrote: | Video files are massive, so it may just be the case that | archiving videos takes so long they didn't want to support it. | harlanji wrote: | I could smash in the audio visual, as my platform is all about | archiving and being the origin for a CDN-fronted open offline- | first platform with minimal resources that can go into a boat | etc. tinydatacenter.com, Github / harlanji / | ispooge,tinydatacenter, biz@harlanji.com - need $375/wk. | a254613e wrote: | I remember seeing this on reddit, the license changed quite a lot | over the past month, with some very weird custom licenses asking | you not to be a fake victim, lie, etc in the process - | https://github.com/c9fe/22120/commits/master/LICENSE - how safe | is it to assume that the current license will stay? | ciarannolan wrote: | Assuming good faith in the creator of this, it looks like they | tried to type up something that they thought would cover their | bases, then realized that wasn't right and copy/pasted in a | real license. | yuskii wrote: | This post has made me angor; | caymanjim wrote: | This is a neat idea, but I wish it would respect some basic Unix | standards by default. Two big annoyances jump out at first | glance: it assumes you want to use port 22120, and it puts its | config in ~/22120-arc. Maybe both of these are configurable, but | the directory is a terrible default. Use XDG (~/.config/22120) or | _at least_ use a hidden directory in the home dir. And the port | it operates on should be completely configurable. Naming the | project 22120 is a terrible idea, and assuming that port won 't | need to change is bad practice. | | I'm not making any value judgment about the actual tool. It | sounds interesting enough. But it should behave better. | fizixer wrote: | I agree what you say while also pointing out that unix home | directory has become a complete mess. Anyone (any installed | software) can do whatever they like, there is no mechanism of | enforcement, and advice in the form of constructive critque or | comment is not even a drop in the bucket towards fixing the | problem. | lxgr wrote: | Very interesting project. | | I wish this was actually (optional) built-in behavior for | browsers when bookmarking pages, or at least when adding to a | "read later" list like Pocket/Instapaper etc. | | Pocket seems to offer something like this, but only in the | premium version, so the "permanent archive" ironically seems to | go away when unsubscribing. | | As a workaround, what if bookmarking a (public) page could | actually ping it to archive.org for archival? | sbeckeriv wrote: | selfplug: https://sbeckeriv.github.io/personal_search/ | | I am working on a personal project that like this. It is in | early stages. I am creating a local search based on my browser | history. So it doesnt crawl pages. Also the fetch is out of | bounds of the browser so Authed urls are not supported out of | the box. | | I have a bookmarklet currently that lets met "pin" my page. my | pinned pages are my new home page. Its how I keep my tabs | closed. | | I do not do a full archive level (but i could). Instead you get | an offline view that is stripped of most things. example | https://raw.githubusercontent.com/sbeckeriv/personal_search/... | | Demo of the pin: | | https://www.youtube.com/watch?v=5g_mXXFwQlg | | a self hosted version is on the roadmap. | asaddhamani wrote: | I have a project https://www.github.com/dhamaniasad/crestify | that does the archival to archive.org and archive.today, you | might find it useful | shrike wrote: | Pinboard.in (not affiliated, just a happy customer) offers an | archiving service for saved bookmarks. | tokamak-teapot wrote: | What it doesn't offer is an integration with the browser to | make it seamless to work with those bookmarks. There are | various extensions for Firefox which will save to Pinboard | (one of them is mine!) but to work with them - you have the | option of going to the website or using the mobile site in a | sidebar (I do this, with some custom css to make it more | readable for me). | | There's a nice MacOS application (sorry can't remember the | name right now) which gives you a better interface, but... | they're bookmarks. I would like them to be integrated with | rather browser bookmarks. And to be usable when the site is | down or I'm offline. And to appear when I search... lots of | possibilities there. | severine wrote: | _I wish this was actually (optional) built-in behavior for | browsers when bookmarking pages, or at least when adding to a | "read later" list like Pocket/Instapaper etc._ | | Sideshow Ask HN: Didn't Firefox mobile work like this? I could | read the reader view items offline... | | Anyone knows what's happening with the whole | bookmarks/collections situation? | gildas wrote: | I implemented some options for that purpose in SingleFile [1]. | They allow you to save the page when you bookmark it and | eventually replace the URL of the page with the file URI on | your disk. | | [1] https://github.com/gildas-lormeau/SingleFile | johnchristopher wrote: | Cool, I was so into maff back in the days. I'll give it a | try. | | (I even wrote this before checking out your link: Have you | heard of https://en.wikipedia.org/wiki/Mozilla_Archive_Format | from two or three Internets ago ? If so what's your thoughts | on it ?) | gildas wrote: | I would recommend you to take a look at SingleFileZ [1], it | should remind you of something ;) | | [1] https://github.com/gildas-lormeau/SingleFileZ | toomuchtodo wrote: | Why a zip file instead of a WARC file? | | https://en.wikipedia.org/wiki/Web_ARChive | walski wrote: | see: https://github.com/c9fe/22120#why-not-warc-or-another- | format... | | > Both WARC and MHTML require mutilatious modifications of | the resources so that the resources can be "forced to fit" | the format. At 22120, we believe this is not required | gildas wrote: | Because it's easier to produce and extract. The zip format | also allows creating self-extracting files (I'm referring | to SingleFileZ). I'm not sure this is possible with the | WARC format. | toomuchtodo wrote: | I see you answered this in a thread a year ago [1] (came | up in a Google search), my apologies. | | [1] https://news.ycombinator.com/item?id=21426056 | kall wrote: | This is a feature of iOS safari with the read later list. It's | not been particularly reliable for me though. | lxgr wrote: | iOS's implementation is definitely useful, but I was thinking | more along the lines of a permanent archive persistently | stored. | | iOS seems to optimize for temporary offline scenarios; saved | pages do not seem to be backed up or synced to iCloud. | kall wrote: | Yeah. I also assume it deletes the pages after they are | "read" but who knows, there's no insight into the feature. | | The best bookmarking option for archival seems to be the | pinboard.in archive plan. | xtiansimon wrote: | I've been storing my research as text files (manual copy and | paste of web page content) for years. | | And, I've wanted a _search history first_ plugin for web search | to find pages I missed saving, but recall reading. | | Since the former takes time and the latter doesn't exist, I | gather I could buy storage and save browsing using this tool. | | It would be interesting to see how it works in practice--saving | so much data. | | Also, For work I'd be interested to know how it works for | password protected sites like banking, social media, etc. | ryanfox wrote: | I've been working on an app that's pretty much exactly "the | latter"! [0] | | The amount of disk space it takes up isn't crazy. It has been | _very_ useful for me. | | [0] https://apse.io | hiisukun wrote: | Just chiming in to say that Firefox location bar has some great | filters [1] that might help you search history first (and other | things). It doesn't do a full text search, but often helps me | in a way I think you're after. | | If you type: "^ worms" in the searchbar it will search your | history for 'worms' and show the results in the dropdown. | Typing "* worms" will search your bookmarks instead. The rest | of the shortcut symbols are listed on the linked page. Hope | that helps! | | [1] http://kb.mozillazine.org/Location_Bar_search | dksidana wrote: | Reminds me days of Webaroo[1] and Google grears[2] | | [1] https://en.m.wikipedia.org/wiki/Webaroo [2] | https://en.m.wikipedia.org/wiki/Gears_(software) | lxgr wrote: | Ah, that brings back memories. Didn't Palm OS have something | similar? I think it was Plucker [1], but I'm not too sure. | | [1] | reaperducer wrote: | Yep. With Plucker, I could download the New York Times web | site (I think via RSS) before I went to work, sync it to my | Palm Pilot, and then read it on my lunch. | ghostbrainalpha wrote: | Very cool idea. I always bring my laptop with me camping in case | I get the urge to write something. | | Having the ability to see the last week or so of my browsing | history would have come in handy on more than one occasion. | jsilence wrote: | Awesome! I always wanted this and at one point tried to achieve | it with WWWOFFLE, but the welcome proliferation of https thwarted | that attempt. | | Gonna check it. | | Unfortunately only for chrome. I am very much used to having my | favourite set of Firefox plugins. Will have to check whether I | can replicate that with Chrome. | wolco2 wrote: | Unfortunate state of affairs with firefox extentions. Niche | extentions do not exist anymore. | | I had to switch to chrome for extensions. Finding a chrome | extension that provides similiar functionality to your firefox | ones should be easy. | phkahler wrote: | >> Unfortunate state of affairs with firefox extentions. | Niche extentions do not exist anymore. | | This seems like something that could be done in a proxy and | be browser independent. | codetrotter wrote: | But then your proxy would need to do the TLS termination. | Which is both kinda cumbersome to set up probably, and also | it means you can no longer look at the certificates for | your connections. | silon42 wrote: | I've used http://www.gedanken.org.uk/software/wwwoffle/ a long | time ago (when on modem). | | What I'd like is to cache the history for each page too | (important for news pages). | ppezaris wrote: | Cool concept. In a world that's getting increasingly connected | what are the main use-cases? | | I ask because the dev tool that our company creates occasionally | (okay, very rarely) gets a question about offline mode, and when | I prod, it's usually just out of curiosity, not because they | actually need it in real life. | lxgr wrote: | This seems to geared towars "content goes down" scenarios, | rather than "reader is temporarily offline". | | It's a concern I have every time I find a particularly | interesting independently hosted blog post or article. | | The Internet Archive goes a long way towards making me worry | about this less, though. (Let's just hope they don't go away!) | reaperducer wrote: | _In a world that 's getting increasingly connected what are the | main use-cases?_ | | Increasingly [?] totally. | | Even though I'm a developer, pre-pandemic I would have to spend | a day or three offline several times a year while working. This | would be useful for that. | | I know an IT guy who works in mines. He loves anything that | works offline. | jaggirs wrote: | The coolest part I think is that you have a copy of all these | websites on disk, which means you can run a full text search on | all the websites you visited (or on their html, technically). | | Browsers'history sucks. I don't know if this project does this, | but I would absolutely love to be able to do SQL queries on my | browsing history. | | I have 'lost' many websites I remember visiting, but for which | I didn't remember anything in the title. | | Also, obviously, websites change sometimes, and the web archive | might not have cached the website you visited. Although from | what I can tell, this project doesn't version websites, it just | caches the latest, so you would probably just overwrite the | previous version accidentally. | hnguy321 wrote: | Anyone know of something like this that can sit on a network, | possibly as a web proxy? | erulabs wrote: | Squid (http://www.squid-cache.org/) is fairly close to what | you're looking for. ___________________________________________________________________ (page generated 2020-11-11 23:00 UTC)