[HN Gopher] Woob: Web Outside of Browsers ___________________________________________________________________ Woob: Web Outside of Browsers Author : pcr910303 Score : 216 points Date : 2022-01-14 15:36 UTC (7 hours ago) (HTM) web link (woob.tech) (TXT) w3m dump (woob.tech) | thih9 wrote: | I've been playing with it, but I keep running into errors. E.g.: | | in woob-weather, with weather.com backend, I've been getting | "Error(weather): 401 Client Error: Unauthorized"; | | in woob-gallery, with imgur backend, when I attempt to download | an image the module crashes with "FileNotFoundError: [Errno 2] No | such file or directory: ''" | | I like the idea though and I'll keep trying further. | | --- | | Update: I resolved the image-gallery problem by specifying the | foldername (so: using "download ID 1 foldername" instead of | "download ID"). BUT: it looks like I'm unable to download text | descriptions that sometimes accompany the images. | vmception wrote: | Reading the linked site and some of the discussion, I highly | recommend finding your nearest Chinese friend or person and | getting authorized on Wechat. Its a whole parallel other | internet! Kinda similar to how a private set of Facebook pages | are otherwise inaccessible with an account, except in a parellel | reality where people use them for all business and have no other | internet presence. | | Yes, as a user another government gets to read your posts, but I | mean yet another. | | To get on, I literally just knocked on a few doors in San | Francisco and got authorized, so many people here can too. You | could probably do it at a park. | | Note: Hong Kong citizens cannot do it for US citizens even I was | trying. Has to be a mainland Chinese person. | vorpalhex wrote: | Weibo is a heavily censored and manipulated platform. The CCP | uses it to track dissidents abroad. | | Do not install or operate on any trusted device. Do not connect | to your home network. Do not store personal details on your | weibo device. Do not ever send sensitive information or talk to | real contacts on Weibo. | vmception wrote: | As you know, it is not possible to separate corporate China | from government China due to the structure of that system so | no analogy exactly works, but our day to day experience is | exceedingly similar as we rely on heavily censored and | manipulated private platforms. Although we retain options to | express ideas, there is not enough saturation of other people | to view those ideas except to participate on heavily censored | platforms and risk complete deplatforming. Its an almost | daily topic here, for example. | | So the user experience on a Chinese service is simply not | different _enough_ for me to treat it differently. | vorpalhex wrote: | I'm someone who is heavily critical of Facebook and | Twitter. | | However there is a world of difference between sloppily | shutting down vaccine or election misinformation, and | actively censoring a Tennis star reporting a sexual assault | by a politiburo member. | | There is a world of difference between taking down shit | talking politicians twitter profiles, and actively | censoring the genocide of an ethnic minority. | | Ads that slurp up personal data are bad. Threatening | political dissidents abroad directly is incomparable. | | To say "well they are all the same bad" is to be willfully | blind to the basic facts. | | [1] - https://www.theatlantic.com/technology/archive/2019/0 | 3/what-... | | [2] - https://www.thecut.com/2021/12/the-disappearance-of- | peng-shu... | vmception wrote: | > To say "well they are all the same bad" is to be | willfully blind to the basic facts. | | Haha I'm not saying that, I'm saying it has nothing to do | with my participation in those platforms because I know | what to expect and my lack of participation changes | nothing. | | My words are the user experience is not different enough. | | For everyone else, check out that great robust example of | a web outside of browsers. | nope96 wrote: | Wouldn't a person need to understand Chinese? or is wechat | English? | vmception wrote: | The settings will be in English and there are pages and chat | rooms you can find. It is mostly Mandarin though. If you are | in any particular niche you might be able to follow since the | memes and reactions are familiar. You can also chat with | other people in English if they know it, some of your friends | probably already uploaded your contact info there so when you | make an account you'll be connected with them even if you | dont share access to your contacts - just like how Facebook | and most other social apps work. | brutal_chaos_ wrote: | This is clever and fantastic. I have been pondering a similar | concept recently and I think I would like to contribute. I'm | curious as to why LGPL-3 was chosen as the license, though, not | that the license is a show stopper. | Zababa wrote: | One explanation might be that the project is mostly French, and | the GPL/LGPL seems more popular in France than in the USA. | Ajedi32 wrote: | Wow! I actually love the idea of being able to interact with | websites via a standard API rather being forced to use web-based | UI they provide. It opens up a whole lot of possibles for things | like alternate clients, standard UIs for interacting across | multiple sites, etc. Also eliminates the possibility of sites | engaging in annoying or abusive behavior by putting users in full | control of the client rather than the site operator. Obviously it | can't work for _every_ site, but it 's quite the interesting | concept. | rpastuszak wrote: | In a way you're describing how browsers should work. | | > Also eliminates the possibility of sites engaging in annoying | or abusive behavior by putting users in full control of the | client rather than the site operator. Obviously it can't work | for every site, but it's quite the interesting concept. | | That's the job of the User Agent after all, acting on behalf of | the user. | sys_64738 wrote: | Isn't this just what Electron provides? | clone1018 wrote: | I've had two ideas related to this in the past that I've always | wanted to prototype: | | - A social media website without a frontend. We just provide a | fully exposed API and Oauth, and devs can create their own | client to interact with the social network. This would give | devs the freedom to create their own experiences without | locking users into one specific way of using the social | network. | | - "Cloud" content hosting as a service. You'd be able to build | your own frontend for interacting with a website / blog, and | then include our JS code and your site's content will | automatically be populated in. This would keep the frontend | clean, simple, and cheap, while offloading posts, comments, and | other advanced functionality to the service. | | Of course both are purely experimental ideas, with no potential | real world meaning :D | andrewfromx wrote: | this sounds perfect for https://www.deso.org/ social network | code | rglullis wrote: | > A social media website without a frontend. We just provide | a fully exposed API and Oauth, | | Take CouchDB and store all activities as ActivityStreams | documents. | | > "Cloud" content hosting as a service. | | "Headless CMS" is the term you are looking for, and it is | already a big industry https://jamstack.org/headless-cms/ | mxuribe wrote: | For your first idea around social media, while not 100% | exactly what you cited, the fediverse sort of already | provides for that...Well, specifically the ActivityPub | protocol (and couple of other protocols) enable such | functionality...and frankly there are numerous (yes, not just | 1 or 2, but numerous) server implementation which further | enable numerous desktop and mobile clients to interact with | content...all federated/sort of decentralized. If you've | heard of mastodon, then they tend to capture most of the | mindshare, but there are many other servers and clients...and | there are reportedly millions of people on the fediverse | around the world...so we're sort of already where you would | like to be. ;-) | | I'm sure there are many sites which help provide better | context for the fediverse, but here, check this one out: | https://fediverse.party/en/fediverse | | Cheers! | thecakefive wrote: | For the social media part that existed; see | https://socialize.dmonn.ch/ | | It has fallen out of fashion tho. | smt88 wrote: | > _I actually love the idea of being able to interact with | websites via a standard API rather being forced to use web- | based UI they provide._ | | For a while, people were pitching this as Web 2.0. It's also | what RSS and podcasts still are. | | Unfortunately, most of the websites we visit are revenue- | generating and want to control their presentation. | vannevar wrote: | Right. There are all kinds of great services that could be | built on top of other peoples' web sites, but most sites want | to own the relationship with their customer directly. And | most of the great ideas for mashup services are predicated on | the idea that the mashup will get most of the revenue, which | is not going to fly with the underlying value providers. In | an ad-supported web world, those who own the eyeballs call | the shots. | anderspitman wrote: | It's a tricky problem. I just have to believe there's a way | we can make the world work without ads at all. They're just | gross. But I have no idea how that could happen. | Ajedi32 wrote: | > most of the websites we visit are revenue-generating and | want to control their presentation | | This project seems designed to work even _without_ the | cooperation of the sites you 're interacting with; all the | site-specific modules are maintained by the community. | Adversarial interoperability at its finest. | | In extreme cases, one could imagine a module running a full | headless browser on the back-end, pretending to be a user | scrolling around and clicking stuff, while presenting the | actual user with a clean front-end. | onion2k wrote: | _I actually love the idea of being able to interact with | websites via a standard API rather being forced to use web- | based UI they provide._ | | That's what HTTP is. You're free to write a client that isn't a | browser that sends and receives the same API messages as any | HTTP client app does. Most people use browsers, but there's | also things like iOS and Android apps that consume the same | APIs as browsers, or Postman that directly communicates with | the APIs, etc. | | The APIs that sit in top of HTTP are even sort of standardized | in the sense that HTTP verbs mean the same everywhere (in | theory, but some devs get it wrong.) | | Thr only hard bit that no one has really solved in a nice way | is how you discover the APIs in the first place. There's things | like WSDL but it's horrible. | 0x445442 wrote: | Believe it or not, around the turn of the century there were many | thick client apps. But back then it was a challenge to ship and | update these applications. This pain, along with the continued | rollout of broadband led many to advocate for creating | applications that would run in a web browser while being | controlled on centralized servers. In practice, turning the | platform that was designed to render markup text into an | application host. This would allow applications to be shipped and | updated with little interaction from the user. | | However, right about the same time web apps were taking over the | world there were thick client apps that were solving the problems | of installation and and updates. Two of the prominent thick | client applications doing this were iTunes and the browsers | themselves. | | Now fast forward a decade to the early teens and the ubiquitous | use of smart phones. What is the single largest determining | factor of platform success? Is it the ability for web apps to | render on your platform's web browser or is it the breadth and | depth of your platform's app store? | | My rant is over, I wish web apps would die. I've wished that for | most of the 21st century. | mahastore wrote: | Good example of another solution without a problem. | mkdirp wrote: | Really? You ever tried automating your own data from e.g. your | bank? Cos I have, and it's a lot more annoying that it should | be. | | Woob does a lot more than just banks. It allows you to get any | of your data. Adding additional providers is piss easy too. | matheusmoreira wrote: | This is so cool. A custom client for websites. Essentially a web | scraper with a GUI on top. You can define your own user | experience instead of accepting what they designed for you. | 0xbadcafebee wrote: | This is.... bizarre. And I like it? | | At first I thought this was like an API to integrate web content | into your own apps. But now it looks more like Groupware, in the | sense that Woob is actually your user interface and there are | just modules to consume content from random websites. | | It goes back to the old idea where you would have one dedicated | desktop application for each thing you wanted to do on the | internet, like read news, send mail, listen to music, view a | calendar... turning your computer into a utilitarian appliance. | Rather than a portal for businesses to spend a lot of time and | money building their own dedicated user interfaces to lock you | in. The latter has made life more difficult, where we have to | constantly learn every business's new interface, there's always | competition between missing features, and the dedicated UI (or | platform) becomes a way for the business to squeeze more out of | the user. | | And there are no ads. I just realized there's an entire | generation who have never seen technology without advertisements. | I wonder what they'd make of this. | NavinF wrote: | Re: that last part. We're probably thinking of different | generations, but I agree. Demographics from a random unreliable | source: Age vs ad blocker usage (female, male) | 16-24 43.2%, 49.2% 25-34 43.0%, 47.6% 35-44 | 38.4%, 44.8% 45-54 33.5%, 39.1% 55-65 32.1%, | 37.3% | | Those poor boomers. They grew up watching ads on every cable | television channel and now they watch ads on every YouTube | video. | jessaustin wrote: | Let's not waste sympathy on the boomers... | CodeGlitch wrote: | Ageism at it's finest. | kodablah wrote: | I think a version of this is what the internet needs but using | headless browsers from the client and with a somewhat-centrally | curated set of scraper "recipes" if you will. Basically a | community curated/updated set of scraper logic per site (yes some | trust is required) that essentially provides JSON data and/or | APIs based on the site. Even just a neutered HTML equivalent of | sites (e.g. amp w/out the Google and ads stuff) would be good. | | Since it is all client side, it can be dubbed a "browser" not a | "scraper" and one might hope popularity is high enough that | active blocking of it is blatantly user hostile. Granted one | hopes that, like EasyList and uBO and others have shown, the | community can outpace site owners. Not appearing headless | (tunneling captchas, literal mousemove events in pseudo-random | human-like ways, etc) should be doable. | | It's something I have thought about and once dubbed "recapitate" | (https://github.com/cretz/software-ideas/issues/82) and plan to | revisit. I have seen many versions of this attempted. We need to | encourage shared data extraction tools. | pjerem wrote: | Interesting to see Woob here. Most of the modules are for french | environment (banks, dating websites, job boards ...). I always | liked the irreverence of the module's names and logos (which are | authentic MS Paint piece of work). | reaperducer wrote: | Unless my brain isn't parsing it right, that dating icon is | both funny and NSFW. | soheil wrote: | aum or happn? | zeeZ wrote: | My first guess was this had something to do with legal | nonsense, and I guess I was right: | | > If provided, icons are preferred to be parodic or humorous in | nature for legal reasons, however there are no restrictions on | the quality or style of humor. | | https://gitlab.com/woob/woob/-/blob/master/CONTRIBUTING.md | pjerem wrote: | Yes, but I think they are, also, just for making fun of some | brands. | | Because, also a fun fact : this project changed its name | recently, it was called Weboob before: https://weboob.org | userbinator wrote: | I remember weboob, and all the amusing names of the various | pieces. Unfortunate that the (French) humour was killed by | political correctness. | awrmc wrote: | Are you suggesting "political correctness" is the only | reason someone might be discouraged from installing a | program named after toilet humor and juvenile references | to specific parts of the anatomy? I've let my kids watch | plenty of children's movies with fart jokes in them, but | I still don't want to hear more of that when I'm trying | to use a tool to access a banking service. It seems | there's even still a banking module with a crude poop | icon: https://woob.tech/applications/bank.html | | Doesn't really inspire confidence in their | professionalism or trustworthiness with handling | financial transactions, if you ask me. | Vosporos wrote: | Je suis francaise et cet humour etait vraiment merdique | littlestymaar wrote: | It's not "political correctness" which killed the humor, | in fact the humor was there by accident and at the | beginning the creator of the project found that funny, so | it stuck for a while | | > When weboob was started in 2010, 11 years ago, the name | was chosen, without a hidden agenda, since as a French | speaker, "boob" wasn't part of my vocabulary. | | > Following its release and the ensuing reactions, during | its first years, the project was complemented with | various provocative elements (icons, application names, | English slurs in the code). This was done with the sole | motive that at that time, it was seen as "fun". | | But when the project gained traction he realized that the | name was probably not appropriate for people building | business apps with it, which he wanted to support. | | > But in practice, it's been years the project isn't | following this approach anymore, it's used as an | essential building block of professional companies, the | provocative elements are progressively removed, and the | professionnalisation[sic] question is being raised. | | Source: Weboob will become woob - https://lists.symlink.m | e/pipermail/weboob/2021-February/0016... | throwaway744678 wrote: | The 6th contributor's (nick)name, can be translated to | something like "Fuckthewhores", with a play on words with | Belzebuth | Hackbraten wrote: | Good on them. I really disliked the old name. | smm11 wrote: | France just can't shake the Minitel, that's for sure. | beders wrote: | was thinking the same thing ;) | res0nat0r wrote: | Also: woob - 1994 is one of the best ambient albums ever made. | | https://www.youtube.com/watch?v=0S3owK3pN64 | jokethrowaway wrote: | This is great. In my ideal world all the web should be like this. | togaen wrote: | but... why | btrettel wrote: | A while back I heard about Z39.50 [0], a protocol that libraries | use for their catalogs. In the 90s it seems there were native | clients for the protocol so that one could interact with the | library catalog without using a web interface. A lot of the | current web interfaces are terribly slow JS monstrosities now so | I'd like to try something faster. | | I never did figure out if any of the GUI clients [1] are still | actively developed and I'd appreciate if anyone who knows about | this could point me towards a good client. | | [0] https://en.wikipedia.org/wiki/Z39.50 | | [1] Some software listed here: | http://www.loc.gov/z3950/agency/resources/software.html | librarianscott wrote: | The most well-known Z39.50 using software that still works is a | paid citation manager called EndNote (the desktop client not | the web client). You can ingest PDFs with that software too. I | don't think the open-source Zotero has added that feature yet. | Z39.50 is clunky, and our old LMS (library management software, | used behind the scenes) can pull in MARC records for our staff | when we order books, but I don't even want to think about the | security. | kadomony wrote: | Woobs out | anderspitman wrote: | Something I've been thinking about lately is how browsers have | essentially become a dependency for any sort of auth on the | internet. Pretty much everything uses OAuth2, which requires you | to be able to render HTML and CSS, and in many implementations | JavaScript. | | That's ~20M (Firefox) to ~30M (Chromium) lines of code as a | dependency for your application, just for auth. This applies even | if you have a slick CLI app like rclone. If you want to connect | it to Google drive you still need a browser to do the OAuth2 | flow. All of this just so we have a safe, known location to stash | auth cookies. | | It would be sweet if there was a lightweight protocol where you | could lay out a basic consent UI (maybe with a simple JSON | format) that can be rendered outside the browser. Then you need a | way to connect to a central trusted cookie store. You could still | redirect to a separate app, but it wouldn't need to be nearly as | complicated as a browser. | notorandit wrote: | I do exactly the other way around on my smartphone: if there is a | web app i won't install the app. | pavlov wrote: | It's like a peek at an alternative Internet where Gopher won | instead of WWW. | mxuribe wrote: | There has been a desire to go back to older days where text was | more prevalent, and so Gemini has begin and is gaining | popularity - though I'm sure very slowly. See: | https://gemini.circumlunar.space/ | | Also, separately (though i wouild not be surprised if | frequented by same/similar folks who have interets in Gemini), | there is also the tildeverse...again, more text-heavy | environments. See: https://tildeverse.org/ | | And, as i have stated in another comment there is the fediverse | (e.g. Mastodon, pleroma, etc.), so the ability to leverage APIs | to interact with other folks and their content without | explicitly needing a typical web browser exists, and | flourishes. | | I'll end by stating that there are a few exciting things - like | the above items i mention as well as this neat Woob platform - | which to me seem very fun, a little new, and yet at the same | time in some ways nostalgic...maybe they won't make the morning | news, and likely only attract geeks, but it is all still | exciting - at least for me! | Qub3d wrote: | The tildeverse looks pretty cool, but why does it redirect me | to a Rick Roll on Firefox? | | I can see the normal site just fine on Lynx browser (maybe | that's the point?) | | Edit: ah, I see, they're doing a JWZ and redirecting based on | referral, but going a step further and setting a cookie. | Cute, but also terribly immature. | mxuribe wrote: | Oops, sorry about that. I never had an issue before...but | funny after visiting HN, see exactly what you mean. | (Clearing the cookies avoids the classic rick roll video). | Anyway, yeah i guess the tilde folks are "characters". But, | separate of that, the community i've interacted with is | quite fun, respectful, and good-natured. I also failed to | mention a sort of equivalent to HN, which i frequent (with | similar topics to HN but often nicer crowd): | https://tildes.net/ | | ...Tilde.net used to be only open to invite not open to | anyone creating an account...so if interested - and still | not open to the public - i can trigger to send you an | invite. | louissan wrote: | woob woob woob! Battletech pulse lasers anyone? | diogenesjunior wrote: | Small discussion earlier: | https://news.ycombinator.com/item?id=29935634 | zepto wrote: | Smaller as in essentially zero, and with a link back to this | one. | | An amusing detour. | dylan-m wrote: | I like the idea of this. There's so much _information_ on the | web, but we still need a way to bring that information to other | applications, without being tied to a particular source. That was | really the dream of the semantic web, after all. | | This kind of idea would be really nicely paired with good | Microformats[1] support, which continues to be a very good idea. | That way we can find, say, a recipe or an address on a web page | in a reusable way and without needing magical heuristics. | | (Of course, "reusable" in theory, with the caveat that everybody | forgot about microformats around when Google decided they could | machine learn their way out of everything). | | [1] http://microformats.org | MaxBarraclough wrote: | > I like the idea of this. There's so much information on the | web, but we still need a way to bring that information to other | applications, without being tied to a particular source. | | I'm not sure I'm interpreting you correctly here, but I think | I'm on the other side of this. The problem is that many modern | websites are godawful. I think the story pretty much ends | there. If websites were not awful, we wouldn't find ourselves | appalled by the idea of just embedding a browser. | | Modern web browsers feature a 'reader mode' as a countermeasure | to that much modern web design is significantly worse than | having no web design at all. | | If you're serious about a 'lightweight' alternative to the | lumbering horror-show of the modern web, the way forward is | either Gemini [0], or a formalised simple subset of HTML. [1] | | > That way we can find, say, a recipe or an address on a web | page in a reusable way and without needing magical heuristics. | | I think the _find_ and _reusable_ aspects here are really two | very different problems. | | The _reusable_ part is easy. HTML is already reusable. A | standardised simple subset would be even more so. [1] | | The _find_ part is trickier. Discovering decent content is | harder, as there 's an arms race of ad-funded spammers trying | to out-compete legitimate recipe sites in search-engine | rankings. (There's also the possibility of search engines not | being motivated to work on delivering good search results. [2]) | | The idea of having a choice between native GUI applications and | web apps, has been with us for some time. Email is probably the | best example, we've long had the choice between webmail and | native email clients. Beyond webmail, these days even Microsoft | Word has a web-based version. There are of course both | advantages and disadvantages to web-based applications. | | [0] https://news.ycombinator.com/item?id=23730408 | | [1] https://news.ycombinator.com/item?id=29291392 | | [2] https://news.ycombinator.com/item?id=29772136 | Karrot_Kream wrote: | How do you know someone uses Gemini? They'll tell you the | moment they can! Like the vegans of the web... | jka wrote: | Also entirely possible that I'm misunderstanding both of you, | but I think what the parent comment was imagining was | something like common schemas (like schema.org[1]?) for | content that is currently around the web and encased in the | challenging web design you mention. | | With common (and evolving) formats -- and incentives for | publishers to provide their information within those formats | -- we could then have much simpler, more streamlined tools to | use and remix that data in application-specific ways. | | [1] - https://schema.org/docs/full.html | anderspitman wrote: | > Discovering decent content is harder, as there's an arms | race of ad-funded spammers trying to out-compete legitimate | recipe sites in search-engine rankings | | I wonder if we even need the search engines? I think a lot of | the things we've come to rely on them for could easily be | handled in other ways. Recipes for example. You don't really | want the best recipe _page_ for a given dish. You want a good | quality recipe _site_ that has a recipe for that dish. | Quality of recipe sites ebb and flow as they sell out and | incentives change, but generally you would probably only need | to be aware of the top 2-3 sites. This is exactly the type of | information that is easily stored as "tribal knowledge" on a | subreddit, forum sticky, community wiki, or even blasting out | to your Facebook friends "hey what's everyone's favorite | recipe site?" | dillondoyle wrote: | Funnily, Google has pushed websites to add more structured data | into their html for crawling.. | | Seems it's used for SEO hackinge. For instance on recipes why | wouldn't a site give their recipe a super high rating. Those | sites are awful SEO spam adservers basically. | | But business info that can be used to map seems pretty valuable | to google. | | https://developers.google.com/search/docs/advanced/structure... | hombre_fatal wrote: | One problem is that it takes a lot of work and effort to build | any of the valuable hubs where people post information. | | Ever try to start a forum? It's a monumental task with no | guarantee of success. You may even need to employ people to | grow and maintain one. | | And once you've finally grown one of these hubs that | accumulates recipes, lyrics, real estate listings, classifieds, | etc. (whatever you had in mind) there's no incentive to make it | as easy as possible to share it with the world. Once you get | over "ugh, everyone just wants to make a buck", there's the | fact that it wasn't free to build and maintain the platform to | begin with. And perhaps the only incentive to build the | platform was the idea that people would pay for the value. | | Or, who is supposed to do the work of curating and organizing | all of this information and then producing an API so that | others can build on it, and why haven't they started? There are | probably some inconvenient truths in the answer beyond | cynicism. | anderspitman wrote: | Ha, I'm trying to start a data ownership forum now. My | approach has been to have it be a central place for support | for all my open source projects. We'll see how that works | out. | matthewaveryusa wrote: | This has Bloomberg terminal / minitel vibes. I think there's | definitely a space for an alternative browser that can render | guis with visually consistent widgets. ___________________________________________________________________ (page generated 2022-01-14 23:00 UTC)