[HN Gopher] The semantic web is dead - Long live the semantic web
       ___________________________________________________________________
        
       The semantic web is dead - Long live the semantic web
        
       Author : LukeEF
       Score  : 159 points
       Date   : 2022-08-10 14:42 UTC (8 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | bawolff wrote:
       | Funnily enough, the why semantic web is good section is the
       | section that actually identifies why it failed.
       | 
       | We are going to have an ultra flexible data model that everyone
       | can just participate in?
       | 
       | That never works. Protocols work by restricting possibilities not
       | allowing everything. The more possibilities you allow, the more
       | room for subtle incompatibilities and the more effort you have to
       | spend massaging everything into compatibility.
        
         | ggleason wrote:
         | That's discussed in the article though. The open world
         | assumption is untenable. Having shareable interoperable
         | schemata that can refer to each-other safely would be a god
         | send however. And that's what is currently very hard but
         | needn't be.
        
           | leoxv wrote:
           | What is "unsafe, untenable or hard" about embedding some
           | JSON-LD (which is just some JSON metadata, transformed using
           | a small JS library), like I did here:
           | https://twitter.com/conzept__/status/1552719001826074625
           | 
           | Whether you trust the URIs or the data that was placed there
           | is not a problem for the semantic web. The fact that you
           | _can_ state these things and relate to other resources and
           | concepts on the web is already wonderful and useful in
           | itself. Google is reading this metadata and relating it to
           | their trust/ranking-graph. The semantic web 'community' could
           | do the same later also, in a more decentralized way
           | (blockhain web IDs perhaps?). For now it all works fine.
        
             | convolvatron wrote:
             | people should use something like json-schema to publish
             | their structure. this doesn't solve the root denotation
             | problem, but it would help a lot.
        
       | ramoz wrote:
       | The future of web standards will be structured in neural network
       | high dimensional spaces. Accessibility to that future web will be
       | built in models that exist across a decentralized environment
       | similar to blockchain/smart-contract architectures.
        
       | rch wrote:
       | JSON-LD has some traction, but the author seems to prefer a
       | slightly different syntax.
       | 
       | I don't see a material difference, but I'm curious to know what
       | others think.
       | 
       | -- https://w3c.github.io/json-ld-bp/#contexts
       | 
       | -- https://w3c.github.io/json-ld-bp/#example-example-typed-
       | rela...
       | 
       | -- https://terminusdb.com/docs/index/terminusx-db/reference-
       | gui...
        
         | ggleason wrote:
         | Well, in one sense the are directly interconvertable. The
         | documents in TerminusDB are elaborated to JSON-LD internally
         | during type-checking and inference.
         | 
         | However, it's not just a question of whether one can be made
         | into another. The use of contexts is very cumbersome, since you
         | need to specify different contexts at different properties for
         | different types. It makes far more sense to simply have a
         | schema and perform the elaboration from there. Plus without an
         | infrastructure for keys, Ids become extremely cumbersome. So
         | beyond just type decorations on the leaves, It's the difference
         | between:                 {         "general_variables": {
         | "alternative_name": ["Sadozai Kingdom", "Last Afghan Empire" ],
         | "language":"latin"         },         "name":"AfDurrn",
         | "social_complexity_variables": {
         | "hierarchical_complexity": {"admin_levels":"five"},
         | "information": {"articles":"present"}         },
         | "warfare_variables": {           "military_technologies": {
         | "atlatl":"present",             "battle_axes":"present",
         | "breastplates":"present"           }         }       }
         | 
         | And                 {         "@id":"Polity/7286b191f5f62a05290
         | b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11",
         | "@type":"Polity",         "general_variables": {           "@id
         | ":"Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fae4aa
         | acc58ba6c11/general_variables/GeneralVariables/e4360ee3766c2863
         | f06a34ffcdd9869d41b03d04c6f6af5f94b0a14a47e8e704",
         | "@type":"GeneralVariables",           "alternative_name":
         | ["Last Afghan Empire", "Sadozai Kingdom" ],
         | "language":"latin"         },         "name":"AfDurrn",
         | "social_complexity_variables": {           "@id":"Polity/7286b1
         | 91f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/soci
         | al_complexity_variables/SocialComplexityVariables/191353c4b7138
         | 842ec4029dd07fbd63c9dda752f0cd72b1584f046a274cf024c",
         | "@type":"SocialComplexityVariables",
         | "hierarchical_complexity": {             "@id":"Polity/7286b191
         | f5f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/social
         | _complexity_variables/Polity/7286b191f5f62a05290b8961fd8836a26d
         | dc8399611b216fae4aaacc58ba6c11/social_complexity_variables/Soci
         | alComplexityVariables/191353c4b7138842ec4029dd07fbd63c9dda752f0
         | cd72b1584f046a274cf024c/hierarchical_complexity/HierarchicalCom
         | plexity/d6a772c5c6919cc511a24ab89f908032aa32b1e3e939d2e0c32044b
         | 3a5d9151d",             "@type":"HierarchicalComplexity",
         | "admin_levels":"five"           },           "information": {
         | "@id":"Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b216fa
         | e4aaacc58ba6c11/social_complexity_variables/Polity/7286b191f5f6
         | 2a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/social_com
         | plexity_variables/SocialComplexityVariables/191353c4b7138842ec4
         | 029dd07fbd63c9dda752f0cd72b1584f046a274cf024c/information/Infor
         | mation/2f557c1016552f30b8d8bb1bdd9a8584791dd06d32f25bded86a7eb5
         | 9788ea7f",             "@type":"Information",
         | "articles":"present"           }         },
         | "warfare_variables": {           "@id":"Polity/7286b191f5f62a05
         | 290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/warfare_variab
         | les/WarfareVariables/704a2c1854a2fe80616fbea0ef0dcd6ce47f517452
         | 9ca191617e42397108c437",           "@type":"WarfareVariables",
         | "military_technologies": {             "@id":"Polity/7286b191f5
         | f62a05290b8961fd8836a26ddc8399611b216fae4aaacc58ba6c11/warfare_
         | variables/Polity/7286b191f5f62a05290b8961fd8836a26ddc8399611b21
         | 6fae4aaacc58ba6c11/warfare_variables/WarfareVariables/704a2c185
         | 4a2fe80616fbea0ef0dcd6ce47f5174529ca191617e42397108c437/militar
         | y_technologies/MilitaryTechnologies/80a91b3e5381154387bde4afc66
         | fdd38834de16c671c49c769f5244475cbbb1b",
         | "@type":"MilitaryTechnologies",             "atlatl":"present",
         | "battle_axes":"present",             "breastplates":"present"
         | }         }       }
        
           | rch wrote:
           | > contexts at different properties for different types
           | 
           | It seems like I could use syntax from HOCON to achieve this
           | in a less verbose way, perhaps with minor changes to the
           | parser.
           | 
           | > have a schema and perform the elaboration from there
           | 
           | I like your schema approach. I'll have to experiment a bit.
        
       | fleddr wrote:
       | You can debate syntax forever but the semantic web will never
       | rise without the proper incentives. Not only is there no
       | incentive for industry to participate in it, there's in fact an
       | anti-incentive to do so.
       | 
       | Say you've build a weather app/website. Being a good citizen, you
       | publish "weatherevent" objects. Now anybody can consume this
       | feed, remix it, aggregate, run some AI on it, new visualizations,
       | whichever. A great thing for the world.
       | 
       | That's not how the world works. Your app is now obsolete.
       | Anybody, typically somebody with more resources than you, will
       | simply take that data and out-compete you, in ways fair on unfair
       | (gaming ranking). You may conclude that this is good at the macro
       | level, but surely the app owner disagrees on the micro level.
       | 
       | Say you're one of those foodies, writing recipes online with the
       | typical irrelevant life story attached. The reason they do this
       | is to gain relevance in Google (which is easily misled by lots of
       | fluffy text), which creates traffic, which monetizes the ads.
       | 
       | Asking these foodies instead to write semantic recipe objects
       | destroys the entire model. Somebody will build an app to scrape
       | the recipes and that seals the fate of the foodie. No
       | monetization therefore they'll stop producing the data.
       | 
       | In commercial settings, the idea that data has zero value and is
       | therefore to be freely and openly shared is incredibly naive. You
       | can't expect any entity to actively work against their own self-
       | interest, even less so when it's existential.
       | 
       | As the author describes, even in the academic world, supposedly
       | free of commercial pressure, there's no incentive or even an
       | anti-incentive. People rather publish lots of papers. Doing
       | things properly means less papers, so punishment.
       | 
       | Like I said, incentives. The incentive for contributing to the
       | semantic web is far below zero.
        
         | marviel wrote:
         | As my Reinforcment Learning professor said: "It's all about
         | incentives, people"
         | 
         | This is the kind of idea that begs me to reconsider crypto as a
         | possible real-world-problem-solving-tool. But I've yet to see
         | an example of crypto working in a way that feels like it'll
         | take off for anything other than (1) another form of "stock" at
         | best, or (2) a grift at worst. I suppose we're in the market
         | for another solution.
         | 
         | To use a Machine Learning analogy, there's the "Credit
         | Assignment Problem." which is basically the same thing:
         | https://www.lesswrong.com/posts/Ajcq9xWi2fmgn8RBJ/the-credit...
        
           | fleddr wrote:
           | I think the fundamental issue in the digital world is that
           | you compete with the entire damn world.
           | 
           | When I open a bakery, competition is limited to just a few
           | miles of space. Provided I provide a decent product, I can
           | exist. This idea allows for millions of independent bakeries
           | to exist around the world, which is awesome. It provides
           | great diversity in products, genuine creativity, cultural
           | differentiation, meaningful local employment.
           | 
           | When you need to compete with the entire world, it's a
           | different game altogether. Everything you do digitally can
           | fairly easily be replicated at low cost. This creates an
           | unstoppable force of centralization fueled by capital but
           | also consumer preference: they rather have one service that
           | has it all.
           | 
           | So even if you found a way to pay for data use (via crypto or
           | not) all power will continue to flow to a dominant party.
        
         | [deleted]
        
       | wyc wrote:
       | We're trying to make semantic web models easier to use with a
       | project called TreeLDR...I think usability has been one of the
       | biggest issues of this ecosystem and OSS in general. Think
       | programmer-friendly data structure definitions that compile to
       | JSON-LD contexts, jsonschemas, and beyond.
       | 
       | https://github.com/spruceid/treeldr
       | 
       | Shameless plug: we're hiring if you like this kind of stuff and
       | Rust.
        
       | pphysch wrote:
       | I think there is a lot of fussing about technical solutions to
       | what is ultimately a cultural problem.
       | 
       | Suppose we had the perfect technology to define ontologies over
       | real data.
       | 
       | This doesn't address the fact that Anglo-American culture is
       | hostile to alternative ontologies. The idea of "one Truth" is
       | baked into the national consciousness, from classical Western
       | religion+philosophy to the liberal-democratic Constitution to
       | Wikipedia and the current Fact-Checking(tm) Brought To You By
       | Lockheed-Martin(tm) news-media regime.
       | 
       | With this worldview, there is no reason to invest in designing or
       | implementing Semantic Web technologies. It's like building a a
       | monument to a god that you don't believe exists. Waste of time.
       | 
       | To be clear, I spend a lot of time thinking about the technical
       | side too and implementing enterprise solutions. I just think it's
       | naive to frame it as primarily a technical problem when it comes
       | to wider public deployment.
        
       | lmeyerov wrote:
       | Very cool topic... and not the article I was expecting!
       | 
       | I actively work with teams making sense of their massive global
       | supply chains, manufacturing process, sprawling IT/IOT infra
       | behavior, etc., and I personally bailed from RDF to bayesian
       | models ~15 years ago... so I'm coming from a pretty different
       | perspective:
       | 
       | * The historical killer apps for semantic web were historically
       | paired with painfully manual taxonomization efforts. In industry,
       | that's made RDF and friends useful... but mostly in specific
       | niches like the above, and coming alongside pricey ontology
       | experts. That's why I initially bailed years ago: outside of
       | these important but niche domains, google search is way more
       | automatic, general, and easy to use!
       | 
       | * Except now the tables have turned: Knowledge graphs for
       | grounding AI. We're seeing a lot of projects where the idea is
       | transformer/gnn/... <> knowledge graph. The publicly visible camp
       | is folks sitting on curated systems like wikidata and osm, which
       | have a nice back-and-forth. IMO the bigger iceberg is from AI
       | tools getting easier colliding with companies having massive
       | internal curated knowledge bases. I've been seeing them go the
       | knowledge graph <> AI for areas like chemicals,
       | people/companies/locations, equipment, ... . It's not easy to get
       | teams to talk about it, but this stuff is going on all the way
       | from big tech co's (Google, Uber, ...) to otherwise stodgy
       | megacorps (chemicals, manufacturing, ..).
       | 
       | We're more on the viz (JS, GPU) + ai (GNN) side of these
       | projects, and for use cases like the above + cyber/fraud/misinfo.
       | If into it, definitely hiring, it's an important time for these
       | problems.
        
         | strangattractor wrote:
         | Generally agree. There is a lot of discussion concerning the
         | technical difficulties, RDF flaws and road blocks little
         | acknowledgement of other non-technical impracticalities. Making
         | something technically feasible does insure adoption. Changing a
         | bunch of code over time will always be preferable redefining
         | ontologies and reprocessing the data.
        
       | gibsonf1 wrote:
       | The semantic web has been reintroduced as part of "Solid" by Tim
       | Berners-Lee (and Inrupt) and is growing very fast:
       | https://solidproject.org/
       | 
       | The opposite of dead in fact.
        
       | boilerupnc wrote:
       | For a year and a half, I worked on a project called OSLC: Open
       | Services for Lifecycle Collaboration [0] which became an Oasis
       | Open Project. It's an open community building practical
       | specifications for integrating software. For software tools that
       | adopt and provide OSLC enabled APIs, data integration and
       | supported use cases become really easy.
       | 
       | As an example, if your department prefers Tool A for defining
       | requirements (Aha, etc ...), Tool B for change management
       | (bugzilla, etc ...) and Tool C for test management and they
       | aren't already a unified platform, it can be hard to gain
       | semantic context across them. I've seen many situations where dev
       | teams prefer a specific FOSS/vendor change management tracking
       | tool while testers prefer a different thing and are unwilling to
       | change because of historical test automation investment. To
       | illustrate, imagine I run a test and it fails. I want to open a
       | bug and have it linked to this failing test and also associate it
       | with an existing requirement. If all 3 tools are OSLC API enabled
       | consumers/producers, then their data can be integrated together
       | trivially and experiences can be far more seamless and pleasant
       | to all involved (e.g. testers can have popups to query
       | (find/select reqmnts) or delegated creates (open new bug))
       | without leaving their own familiar test tool's UI. Nice. Anything
       | can have an OSLC enabled API adapter from existing servers to
       | spreadsheets (with an associated proxy server). It has great
       | promise in bringing FOSS/vendor tooling together.
       | 
       | In a nutshell, it's a set of standards around building a digital
       | thread for tools to integrate together. Workstreams are focused
       | per domain (quality management, change management, requirements
       | management, etc ...) [1]. Linked Data and RDF are its core tech
       | underpinning [2]
       | 
       | [0] https://open-services.net/
       | 
       | [1] https://open-services.net/specifications/#active-
       | publication...
       | 
       | [2] https://oslc.github.io/developing-oslc-
       | applications/technica...
        
       | iamwil wrote:
       | On our podcast, The Technium, we covered Semantic Web as a retro-
       | future episode [0]. It was a neat trip back to the early 2000s.
       | It wasn't a bad idea, pre se, but it depended on humans doing-
       | the-right-thing for markup and the assumption that classifying
       | things are easy. Turns out neither are true. In addition, the
       | complexity of the spec really didn't help those that wanted to
       | adopt its practices. However, there are bits and pieces of good
       | ideas in there, and some of it lives on in the web today. Just
       | have to dig a little to see them. Metadata on websites for
       | fb/twitter/google cards, RDF triples for database storage in
       | Datomic, and knowledge base powered searches all come to mind.
       | 
       | [0] https://youtu.be/bjn5jSemPws
        
         | lolive wrote:
         | I was hired by a BIG company to help their data governance, and
         | a pragmatic semantic web is giving pretty interesting results.
         | Just to add some hotness/trollness to the discussion, Neo4J was
         | a mind opener for many people [both technical and non-
         | technical]
        
       | low_tech_punk wrote:
       | The entire movement felt like a massive tragedy of the commons.
       | There is just no incentive for any single player to push the
       | standard forward and the commercial players are already reaping
       | enough benefits from Web 2.0 that putting more money in Semantic
       | Web makes no sense.
       | 
       | Semantic Web was supposed to be the Web 3.0. It's so dead now
       | that even its name is stolen by the blockchain. RIP.
        
         | [deleted]
        
         | [deleted]
        
       | mxmilkiib wrote:
       | LV2 audio plugins use RDF/Turtle;
       | 
       | https://github.com/lv2/lv2                 curl -H "Accept:
       | text/turtle,application/rdf+xml" http://lv2plug.in/ns/ext/lv2core
       | curl -H "Accept: text/turtle,application/rdf+xml"
       | http://lv2plug.in/ns/ext/atom
       | 
       | Some hosts also use it for saving audio graphs;
       | 
       | https://drobilla.net/software/ingen.html
       | http://drobilla.net/ns/ingen.html
       | 
       | https://github.com/moddevices/mod-factory-user-data/tree/mas...
       | https://pedalboards.moddevices.com/
        
       | de6u99er wrote:
       | While I love the semantic web I see two major issues with it:
       | 
       | 1. Standardization in regards of (globally) unique identifiers
       | and ontologies. Most things un the semantic web have multiple
       | identifiers and, based on personal preferences, attributes linked
       | to different ontologies. There's several projects that try to
       | gather data for the same thing from various ontologies, but
       | sometimes the same attributes have differing values because of
       | conversions or simply extracting data points from different
       | publications where different methods have been used to measure
       | stuff.
       | 
       | 2. Performance of large datasets gets really bad since
       | distributing graphs is still a problem that lacks good solutions.
       | One of the solutions is to store data in distributed column
       | stores. But there's still a ton of unsolved graph traversal
       | performance issues.
       | 
       | I strongly believe that the technological batriers need to be
       | solved first. Until then there will always be the person in
       | meetings, asking why not use relational or NoSql tech because of
       | performance...
        
         | leoxv wrote:
         | Many of the biggest companies in world are using semweb tech:
         | http://sparql.club
         | 
         | Open linked-data has been growing very fast over the last few
         | years. Many governments are now demanding LD from their
         | executive/subsidized organizations. These data stores are then
         | made accessible using REST and/or SPARQL.
        
       | terminatornet wrote:
       | blank is dead, long live blank
        
       | jxramos wrote:
       | That github was created 2 days ago, wasn't this article discussed
       | elsewhere someplace? It looks very recognizable. Was it on a blog
       | or something and just made a new home in github or was it some
       | other similar article I may be thinking about.
        
         | ggleason wrote:
         | I wrote it from scratch 2 days ago.
        
       | hosh wrote:
       | This is a really fascinating analysis. I have wondered why the
       | semantic web never took off, and I am finding myself interested
       | in being able to create data sources in a federated way. The
       | author's mention of Data Mesh and his own project, TerminusDB
       | looks like what I had been looking for, for a side project.
       | 
       | One adjacent project I did not see mentioned is XMPP. The
       | extensibility of XMPP comes from being able to refer to schemas
       | within stanzas of the payload. It's also an interesting case
       | study on an ecosystem built from a decentralized, extensible
       | protocol. One of the burdens plaguing the XMPP ecosystem is spam,
       | and I wonder to what extent we might see that if the semantic web
       | revives again.
        
       | Krisjohn wrote:
       | Sigh
       | 
       | When the phrase "The King is dead, long life the King" is used,
       | the two kings are different people; the one that just passed and
       | the one that replaced him. If the King is replaced by a Queen
       | then the phrase is "The King is dead, long live the Queen". This
       | is not some life after death thing. You aren't saying the King
       | will live on in the hearts and minds of the people, you're
       | stating your support for the successor.
        
         | lolive wrote:
         | The new king of the Semantic Web is obviously Neo4J.
        
       | leoxv wrote:
       | I'm building a front end app for Wikipedia & Wikidata called
       | Conzept encyclopedia (https://conze.pt) based on semantic web
       | pillars (SPARQL, URIs, various ontologies, etc.) and loving it so
       | far.
       | 
       | The semantic web is not dead, its just slowly evolving and and
       | growing. Last week I implemented JSON-LD (RDF embedded in HTML
       | with a schema.org ontology), super easy and now any HTTP client
       | can comprehend what any page is about automatically.
       | 
       | See https://twitter.com/conzept__ for many examples what Conzept
       | can already do. You won't see many other apps do these things,
       | and certainly not in a non-semantic-web way!
       | 
       | The future of the semantic web is in: much more open data, good
       | schemas and ontologies for various domains, better web extensions
       | understanding JSON-LD, more SPARQL-enabled tools, better and more
       | lightweight/accessible NLP/AI/vector compute (preferably embedded
       | in the client also), dynamic computing using category theory
       | foundations (highly interactive and dynamic code paths, let the
       | computer write logic for you), ...
        
         | lolive wrote:
         | The future of the semantic web is in big companies. Where
         | handling data exchanges at scale is becoming a massive waste of
         | time, resources and sanity.
        
       | lancesells wrote:
       | > Because distributed, interoperable, well defined data is
       | literally the most central problem for the current and near
       | future human economy.
       | 
       | I'm having a really hard time seeing this at least in the terms
       | of the web and the majority of web content.
        
       | lolive wrote:
       | Whoever dismisses the semantic web and prefers CSV for data
       | exchange can burn in HELL!!!
        
       | jansc wrote:
       | The semantic web is dead. Long live Topic maps [1] ;-)
       | 
       | https://en.wikipedia.org/wiki/Topic_map
        
       | tconfrey wrote:
       | I think the general message here is that complex and complete
       | architectures tend to fail in favor of simpler solutions that
       | people can understand and use to get things done in the here and
       | now.
       | 
       | Its interesting to me that the recent uptick in the personal
       | knowledge management space (aka tools for thought)[0] is all
       | around the bi-directional graph which is basically a 2-tuple
       | simplified version of the RDF 3-tuple. You lose the semantics of
       | a labelled edge, but its easier for people to understand.
       | 
       | [0] See Roam Research, Obsidian, LogSeq, Dendron et al.
        
       | openfuture wrote:
       | Lots of good points raised, necessary discussion.
       | 
       | My take is that we know a lot of this already but refuse to
       | accept the solutions. The way to exchange data and the way to
       | relate and query data is both known to a large extent; canonical
       | S-expressions and datalog-ish expressivity. I just can't
       | understand why no one thinks datalisp.is a persuasive foundation.
        
       | strangattractor wrote:
       | Having worked for an Academic Publisher that had intense interest
       | in this I finally came to the following conclusions to why this
       | is DOA.
       | 
       | 1. Producers of content are unwilling to pay for it (and neither
       | are consumers BTW) 2. It is impossible to predict how the
       | ontology will change over time so going back and reclassifying
       | documents to make them useful is expensive. 3. Most pieces of
       | info have a shelf life so it is not worth the expense of doing
       | it. 4. Search is good enough and much easier. 5. Much of what is
       | published is incorrect or partial so.
       | 
       | In the end I decided this is akin to discussing why everybody
       | should use Lisp to program but the world has a differ opinion.
        
         | ternaryoperator wrote:
         | Not sure I understand the comparison with Lisp. You list five
         | reasons for the semantic web that mostly involve cost.
        
       | pornel wrote:
       | Semantic Web lost itself in fine details of machine-readable
       | formats, but never solved the problem of getting correctly marked
       | up data from humans.
       | 
       | In the current web and apps people mostly produce information for
       | other people, and this can work even with plain text. Documents
       | may lack semantic markup, or may even have invalid markup, and
       | have totally incorrect invisible metadata, and still be perfectly
       | usable for humans reading them. This is a systemic problem, and
       | won't get better by inventing a nicer RDF syntax.
       | 
       | In language translation, attempts of building rigid formal
       | grammar-based models have failed, and throwing lots of text at a
       | machine learning has succeeded. Semantic Web is most likely
       | doomed in the same way. GPT-3 already seems to have more
       | awareness of the world than anything you can scrape from any
       | semantic database.
        
         | pphysch wrote:
         | Sure, but there are still a lot of decisions being made behind
         | the curtain, when it comes to producing a model like GPT-3. How
         | was the training data ontologized? Where did it come from? To
         | some extent, these are the same problems facing manual
         | curation.
        
           | pornel wrote:
           | GPT may have had some manual curation to avoid making it too
           | horny and racist, but on a technical level for such models
           | you can just throw anything at it. The more the better, shove
           | it all in.
        
       | cyocum wrote:
       | The author of this post mentions the Humanities at the end of
       | their post and TerminusDB. I work on a Humanities based project
       | which uses the Semantic Web (https://github.com/cyocum/irish-gen)
       | and I have looked at TerminusDB a couple of times.
       | 
       | The main factor in my choice of technologies for my project was
       | the ability to reason data from other data. OWL was the defining
       | solution for my project. This is mainly because I am only one
       | person so I needed the computer to extrapolate data that was
       | logically implied but I would be forced to encode by hand
       | otherwise. OWL actually allowed my project to be tractable for a
       | single person (or a couple of people) to work on.
       | 
       | The author brings up several points that I have also run into
       | myself. The Open World Assumption makes things difficult to
       | reason about and makes understanding what is meant by a URL hard.
       | Another problem that I have run into is that debugging OWL is a
       | nightmare. I have no way to hold the reasoner to account so I
       | have no way when I run a SPARQL query to be able to know if what
       | is presented is sane. I cannot ask the reasoner "how did you come
       | up with this inference?" and have it tell me. That means if I run
       | a query, I must go back to the MS sources to double check that
       | something has not gone wrong and fix the database if it has.
       | 
       | Another problem that the author discusses and what I call
       | "Academic Abandonware". There are things out there but only the
       | academic who worked on it knows how to make it work. The
       | documentation is usually non-extant and trying to figure things
       | out can take a lot of precious time.
       | 
       | I will probably have another look at TerminusDB in due course but
       | it will need to have a reasoner as powerful as the OWL ones and
       | an ease of use factor to entice me to shift my entire project at
       | this point.
        
         | closewith wrote:
         | > I work on a Humanities based project which uses the Semantic
         | Web (https://github.com/cyocum/irish-gen) and I have looked at
         | TerminusDB a couple of times.
         | 
         | I had never come across anything like this before, but this is
         | a wonderful project.
        
         | zozbot234 wrote:
         | "Reasoning" capability can be added to any conventional
         | database via the use of views, and sometimes custom indexes.
         | The real problem is that it's computationally expensive for
         | non-trivial cases.
        
           | lolive wrote:
           | I hardly see how you can define in a RDBMS that a resource
           | that both have an engine and four wheels should be seen as a
           | car. Without going into a nightmare of unbearable SQL...
        
             | zozbot234 wrote:
             | The SQL for describing "resources that contain other
             | resources" gets a bit unidiomatic, but defining a query for
             | those that have e.g. an engine and four wheels is quite
             | easy. Then you can add that as a custom view, so that your
             | inferred data is in turn available and queryable on an
             | equal basis with raw input to the knowledge base.
        
               | lolive wrote:
               | Sure. But maintaining the coherence between your business
               | data model definitions and their implementation in the
               | RDBMS can quickly become a massive headache, don't you
               | think?
        
       | asplake wrote:
       | Seems to miss the obvious double whammy:
       | 
       | 1) Because it burdens producers to no obvious benefit, a problem
       | forever
       | 
       | 2) Because progress over time in language processing makes it
       | less and less necessary
        
         | jll29 wrote:
         | Natural language processing (NLP) may indeed understand the
         | unstructured text, then according to (2), the "Semantic Web" is
         | not needed, except for perhaps caching NLP outputs in machine-
         | readable form.
         | 
         | (1) is more fundamental: a lot of value-add annotation (in RDF
         | or other forms) would be valuable, but because there is work
         | involved those that have it don't give it away for free. This
         | part was not sufficiently addressed in the OP: the Incentive
         | Problem. Either there needs to be a way how people pay for the
         | value-add metadata, or there has to be another benefit for the
         | provider why they would give it away. Most technical articles
         | focus on the format, or on some specific ontologies (typically
         | without an application).
         | 
         | A third issue is trust. In Berners-Lee's original paper, trust
         | is shown as an extra box, suggesting it is a component. That's
         | a grave misunderstanding: trust is a property of the whole
         | system/ecosystem; you can't just take a prototype and say "now
         | let's add a trust module to it!" In the absence of trust
         | guarantees, who ensures that the metadata that does exist is
         | correct? It may just be spam (annotation spam may be the
         | counterpart of Web spam in the unstructured world).
         | 
         | No Semantic Web until the Incentive Problem and the Trust
         | Problem are solved.
        
           | leoxv wrote:
           | "No Semantic Web until the Incentive Problem and the Trust
           | Problem are solved."
           | 
           | No. The semweb is already functional as is (see my other
           | comments here). Trust is orthogonal and can/is being solved
           | in different ways (centralized/decentralized as in
           | Wikidata/ORCIDs/org-ID-URIs).
        
             | oofbey wrote:
             | Talking about "the incentive problem" as if it's some minor
             | fixable issue ignores all of human psychology and
             | economics.
             | 
             | The climate crisis is a somewhat comparable example - it
             | requires changing behavior on a massive scale for abstract
             | benefit. In the climate case the benefit is much more
             | fundamental than what semweb promises. And despite massive
             | pain and effort we are very very far from addressing it.
             | Thinking semweb would happen just cuz it sounds cool is
             | super naive.
        
         | leoxv wrote:
         | 1)
         | 
         | - SPARQL is _a lot better_ than the many different forms of
         | SQL.
         | 
         | - Adding some JSON-LD can be done through simple JSON metadata.
         | Something people using Wordpress are already able to do. All
         | this will be more and more automated.
         | 
         | - The benefit is ontological cohesion across the whole web.
         | Please take a look at the https://conze.pt project and see what
         | this can bring you. The benefit is huge. Simple integration
         | with many different stores of information in a semantically
         | precise way.
         | 
         | 2) AI/NLP is never completely precise and requires huge
         | resources (which require centralization). The basics of the
         | semantic web will be based on RDF (whether created through some
         | AI or not), SPARQL, ontologies and extended/improved by AI/NLP.
         | Its a combination of the two that is already being used for
         | Wikipedia and Wikidata search results.
        
           | azinman2 wrote:
           | > The benefit is ontological cohesion across the whole web
           | 
           | This has no benefit for the person who has to pay to do the
           | work. Why would I pay someone to mark up all my data, just
           | for the greater good? When humans are looking/using my
           | products, none of this is visible. It's not built into any
           | tools, it doesn't get me more SEO, and it doesn't get me any
           | more sales.
        
             | leoxv wrote:
             | Why are people editing Wikipedia and Wikidata? What would
             | it bring you if your products were globally linked to that
             | knowledge graph and Google's machines would understand that
             | metadata from the tiny JSON-LD snippet on each page? The
             | tools are here already, the tech is evolving still, but the
             | knowledge graph concept is going to affect web shop owners
             | too soon enough.
        
               | azinman2 wrote:
               | It's unclear to me at this point why people are
               | contributing to Wikipedia and certainly wikidata, but
               | they're getting something out of it (perhaps notoriety),
               | and a lot probably has to do with contributing to the
               | greater good. It's all non profit. The rest of the web is
               | unlike these stand out projects.
               | 
               | Meanwhile, why would say Mouser or Airbnb pay someone to
               | markup their docs? WebMD? Clearly nothing has been
               | compelling them to do so thus far, and when you're
               | talking about harvesting data and using it elsewhere,
               | it's a difficult argument to make. Google already gets
               | them plenty of traffic without these efforts.
        
               | leoxv wrote:
               | They do it because it benefits them too. OpenStreetMaps
               | links with WD, GLAMs link with WD, journals/ORCIDs link
               | with WD, all sorts of other data archives link with WD.
               | Whoever is not linking with may see a crawler pass by to
               | collect license-free facts.
               | 
               | Also, I just checked: WebMD is using a ton of embedded
               | RDF on each page. They understand SEO well as you said :)
        
         | oofbey wrote:
         | Exactly.
         | 
         | A refinement on your second point is that the groups who would
         | have benefited the most from semantic web were the googles of
         | the world, but they were also the ones who needed it the least.
         | Because they were well ahead of everybody else at building the
         | NLP to extract structure from the existing www. In fact the
         | existence of semantic web would have eroded their key
         | advantage. So the ones in a position to encourage this and make
         | it happen didn't want it at all. So it was always DOA.
        
       | Arrgh wrote:
       | Building a trust relationship between commercial entities isn't
       | automatable; it nearly always requires a contract to be carefully
       | hand-written and argued over by high-priced lawyers before any
       | meaningful exchange of value can take place.
       | 
       | Sure, this is an unfortunate level of friction, and overkill in
       | many cases, but think about it from a cost/benefit perspective: I
       | can spend $10k on legal fees and successfully avoid not just a
       | lot of uncertainty, but very infrequently, the contract also
       | protects me from losses that can be orders of magnitude larger
       | than it cost me to negotiate the contract.
        
       | staplung wrote:
       | Clay Shirky nailed in in 2003:
       | 
       | https://deathray.us/no_crawl/others/semantic-web.html
       | 
       | I'll just excerpt the conclusion:
       | 
       | ``` The systems that have succeeded at scale have made simple
       | implementation the core virtue, up the stack from Ethernet over
       | Token Ring to the web over gopher and WAIS. The most widely
       | adopted digital descriptor in history, the URL, regards semantics
       | as a side conversation between consenting adults, and makes no
       | requirements in this regard whatsoever: sports.yahoo.com/nfl/ is
       | a valid URL, but so is 12.0.0.1/ftrjjk.ppq. The fact that a URL
       | itself doesn't have to mean anything is essential - the Web
       | succeeded in part because it does not try to make any assertions
       | about the meaning of the documents it contained, only about their
       | location.
       | 
       | There is a list of technologies that are actually political
       | philosophy masquerading as code, a list that includes Xanadu,
       | Freenet, and now the Semantic Web. The Semantic Web's
       | philosophical argument - the world should make more sense than it
       | does - is hard to argue with. The Semantic Web, with its neat
       | ontologies and its syllogistic logic, is a nice vision. However,
       | like many visions that project future benefits but ignore present
       | costs, it requires too much coordination and too much energy to
       | effect in the real world, where deductive logic is less effective
       | and shared worldview is harder to create than we often want to
       | admit.
       | 
       | Much of the proposed value of the Semantic Web is coming, but it
       | is not coming because of the Semantic Web. The amount of meta-
       | data we generate is increasing dramatically, and it is being
       | exposed for consumption by machines as well as, or instead of,
       | people. But it is being designed a bit at a time, out of self-
       | interest and without regard for global ontology. It is also being
       | adopted piecemeal, and it will bring with it with all the
       | incompatibilities and complexities that implies. There are
       | significant disadvantages to this process relative to the shining
       | vision of the Semantic Web, but the big advantage of this bottom-
       | up design and adoption is that it is actually working now. ```
        
         | leoxv wrote:
         | "However, like many visions that project future benefits but
         | ignore present costs, it requires too much coordination and too
         | much energy to effect in the real world" ... Wikipedia,
         | Wikidata, OpenStreetMaps, Archive.org, ORCID science-journal
         | stores, and the thousands of other open linked-data platforms
         | are proofing Clay wrong each day. He has not been relevant for
         | a long time IMHO. Semweb > tag-taxonomies.
        
       | asiachick wrote:
       | I only skimmed the article so maybe I missed I but at a glance it
       | seemed the completely miss the biggest issue. People will
       | intentionally mislabel things. If chocolate is trending people
       | will add "chocolate" to there tags for bitcoin.
       | 
       | You can see this all over the net. One example is the tags on
       | SoundCloud.
       | 
       | Another issue is agreeing on categories. say women vs men or male
       | vs female. for the purpose of id the fluidity makes sense but
       | less so for search. to put it another way, if I search for
       | brunettes i'd better not see any blondes. If I search for dogs
       | I'd better not see any cats. And what to do about ambiguous
       | stuff. What's a sandwich? A hamburger? a hotdog? a gyro? a taco?
        
       | PaulHoule wrote:
       | Semweb people got burned out by the stress of making new
       | standards which means that standards haven't been updated. We've
       | needed a SPARQL 2 for a long time but we're never going to get
       | it.
       | 
       | One thing I find interesting is that description logics (OWL)
       | seem to have stayed a backwater in a time when progress in SAT
       | and SMT solvers has been explosive.
        
         | ggleason wrote:
         | That's a very good point re SAT/SMT. F* (https://www.fstar-
         | lang.org/) has done truly amazing things by making use of them,
         | and it's great to be able to get sophisticated correctness
         | checks while doing basically non of the work.
         | 
         | I'm going to have to go away and think about how one could
         | effectively leverage this in a data setting, but I'd love to
         | hear ideas.
        
           | PaulHoule wrote:
           | It doesn't have anything directly to do with SAT but I'd say
           | the #1 deficiency in RDFS and OWL is this.
           | 
           | Somebody might write                  :Today :tempF 32.0 .
           | 
           | or                  :Today :tempC 0.0 .
           | 
           | The point of RDFS and OWL is _not_ to force people into a
           | straightjacket the way people think it is but rather make it
           | possible to write a rulebox after the fact that merges data
           | together. You might wish you could write
           | :tempC rdfs:subPropertyOf :tempF .
           | 
           | but you can't, what you really want is to write a rule like
           | ?x :tempC ?y -> ?x :tempF ?y*1.8 + 32.0
           | 
           | but OWL doesn't let you do that. You can do it with SPIN but
           | SPIN never got ratified and so far all the SPIN
           | implementations are simple fixed point iterators and don't
           | take advantage of the large advances that have happened with
           | production rules systems since they fell out of fashion (e.g.
           | systems in the 1980s broke down with 10,000 rules, in 2022
           | 1,000,000 rules is often no problem.)
        
         | zozbot234 wrote:
         | A recent paper connects SHACL (mentioned in OP) to description
         | logic and OWL: https://arxiv.org/abs/2108.06096 . This is a
         | surprising link which seems to have been missed by SemWeb
         | practitioners when SHACL was proposed.
        
         | blablabla123 wrote:
         | Wikidata is quite usable though with SPARQL through REST. To me
         | the biggest problem seems lack of documentation but for small
         | scale experiments interesting stuff can be done with it (with
         | enough caching, probably with SQL). Running my own triple store
         | seems a lot of work though, already choosing which one to use
         | actually
        
         | jrochkind1 wrote:
         | > Semweb people got burned out by the stress of making new
         | standards which means that standards haven't been updated.
         | 
         | True. But and also, web standards seem to have mostly been
         | abandoned/died beyond just semantic web. I am not sure how to
         | explain it, but there was a golden age of making inter-operable
         | higher-level data and protocol standards, and... it's over.
         | There much less standards-making going on. It's not just SPARQL
         | that could use a new version, but has no standards-making
         | activity going on.
         | 
         | I can't totally explain it, and would love to read someone who
         | thinks they can.
        
       | Yahivin wrote:
       | Cleary the writings of a brilliant and disturbed mind.
        
       | throwaway0asd wrote:
       | Semantic web is data science for the browser. Most people can't
       | even figure out how to architect HTML/JS without a colossal tool
       | to do it for them, so figuring out data science architecture in
       | the browser is a huge ask.
        
         | z3t4 wrote:
         | There are two camps,
         | 
         | one that thinks you should use tools to generate HTML/JS and
         | those tools should generate strict XML and any extra semantic
         | data. The problem is that the actual users of these tools
         | either don't care, or know about semantic HTML nor semantic
         | data.
         | 
         | Then the other camp that thinks HTML should be written by hand
         | which makes it small, simple and semantic (layout and design
         | separated into CSS) without any div elements. Hand-writing the
         | semantic data in addition to the semantic HTML becomes too
         | burdensome.
        
       | jerf wrote:
       | The reason why the semantic web is even more fundamental: You
       | can't get everyone to agree on one schema. Period. Even if
       | everyone is motivated to, they can't agree, and if there is even
       | a hint of a reason to try to distinguish oneself or strategically
       | fail to label data or label it incorrectly, it becomes even more
       | impossible.
       | 
       | (I mean, the "semantic web" has foundered so completely and
       | utterly on the problem of even barely working at all that it
       | hasn't hardly had to face up to the simplest spam attacks of the
       | early 2000s, and it's not even remotely capable of playing in the
       | 2022 space.)
       | 
       | Agreement here includes not just abstract agreement in a meeting
       | about what a schema is, but complete agreement when the rubber
       | hits the road such that one can rely on the data coming from
       | multiple providers as if they all came from one.
       | 
       | Nothing else matters. It doesn't matter what the serialization of
       | the schema that can't exist is. It doesn't matter what inference
       | you can do on the data that doesn't exist. It doesn't matter what
       | constraints the schema that can't exist specifies. None of that
       | matters.
       | 
       | Next in line would be the economic impracticality of expecting
       | everyone to label their data out of the goodness of their hearts
       | with this perfectly-agreed-upon schema, but the Semantic Web
       | can't even get far enough for this to be its biggest problem!
       | 
       | Semantic web is a whole bunch of clouds and wishes and dreams
       | built on a foundation that not only _does_ not exist, but _can_
       | not exist. If you want to rehabilitate it, go get people to agree
       | (even in principle!) on a single schema. You won 't rehabilitate
       | it. But you'll understand what I'm saying a lot more. And you'll
       | get to save all the time you were planning on spending building
       | up the higher levels.
        
         | lyxsus wrote:
         | There're a lot of wrong perspectives on the topic in this
         | thread, but this one I like the most. When someone starts to
         | talk about "agreeing on a single schema/ontology" it's a solid
         | indicator that that someone needs to get back to rtfm (which I
         | agree a bit too cryptic).
         | 
         | The point here is that in semantic web there're supposed to be
         | lots and lots of different ontologies/schemas by design, often
         | describing the same data. SW spec stack has many well-separated
         | layers. To address that problem, an OWL/RDFS is created.
        
           | wrnr wrote:
           | I've been part of 4 commercial project that used the semantic
           | web in one way or another. All these project or at least
           | their semantic web part where a failure. I think that I have
           | a good idea on where the misunderstanding about the semantic
           | web originate. The author does seem to have a good
           | understanding and is right about the semantic web forcing
           | everything into a single schema. Academia sells the straight
           | jacked of the semantic web as a life long free lunch at an
           | all-you-can eat-buffet but instead you are convicted to a
           | life sentence in prison. Adopting RDF is just too costly
           | because it is never the way computers or humans structure
           | data in order to work with it. Of course everything can be
           | organised in a hyper graph, there is a reason why Steven
           | Wolfram also uses this structure, they just so flexible. At
           | the end of the day I don't agree with the author opinion of
           | the semantic web having much of a future, I did my best but
           | it didn't work out, time for other things.
        
             | lyxsus wrote:
             | > semantic web forcing everything into a single schema
             | 
             | I don't think "forcing" is the right word here, I think the
             | right one would be "expects it to converge under practical
             | incentives". That's a more gentle statement that reflects
             | the fact, that it doesn't have to for SW tech to work.
             | 
             | Also, the term "schema" is a bit off, bc there's really no
             | such thing in there. You can have the same graph described
             | differently using different ontologies at the same moment
             | without changing underlying data model, accessible via the
             | same interface. It's a very different approach.
             | 
             | > never the way computers or humans structure data in order
             | to work with it
             | 
             | If you haven't mentioned that you had an experience, I
             | would say you confuse different layers of technology,
             | because graph data model is a natural representation of
             | many complex problems. But because you have, can I ask you
             | to clarify what you mean here?
             | 
             | > Academia sells the straight jacked of the semantic web as
             | a life long free lunch at an all-you-can eat-buffet
             | 
             | I disagree, bc I in fact think that academia doesn't sell
             | shit, and that's the problem. There's no clear marketing
             | proposal and I don't think they really bother or equipped
             | to make it. There's a lack of human-readable specs and
             | docs, it's insane how much time you need to invest in this
             | topic even just to be able estimate whenever it's a
             | reasonable to consider using SW in a first place. Also,
             | lack of conceptual framework, "walkthroughs", tools,
             | outdated information, incorrect information drops survival
             | chance of a SW-based project by at least x100. But it can
             | really shine in some use-cases, that unfortunately have
             | little to do with the "web" itself.
        
             | zozbot234 wrote:
             | RDF is just an interoperability format. You aren't supposed
             | to use it as part of your own technology stack, it just
             | allows multiple systems to communicate seamlessly.
        
           | jerf wrote:
           | "The point here is that in semantic web there're supposed to
           | be lots and lots of different ontologies/schemas by design,
           | often describing the same data."
           | 
           | Then that is just another reason it will fail. We already
           | have islands of data. The problem with those islands of data
           | is not that we don't have a unified expression of the data,
           | the problem is the _meaning_ is isolated. The lack of a
           | single input format is little more than annoyance and the
           | sort of thing that tends to resolve itself over time even
           | without a centralized consortium, because that 's the _easy_
           | part.
           | 
           | Without agreement, there is no there there, and none of the
           | promised virtues can manifest. If what you say is the
           | semantic web is the semantic web (which certainly doesn't
           | match what everyone else says it is), then it is failing
           | because it doesn't solve the right problem, though that isn't
           | surprising because it's not solvable.
           | 
           | If what you describe is the semantic web, the Semantic Web is
           | "JSON", and as solved as it ever will be.
           | 
           | A "knowing wizard correcting the foolish mortals" pose would
           | be a lot more plausible if the "semantic web" had more to
           | show for its decades, actual accomplishments even remotely in
           | line with the promises constantly being made.
        
             | lyxsus wrote:
             | so if it tries to have a unified ontology that's why it's
             | destined to fail, but if it's designed to working with many
             | small ontologies... that's why it will fail! lol, but you
             | can't have it both ways.
             | 
             | In SW, the "semantic" part is subjective to an interpreter.
             | You can have different data sources, partially mapped using
             | owl to the ontology that an interpreter (your program)
             | understands. That allows you to integrate new data sources
             | independently from the program if they use a known ontology
             | seamlessly or create a mapping of a set of concepts into a
             | known ontology (which you would have do anyway in other
             | approach). So in theory, data consumption capabilities (and
             | reasoning) grows as your data sources evolve.
             | 
             | > If what you describe is the semantic web, the Semantic
             | Web is "JSON", and solved.
             | 
             | It has nothing to do with JSON, JSON-LD, XML, Turtle, N3,
             | rdfa, microdata and etc.. RDF is a data model, but those
             | are serialisation formats. That's another interesting
             | point, because half of the people talk only about formats
             | and not the full stack. That's not a reasonable discussion.
             | 
             | > which certainly doesn't match what everyone else says it
             | is
             | 
             | oh, I know it and it's upsetting.
        
               | pessimizer wrote:
               | > if it tries to have a unified ontology that's why it's
               | destined to fail, but if it's designed to working with
               | many small ontologies... that's why it will fail! lol,
               | but you can't have it both ways.
               | 
               | You're only supposed to say "you can have it both ways"
               | about contradictory things. It can both be a hopeless
               | endeavor because it is impossible to agree on ontologies
               | and a useless endeavor if you don't agree on ontologies.
        
               | lyxsus wrote:
               | Oh, I would like to see a look on your face when just in
               | about 100-200 years from now it will be mature enough for
               | a "web scale".
        
               | pessimizer wrote:
               | Just 200 years around the corner.
        
               | jrochkind1 wrote:
               | Maybe 300. But no longer, I'm confident! Do you want to
               | be left out in a couple centuries? You better get on the
               | train now.
        
           | kortex wrote:
           | > The point here is that in semantic web there're supposed to
           | be lots and lots of different ontologies/schemas by design,
           | often describing the same data.
           | 
           | This is incredibly problematic for many reasons. Not the
           | least of which is the inevitable promulgation of bad
           | data/schemas. I remember one ontology for scientific
           | instruments and I, a former chemist, identified multiple
           | catastrophically incorrect classifications (I forget the
           | details, but something like classifying NMR as a kind of
           | chromatography. Clear indicators the owl author didn't know
           | the domain).
           | 
           | The only thing worse than a bad schema is multiple bad
           | schemas of varying badness, and not knowing which to pick.
           | Especially if there is disjoint aspects of each which are
           | (in)correct.
           | 
           | There may have been advancements in the few years since I was
           | in the space, but as of then, any kind of
           | probabilistic/doxastic ontology was unviable.
        
             | lyxsus wrote:
             | That's a valid point, but I'm not sure, the following
             | problem has a technical solution:
             | 
             | > Clear indicators the owl author didn't know the domain
        
               | kortex wrote:
               | It doesn't, which is exactly the problem. Ontologies
               | inevitably have mistakes. When your reasoning is based on
               | these "strong" graph links, even small mistakes can
               | cascade into absolute garbage. Plus manual taxonomic
               | classification is super time consuming (ergo expensive).
               | Additionally, that assumes that there is very little in
               | the way of nebulosity, which means you don't even have a
               | solid grasp of correct/incorrect. Then you have
               | perspectives - there is no monopoly on truth.
               | 
               | It's just not a good model of the world. Soft features
               | and belief-based links are a far better way to describe
               | observations.
               | 
               | Basically, every edge needs a weight, ideally a log-
               | likelihood ratio. 0 means "I have no idea whether this
               | relation is true or false", positive indicates truthiness
               | and negative means the edge is more likely to be false
               | than true.
               | 
               | Really, the whole graph needs to be learnable. It doesn't
               | really matter if NMR is a chromatographic method. _Why_
               | do you care what kind of instrument it is? Then apply
               | attributes based on behaviors ( "it analyses chemicals",
               | "it generates n-dim frequency-domain data")
        
               | lyxsus wrote:
               | Understood, thank you.
               | 
               | Yes, that's not solvable with just OWL (though it might
               | help a little) or any other popular reasoners I know.
               | There're papers, proposals and experimental
               | implementations for generating probability-based
               | inferences, but nothing one can just take and use, but
               | there're tons of interesting ideas on how to represent
               | that kind of data in RDF or reason about.
               | 
               | I think the correct solution in SW context would be to
               | add a custom reasoner to the stack.
        
         | leoxv wrote:
         | Wikidata is already providing a nearly globally accepted store
         | of concept IDs. Wikipedia adds a lot of depth to this knowledge
         | graph too.
         | 
         | Schema.org has become very popular and Google is backing this
         | project. Wordpress and others are already using it.
         | 
         | Governments are requiring not just "open data", but also "open
         | linked-data" (which can then be ingested into a SPARQL engine),
         | because they want this data to be usable across organizations.
         | 
         | The financial industry are moving to the FIBO ontology, and on
         | and on...
        
       | lysergia wrote:
       | Long live the dream of the semantic web. For visual learners
       | there's a great YouTube video explaining the semantic web here:
       | 
       | https://youtu.be/6gmP4nk0EOE
        
         | mxmilkiib wrote:
         | For a longer in-depth video playlist,
         | https://youtube.com/playlist?list=PLoOmvuyo5UAeihlKcWpzVzB51...
        
       | thirdtrigger wrote:
       | Interesting writeup. I'm of the opinion that the problem of the
       | naming issue (how to call "things"?) sits in the idea that going
       | from structured documents to structured data is one abstraction
       | level too deep (i.e., people don't agree on how to call
       | "things"). I believe this can be solved by similarity search; if
       | we can approximate the data and represent the structure in
       | embeddings. Hopefully, this might be a step in the 2nd try, as
       | mentioned in the MD :)
       | 
       | > It would be like wikipedia, but even more all encompassing, and
       | far more transformational.
       | 
       | You might like to see this
       | (https://weaviate.io/developers/weaviate/current/tutorials/se...)
       | as a step in this direction because it contains the structured
       | Wikipedia data and the embeddings to target individual nodes in
       | the graph.
        
       | galaxyLogic wrote:
       | I think the answer is Datalog. It is simple, simpler than SQL but
       | powerful like Prolog. Why hasn't it caught on?
        
         | ggleason wrote:
         | I am of the same opinion. TerminusDB uses a data log for query
         | and update. I think it will catch on.
         | 
         | And in the future we will even be able to add constraints -
         | which can be a real superpower in querying graphs.
        
         | SeanLuke wrote:
         | The semantic web is notion for _defining_ data relationships.
         | Datalog and SQL are languages for _queries_. These have little
         | to do with one another. It 's like saying that HTML is failing
         | as a format, so the answer is HTTP.
        
         | tannhaeuser wrote:
         | Nit: Datalog isn't as powerful as Prolog, that's the whole
         | point of it as a decidable fragment of first order logic (and
         | it's seeing increased use in SMT/fixpoint solvers and
         | databases)
         | 
         | But yeah, if getting rid of the whole SemWeb stack, triples,
         | and their many awful serialization formats, design-by-committee
         | query and constraint languages (keeping just the good parts)
         | means we can finally return to focus on Prolog, Datalog, and
         | simple term encodings of logic, I'm all for it.
        
       | travisgriggs wrote:
       | > My experience in engineering is that you almost always get
       | things wrong the first time.
       | 
       | Probably the oldest gem I can remember, harvested from from a
       | more senior mentor type, was the quip "It takes 3 times to get it
       | right. And that's an average. Get failing."
       | 
       | Now, I'm that older guy. I still think this holds.
        
       | efitz wrote:
       | The answer to almost any question beginning with "why don't they"
       | (or why didn't they), is almost always "money".
       | 
       | Producing, aggregating, storing, or otherwise adding value to
       | information costs money. Operating the internet costs money.
       | Providing access to data costs money.
       | 
       | People are lazy. Businesses on the internet have learned that
       | they can extract more money from this vast pool of lazy people by
       | presenting information rather than just providing information. By
       | this, I mean that the value-add and/or lock-in of many internet
       | businesses is tied to how the information is presented; adopting
       | a standard format would be effort that would not be financially
       | rewarded.
       | 
       | (by "lazy", I mean "looking for local minima in effort to
       | accomplish whatever task that they're trying to do")
       | 
       | Finally, the web envisioned itself as a hypermedia system that
       | incorporated presentation (and subsequently active content)
       | instead of just semantic content. Since presentation is a
       | property of the web, it was quickly adopted for the reasons
       | described above and evolved into the modern web (which replaced
       | the blink tag with shit tons of javascript, don't get me
       | started).
       | 
       | Therefore the "semantic web" could never exist because
       | "semantics" is fundamentally incompatible with "web". Once you
       | invent the web, you can't have the semantic web anymore because
       | money.
       | 
       | We shoulda stuck with gopher.
        
         | jsight wrote:
         | +1 - The surest path to having someone copy your data and
         | monetize it better than you is to present it in semantically
         | sound ways.
         | 
         | Imagine a stock site that made real time prices readily
         | available in a common format! Oh, it exists, but you have to
         | pay for it...
         | 
         | And you don't need semantics for that, you want something more
         | like Swagger.
        
         | [deleted]
        
       | kukkeliskuu wrote:
       | There are deeper issues with semantic web.
       | 
       | Look at the EDIFACT. Huge standardization effort, but it was
       | still not possible to automate system to system communication,
       | because ultimately you need to rely on some words, and words are
       | flexible. I was working with multiple companies that understood
       | "through-invoicing" in EDIFACT differently, but the differences
       | were so subtle they needed a third party to clarify those
       | differences.
       | 
       | Lately, in various sectors, such as finance, there are
       | commercially available reference data models. These are extremely
       | complex, because they need to cover all the possible alternatives
       | businesses might have, in various countries. Just to gain basic
       | understanding of such a model is a huge effort. To have people to
       | label things properly would probably involve learning a similar
       | system.
        
         | TylerE wrote:
         | Sort of reminds me of the original idea behind REST. IMO
         | automated system-to-system is a dead end... you're always going
         | to need humans in the loop for any useful non-trivial data.
        
       | WaitWaitWha wrote:
       | Very interesting. I would like to see pricing, specially for the
       | stringchair. I have a few buddies that could use it.
        
       | boxslof wrote:
       | keeping it short because on phone.
       | 
       | working for a company, 100 % semantic web, integrating many, many
       | parties for many years now, all of it rdf.
       | 
       | - you get used to turtle. one file can describe your db and be
       | ingested as such. handy. - interoperability is really possible.
       | (distributed apps) - hardest part is getting everyone to agree on
       | the model, but often these discussions is more about resolving
       | ambuigties surrounding the business than about translating it to
       | model. (it gets things sharp) - agree on a minimum model, open
       | world means you can extend in your app - don't overthink your owl
       | descriptions
       | 
       | - no, please no reasoners. data is never perfect.
       | 
       | - tooling is there - triple stores are not the fastest
       | 
       | pls, not another standard to fix the semantic web. Everything is
       | there. More maturity in tooling might be welcome, but this a
       | function of the number people using it.
        
       | 0xbadcafebee wrote:
       | Very well written introduction to some of the problems with
       | semantic web dev.
       | 
       | Personally I think the reason it died was there were no obvious
       | commercial applications. There are of course commercial
       | applications, but not in a way that people realize what they're
       | using is semantic web. Of all the 'note keepers' and 'knowledge
       | bases' out there, none of them are semantic web. Thus it has
       | languished in academia and a few niche industries in backend
       | products, or as hidden layers, ex. Wikipedia. Because there
       | wasn't something we could stare at and go "I am using the
       | semantic web right now", there was no hype, and no hype means no
       | development.
        
         | k8si wrote:
         | Very hard to make a business case because for the reasons you
         | mentioned + the costs are very front-loaded because ontologies
         | are so damn hard to build, even for very well-contained
         | problems. Without a clear payoff, why bother
        
           | galaxyLogic wrote:
           | Yes because that is about formalizing all human thought and
           | knowledge. In principle that has nothing to do with computers
           | and is something everybody working in science and humanities
           | has been always trying to do starting with Socrates or was it
           | Pythagoras. It is about "building theories".
           | 
           | Now computers can help in that of course but it doesn't
           | really make it easy to create a consistent stable "theory of
           | everything". As we used to say "garbage in garbage out".
        
       ___________________________________________________________________
       (page generated 2022-08-10 23:00 UTC)