[HN Gopher] Data Mesh Architecture ___________________________________________________________________ Data Mesh Architecture Author : aiobe Score : 71 points Date : 2022-03-18 12:15 UTC (1 days ago) (HTM) web link (www.datamesh-architecture.com) (TXT) w3m dump (www.datamesh-architecture.com) | politelemon wrote: | Is there an underlying assumption here that all of the datasets' | domains are perfectly in sync with each other in the context of | domain metadata? | | As an example, a Team1 might define the manufacturer of a | Sprocket as the company that assembled it, whereas a Team2 might | define the manufacturer as the company that built the Sprocket's | engine. Since the purpose of a datamesh is to enable other teams | to perform cross-domain data analytics, there needs to be | reconciliation regarding these definitions, or it'll become a | datamess. Where does that get resolved? | gxt wrote: | The chief data officer in close collaboration with the chief | data engineering officer must elaborate automated normalization | guidelines backed with implementations used across all data | streams to insure any skew in the data model is limited to non | production environments and all data entities are materialized | consistently across the whole data model. | i_like_waiting wrote: | what type of company you are working for? Usually there is | not even CIO, I haven't even heard about company with both | CDO and CDEO (or even CDEO itself). | | I thought big portion of need that data mesh fills is the | organizations who are missing resources in their core BI | team. | tremoloqui wrote: | A data mesh approach probably wouldn't work in the sort of | organization you describe. | | IMO - To make it work you need a consistent taxonomy or way | of translating from a particular domain to some sort of | interchange format. | | If you have that then a set of centralized tools can pull | from the separate domains using a core set of protocols to | produce reports etc. | gxt wrote: | There's no magic. you need a core team that pivots from | writing code at O(n) cost enterprise wide to more or less | amortized O(1) where n is the amount of work required to | process a new data stream - ie having to write code once | per stream vs once for a standardized stream that gets | reused. With only datamesh I don't think it's going to work | but with standardized tools that allow your teams to write | transformations and code as data then every team | effectively gets access to a self-service data warehouse | with only access to pre-approved happy paths that can be | automatically monitored for the most part. That's where you | gain in efficiency and can let your BI teams focus on BI | and not boilerplate code, infrastructure, conformity, etc. | i_like_waiting wrote: | Yes, its similar path that I am taking (while leading BI | in my org.) Having first sights of self-service from | analysis perspective is super easy thanks to tools like | metabase. | | For bringing data in, thats completely different story, | especially in non-tech organizations. The gap between how | power user from specific department and somebody from my | team brings and transforms data is still too big and | somehow hard to enforce (following naming conventions, | keeping same data formats for same columns, lowercasing | certain columns, so joins are done correctly...). They | usually have their "playground schemas" they use, but its | very far from saying that they "own" data quality there. | LaserToy wrote: | I looks like a weird attempt to build a consulting business | around a simple idea. | | Treat data assets like micro services and pipelines like network. | Period. | | Prescribing everything else rubs me wrong way. | | So, data mesh is: architecture in which data in the company | organized in loosely coupled data assets. | robertlagrant wrote: | It really feels like data mesh is a fairly half baked concept | born out of short term consulting gigs and a desire to become a | technical thought leader. | i_like_waiting wrote: | Reminds me of first OLAP cubes a lot, something that consultant | online praise as much as possible, just so then 3-4 years later | they are contracted by the company to fix the mess it created. | edmundsauto wrote: | What are the downsides of OLAP cubes, and how were they | fixed? Curious to level up my understanding. | i_like_waiting wrote: | I guess they had their place in some point and time, but I | still vividly remember my old manager speaking about | building OLAP cube in 2018. | https://www.holistics.io/blog/the-rise-and-fall-of-the- | olap-... | i_like_waiting wrote: | So if I understand this correctly, data mesh is just data mart, | that doesn't bring data in database as a table, but uses S3 | storage instead (I assume because thats cheaper in the cloud?) | skrrr wrote: | That + a central data platform team that provides infra, | quality monitors, data lineage and catalogue capabilities + a | central team that provides guidelines on SLAs, metadata | standards etc. Sounds good in theory, I am eager to see how it | fails in practice | mountainriver wrote: | This seems like mostly common sense. Infrastructure teams should | always be building tools that the org consumes (and ideally the | general public) | | In a lot of orgs this goes sideways and the infrastructure teams | end up owning everything and never have time to do anything else. | Usually this happens due to upper management putting on the | squeeze. | | In order for teams to actually own their infrastructure and data | we need better tooling to help them. This is coming along | nowadays but isn't fully there. | sdze wrote: | If you need so many "slides" to persuade your clients of | something, I think you lost already. | rad_gruchalski wrote: | Considering how many big companies go about implementing this | right now, I don't agree. C line likes slides. | MikeDelta wrote: | Indeed, the Future State Architecture documentation from the | central architects that I have seen were all powerpoint | presentations with at least 100 slides. ___________________________________________________________________ (page generated 2022-03-19 23:00 UTC)