[HN Gopher] Maintain a clean architecture in Python with depende... ___________________________________________________________________ Maintain a clean architecture in Python with dependency rules Author : rekahrv Score : 115 points Date : 2022-12-15 13:29 UTC (9 hours ago) (HTM) web link (sourcery.ai) (TXT) w3m dump (sourcery.ai) | vasili111 wrote: | Am I only person that prefers to use raw SQL over of SQLAlchemy? | I do not see any real advantage of using SQLAlchemy over raw SQL | if I do not plan (which I do not plan) in future to switch DB | engine for the application. Do you see any real advantage of | using SQLALchemy over raw SQL queries if you do not plan to | switch DB engine for your application in future? | willseth wrote: | It's nice to have all of your db operations in Python and | automatically integrated with existing Python tooling. It also | makes it easier to refactor, organize, etc. SQLAlchemy comes | OOTB with a lot of nice convenience tools and functions, and | there's an ecosystem built around it, e.g. Alembic for schema | migration. There are some cases like really complex queries | where it can get in the way, but overall I find the tradeoffs | are easily worth it for the convenience | dontlaugh wrote: | Alchemy makes it quite easy to compose queries, which isn't | possible with SQL. That's about it. | gghhzzgghhzz wrote: | I use it mostly for reverse engineering a model on top of a | legacy database when working on projects to clean and migrate | that data. | | I seen some very legacy database 'designs' and have never | failed to model them with a combination of sqlalchemy join | mapping, datatype mapping and some object properties in python | for cases that are simpler to just express as list | comprehensions. | | You end up with some data quality rules / Transformation logic | you can reasonably share with business users. | | On the Load end I normally do that via sql bulk inserts as | using an ORM just adds too much overhead and not enough | control. | lyu07282 wrote: | Is there something like this for react/jsx? I always wished I | could constrain component dependencies across the atom > molecule | > organism layers. | thundergolfer wrote: | This is supported in Bazel with package visibility rules. Once | you've got that feature as a way to tame a larger and expanding | codebase, you'll wonder why it isn't a feature in more systems. | | https://bazel.build/concepts/visibility | rekahrv wrote: | Thanks a lot for sharing this link. I haven't used Bazel, but | this concept of target and load visibility sounds cool. | | "Once you've got that feature as a way to tame a larger and | expanding codebase, you'll wonder why it isn't a feature in | more systems." :-) | shankr wrote: | This has also been recently integrated in pants. | | https://github.com/pantsbuild/pants/issues/13393 | hbrn wrote: | This (plus Law of Demeter) is the right way to handle medium-big | size projects, though I'm not completely sold on the tooling. I | mostly do it manually (yes, it is still doable with dozens of | modules since dependency hierarchy doesn't change often). | | One recommendation I have is to present the hierarchy as DAG. | Existing image | (https://sourcery.ai/static/05300f06cb847360719e2aa31dc5a31b/...) | doesn't make it very obvious that api is a highest-level module, | even though it is clearly stated in the rules. | AlphaSite wrote: | Import Linter is probably a better choice for Python since it's | free https://pypi.org/project/import-linter/ | hbrn wrote: | I thought about using it at a small scale, but frankly I find | more value in a visual representation, and once I have that I | don't want to explicitly blacklist imports: those rules can | already be derived from the graph (i.e. any import that | introduces a cycle is a violation). | | Doing it manually allows me the following: | | 1. I get to define what are the namespaces (domains) that | matter, irrespective of the package structure. E.g. import | from stripe.api_resources is still a dependency on stripe, | not on stripe.api_resources. | | 2. Work around a bunch of dependency caveats (frameworks like | Django do runtime imports and mix high and low level concepts | in settings, db foreign keys might inverse your dependencies, | etc) | | 3. Violations are very easy to see: they are cycles in the | graph, i.e. arrows pointing upwards. Those are typically | design flaws. Though I still allow certain violations because | practicality beats purity. | | 4. Since some violations are allowed, I get to decide how to | arrange the graph so that it is more clear what is the flaw | and how to address it. | | I haven't found a good tool that allows me to get all of | these. One day I'll have to build it myself. | rekahrv wrote: | Yes, Law of Demeter is exactly what these rules are trying to | achieve. :-) Thanks, a DAG is a great recommendation. | neves wrote: | I've already seem tools like this for other languages, but never | seen someone effectively using them. Does anyone here has good or | bad experiences with these architecture rule systems? | hakanderyal wrote: | After dealing with that problem and enduring the pain of it for | years, I finally switched to C#/.NET. It has the necessary | tooling to achieve this and more. | | Rewriting a lot of things was time well spent rather than trying | to tame the dynamic nature of Python and my tendency to overuse | it. | | And I can't believe I'm writing this after all these years | evangelizing Python and dynamically typed languages. | elforce002 wrote: | Interesting. What about using mypy to have some sort of static | typing a la typescript? | wiseowise wrote: | Lipstick on a pig. Unlike TypeScript, mypy feels really | clunky. | vyrotek wrote: | Pleasantly surprised to see .NET shared on HN! I've had a lot | of success with it in my career building several SaaS platforms | from the ground-up. The tooling is great. It's wild how | productive a small startup team can be on the .NET stack using | a clean architecture. | [deleted] | whiskey14 wrote: | Can you please give a short summary of why I should give | C#/.NET a go for my backend services? | | I've been fighting battles in Python backend services to get a | nicely decoupled API, logic and DB layers for a while...but | sqlalchelmy, alembic and flask/django/fastAPI are my safety | blankets | camdenreslink wrote: | One reason is that entity framework is the best ORM out | there. It blows sqlalchemy and alembic out of the water imo | (I've used both a bunch). | | Another reason is that decoupling and adding layers to your | code is more part of the culture. Look up "domain driven | design C#" or "onion architecture C#" and there will be a lot | of resources on how to achieve it. There is stuff out there | for Python as well (and the concepts translate between | languages), but not nearly as much. | megaman821 wrote: | I haven't played with SQLAlchemy in a while, but I was | comparing EF core to the Django ORM, and EF core seemed to | be lacking in features. There were a few things missing but | the two that pop to mind are Window function and Case | statements. | mattgreenrocks wrote: | The .NET ecosystem is great. | | It feels a lot more professional than other ecosystems. For | example, they actually talk about layering/coupling as | professionals should! People actually seem to talk about | architecture as well rather than blindly believing that the | conventions forced on them by a framework are sufficient | for all use cases. | | I especially like the gradient in the .NET world from micro | ORMs to full-fledged ORMS. Most ecosystems seem to develop | a big ORM that constantly accrues features (and bugs) and | eventually becomes enshrined as a "best practice" because | it acts as a kitchen sink. | daxfohl wrote: | +1 I was much happier using Dapper compared to EF. I | figure if it's good enough to run stackoverflow, it's | probably good enough for whatever I happen to be doing. | | The amount of open source in dotnet is great. (I think | more than Java? My impression of that is dominated by | Apache etc., though my experience in the Java ecosystem | is limited. Presumably people in Java land would expect | the same of dotnet being dominated by Microsoft, but | that's _really_ not the case). | hakanderyal wrote: | I was in the same boat. I wanted to switch a few years ago | actually, but EF core was missing core features I took for | granted in Sqlalchemy. | | As for the reasons: | | - Static typing and C# projects makes code organization and | refactoring dead easy. | | - Modern C# doesn't require that much boilerplate, and has | features that allows a developer to speed up development, | like Python. | | - EF Core covers everything I need from SqlAlchemy/Alembic. | | - LINQ is an awesome way to work with collections. Type safe | DB queries comes handy both when developing and refactoring. | | - ASP.NET covers everything I need from flask/Fastapi. | | - More speed and lower resource usage is nice. | | - Being able to use an IDE with it's full power is nice. | | My main reason to switch was static typing, and my only | requirement was a good ORM comparable to SqlAlchemy. The rest | is just bonus. | danuker wrote: | > - More speed and lower resource usage is nice. | | > - Being able to use an IDE with it's full power is nice. | | In my experience, Visual Studio is much slower when | developing. My Python workflow affords a 200ms red-green- | refactor loop, while VS is on the order of several seconds. | | This might not seem like much, but it has a great impact on | my engagement, flow, and satisfaction. | hakanderyal wrote: | Rebuilding the project to run the tests adds a bit of | time, yes. I see this as a cost of the static typing, | runtime speed etc. | | It's worth it in the end. YMMV. | danuker wrote: | I am not so sure it's worth it. In general, developer | time is much more valuable than machine time. | | Maybe if you're building a large high-performance server, | you should invest in performance. But otherwise, if you | only look at at computational complexity, and batch | up/avoid I/O when possible, you're fine. | gmueckl wrote: | Static typing in .NET saves developer time on a massive | scale. Sure, the compile times and startup times may be | longer (and tests may take longer to run for that | reason), but the languages also allow for editor/IDE | tooling that boosts developer productivity massively. | Visual Studio with Resharper or Rider may seem expensive, | but if you work with these tools full time, they pay for | their cost multiple times over in almost no time. | lowbloodsugar wrote: | I grew up with BASIC and made it to Java by way of | assembly, C, C++ and C#. This year I put some Rust into | production tooling. I've used python along the way, but | usually as a scripting tool. I've never worked at a | company whose codebase involves a lot of python. So | beware of confirmation bias in my thinking. | | What follows is my opinion, I am aware it is my opinion, | but in my sphere of influence, it is not up for debate | when it comes to writing code. It might occasionally be a | conversation over lunch. | | I put python in the same bucket as BASIC. It's not a | production language. "developer time is much more | valuable than machine time." Yes. Absolutely. Iteration | speed is vital. But it is vital in more than just the | test loop. It is important in the "minor refactoring of | various classes" up to the "major refactoring of entire | systems" loop too. And python just doesn't make that | easy. It actively makes it difficult. It makes | comprehension difficult. It's difficult to look at python | code in a code-review and have a good idea of what the | classes involved are. I don't even write scripts in it | any more. I've found that any script that is worth | writing is likely to grow and evolve over time, and if it | is not written in something like C# or Java, then it will | become an intolerable mess. I've seen entire | organizations that are basically cargo culting. | | I encourage you to learn a statically typed language and | its tooling. | yCombLinks wrote: | All of the benefits of static typing save a ton of | developer time in other places. No one is talking about | saving machine time as the primary benefit of static | typing. | hakanderyal wrote: | For me, runtime performance is only a tiny percent of the | advantages. It could have been slower than Python, I | would still make the switch. | kcartlidge wrote: | > _Rebuilding the project to run the tests adds a bit of | time, yes._ | | It costs money, but I've paid for NCrunch for a fair few | years and find it invaluable for this reason. It doesn't | even need you to _save_ changes before it spots them and | runs affected tests in the background. | | If cost is an issue you can also start `dotnet watch | test` going in a terminal/command prompt for non- | interactive live-reload testing. | roflyear wrote: | This is true. Almost all C# projects will take a while | for you to run. It is unfortunate. | | The upside is hopefully you "don't need to run it as many | times" but ... eh. No thanks. | kcartlidge wrote: | There is a delay, true. Inevitable with the compilation | phase, and I _do_ find it irritating that my Go stuff | builds so much faster. That said, there 's reasonable | (not perfect) live reloading happening these days which | helps somewhat. | dfee wrote: | My journey went through mypy, then typescript for frontend, | the typescript on node. The story here was the type system is | so much better that it allowed better prototyping, larger | codebases and confidence. | | I've done a lot of C# and Java now over the last few years, | and I don't love their type system, esp compared to | typescript, but they scale much better against large | codebases - especially with tooling like bazel. | | I've been looking at Haskell and Rust a lot to fill this | intermediary: code that's performant, with a very expressive | type system. | | I maintain(ed) a number of popular python packages, and that | journey lasted for nearly a decade. | daxfohl wrote: | This is my experience, having gone the route you're | looking. Haskell (~6 month trial) was unproductive for me. | Primarily the ecosystem is full of abandonware. Secondarily | it lures you into spending _way_ too much time refactoring | stuff into the most concise possible form, which you can no | longer understand (and frequently needs rewritten | completely because the tiniest change to the most concise | possible form invariably explodes through several layers | when you have to make changes later). Rust (~3 month trial) | may be great for codebases where you 'd legitimately | consider C / C++, but too much work otherwise; I personally | wasn't doing anything that I'd use C for, so it was not | worth it. | | I ended up being very happy with F# as a middle ground for | several years, but eventually migrated back to C# as they | started adding more and more F# features. The primary | challenge with F# was the parity mismatch with the | underlying runtime, so you end up having to write a fair | amount of non-idiomatic F# to interop with common | libraries. But otherwise it's great. (I also tried Scala | for a year and hated it: too many ways to do any one | thing). | electroly wrote: | Other commenters have good specific points but I'll add one | overarching theme: .NET is developed by a well-funded | corporation that is incentivized to bring all the popular | innovations from other ecosystems back to the .NET world in a | cohesive form. If something becomes popular in another | programming ecosystem and people want it, we'll get it in | .NET and it'll be done in the same style as everything else | we have. It's pretty refreshing working with a system that | was designed to work together rather than cobbling bits | together. | andrew_eu wrote: | Before clicking on this, I expected to see import-linter [0] | which achieves something very similar but with, in my opinion, a | bit less magic. Another solution in a similar spirit is Pants | [1], though this is actually a build system which allows you to | constrain dependencies between different artifacts (e.g. which | modules are allowed to depend on which modules). | | To Sourcery's credit, their product looks much more in the realm | of "developer experience" -- closer to Copilot (or what I | understand of it) than to import-linter. Props to them for at | least having a page about security [2] and building a solution | which doesn't inherently require all of your source code to be | shared with a vendor's server. | | [0] https://github.com/seddonym/import-linter | | [1] https://www.pantsbuild.org/ | | [2] https://docs.sourcery.ai/Product/Permissions-and-Security/ | memco wrote: | Thanks for the additional tools to tackle this problem. We | usually don't have problems with this at work, but I just so | happened to discover one today and was dreading the work it | will take to sort out how to fix it. | revskill wrote: | In Typescript, i normally just allow interface to be dependencies | between layers. (API, command line programs,..) -> (Services) -> | (Database). | Thaxll wrote: | Your DB / api layer should never touch the same models. | rekahrv wrote: | Thanks, that's a good point and perhaps a good topic for a | future post :-) How to ensure that the API and the db use | different models even if those models are in the same package? | dangets wrote: | I struggle with this also, I assume the answer is to not have | them in the same package. You can also break the application | into separate 'domain', 'infra', 'application' modules as | documented in [0] with rules on what dependencies are allowed | in each module (e.g. domain should not have db or | serialization implementation). The problem is that this does | create several adapter layers which adds to the mental | complexity. | | [0] https://learn.microsoft.com/en- | us/dotnet/architecture/micros... | marginalia_nu wrote: | Why not? | aobdev wrote: | I hate that these are called models (probably because they | extend pydantic's BaseModel), but if they were called Schema or | Serializers would this still be true? Typically what you see in | a FastAPI project is a class that parsers the request body, and | the same or slightly modified class that serializes the | response back out after touching the DB. And this isn't a new | idea, because Flask+Marshmallow and DRF do the exact same | thing. | rekahrv wrote: | I've used multiple names for similar packages incl. `models` | and `schemas`. :-) Yes, for this example, I picked `models` | to follow Pydantic's terminology. | | IMO, the FastAPI approach you described makes a lot of sense: | The "schema" stored in the db and the "schema" returned by | the API aren't the same, but they are quite similar. They | have many common properties => They can often have a common | base class. | [deleted] | inwit wrote: | It's these kind of rules that mean I'm here wading through 5 | layers of exquisitely decoupled nonsense that could be done in | a few lines | Thaxll wrote: | Convert function between API and DB model does not sound | complicated. | | Storing the API model in your DB is really a bad idea. | camgunz wrote: | I just listened to the DHH/Kent Beck/Martin Fowler discussion | about TDD "damage" and both sides still seemed unconvinced by | the end of it, but this exact example came up. It seems like | SOA (whether it's DDD or Hexagonal or Clean or w/e) and TDD | really push you towards this kind of layer bloat for one | reason or another. | | I'm (maybe obviously) on the SOA-skeptic side, my arguments | generally are: | | - Most apps aren't that big and don't need multiple layers of | abstraction (i.e. the ORM and its models are totally fine). | If the app starts getting too big for its britches, probably | the best thing to do is make it 2 apps (too big: 2 apps is a | good slogan here). | | - Dependency injection and mocks are pretty bad ideas that | are only occasionally useful (DHH uses the example of a | payments gateway), but mostly push IoC through your whole app | and make control flow confusingly backwards. Mocks are always | in disrepair, and almost never accurately reflect what | they're trying to mock, and thus ironically are big vectors | for bugs that make it through testing. | | - Having tons of unit tests tends to slow eng velocity to a | crawl, because they test the parts of the application that | aren't the requirements (were these functions called, what's | the call signature of this function, was this class | instantiated, etc.). Unit tests create a super-fine-grained | shadow spec about the lowest level details of your | application, and mostly I think they shouldn't ever be | committed to a repo. They help during individual development, | but then the whole team is stuck maintaining them forever | whenever they make changes. They also tend to slow down CI | because they're slow and always flaky. | | - You almost certainly will never need to switch databases, | let alone abstract across a database, a message queue, and a | web api. It's not worth doing a "repo" abstraction and | encapsulating those details. | | - There are (now) really good libraries for almost anything | you want to do. ORMs literally map database entities to | domain entities--they just abstract the persistence for you. | Sounds like a repo to me! We also have good validation, | logging, monitoring, auth/auth etc. built into frameworks and | 3rd party services. A lot of the things you might put into | other layers or even other services are now neatly packaged | into libraries/frameworks you can just use and SaaS things | you can just buy, leaving you free to just implement your | business logic. | kortex wrote: | Agree on the points that you should never need to abstract | over your database, orm, message queue, etc. | | Disagree on dependency injection. I came from the | globals/patch everything school of python, to the | Fastapi/Pytest DI flavor, and it's a breath of fresh air. | It's just so much easier to abstract the IO providers and | swap them out with objects tailored to the test suite - eg | for database, I create db objects which roll back any | transactions between tests. | | Hard disagree on unit tests. Maybe in other languages, but | in Python, trying to develop even a moderately complex app | without unit tests is a nightmare. I know, I've lived it. | Even in an app with >85% unit test coverage, there was | still a ton of friction around development on any of the | interfaces which had low coverage. | | Any gains in velocity of development almost always cost far | more in debugging down the road. | | I love python, but it is really prone to dumb footguns at | runtime, NoneType errors in particular. You need to impose | a lot of discipline to make large python apps enjoyable to | develop on. | yunohn wrote: | > leaving you free to just implement your business logic. | | Often, engineers (and HN) forget that code is a means to an | end - not an artistic expression that provides value by its | pure existence. | hbrn wrote: | I mostly agree with you and DHH on that topic, however in | my experience reasonably applied SOA/DDD actually shields | me from this layering nonsense. | | When your apps live as a service on the network or as a | nicely isolated module in your repo, you no longer have a | reason to over-engineer them. You don't need a grandiose | architecture that solves every problem, instead you can | make local decisions that are good enough in the specific | context. Though, admittedly, I found it hard to sell such | "inconsistencies" to other tech leaders, most folks aspire | to those grandiosities. | | > If the app starts getting too big for its britches, | probably the best thing to do is make it 2 apps | | That's the argument in favor of SOA, isn't it? | chao- wrote: | I generally agree with the position that unit tests should | be used with discretion, and that full coverage via unit | tests often leads to thousands of low-ulitility or | redundant tests, and so on. However I cannot agree with | this: | | > _They also tend to slow down CI because they 're slow and | always flaky._ | | In my experience, unit tests are the most stable, the least | flaky, because they touch the least code and often have | very simple setup. An integration test might rely on four | database tables being just-so, and go on to connect with | two external services (and whether mocked, replayed, or | live, flakiness may arise). That integration test is twenty | times more valuable, but it is equally more likely to break | for reasons tangential to its core assertions. | yuppiepuppie wrote: | Title is misleading, it should be "Maintain a clean architecture | in FastAPI with dependency rules" | rekahrv wrote: | Thanks, that's a good point. We thought that a small FastAPI | project shows the general concept as well. Do you have | suggestions which other examples would be useful? | lyu07282 wrote: | It has nothing to do with FastAPI, pretty sure sourcery would | work with anything. | w_t_payne wrote: | I do exactly this in my side project. I have a set of rules which | put restrictions on which packages and modules can be included | from other packages and modules. For example, a high maturity | package is not allowed to depend upon a low maturity package. | Similarly, a core library package is not allowed to rely on a | package that is specific to a particular product or a particular | piece of bespoke development. In this way, much of the potential | for circular dependencies is eliminated, and the purpose and | internet is clearly communicated. | | (I don't do this using sourcery though ... I have my own set of | rules) | rekahrv wrote: | That's very cool. Can you tell a bit more about this set of | rules? | | "For example, a high maturity package is not allowed to depend | upon a low maturity package. Similarly, a core library package | is not allowed to rely on a package that is specific to a | particular product or a particular piece of bespoke | development." I really like these. | w_t_payne wrote: | I have a general system for representing metadata in source | files. (I use YAML documents embedded in block comments). | | Some of this metadata gives traceability information for | requirements, tests etc.. while other metadata enables me to | associate a maturity level with each file. | | My build system understands this metadata and uses it to | inform e.g. the minimum test coverage that it expects on a | file-by-file basis. | | The same metadata is used to ensure that all of the other | components that a file references are at the same level of | maturity or higher. | | I also have metadata for each file (partly derived from | location in the repository) that gives each file a number | which defines it's position in a hierarchy of design | elements. | | The position in the hierarchy helps to indicate what the | purpose of the file is. I use this to make a distinction | between those core, foundational, stable design elements upon | which other design elements may build, and those more | peripheral, ephemeral and 'agile' design elements which can | be quickly tailored to meet the needs of a client or partner. | | This means that a (hopefully stable) core API component can | be prevented from relying upon a (perhaps less stable) | bespoke customer-specific component. It also means that | there's more freedom in changing and adapting peripheral | designs as you can have confidence that it's stability is not | something that is going to be relied upon. | rekahrv wrote: | Thanks for the detailed description. That's a really | sophisticated system with several cool features. * minimum | test coverage on a file-by-file basis * various levels | maturity | | "It also means that there's more freedom in changing and | adapting peripheral designs as you can have confidence that | it's stability is not something that is going to be relied | upon." That's a big advantage, indeed. | | I also like the concept of storing this metadata next to | the code in structured comments. | cjohnson318 wrote: | I do a similar thing. Here's the style-guide I learned from: | https://phalt.github.io/django-api-domains/styleguide/ | | Basically, you have an api class for each Django app, and you | use this class for all external interactions. The api class | calls the service class, and the service class deals with the | Django ORM. I added a view class, which is my | DjangoRestFramework layer; so when a request comes in, it's | caught by my view class, and passed onto the api class. I have | DRF serializers for outgoing data, and pydantic schemas for | incoming data. I also have a selector class for read-only views | of my data. | | It's a _lot_ of typing, but I know exactly where everything is | when something goes wrong, or I need to add a small adjustment | somewhere, also it 's easy for new devs to learn and use. One | downside is that an api change require you to touch a dozen | files. | rekahrv wrote: | Thanks a lot. The Django API Domains Styleguide looks great. | Do you perhaps know some open source projects that follow | this structure? | mikeholler wrote: | Is there anything similar to this for Java/Kotlin/Gradle? | Sankozi wrote: | There is ArchUnit - https://www.archunit.org/ | mikeholler wrote: | Thanks, that looks like exactly what I was looking for. | naizarak wrote: | JPMS allows for exactly this, but that entire project was so | poorly implemented that no one uses it. ___________________________________________________________________ (page generated 2022-12-15 23:01 UTC)