[HN Gopher] The return of lazy imports for Python ___________________________________________________________________ The return of lazy imports for Python Author : mfiguiere Score : 49 points Date : 2022-12-25 17:18 UTC (5 hours ago) (HTM) web link (lwn.net) (TXT) w3m dump (lwn.net) | bfung wrote: | The only argument PEP 690 has is A) [performance on startup] or | B) [when import functions is not used] in main body of the code. | | For B), easy enough to run one of many linters to detect this | case and have people write less bad code. | | A) is way more subjective and can be fixed in many ways. | | With the many more Python coders these days with less coding | experience, personal feeling is please stop throwing these | production issue causing features in that I have to fix. Glad the | PEP is rejected. | | Old programmer wisdom is to load all your configs and assumptions | as early as possible to eliminate a whole space of problems with | your code, making faster and easier to read/reason about later. | rrdharan wrote: | I'm curious about the other ways to fix startup performance. | | I've seen a moderately sized (~300k LoC) Python CLI project | that had a horrendous, anger-inducing startup time until they | switched to the lazy import approach basically | described/standardized by PEP 690 and the improvement was | massive. | m000 wrote: | Not a word though about the elephant in the room: circular | imports. | | It is absurdly easy in Python to end up with a circular import | situtation, where no real circular dependency exists. E.g. you | can't have A.a1 -> B.b1 and B.b2 -> A.a2. So, you are forced to | layout your code in some quite awkward ways. | overgard wrote: | The only way to really fix that is to disallow code running | outside of functions, IE, basically a whole new language. | capableweb wrote: | Isn't having circular dependencies more awkward? Conceptually, | it makes things more intertwined when instead you can build a | better and more separated architecture. | marius_k wrote: | Is there an elegant way to import type hints without circular | imports? | [deleted] | lopz wrote: | Something like this does the trick: | | if TYPE_CHECKING: import WhateverClass | | https://docs.python.org/3/library/typing.html#typing.TYPE_C | H... | nomel wrote: | I'm a fan of having a single state, used for everything. | Splitting the code up into two states, one for the linter | and one for the execution, seems like a recipe for | incorrectness and confusion. I would hate to refactor | something like that. | closed wrote: | The issue is that sometimes a function can take a type | that is an optional dependency, so you don't want to | import it unless you are type checking. | | (And some types are defined in the typeshed so only exist | to be imported during type checking; eg the type checker | lib itself is a dependency in this case) | [deleted] | nijave wrote: | Lazy imports don't really seem that useful. The only time I've | found them useful (in a Ruby project) was for unit tests/local | development where only a small subset of the application is | loaded at a time. Anything long running you generally want the | predictability of loading everything up front. For command line | utilities, it seems like you're going to need to load the module | at some point or another regardless (if you're actually using it) | so I'm not sure how you'd see a gain unless there's some | async/multi thread hacks. | [deleted] | dalke wrote: | Some package developers don't want their users to have the two | step process of "import" than "use." NumPy imports 137 modules | with "import numpy", of which 94 are specifically in the NumPy | hierarchy: >>> import sys >>> | len(sys.modules) 83 >>> import numpy as np | >>> len(sys.modules) 220 >>> sum(1 for k in | sys.modules if "numpy" in k) 94 | | so people can write a one-liner like: >>> | np.polynomial.chebyshev.Chebyshev([0,1,3])(np.linspace(-1.0, | 1.0, 5)) array([ 2., -2., -3., -1., 4.]) | | without having to import np.polynomial.chebyshev.Chebyshev | first. | | This API design requires importing most of NumPy at startup, | which has a cost they didn't consider so important because | their users are primarily doing long-term computing and | notebook-style development, where startup cost is relatively | small. | | I've complained about this because I live in the short-lived | program world, where it's annoying to have a 0.1 second import | overhead if I only need one function from NumPy: | py310% time python -c 'pass' 0.025u 0.006s 0:00.03 66.6% | 0+0k 0+0io 0pf+0w py310% time python -c 'import numpy' | 0.142u 0.292s 0:00.14 307.1% 0+0k 0+0io 0pf+0w | | As I understand it, SciPy wants a similar API design goal, but | has a lot more packages. They've developed lazy imports to try | to have the best of both worlds. | | > For command line utilities, it seems like you're going to | need to load the module at some point or another regardless (if | you're actually using it) | | Thing is, you might not actually use it. If the command-line | tool uses subcommands, each different subcommand might need | only a subset of the full set of packages. | | Perhaps only one of the subcommands uses NumPy, while for 95% | of uses, NuPy isn't used at all. | | As the discussion for this feature points out, this can be | addressed by only importing when needed. (One of the reasons | I've started using click over argparse is click does more of | this separation for me.) However, it's somewhat fragile, in | that it's easy to add an rarely-needed expensive import at top- | level without noticing it, and requires some non-standard | tooling to detect issues, like the non-predictability you | mentioned. | | I personally want something like the lazy-/auto- importer in my | package, so I can reduce the two step process. My last package | released used module-level getattr functions, which gets me | mostly there, except for notebook auto-completion of the lazy | wrappers. (It works in the command-line shell though.) | | I can't import everything on startup because parts of my | package depend on third-party packages, which might not be | installed. I instead want to raise an ImportError when those | lazy objects are accessed. Plus, one of the third-party | packages is through a Python/Java bridge, which has its own | startup costs that I want to avoid. | slaymaker1907 wrote: | You could do the numpy style API lazily. They would just need | to each API as an object that does the imports dynamically. | klyrs wrote: | In really large projects (e.g. SciPy as mentioned in the | article), lazy imports make sense. Especially with the | popularity of decorators, importing a file without any apparent | module-level code will actually need to run a nontrivial amount | of code. Multiply that by a few thousand files in a library | with a tree of "import * from ..." and you're looking at | perhaps seconds of startup time. Lazy importing can short- | circuit that, but still make symbols available for ease of use. | kelsolaar wrote: | Numpy and Matplotlib were quite slow to import also, I | haven't timed them recently though. | isitmadeofglass wrote: | Might not grasp the full context here, but it's trivial to lazily | import modules in your own code. I know every beginners guide | will advice you not to do that, but that's just because it's an | easy footgun for new programmers. If you have some cli tool that | only needs scipy for certain sub commands you can just move it to | those subcommand calls so it's loaded when needed instead of up | front. | Waterluvian wrote: | As long as you eagerly check if it exists. Don't wait until | part way through your program to discover dependency issues. | scott_w wrote: | There is a downside: manually doing so means your import occurs | every time you call that function. This would avoid that by | only importing once lazily. | ledauphin wrote: | it's an extra function call, but module imports are cached so | you're not incurring the actual import cost. | T-A wrote: | You can say | | mylib = None | | in the global scope and then | | global mylib | | if mylib is None: import mylib | | in your function to avoid the extra function call. | bobbylarrybobby wrote: | Isn't `is` a function? A cheap function, but still a | python function | coredog64 wrote: | I usually only import argparse inside my 'if __name__ == | "__main__"' stanza. | dalke wrote: | That's not really the issue with argparse and subcommands. | | argparse with subcommands generally requires specifying all | of the options for all of the subcommands, even if you only | want one subcommand. | | These in turn may require importing subcommand-specific | modules, to handle things like the right 'type' handler in an | an add_argument() parameter. This callback function might, | depending on the input value, select one from a dozen | different additional packages. | | It's possible to avoid this, by deferring argument->type | processing until later, and having a single large module | containing all of the help strings and epilogs, though this | will separate your argparse code from your subcommand code, | and in general make things more complicated. I did this for a | while. | | Alternatively, you can create your own subcommand dispatch | system using an nargs="?" to get the subcommand and an | nargs=argparse.REMAINDER to capture the rest of the flags, to | pass to a new ArgumentParser, and develop a top-level --help | replacements. I tried this too. | | I've since decided to use click, which does a better job at | compartmentalizing at least this level of subcommand imports. | Mehdi2277 wrote: | That trivial way only lazily shallow imports. I don't see a | good way to do a lazy deep import. A lot of libraries I import, | then transitively import hundreds or more of other files. The | file I import I may only need a small subset of those | transitive imports. The lazy import pep would have meant that | whenever the import was finally executed, the imports in that | file are also lazy and only done if needed. | overgard wrote: | Personally, unless it has explicit "lazy" syntax I kind of hate | the idea. One thing I always liked about python was how | predictable and simple module imports are. | slaymaker1907 wrote: | I think such a mechanism already exists. You can just use the | functional import syntax inside of a block. However, I think | lazy imports could be ok so long as the language can show that | a particular module is side effect free (i.e. no globals aside | from something like constexpr). | masklinn wrote: | > You can just use the functional import syntax inside of a | block. | | You don't even need functional import syntax, but as TFA | notes this comes at a cost as it has to invoke the entire | import machinery, which it can only skip to an extent (once a | module is loaded and cached) as import hooks can have odd | behaviours. | code_runner wrote: | Doubtful lazy imports would've helped at all... I joined a | project where almost every import statement has side effects, | some of which took multiple minutes to read things into memory | etc. | | Tried some poor-man's debugging and never hit a breakpoint on the | first significant line of code... took a while to figure out as | it was my first Python project. | | It almost feels like Python needs a scripting and non-scripting | mode, or some kind of warning logging "you did everything wrong" | deepsun wrote: | Python is a very good scripting language, so good that people | sometimes mistake it for application language. | rtzuul wrote: | The import code in Python is a mess, probably made slower since | the advent of importlib. | | You need developers who care about fast, clean code to fix the | issue. Those kind of developers usually don't fare well in the | Python swamp, so it won't happen. | tersers wrote: | I get an impression that regardless of topic, it's difficult for | any decisions to be made for the future of Python? The discussion | seems to always revolve around "what if?"s in not the most | collaborative fashion. I wonder if what most languages need are | less experts in computer science or language theory or anything | technical, and more folks that can do facilitation. | setr wrote: | Why do you think managers take over everything? Regardless of | topic, any sufficiently large problem eventually becomes | primarily a coordination problem | epgui wrote: | Personally, I'd much rather languages be designed from | mathematical foundations and/or very careful theory. | CoastalCoder wrote: | There certainly are languages like that, but I think you'll | find there are tradeoffs to consider. Especially in | commercial software. | dragonwriter wrote: | Python decision-making is rather conservative, preferring deep | exploration of implications, because its a big, established | language with a lot of existing use to support, and because of | the 2->3 experience. | | I don't think lack of facilitation skill is an issue; its a | deliberate policy choice. | estebank wrote: | Is there _any_ production ready language that isn 't | conservative with its decision making? | dragonwriter wrote: | > Is there any production ready language that isn't | conservative with its decision making? | | There's variations of degree, but probably not. Part of | being production-ready is stability. | A4ET8a8uTh0 wrote: | The what if is a valid question I think. In my little corner of | the universe, my boss is genuinely ( and the more I think about | it, reasonably ) worried about introducing more dependency on | Python in our daily work. | | The are a lot of reasons not to introduce it, but 'what ifs' at | a company like ours could be devastating. I still think proper | precautions can be taken, but it is harder for me to say that I | would just say yes if I was in his shoes. | wpietri wrote: | I really appreciate both the good writeup here and the fact that | so many people are thinking through changes so carefully. | pard68 wrote: | I didn't understand this when it first came up and I still don't. | If you want to defer your imports than wait until it's needed. It | might be useful if you need to load a behemoth of a module for | some rarely used part of a CLI tool. Otherwise, an X ms load at | startup is hardly any different than the same X ms load in the | middle of the execution. And on a server it's actually worse. | [deleted] ___________________________________________________________________ (page generated 2022-12-25 23:00 UTC)