[HN Gopher] The return of lazy imports for Python
       ___________________________________________________________________
        
       The return of lazy imports for Python
        
       Author : mfiguiere
       Score  : 49 points
       Date   : 2022-12-25 17:18 UTC (5 hours ago)
        
 (HTM) web link (lwn.net)
 (TXT) w3m dump (lwn.net)
        
       | bfung wrote:
       | The only argument PEP 690 has is A) [performance on startup] or
       | B) [when import functions is not used] in main body of the code.
       | 
       | For B), easy enough to run one of many linters to detect this
       | case and have people write less bad code.
       | 
       | A) is way more subjective and can be fixed in many ways.
       | 
       | With the many more Python coders these days with less coding
       | experience, personal feeling is please stop throwing these
       | production issue causing features in that I have to fix. Glad the
       | PEP is rejected.
       | 
       | Old programmer wisdom is to load all your configs and assumptions
       | as early as possible to eliminate a whole space of problems with
       | your code, making faster and easier to read/reason about later.
        
         | rrdharan wrote:
         | I'm curious about the other ways to fix startup performance.
         | 
         | I've seen a moderately sized (~300k LoC) Python CLI project
         | that had a horrendous, anger-inducing startup time until they
         | switched to the lazy import approach basically
         | described/standardized by PEP 690 and the improvement was
         | massive.
        
       | m000 wrote:
       | Not a word though about the elephant in the room: circular
       | imports.
       | 
       | It is absurdly easy in Python to end up with a circular import
       | situtation, where no real circular dependency exists. E.g. you
       | can't have A.a1 -> B.b1 and B.b2 -> A.a2. So, you are forced to
       | layout your code in some quite awkward ways.
        
         | overgard wrote:
         | The only way to really fix that is to disallow code running
         | outside of functions, IE, basically a whole new language.
        
         | capableweb wrote:
         | Isn't having circular dependencies more awkward? Conceptually,
         | it makes things more intertwined when instead you can build a
         | better and more separated architecture.
        
           | marius_k wrote:
           | Is there an elegant way to import type hints without circular
           | imports?
        
             | [deleted]
        
             | lopz wrote:
             | Something like this does the trick:
             | 
             | if TYPE_CHECKING: import WhateverClass
             | 
             | https://docs.python.org/3/library/typing.html#typing.TYPE_C
             | H...
        
               | nomel wrote:
               | I'm a fan of having a single state, used for everything.
               | Splitting the code up into two states, one for the linter
               | and one for the execution, seems like a recipe for
               | incorrectness and confusion. I would hate to refactor
               | something like that.
        
               | closed wrote:
               | The issue is that sometimes a function can take a type
               | that is an optional dependency, so you don't want to
               | import it unless you are type checking.
               | 
               | (And some types are defined in the typeshed so only exist
               | to be imported during type checking; eg the type checker
               | lib itself is a dependency in this case)
        
               | [deleted]
        
       | nijave wrote:
       | Lazy imports don't really seem that useful. The only time I've
       | found them useful (in a Ruby project) was for unit tests/local
       | development where only a small subset of the application is
       | loaded at a time. Anything long running you generally want the
       | predictability of loading everything up front. For command line
       | utilities, it seems like you're going to need to load the module
       | at some point or another regardless (if you're actually using it)
       | so I'm not sure how you'd see a gain unless there's some
       | async/multi thread hacks.
        
         | [deleted]
        
         | dalke wrote:
         | Some package developers don't want their users to have the two
         | step process of "import" than "use." NumPy imports 137 modules
         | with "import numpy", of which 94 are specifically in the NumPy
         | hierarchy:                 >>> import sys       >>>
         | len(sys.modules)       83       >>> import numpy as np
         | >>> len(sys.modules)       220       >>> sum(1 for k in
         | sys.modules if "numpy" in k)       94
         | 
         | so people can write a one-liner like:                 >>>
         | np.polynomial.chebyshev.Chebyshev([0,1,3])(np.linspace(-1.0,
         | 1.0, 5))       array([ 2., -2., -3., -1.,  4.])
         | 
         | without having to import np.polynomial.chebyshev.Chebyshev
         | first.
         | 
         | This API design requires importing most of NumPy at startup,
         | which has a cost they didn't consider so important because
         | their users are primarily doing long-term computing and
         | notebook-style development, where startup cost is relatively
         | small.
         | 
         | I've complained about this because I live in the short-lived
         | program world, where it's annoying to have a 0.1 second import
         | overhead if I only need one function from NumPy:
         | py310% time python -c 'pass'       0.025u 0.006s 0:00.03 66.6%
         | 0+0k 0+0io 0pf+0w       py310% time python -c 'import numpy'
         | 0.142u 0.292s 0:00.14 307.1% 0+0k 0+0io 0pf+0w
         | 
         | As I understand it, SciPy wants a similar API design goal, but
         | has a lot more packages. They've developed lazy imports to try
         | to have the best of both worlds.
         | 
         | > For command line utilities, it seems like you're going to
         | need to load the module at some point or another regardless (if
         | you're actually using it)
         | 
         | Thing is, you might not actually use it. If the command-line
         | tool uses subcommands, each different subcommand might need
         | only a subset of the full set of packages.
         | 
         | Perhaps only one of the subcommands uses NumPy, while for 95%
         | of uses, NuPy isn't used at all.
         | 
         | As the discussion for this feature points out, this can be
         | addressed by only importing when needed. (One of the reasons
         | I've started using click over argparse is click does more of
         | this separation for me.) However, it's somewhat fragile, in
         | that it's easy to add an rarely-needed expensive import at top-
         | level without noticing it, and requires some non-standard
         | tooling to detect issues, like the non-predictability you
         | mentioned.
         | 
         | I personally want something like the lazy-/auto- importer in my
         | package, so I can reduce the two step process. My last package
         | released used module-level getattr functions, which gets me
         | mostly there, except for notebook auto-completion of the lazy
         | wrappers. (It works in the command-line shell though.)
         | 
         | I can't import everything on startup because parts of my
         | package depend on third-party packages, which might not be
         | installed. I instead want to raise an ImportError when those
         | lazy objects are accessed. Plus, one of the third-party
         | packages is through a Python/Java bridge, which has its own
         | startup costs that I want to avoid.
        
           | slaymaker1907 wrote:
           | You could do the numpy style API lazily. They would just need
           | to each API as an object that does the imports dynamically.
        
         | klyrs wrote:
         | In really large projects (e.g. SciPy as mentioned in the
         | article), lazy imports make sense. Especially with the
         | popularity of decorators, importing a file without any apparent
         | module-level code will actually need to run a nontrivial amount
         | of code. Multiply that by a few thousand files in a library
         | with a tree of "import * from ..." and you're looking at
         | perhaps seconds of startup time. Lazy importing can short-
         | circuit that, but still make symbols available for ease of use.
        
           | kelsolaar wrote:
           | Numpy and Matplotlib were quite slow to import also, I
           | haven't timed them recently though.
        
       | isitmadeofglass wrote:
       | Might not grasp the full context here, but it's trivial to lazily
       | import modules in your own code. I know every beginners guide
       | will advice you not to do that, but that's just because it's an
       | easy footgun for new programmers. If you have some cli tool that
       | only needs scipy for certain sub commands you can just move it to
       | those subcommand calls so it's loaded when needed instead of up
       | front.
        
         | Waterluvian wrote:
         | As long as you eagerly check if it exists. Don't wait until
         | part way through your program to discover dependency issues.
        
         | scott_w wrote:
         | There is a downside: manually doing so means your import occurs
         | every time you call that function. This would avoid that by
         | only importing once lazily.
        
           | ledauphin wrote:
           | it's an extra function call, but module imports are cached so
           | you're not incurring the actual import cost.
        
             | T-A wrote:
             | You can say
             | 
             | mylib = None
             | 
             | in the global scope and then
             | 
             | global mylib
             | 
             | if mylib is None:                  import mylib
             | 
             | in your function to avoid the extra function call.
        
               | bobbylarrybobby wrote:
               | Isn't `is` a function? A cheap function, but still a
               | python function
        
         | coredog64 wrote:
         | I usually only import argparse inside my 'if __name__ ==
         | "__main__"' stanza.
        
           | dalke wrote:
           | That's not really the issue with argparse and subcommands.
           | 
           | argparse with subcommands generally requires specifying all
           | of the options for all of the subcommands, even if you only
           | want one subcommand.
           | 
           | These in turn may require importing subcommand-specific
           | modules, to handle things like the right 'type' handler in an
           | an add_argument() parameter. This callback function might,
           | depending on the input value, select one from a dozen
           | different additional packages.
           | 
           | It's possible to avoid this, by deferring argument->type
           | processing until later, and having a single large module
           | containing all of the help strings and epilogs, though this
           | will separate your argparse code from your subcommand code,
           | and in general make things more complicated. I did this for a
           | while.
           | 
           | Alternatively, you can create your own subcommand dispatch
           | system using an nargs="?" to get the subcommand and an
           | nargs=argparse.REMAINDER to capture the rest of the flags, to
           | pass to a new ArgumentParser, and develop a top-level --help
           | replacements. I tried this too.
           | 
           | I've since decided to use click, which does a better job at
           | compartmentalizing at least this level of subcommand imports.
        
         | Mehdi2277 wrote:
         | That trivial way only lazily shallow imports. I don't see a
         | good way to do a lazy deep import. A lot of libraries I import,
         | then transitively import hundreds or more of other files. The
         | file I import I may only need a small subset of those
         | transitive imports. The lazy import pep would have meant that
         | whenever the import was finally executed, the imports in that
         | file are also lazy and only done if needed.
        
       | overgard wrote:
       | Personally, unless it has explicit "lazy" syntax I kind of hate
       | the idea. One thing I always liked about python was how
       | predictable and simple module imports are.
        
         | slaymaker1907 wrote:
         | I think such a mechanism already exists. You can just use the
         | functional import syntax inside of a block. However, I think
         | lazy imports could be ok so long as the language can show that
         | a particular module is side effect free (i.e. no globals aside
         | from something like constexpr).
        
           | masklinn wrote:
           | > You can just use the functional import syntax inside of a
           | block.
           | 
           | You don't even need functional import syntax, but as TFA
           | notes this comes at a cost as it has to invoke the entire
           | import machinery, which it can only skip to an extent (once a
           | module is loaded and cached) as import hooks can have odd
           | behaviours.
        
       | code_runner wrote:
       | Doubtful lazy imports would've helped at all... I joined a
       | project where almost every import statement has side effects,
       | some of which took multiple minutes to read things into memory
       | etc.
       | 
       | Tried some poor-man's debugging and never hit a breakpoint on the
       | first significant line of code... took a while to figure out as
       | it was my first Python project.
       | 
       | It almost feels like Python needs a scripting and non-scripting
       | mode, or some kind of warning logging "you did everything wrong"
        
         | deepsun wrote:
         | Python is a very good scripting language, so good that people
         | sometimes mistake it for application language.
        
       | rtzuul wrote:
       | The import code in Python is a mess, probably made slower since
       | the advent of importlib.
       | 
       | You need developers who care about fast, clean code to fix the
       | issue. Those kind of developers usually don't fare well in the
       | Python swamp, so it won't happen.
        
       | tersers wrote:
       | I get an impression that regardless of topic, it's difficult for
       | any decisions to be made for the future of Python? The discussion
       | seems to always revolve around "what if?"s in not the most
       | collaborative fashion. I wonder if what most languages need are
       | less experts in computer science or language theory or anything
       | technical, and more folks that can do facilitation.
        
         | setr wrote:
         | Why do you think managers take over everything? Regardless of
         | topic, any sufficiently large problem eventually becomes
         | primarily a coordination problem
        
         | epgui wrote:
         | Personally, I'd much rather languages be designed from
         | mathematical foundations and/or very careful theory.
        
           | CoastalCoder wrote:
           | There certainly are languages like that, but I think you'll
           | find there are tradeoffs to consider. Especially in
           | commercial software.
        
         | dragonwriter wrote:
         | Python decision-making is rather conservative, preferring deep
         | exploration of implications, because its a big, established
         | language with a lot of existing use to support, and because of
         | the 2->3 experience.
         | 
         | I don't think lack of facilitation skill is an issue; its a
         | deliberate policy choice.
        
           | estebank wrote:
           | Is there _any_ production ready language that isn 't
           | conservative with its decision making?
        
             | dragonwriter wrote:
             | > Is there any production ready language that isn't
             | conservative with its decision making?
             | 
             | There's variations of degree, but probably not. Part of
             | being production-ready is stability.
        
         | A4ET8a8uTh0 wrote:
         | The what if is a valid question I think. In my little corner of
         | the universe, my boss is genuinely ( and the more I think about
         | it, reasonably ) worried about introducing more dependency on
         | Python in our daily work.
         | 
         | The are a lot of reasons not to introduce it, but 'what ifs' at
         | a company like ours could be devastating. I still think proper
         | precautions can be taken, but it is harder for me to say that I
         | would just say yes if I was in his shoes.
        
       | wpietri wrote:
       | I really appreciate both the good writeup here and the fact that
       | so many people are thinking through changes so carefully.
        
       | pard68 wrote:
       | I didn't understand this when it first came up and I still don't.
       | If you want to defer your imports than wait until it's needed. It
       | might be useful if you need to load a behemoth of a module for
       | some rarely used part of a CLI tool. Otherwise, an X ms load at
       | startup is hardly any different than the same X ms load in the
       | middle of the execution. And on a server it's actually worse.
        
         | [deleted]
        
       ___________________________________________________________________
       (page generated 2022-12-25 23:00 UTC)