[HN Gopher] Modern Python Performance Considerations
       ___________________________________________________________________
        
       Modern Python Performance Considerations
        
       Author : chmaynard
       Score  : 226 points
       Date   : 2022-05-05 12:50 UTC (10 hours ago)
        
 (HTM) web link (lwn.net)
 (TXT) w3m dump (lwn.net)
        
       | kzrdude wrote:
       | Faster-cpython is not the main topic here but certainly welcome
       | since it's the most used python. They've done great things so
       | far. Though I remember I heard the promise of 50% improvement in
       | each of five separate steps :)
        
       | joncatanio wrote:
       | This is a great read, and it's fantastic to see all the work
       | being done to evaluate and improve the language!
       | 
       | The dynamic-nature of the language is actually something that I
       | had studied a few years back [1]. Particularly the variable and
       | object attribute look ups! My work was just a master's thesis, so
       | we didn't go too deep into more tricky dynamic aspects of the
       | language (e.g. eval, which we restricted entirely). But we did
       | see performance improvements by restricting the language in
       | certain ways that aid in static analysis, which allowed for more
       | performant runtime code. But for those interested, the abstract
       | of my thesis [2] gives more insight into what we were evaluating.
       | 
       | Our results showed that restricting dynamic code (code that is
       | constructed at run time from other source code) and dynamic
       | objects (mutation of the structure of classes and objects at run
       | time) significantly improved the performance of our benchmarks.
       | 
       | There was also some great discussion on HN when I had posted our
       | findings as well [3].
       | 
       | [1]: https://github.com/joncatanio/cannoli
       | 
       | [2]: https://digitalcommons.calpoly.edu/theses/1886/
       | 
       | [3]: https://news.ycombinator.com/item?id=17093051
        
         | Animats wrote:
         | _But we did see performance improvements by restricting the
         | language in certain ways that aid in static analysis, which
         | allowed for more performant runtime code._
         | 
         | Well, yes. In Python, one thread can monkey-patch the code in
         | another thread while running. That feature is seldom used. In
         | CPython, the data structures are optimized for that.
         | Underneath, everything is a dict. This kills most potential
         | optimizations, or even hard-code generation.
         | 
         | It's possible to deal with that efficiently. PyPy has a
         | compiler, an interpreter, and something called the "backup
         | interpreter", which apparently kicks in when the program being
         | run starts doing weird stuff that requires doing everything in
         | dynamic mode.
         | 
         | I proposed adding "freezing", immutable creation, to Python in
         | 2010, as a way to make threads work without a global lock.[1]
         | Guido didn't like it. Threads in Python still don't do much for
         | performance.
         | 
         | [1]
         | http://www.animats.com/papers/languages/pythonconcurrency.ht...
        
           | chrisseaton wrote:
           | > This kills most potential optimizations, or even hard-code
           | generation.
           | 
           | It doesn't - this has been a basically solved problem since
           | Self and deoptimisation were invented.
        
             | Animats wrote:
             | In theory, yes. In CPython, apparently not. In PyPy,
             | yes.[1] PyPy has to do a lot of extra work to permit some
             | unlikely events.
             | 
             | [1] https://carolchen.me/blog/jits-impls/
        
               | [deleted]
        
               | chrisseaton wrote:
               | You're trying to correct me by posting my own mentee's
               | blog post at me.
        
       | jerf wrote:
       | "Those techniques are based on the idea that most code "does not
       | use the full dynamic power that it could at any given time" and
       | that Python can quickly check to see if they are using the
       | dynamic features."
       | 
       | If anyone has a burning desire to try to write the next big
       | dynamically-typed scripting language, I've often noodled in my
       | head with the idea of a language that has a dynamically-typed
       | startup phase, but at some point you call "DoneBeingDynamic()" on
       | something (program, module, whatever, some playing would have to
       | be done here) and the dynamic system basically freezes everything
       | into place and becomes a static system. (Or you have an explicit
       | startup phase to your module, or something like that.)
       | 
       | The core observation I'm driving this on is much the same as the
       | quote I give from the article. You generally set up the vast
       | majority of your "dynamicness" once at runtime, e.g., you set up
       | your monkeypatches, you read the tables out of the DB to set up
       | your active classes, you read the config files and munge together
       | the configurations, etc. But then forever after, your dynamic
       | language is constantly reading this stuff, over and over and
       | _over and over_ again, millions, billions, trillions of times,
       | with it never changing. But it has to be read for the language to
       | work.
       | 
       | Combine that with perhaps some work on a system that backs to a
       | struct-like representation of things rather than a hash-like
       | representation, and you might be able to build something that
       | gets, say, 80% of the dynamicness of a 1990s-era dynamic
       | scripting language, while performing at something more like
       | compiled language speeds, albeit with a startup cost. If you
       | could skip over the dozens of operations resolving
       | x.y.z.q = 25
       | 
       | a dynamically-typed language like Python needs to properly
       | implement that and get down to a runtime that can do the same
       | thing compiled languages do by pre-computing the offset into a
       | struct and just setting the value, you might get near static-
       | language performance with dynamic typing affordances.
       | 
       | You can also view this as a Lisp-like thing that has an
       | integrated phase where it has macros, but then at some point puts
       | this capability down.
       | 
       | I tend to think it's just fundamentally flawed to take a language
       | that is intrinsically defined as "x.y.z.q" requiring dozens of
       | runtime operations versus trying to define a new one where it is
       | a first-class priority from day one that the system be able to
       | resolve that down to some static understanding of what "x.y.z.q"
       | is. e.g., it's OK if y is a property and z is some fancy override
       | if the runtime can simply hardcode the relevant details instead
       | of having to resolve them every time. You can outrun even JIT-
       | like optimizations if you can get this down to the point where
       | you don't even have to check incoming types, you just know.
        
         | marcosdumay wrote:
         | I disagree. You are just doing those same optimizations by
         | hand, instead of on a JIT. The computer is there to help us,
         | and a lot of the value in a dynamic language comes from being
         | able to override things at any time.
         | 
         | If you just set your structure up and run it statically, you
         | are better with a static language, that can take all kinds of
         | value from that fixed structure.
        
         | borodi wrote:
         | This feels like you are describing julia, startup cost included
         | :).
        
         | coldtea wrote:
         | > _I 've often noodled in my head with the idea of a language
         | that has a dynamically-typed startup phase, but at some point
         | you call "DoneBeingDynamic()" on something (program, module,
         | whatever, some playing would have to be done here) and the
         | dynamic system basically freezes everything into place and
         | becomes a static system. (Or you have an explicit startup phase
         | to your module, or something like that.)_
         | 
         | V8 tries to guess that for classes and objects based on runtime
         | information - that's how it gets some of its speed (it still
         | needs checks about whether this is violated at any point, so
         | that it can get rid of the proxy/stub "static" object it
         | guessed).
         | 
         | For a more static guarantee, there are also things like
         | Object.freeze which does about what you describe for dynamic
         | objects in JS (#).
         | 
         | # https://gist.github.com/briancavalier/3772938
        
           | jerf wrote:
           | I'd be curious to see if a language developed with the idea
           | that this is what it's going to do from scratch could do
           | better than trying to bodge it on afterwards. Rather than
           | pecking around what could be done literally decades after the
           | language is specified, what if you started out with this
           | idea?
           | 
           | I dunno. It's possible the real world would stomp all over
           | this idea in practice, or the resulting language would just
           | be too complex to be usable. It does imply a rather weird
           | bifurcation between being in "the init phase" and "the normal
           | runtime phase", and who knows what other "phases" could
           | emerge. Although technically, Perl actually already has this
           | split, although generally it can be ignored because it's of
           | much less consequence in Perl precisely because there mostly
           | isn't much utility to having something done in the earlier
           | phase, unlike this hypothetical language.
        
             | gpderetta wrote:
             | It seems that lisp-like macros or more generally multistage
             | compilation is close to what you have in mind.
        
               | jerf wrote:
               | Yes, it's not a brand-new dimension of programming
               | languages, merely a refinement of existing ideas. However
               | I'm not aware of anything quite like it out there. Lisp
               | could be used to implement it, but, I mean, that's not a
               | very strong statement now is it? Lisp can be used to
               | implement anything. The question is about whether it
               | exists.
               | 
               | Partially I throw this idea out as a bone to those who
               | like dynamic languages. Personally I don't have this
               | problem anymore because I've basically given them up,
               | except in cases where the problem is too small to matter.
               | And if you already know and like Lisp, you don't really
               | have this problem either.
               | 
               | But if you are a devotee of the 1990s dynamic scripting
               | languages, you're getting really squeezed right now by
               | performance issues. You can run 40-50x slower than C, or
               | you can run circa 10x slower than C with an amazing JIT
               | that requires a ton of effort and will forever be very
               | quirky with performance, and in both cases you'll be
               | doing quite a lot of work to use more than one core at a
               | time. Python is just hanging in there with the amazing
               | amount of work being poured into NumPy, and from what I
               | gather from my limited interactions with data scientists,
               | as data sets get larger and the pipelines more complex,
               | the odds you'll fall out of what NumPy can do and fall
               | back to pure Python goes up and the price of that goes up
               | too.
               | 
               | I think a new dynamic scripting language built from the
               | ground up to be multithreadable and high performance via
               | some techniques like this would have some room to run,
               | and while hordes of people will come out of the woodwork
               | promising that one of the existing ones will get there
               | Real Soon Now, just wait, they've almost got it, the
               | reality is I think that the current languages have pretty
               | much been pushed as far as they can be. Unless someone
               | writes this language, dynamic scripting languages are
               | going to continue slowly, quite slowly, but also quite
               | surely, just getting squeezed out of computing entirely.
               | I mean, I'm looking at the road ahead and I'm not sure
               | how Go or C# is going to navigate a world where even low-
               | end CPUs casually have 128 cores on consumer hardware....
               | Python _qua_ Python is going to face a real uphill battle
               | when the decision to use it entails basically committing
               | to not only using less than 1% of the available cores
               | (without offloading on to the programmer a significant
               | amount of work to get past that), but also using that
               | core ~1.5 orders of magnitude less efficiently than a
               | compiled language. You 've always had to pay some to use
               | Python, sure, but that's an awful lot of _orders of
               | magnitude_ for  "a nice language". Surely we can have "a
               | nice language" for less price than that.
        
         | ufo wrote:
         | This kind of sounds similar to what a JIT compiler does, except
         | that a JIT will silently fall back to slower code if you do
         | those forbidden dynamic things. I think the most appealing
         | thing about what you're suggesting here is less about the peak
         | performance and more about having better guarantees about
         | startup cost and that performance won't be degraded (prefer
         | failing loudly to chugging along unoptimized). These two areas
         | often aren't the strongest point in JIT-ed systems...
        
         | tln wrote:
         | This approach kind of describes Graal.
         | 
         | Interestingly, GraalPython never seems to come up on these
         | speeding-up-Python articles & benchmarks while TruffleRuby is a
         | heavyweight in the speeding-up-Ruby space.
        
           | kmod wrote:
           | I tried to benchmark GraalPython for the talk but the
           | compatibility situation was so poor that I wasn't even close
           | to being able to run any benchmarks.
        
         | w-m wrote:
         | This may be a naive question (I have very little knowledge
         | about building languages and compilers): Would this be possible
         | in Python by introducing a keyword like `final`? Any object,
         | variable, method that is marked final just has to be looked up
         | once by the interpreter, the re-fetching the article describes
         | doesn't have to happen again. Trying to change a final thing
         | results in an exception.
        
       | uncomputation wrote:
       | With JavaScript, these kinds of optimizations in an engine make
       | sense due to the web being limited by it and thus speed is a huge
       | factor. With Python, however, if a Python web framework is "too"
       | slow, I would honestly say the problem is using Python at all for
       | a web server. Python shines beautifully as a (somewhat) cross
       | platform scripting language: file reading and writing,
       | environment variables, simple implementations of basic utilities:
       | sort, length, max, etc that would be cumbersome in C. The move of
       | Python out of this and into practically everything is the issue
       | and then we get led into rabbit holes such as this where since we
       | are using Python, a dynamic scripting language, for things a
       | second year computer science student should know are not "the
       | right jobs for the tool."
       | 
       | Instead of performance, I'd like to see more effort in
       | portability, package management, and stability for Python
       | because, essentially since it is often enterprise managed,
       | juggling fifteen versions of Python where 3.8.x supports native
       | collection typing annotations but we use 3.7.x, etc. is my
       | biggest complaint. Also up there is pip and just the general mess
       | of dependencies and lack of a lock file. Performance doesn't even
       | make the list.
       | 
       | This is not to discredit anyone's work. There is a lot of
       | excellent technical work and research done as discussed in the
       | article. I just think honestly a lot of this effort is wasted on
       | things low on the priority tree of Python.
        
         | Barrin92 wrote:
        
         | waprin wrote:
         | On paper, Python is not the right tool for the job. Both
         | because of its bad performance characteristic and because it's
         | so forgiving/flexible/dynamic , it's tough to maintain large
         | Python codebases with many engineers.
         | 
         | At Google there is some essay that Python should be avoided for
         | large projects.
         | 
         | But then there's the reality that YouTube was written in
         | Python. Instagram is a Django app. Pinterest serves 450M
         | monthly users as a Python app. As far as I know Python was a
         | key language for the backend of some other huge web scale
         | products like Lyft, Uber, and Robinhood.
         | 
         | There's this interesting dissonance where all the second year
         | CS students and their professors agree it's the wrong tool for
         | the job yet the most successful products in the world did it
         | anyway.
         | 
         | I guess you could interpret that to mean all these people
         | building these products made a bad choice that succeeded
         | despite using Python but I'd interpret it as another instance
         | of Worse is Better. Just like Linus was told monolithic kernels
         | were the wrong tool for the job but we're all running Linux
         | anyway.
         | 
         | Sometimes all these "best practices" are just not how things
         | work in reality. In reality Python is a mission critical
         | language in many massively important projects and it's
         | performance characteristics matter a ton and efforts to improve
         | them should be lauded rather than scrutinized.
        
           | arinlen wrote:
           | > But then there's the reality that YouTube was written in
           | Python. Instagram is a Django app. Pinterest serves 450M
           | monthly users as a Python app. As far as I know Python was a
           | key language for the backend of some other huge web scale
           | products like Lyft, Uber, and Robinhood.
           | 
           | All those namedrops mean and matter nothing. Hacking together
           | proof of concepts is a time honoured tradition, as is pushing
           | to production hacky code that's badly stiched up. Who knows
           | if there was any technical analysis to pick Python over any
           | alternative? Who knows how much additional engineering work
           | and additional resources was required to keep that Python
           | code from breaking apart in production? I mean, Python always
           | figured very low in webapp framework benchmarks. Did that
           | changed just because <trendy company> claims it used Python?
           | 
           | Also serving a lot of monthly users says nothing about a tech
           | stack. It says a lot about the engineering that went into
           | developing the platform. If a webapp is architected so that
           | it can scale well to meet it's real world demand, even after
           | paying a premium for the poor choice of tech stack some guy
           | who is no longer around made in the past for god knows what
           | reason, what would that say about the tech stack?
        
           | dataflow wrote:
           | I don't think "I could use tool X for job Y" implies "X was
           | the right tool for jon Y". You could commute with a truck to
           | your workplace 300 feet away for 50 years straight and I
           | would still argue you probably used the wrong tool for the
           | job. "Wrong tool" doesn't imply "it is impossible to do
           | this", it just means "there are better options".
        
           | ChrisLomont wrote:
           | >the most successful products in the world did it anyway
           | 
           | A few successful projects in the world did it. There's likely
           | far more successful products that didn't use it.
           | 
           | The key metric along this line is how often each language
           | allows success to some level and how often they fail
           | (especially when due to the choice of language).
           | 
           | >should be lauded rather than scrutinized
           | 
           | One can do both at the same time.
        
             | sjtindell wrote:
             | Instagram has one billion monthly users generating $7
             | billion a year. There are almost zero products on earth as
             | successful.
        
               | arinlen wrote:
               | > Instagram has one billion monthly users generating $7
               | billion a year.
               | 
               | Doesn't Instagram serve mostly static content that's put
               | together in an appealing way by mobile apps? I'd figure
               | Instagram's CDN has far more impact than whatever Python
               | code it's running somewhere in it's entrails.
               | 
               | Cargo cult approaches to tech stacks don't define
               | quality.
        
               | xboxnolifes wrote:
               | The point is that it's still _one_ project. You need to
               | count the failures as well to rule out survivorship bias.
        
               | slt2021 wrote:
               | Just compare Instagram written in Python to Google Wave,
               | Google+ or any other Google's social media, written in
               | C++/Java :))))
        
               | jeremycarter wrote:
               | And you can put 7 billion of effort into tweaking your
               | python application performance?
        
             | fddhjjj wrote:
             | > The key metric along this line is how often each language
             | allows success to some level and how often they fail
             | 
             | How does python score on these key metrics?
        
           | w1nk wrote:
           | > There's this interesting dissonance where all the second
           | year CS students and their professors agree it's the wrong
           | tool for the job yet the most successful products in the
           | world did it anyway.
           | 
           | > I guess you could interpret that to mean all these people
           | building these products made a bad choice that succeeded
           | despite using Python but I'd interpret it as another instance
           | of Worse is Better. Just like Linus was told monolithic
           | kernels were the wrong tool for the job but we're all running
           | Linux anyway.
           | 
           | This isn't the correct perspective or take away. The 'tool'
           | for the job when you're talking about building/scaling a
           | website changes over time as the business requirements shift.
           | When you're trying to find market fit, iterating quickly
           | using 'RAD' style tools is what you need to be doing. Once
           | you've found that fit and you need to scale, those tools will
           | need to be replaced by things that are capable of scaling
           | accordingly.
           | 
           | Evaluating this binary right choice / wrong choice only makes
           | sense when qualified with a point in time and or scale.
        
         | digisign wrote:
         | The folks that work on performance are not the folks working on
         | packaging. Shall we stop their work until the packaging team
         | gets in gear?
        
         | rmbyrro wrote:
         | Totally agree that performance is not on my top 10 wish list
         | for Python.
         | 
         | But I disagree on " _not the right jobs for the tool_ ".
         | 
         | Python is extremely versatile and can be used as a valid tool
         | for a lot of different jobs, as long as it fits the _job
         | requirements_ , performance included.
         | 
         | It doesn't require a CS degree to know that fitting _job
         | requirements_ and other factors like the team expertise, speed,
         | budget, etc, are more important than fitting a theoretical
         | sense of  "right jobs for the tool".
        
           | blagie wrote:
           | > It doesn't require a CS degree to know that fitting job
           | requirements and other factors like the team expertise,
           | speed, budget, etc, are more important than fitting a
           | theoretical sense of "right jobs for the tool".
           | 
           | It requires experience.
           | 
           | A lot of those lessons only come after you've seen how much
           | more expensive it is to maintain a system than to develop
           | one, and how much harder people issues are than technical
           | issues.
           | 
           | A CS degree, or even a junior developer, won't have that.
        
           | moffkalast wrote:
           | Python can do just about anything... but it will take its
           | time doing it.
        
         | pjmlp wrote:
         | Agreed, my only use for Python since version 1.6, is portable
         | shell scripting or when sh scripts get too complicated.
         | 
         | Anything beyond that, there are compiled languages with REPL
         | available.
        
           | mrtranscendence wrote:
           | What compiled languages do you have in mind? I suppose
           | technically there are repls for C or Rust or Java, but I
           | wouldn't consider them ideal for interactive programming.
           | Functional programming might do a bit better -- Scala and
           | GHCi work fine interactively. Does Go have a repl?
        
             | eatonphil wrote:
             | > compiled languages
             | 
             | Might be tripping you up. Very few languages require that
             | _implementations_ be compiled or interpreted. For most
             | languages, having a compiler or interpreter is an
             | implementation decision.
             | 
             | I can implement Python as an interpreter (CPython) or as a
             | compiler (mypyc). I can implement Scheme as an interpreter
             | (Chicken Scheme's csi) or as a compiler (Chicken Scheme's
             | csc). The list goes on: Standard ML's Poly/ML
             | implementation ships a compiler and an interpreter; OCaml
             | ships a compiler and an interpreter.
             | 
             | There are interpreted versions of Go like
             | https://github.com/traefik/yaegi. And there are native-,
             | AOT-compiled versions of Java like GraalVM's native-image.
             | 
             | For most languages there need be no relationship at all
             | between compiler vs interpreter, static vs dynamic, strict
             | or no typing.
        
             | pjmlp wrote:
             | Java, C#, F#, Lisp variants, and C++.
             | 
             | Eclipse has Java scratchpads for ages, Groovy also works
             | out for trying out ideas and nowadays we have jshell.
             | 
             | F# has a REPL in ML linage, and nowadays C# also shares a
             | REPL with it in Visual Studio.
             | 
             | Lisp variants, going at it for 60 years.
             | 
             | C++, there are hot reload environments, scripting variants,
             | and even C and C++ debuggers can be quite interactive.
             | 
             | I used GDB in 1996, alongside XEmacs, as poor man's REPL
             | while creating a B+Tree library in C.
             | 
             | Yes, there are Go interpreters available,
             | 
             | https://github.com/traefik/yaegi
        
         | blagie wrote:
         | I want a common language I can work with. Right now, Python is
         | the only tool which fits the bill.
         | 
         | A critical thing is Python does numerics very, very well. With
         | machine learning data science, and analytics being what they
         | are, there aren't many alternatives. R, Matlab, and Stata won't
         | do web servers. That's not to mention wonderful integrations
         | with OpenCV, torch, etc.
         | 
         | Python is also competent at dev-ops, with tools like ansible,
         | fabric, and similar.
         | 
         | It does lots of niches well. For example, it talks to hardware.
         | If you've got a quadcopter or some embedded thing, Python is
         | often a go-to.
         | 
         | All of these things need to integrate. A system with
         | Ruby+R+Java will be much worse than one which just uses Python.
         | From there, it's network effects. Python isn't the ideal server
         | language, but it beats a language which _just_ does servers.
         | 
         | As a footnote, Python does package management much better than
         | alternatives.
         | 
         | pip+virtualenv >> npm + (some subset of require.js / rollup.js
         | / ES2015 modules / AMD / CommonJS / etc.)
         | 
         | JavaScript has finally gone from a horrible, no-good, bad
         | language to a somewhat competent one with ES2015, but it has at
         | least another 5-10 years before it can start to compete with
         | Python for numerics or hardware. It's a sane choice if you're
         | front-end heavy, or mobile-heavy. If you're back-end heavy
         | (e.g. an ML system) or hardware-heavy (e.g. something which
         | talks to a dozen cameras), Python often is the only sane
         | choice.
        
           | Denvercoder9 wrote:
           | > As a footnote, Python does package management much better
           | than alternatives.
           | 
           | If you use it as a scripting language, that might very well
           | be the case (it's at least simpler). When you're building
           | libraries or applications, no, definitely not. It's a huge
           | mess, and every 3 years or so we get another new tool that
           | promises to solve it, but just ends up creating a bigger
           | mess.
        
             | whimsicalism wrote:
             | I think poetry actually does solve it
        
               | Thrymr wrote:
               | Oh, there are a half dozen different tools that solve
               | python package management. Unfortunately, they are
               | mutually incompatible and none solve it for all use
               | cases.
        
           | whimsicalism wrote:
           | > it has at least another 5-10 years before it can start to
           | compete with Python for numerics or hardware
           | 
           | More, given that no language competes at high-level numerics
           | with Python outside of Julia and numerics in general only
           | adds C++.
        
             | Robotbeat wrote:
             | Fortran >:D
        
               | whimsicalism wrote:
               | For low-level, fair. I only know of people in astronomy
               | academia who actually use it nowadays though.
        
           | DeathArrow wrote:
           | >. R, Matlab, and Stata won't do web servers.
           | 
           | Not unless they're pushed to, like Python was.
           | 
           | >A critical thing is Python does numerics very, very well.
           | 
           | That's not Python doing numerical stuff. That's C code,
           | called from Python.
        
             | fractalb wrote:
             | > Not unless they're pushed to, like Python was.
             | 
             | Readability of code and ease of use is a big thing. It's
             | just not about pushing hard till we make it.
             | 
             | edit: formating
        
             | jonnycomputer wrote:
             | I wouldn't want to do a web-server in MATLAB. I like
             | MATLAB, but no, not that.
        
             | mrtranscendence wrote:
             | > That's not Python doing numerical stuff. That's C code,
             | called from Python.
             | 
             | That's sort of a distinction without a difference, isn't
             | it? Python can be good for numeric code in many instances
             | because someone has gone through the effort of implementing
             | wrappers atop C and Fortran code. But I'd rather be using
             | the Python wrappers than C or especially Fortran directly,
             | so it makes at least a little sense to say that Python
             | "does numerics [...] well".
             | 
             | > Not unless they're pushed to, like Python was.
             | 
             | R and Matlab, maybe. A web server in Stata would be a
             | horrible beast to behold. I can't imagine what that would
             | look like. Stata is a _terrible_ general purpose language,
             | excelling only at canned econometrics routines and
             | plotting. I had to write nontrivial Stata code in grad
             | school and it was a painful experience I 'd just as soon
             | forget.
        
               | disgruntledphd2 wrote:
               | You can do web stuff in R, but it's a lot harder than it
               | needs to be. R sucks for string interpolation, and a lot
               | of web related stuff is string interpolation.
        
               | mrtranscendence wrote:
               | Yeah, I'm not surprised by that. The extent of my web
               | experience in R is calling rcurl occasionally, so I've
               | never tried and failed to do anything complicated.
        
             | blagie wrote:
             | It's not C code. It calls into a mixture of C, CUDA,
             | Fortran, and a slew of other things. Someone did the work
             | of finding the best library for me, and integrating them.
             | 
             | As for me, I write:
             | 
             | A * B
             | 
             | It multiplies two matrices. C can't do that. In C, I'd have
             | some unreadable matrix64_multiply(a, b). Readability is a
             | big deal. Math should look more-or-less like math. I can
             | handle 2^4, or 2**4, but if you have mpow(2, 4) in the
             | middle of a complex equation, the number of bugs goes way
             | up.
             | 
             | I'd also need to allocate and free memory. Data wrangling
             | is also a disaster in C. Format strings were a really good
             | idea in the seventies, and were a huge step up from BASIC
             | or Python. For 2022?
             | 
             | And for that A * B? If I change data types, things just
             | work. This means I can make large algorithmic changes
             | painlessly.
             | 
             | Oh, and I can develop interactively. ipython and jupyter
             | are great calculators. Once the math is right, I can copy
             | it into my program.
             | 
             | I won't even get started on things like help strings and
             | documentation.
             | 
             | Or closures. Closures and modern functional programming are
             | huge. Even in the days of C and C++, I'd rather do math in
             | a Lisp (usually, Scheme).
             | 
             | I used to do numerics in C++, and in C before that. It's at
             | least a 10x difference in programmer productivity stepping
             | up to Python.
             | 
             | Your comment sounds like someone who has never done
             | numerical stuff before, or at least not serious numerical
             | stuff.
        
               | danuker wrote:
               | > the number of bugs goes way up
               | 
               | In case you are forced to use the unreadable long-named
               | unintuitively-syntaxed methods, add unit tests, and check
               | that input-output pairs match with whatever formula you
               | started with.
        
               | tomrod wrote:
               | Yet, Python (and most of her programmers including data
               | scientists, of which I am one) stumble with typing.
               | if 0.1 + 0.2 == 0.3:             print('Data is handled
               | as expected.')         else:             print('Ruh
               | roh.')
               | 
               | This fails on Python 3.10 because floats are not
               | decimals, even if we really want them to be. So most
               | folks ignore the complexity (due to naivety or
               | convenience) or architect appropriately after seeing
               | weird bugs. But the "Python is easiest and gets it right"
               | notion that I'm often guilty of has some clear edge
               | cases.
        
               | fnord123 wrote:
               | This is an issue for accountancy. Many numerical fields
               | have data coming from noisy instruments so being lossy
               | doesn't matter. In the same vein as why GPUs offer f16
               | typed values.
        
               | dullcrisp wrote:
               | Why would you want decimals for numeric computations
               | though? Rationals might be useful for algebraic
               | computations, but that'd be pretty niche. I'd think
               | decimals would only be useful for presentation and maybe
               | accountancy.
        
               | tomrod wrote:
               | Well, for starters folks tend to code expecting
               | 0.1+0.2=0.3, rather than abs(0.3-0.2-0.1) <
               | tolerance_value
               | 
               | Raw floats don't get you there unfortunately.
        
               | gjm11 wrote:
               | They also expect 1/3 + 1/3 + 1/3 == 1. Decimals won't
               | help with that.
        
               | kbenson wrote:
               | That's slightly different in that most programmers won't
               | read 1/3 as "one third" but instead "one divided by
               | three", and interpret that as three divisions added
               | together, and the expectations are different. Seeing a
               | constant written as a decimal invites people to think of
               | them as decimals, rather than the actual internal
               | representation, which is often "the float that most
               | closely represents or approximates that decimal".
        
               | dekhn wrote:
               | https://docs.python.org/3/library/decimal.html
        
               | tomrod wrote:
               | Correct! Many python users don't know about this and
               | similar libraries that assist with data types. Numpy has
               | several as well.
        
           | kbenson wrote:
           | > As a footnote, Python does package management much better
           | than alternatives
           | 
           | No offense meant, but that sounds like the assessment of
           | someone that has only experienced really shitty package
           | management systems. PyPI has had their XMLRPC search
           | interface disabled for months (a year?) now, so you can't
           | even easily figure out what to install from the shell and
           | have to use other tools/a browser to figure it out.
           | 
           | Ultimately, I'm moving towards thinking that most scripting
           | languages actually make for fairly poor systems and admin
           | languages. It used to be the ease of development made all the
           | other problems moot, but there's been large advances in
           | compiled language usability.
           | 
           | For scripting languages you're either going to follow the
           | path or Perl or the the path of Python, and they both have
           | their problems. For Perl, you get amazing stability at the
           | expense of eventually the language dying out because there's
           | not enough new features to keep people interested.
           | 
           | For Python, the new features mean that module writers want to
           | use them, and then they do, and you'll find that the system
           | Python you have can't handle what modules need for things you
           | want to install, and so you're forced to not just have a
           | separate module environment, but fully separate pythons
           | installed on servers so you cane make use of the module
           | ecosystem. For a specific app you're shipping around this is
           | fine, but when maintaining a fleet of servers and trying to
           | provide a consistent environment, this is a big PITA that you
           | don't want to deal with when you've already chosen a major
           | LTS distro to avoid problems like this.
           | 
           | Compiling a scripting language usually doesn't help much
           | either, as that usually results in extremely bloated binaries
           | which have their own packaging and consistency problems.
           | 
           | This is cyclical problem we've had so far. A language is used
           | for admin and system work, the requirements of administrators
           | grate up against the usage needs of people that use the
           | language for other things, and it fails for non-admin work
           | and loses popularity and gets replaced be something more
           | popular (Perl -> Python) or it fails for admin work because
           | it caters to other uses and eventually gets replaced by
           | something more stable (what I think will happen to Python,
           | what I think somewhat happened to bash earlier for slightly
           | different reasons).
           | 
           | I'm not a huge fan of Go, but I can definitely see why people
           | switch to it for systems work. It alleviates a decent chunk
           | of the consistency problems, so it's at least better in that
           | respect.
        
             | jonnycomputer wrote:
             | >No offense meant, but that sounds like the assessment of
             | someone that has only experienced really shitty package
             | management systems. PyPI has had their XMLRPC search
             | interface disabled for months (a year?) now, so you can't
             | even easily figure out what to install from the shell and
             | have to use other tools/a browser to figure it out.
             | 
             | Yes, this is, frankly, an absurd situation for python.
             | 
             | And then there is the fact that I end up depending on
             | third-party solutions to manage dependencies. Python is
             | big-time now; stop the amateur hour crap.
        
         | the__alchemist wrote:
         | I agree! Here's a related point: Rust seems ideal for web
         | servers, since it's fast, and is almost as ergonomic as Python
         | for things you listed as cumbersome in C. So, why do I use
         | Python for web servers instead of Rust? Because of the robust
         | set of tools of tools Django provides. When evaluating a
         | language, fundamentals like syntax and performance are one
         | part. Given web server bottlenecks are I/O limited (mitigating
         | Python's slowness for many web server uses), and that I'd have
         | to reinvent several wheels in Rust, I use Python for current
         | and future web projects.
         | 
         | Another example, with a different take: MicroPython, on
         | embedded. The only good reason I can think for this is to
         | appeal to people who've learned Python, and don't want to learn
         | another language.
        
         | rootusrootus wrote:
         | > the problem is using Python at all for a web server
         | 
         | I don't agree with this. Maybe for a web server where
         | performance is really going to matter down to the microsecond,
         | and I've got no other way to scale it. I write server code in
         | both Javascript and Python, and despite all of my efforts I
         | still find that I can spin up a simple site in something like
         | django and then add features to it much more easily than I can
         | with node. It just has less overhead, is simpler, lets me get
         | directly to what I need without having to work too hard. It's
         | not like express is _hard_ per se, but python is such an easy
         | language to work with and it stays out of my way as long as I
         | 'm not trying to do exotic things.
         | 
         | And then it pays dividends later, as well, because it's really
         | easy for a python developer to pick up code and maintain it,
         | but for JS it's more dependent on how well the original
         | programmer designed it.
        
           | srcreigh wrote:
           | The problem with Django services is the insanely low
           | concurrency level compared to other server frameworks
           | (including node).
           | 
           | Django is single request at a time with no async. The
           | standard fix is gunicorn worker processes, but then you
           | require entire server memory * N memory instead of
           | lightweight thread/request struct * N memory for N requests.
           | 
           | I shudder to think that whenever Django server is doing an
           | HTTP request to a different service or running a DB query,
           | it's just doing nothing while other requests are waiting in
           | the gunicorn queue.
           | 
           | The difference is if you have an endpoint with 2s+ queries
           | taking 2s for one customer, with Django, it might cause the
           | entire service to stall for everybody, whereas with a decent
           | async server framework other fast endpoints can make progress
           | while the 2s ones are slow.
        
             | pdhborges wrote:
             | You can configure gunicorn to use multiple threads to
             | recover quite a bit of concurrency in those scenarios and
             | that is enough for many applications.
        
               | srcreigh wrote:
               | What threading/workers configuration do you use?
               | 
               | I'm looking at a page now which recommends 9 concurrent.
               | requests for a Django server running on a 4 core
               | computer.
               | 
               | Meanwhile node servers can easily handle hundreds of
               | concurrent requests.
        
               | pdhborges wrote:
               | We use the ncpu * 2 + 1 formula for the number of workers
               | that serve API requests.
               | 
               | I don't think in 'handling x concurrent requests' terms
               | because I don't even know what that means. Usually I
               | think around thoughout, latency distributions and number
               | of connections that can be kept open (for servers that
               | deal with web sockets).
               | 
               | For example if you have the 4 core computer and you have
               | 4 workers and your requests take around 50ms each you can
               | get to a throughput of 80 requests per second. If the
               | fraction of request time for IO if 50% you can bump your
               | thread count to try to reach 160 request per second. Note
               | that in this case each request consumes 25ms of CPU so
               | you would never be able to get more than 40 requests per
               | second per CPU whether you are using node or python.
        
             | manfre wrote:
             | Django has async support for everything except the ORM.
             | async db is possible without the ORM or by doing some
             | thread pool/sync to async wrapping. A PR for that was under
             | review last I checked.
             | 
             | Either way, high concurrency websites shouldn't have
             | queries that take multiple seconds and it's still possible
             | to block async processes in most languages if you mix in a
             | blocking sync operation.
        
         | dirnctiwnsidj wrote:
         | This sounds like sour grapes. Python is a general-purpose
         | language. Languages like Awk and Perl and Bash are clearly
         | domain-specific, but Python is a pretty normal procedural
         | language (with OO bolted on). The fact that it is dynamic and
         | high-level does not mean it is unsuited for applications or the
         | back-end. People use high-level dynamic languages for servers
         | all the time, like Groovy or Ruby or, hell, even Node.js.
         | 
         | What about Python makes it unsuitable for those purposes other
         | than its performance?
        
         | make3 wrote:
         | I'm not sure it's very relevant to say in a discussion of the
         | answer of "how do we improve Python" is "don't use Python".
         | People have all kinds of valid reasons to use Python. Let's
         | keep this on topic please
        
         | heavyset_go wrote:
         | > _Also up there is pip and just the general mess of
         | dependencies and lack of a lock file._
         | 
         | You can use pyproject.toml or requirements.txt as lock files,
         | Poetry can use the former and poetry.lock files, as well.
        
         | marius_k wrote:
         | > and lack of a lock file
         | 
         | Is it possible to solve your problem using pip freeze?
        
         | robotsteve2 wrote:
         | The world doesn't revolve around web development. It's not the
         | only use case. Scientific Python is huge and benefits
         | tremendously from the language being faster. If Python can be
         | 1% faster, that's a significant force multiplier for scientific
         | research and engineering analysis/design (in both academia and
         | industry).
        
           | mrtranscendence wrote:
           | Because most of the really huge scientific Python libraries
           | are written as wrappers over lower-level language code, I'd
           | be curious to what extent speeding up Python by, say, 10%
           | would speed up "normal" scientific Python code on average.
           | 1%? 5%?
        
             | animatedb wrote:
             | If you are talking about large sets of numbers, then the
             | speed up will be far below 1%.
        
       | DeathArrow wrote:
       | >The first topic he raised, "why Python is slow", is somewhat
       | divisive
       | 
       | What dynamic, interpreted, single threaded language is fast?
        
         | baisq wrote:
         | Practically every other language that ticks those boxes is
         | faster than Python.
        
         | bsder wrote:
         | > What dynamic, interpreted, single threaded language is fast?
         | 
         | Javascript. End of list.
         | 
         | The problem is that a Javascript implementation is now _so_
         | complicated that you can 't develop a new one without massive
         | investment of resources.
        
         | brokencode wrote:
         | As far as interpreted languages go, Wren is pretty quick, but
         | still not fast compared to compiled languages.
         | 
         | But for dynamic, single threaded languages, JavaScript is
         | famously fast with a modern JIT compiler like V8.
        
         | Qem wrote:
         | Lua (LuaJIT implementation). Some Smalltalk VMs are also quite
         | fast. For example, see Eliot Miranda work on CogVM.
        
           | astrobe_ wrote:
           | Your pushing it a bit too far if you say that JIT is
           | interpreted.
           | 
           | To answer OP, if you replace "dynamic" by "untyped", Forth
           | qualifies. And it actually can go where there's no JIT to
           | save your A from the "just throw more hardware (and software)
           | at the problem" mindset.
        
             | Qem wrote:
             | I think someone once said dynamic langs must cheat to be
             | performant. Jitted runtimes are just interpreters cheating.
        
       | DeathArrow wrote:
       | What's wrong with using the right tool for the right job? Python
       | for utility scripts, Javascript for Web frontend, C and C++ for
       | system programming, C# for Web backend, R for statistical stuff
       | and data analysis?
       | 
       | It seems to me some guys learned a language suited to a thing and
       | instead of learning other languages better fitted for other
       | purposes, they push for their one and only language to be used
       | everywhere, resulting in delays and financial losses.
       | 
       | It's not very hard to learn another language. Or, if you are that
       | lazy, you can stay with the language you know and use it for what
       | was intended.
        
         | ReflectedImage wrote:
         | Python has domain on web backend, statistical stuff and data
         | analysis nowadays.
        
         | supreme_berry wrote:
         | Dumbest comment on the thread?
        
       | dijit wrote:
       | As a very partial; almost unrelated question: Is there any python
       | module that you use day-to-day that you'd like to have a
       | significant speedup with?
       | 
       | I'm thinking of reimplementing some python modules in rust, as
       | that seems like the kind of weird thing I'm in to. I've done it
       | with some success (using the excellent work of the pyo3 project)
       | professionally, but I'd be interested in doing more.
        
         | yedpodtrzitko wrote:
         | Pydantic is quite popular library. Its author is doing exactly
         | this - rewriting its core [0] in Rust. It's still WIP, but
         | readme mentions that "Pydantic-core is currently around 17x
         | faster than Pydantic Standard."
         | 
         | [0] https://github.com/samuelcolvin/pydantic-core
        
         | tclancy wrote:
         | Not working in Python right now, but I have 15 years of Python
         | + Django on the web and while there are any number of attempts
         | at this (I keep a list at
         | https://pinboard.in/u:tclancy/t:json/t:python/), any
         | improvement in JSON serialization and unserialization speeds is
         | a huge boon to projects. I am trying to think of similar
         | bottlenecks where a drop-in replacement can be a huge
         | performance improvement.
        
           | JackC wrote:
           | The missing thing last time I looked was a fast python json
           | library that's byte-compatible with stdlib -- same inputs,
           | same outputs. There are good fast options but they tend to
           | add some (perfectly reasonable) limitation like fixed
           | indentation size, for the sake of speed, that blocks them
           | from being dropped into an existing public API.
        
         | dotnet00 wrote:
         | Definitely matplotlib. Navigating image plots in interactive
         | mode with even just 10000x10000 pixels is painfully slow. While
         | I've picked up some alternatives, they don't feel as clean as
         | matplotlib.
        
           | wcunning wrote:
           | 10000% -- matplotlib for visualization of a lot of different
           | data I've looked at, but esp things like high res images in
           | machine learning contexts is incredibly slow, even on good
           | computers. It does fine for small vector stuff and render
           | once and save graphs, but it's bad for what a lot of people
           | use it for.
        
         | mritchie712 wrote:
         | pandas
        
           | curiousgal wrote:
           | I remember, when trying to squeeze some performance out of
           | it, that a lot of the overhead came from it trying to infer
           | types.
        
           | w-m wrote:
           | This is a curious reply for me. I would think that there are
           | very few parts in pandas that could be sped-up by
           | reimplementing them with a compiled language. Pandas is
           | plenty fast for the built-in methods, it only gets slow when
           | you start interfacing with Python, e.g. by doing an `.apply`
           | with your custom Python method. Obviously this interfacing
           | part is impossible to speed up by reimplementing parts of
           | pandas (you'd need a different API instead).
        
           | mynameis_jeff wrote:
           | would give https://github.com/modin-project/modin a shot
        
           | SnooSux wrote:
           | It's been done: https://github.com/pola-rs/polars
           | 
           | But I'm sure there's always room for improvement
        
             | rytill wrote:
             | It's not like polars is a drop-in replacement, it has a
             | totally different API.
        
               | mrtranscendence wrote:
               | You wrote "it has a totally different API", did you mean
               | "it has an actually sane API?" Because that's what I
               | think of when I compare pandas to polars.
        
           | fgh wrote:
           | The answer would then be to have a look at polars.
        
         | zmgsabst wrote:
         | You'd be awesome if you wrote a library for large image
         | processing.
         | 
         | You can make large Numpy arrays fine -- eg, 20k x 20k or 500k x
         | 500k, but trying to render that to anything but SVG or manual
         | tilings pukes badly.
         | 
         | That's my main blocker on rendering high dimensional shapes:
         | you can do the math, but visualizations immediately fall over
         | (unless you do tiling yourself).
         | 
         | There's probably someone with a more useful idea than
         | "gigapixel rendering" though.
        
       | DeathArrow wrote:
       | As I see it, Python is good for glue code and small scripts where
       | performance usually doesn't matter. Even if it would be more
       | performant, it would be a nightmare for large code bases since
       | it's dynamically typed.
       | 
       | I really enjoy Nim which is "slick as Python, fast as C".
        
         | supreme_berry wrote:
         | You wouldn't believe how many near-FAANGS have hundreds of
         | large backend services on Python without any issues and from
         | times where typing was in docstrings.
        
           | baisq wrote:
           | Because they have insane amounts of money that they can throw
           | at the machines.
        
             | bjourne wrote:
             | I one had a database-backed website serving 50k unique
             | visitors/day written in Django and hosted on a low-budget
             | vps. Worked like a charm with very few hiccups.
        
             | CraigJPerry wrote:
             | I was curious so i had a bash at comparing the cost of just
             | buying another server to throw at the problem vs telling a
             | FAANG dev to optimise the code.
             | 
             | A dedicated 40core / 6Tb server is around $2k but will be
             | amortized over the years of its life. It needs power,
             | cooling, someone to install it in a rack, someone to
             | recycle it afterwards, ..., around $175/yr
             | 
             | A FAANG dev varies wildly but $400k seems fair-ish (given
             | how many have TC > 750k).
             | 
             | So that's about 12 hours of time optimising the code vs
             | throwing another 40c / 6Tb machine at the problem for 365
             | days.
             | 
             | The big cost i'm missing out of both the server and the
             | developer is the building they work in. What's the recharge
             | for a desk at a FAANG, $150k/yr ? I have no idea how much a
             | rack slot works out at.
             | 
             | Unless i've screwed up the figures anywhere, we should
             | probably all be looking at replacing Python with Ruby if we
             | can squeeze more developer productivity!
        
       | SuaveSteve wrote:
       | Why not switch to making __slots__ in classes the default and
       | then making attribute changes to an object during runtime an opt-
       | in? It will require a long grace period but wouldn't it help
       | optimisation efforts immensely?
        
         | BurningFrog wrote:
         | Where can I read about what kind of performance improvements
         | `__slots__` brings?
        
           | WillDaSilva wrote:
           | The Python docs themselves is a good place to start:
           | https://docs.python.org/3/reference/datamodel.html#slots
           | 
           | The Python wiki also has some good info about it:
           | https://wiki.python.org/moin/UsingSlots
        
         | gjulianm wrote:
         | That's going to require quite a lot of changes, it's a giant
         | breaking change. All classes would need someone to go around
         | finding all the attributes that are created and adding an
         | __slots__ dictionary, to avoid regular attribute initialization
         | in __init__ failing. It's a massive task, and it would
         | completely break backwards compatibility for performance gains
         | that not everybody will need.
        
         | anamax wrote:
         | default __slots__ breaks a lot of monkey patching.
         | 
         | An "easier" change would be to add a class attribute
         | "no__dict__", which says that the __dict__ attribute can't be
         | used, which lets the implementation do whatever it wants. That
         | can be incrementally added to classes.
         | 
         | Another option is a "no__getattr__" attribute, which disables
         | gettattr and friends.
        
         | yedpodtrzitko wrote:
         | That would mean all installed dependencies need to comply with
         | this change as well, which is unlikely to happen in any
         | realistic timeframe.
        
       | [deleted]
        
       | g42gregory wrote:
       | 15 years ago I remember reading Guido van Rossum saying that
       | Python is a connector language and if you need performance, just
       | drop into C and write/use a C module. I thought it was crazy at
       | the time, but now I see that he was absolutely right. It took a
       | while, but now Python has a high-performing C module for pretty
       | much every task.
        
         | chrisseaton wrote:
         | But these don't compose right? Each is a black-box to each
         | other? A black-box add and a black-box multiply don't fuse.
        
           | kylebarron wrote:
           | They can! Numpy exposes a C API to other Python programs [0].
           | It's not hard to write a Cython library that uses the Numpy C
           | API directly and does not cross into Python [1].
           | 
           | [0]: https://numpy.org/doc/stable/reference/c-api/index.html
           | 
           | [1]: https://github.com/kylebarron/pymartini/blob/4774549ffa2
           | 051c...
        
             | chrisseaton wrote:
             | So they can if you use their specific API? It doesn't
             | naturally compose in conventional Python code?
        
           | dekhn wrote:
           | C and Python are not black boxes to each other. The entire
           | python interpreter is literally a C API. You can create
           | pyobjects, add heterogenous PyObjects to PyLists, etc. So
           | evewrything in Python can be introspected from C.
           | 
           | Turned around, Python has arbitrary access into the C
           | programming space (really, the UNIX or Windows process it's
           | running inside), so long as it has access to headers or other
           | type info it can see C with more than black box info.
           | 
           | Most python numerics is implement in numpy; the low levels of
           | numpy are actually (or were) effectively a C API implementing
           | multiple dimension arrays, with a python wrapper.
        
             | klyrs wrote:
             | You're talking past chrisseaton's point here. If you want
             | two C extensions to interoperate with bare-metal
             | performance, you can't just do                 from lib1
             | import makedata       from lib2 import processdata
             | data = makedata()       print(processdata(data))
             | 
             | Because makedata needs to provide a c->py bridge and
             | processdata needs a py->c bridge, so your process
             | inherently has python in the middle unless lib2 has
             | intimate knowledge of lib1. It can absolutely be done (I've
             | written plenty of c extensions that handle numpy arrays,
             | for example) but if somebody hasn't done the work, you
             | don't get it for free. If your c extension expects a list
             | of lists of floats, the numpy array totally supports that
             | interface... but (last I checked) the overhead there is way
             | slower than calling list(map(list, data)) and throwing that
             | into your numpy-naive c extension.
        
             | chrisseaton wrote:
             | > C and Python are not black boxes to each other
             | 
             | Yes they are - the C interpreter knows _nothing_ about what
             | your C extension does. It can't optimise it because all it
             | has is your machine code - no higher level logic.
        
       | n8ta wrote:
       | Print doesn't have to be re resolved on every access... Not sure
       | about python but many interpreters do a resolution pass that
       | matches declarations and usages (and decides where data lives,
       | stack, heap, virtual register, whatever)
        
         | SnowflakeOnIce wrote:
         | In Python semantics, indeed, 'print' does need to be looked up
         | each time!
        
       | dataflow wrote:
       | > Python can quickly check to see if they are using the dynamic
       | features
       | 
       | I don't understand how this is supposed to be "quickly"
       | verifiable?
       | 
       | Nothing prevents you from doing eval('gl' + 'obals')()['len'] =
       | ...; how is the interpreter supposed to quickly check that this
       | isn't the case when you're calling a function that might not even
       | be in the current module?
       | 
       | Doing this correctly would seem to require a ton of static
       | analysis on the source or bytecode that I imagine will at _best_
       | be slow, and at worst impossible due to the halting problem.
        
         | [deleted]
        
         | kmod wrote:
         | Python dictionaries now have version counters that track how
         | many times they were modified, so the quick check is to ask
         | "was len not overidden last time and is the number of
         | modifications to the globals the same as it was last time".
        
         | gpderetta wrote:
         | One possibility is to move the cost to the assignment, so the
         | code that assigns a new value to the global 'len' function is
         | going to track and invalidate all cached lookups. Hopefully you
         | are changing the binding of 'len' less often than you are
         | calling it :)
        
           | kmod wrote:
           | Cinder does this (invalidation), and both Faster CPython and
           | Pyston use guarding.
        
             | gpderetta wrote:
             | Right, of course, guarded devirtualization is a common
             | technique.
        
         | [deleted]
        
         | bootwoot wrote:
         | I was reading this as an undetailed description of state
         | available WITHIN the interpreter. Probably there is a table of
         | globals that you can simply check last modification on or
         | something like this. Whether you hit it with eval or some other
         | tricky code, you can't modify a global without the interpreter
         | knowing about it.
        
           | dataflow wrote:
           | If that's what they mean, how would that be any faster than
           | what's going on right now? I thought normally when you hit a
           | callable, the interpreter would just look up its name, check
           | to see if it's a built-in, and then call the built-in if
           | so... whereas in this case you'd still have to look up the
           | name of the callable (is the idea to bypass this somehow?
           | what do they do currently?), check to see if it's different
           | than the built-in you'd _expect_ from the name (i.e. if it 's
           | ever been reassigned to), then call that expected built-in if
           | it's not... which seems like the same thing? At best it would
           | seem to convert 1 indirect call to a direct call, which would
           | be negligible for something like Python. Is the current
           | implementation somehow much slower than I'm imagining? What
           | am I missing?
        
             | the-lazy-guy wrote:
             | You could do something like primitive inline cache. Store
             | "version" of the globals in another variable. Each time
             | globals are modified - bump the version. For each call-site
             | and/or keep what the global name is resolved to + version
             | of "globals object" in a static variable. Now you can avoid
             | name resolution if version hasn't changed between two
             | executions of the line. Now in fast-path you just pay the
             | price of (easily predicted, because globals almost never
             | change) single compare and jump vs full hash-table lookup.
        
               | dataflow wrote:
               | I think the core of the optimization you're mentioning
               | hinges on a normal lookup being a slow hashtable lookup
               | (of a string?)... whereas I imagined the first thing the
               | interpreter would do would be to intern each name and
               | assign it a unique ID (as soon as during parsing, say)
               | and use that thereafter whenever they're not forced to
               | use a string (like with globals()). That integer could
               | literally be a global integer index into a table of
               | interned strings, so you could either avoid hashing
               | entirely (if the table isn't too big) or reduce it to
               | hashing an int, both of which are much faster than
               | hashing a string. Do they not do that already? Any idea
               | why? I feel like that's the real optimization you'd need
               | if checking a key in a hashtable is the slow part (and
               | it's independent of whether the value is being modified).
        
         | blagie wrote:
         | I don't think the world is quite so bad.
         | 
         | x86 processors solve this by speculating about what's going on.
         | If you suddenly run into a 1976-era operation, everything slows
         | down dramatically for a bit (but still goes faster than an
         | 8086). If you have a branch or cache miss, things slow down a
         | little bit.
         | 
         | One has a few possibilities:
         | 
         | - A static analysis /proves/ something. print is print. You
         | optimize a lot.
         | 
         | - A static analysis /suggests/ something. print is print,
         | unless redefined in an eval. You just need to go into a slow
         | path in operations like `eval`, so if print is modified, you
         | invalidate the static analysis.
         | 
         | - A static or dynamic analysis suggests something
         | probabilistically. You can make the fast path fast, and the
         | slow path eventually work. If print isn't print, you raise an
         | internal exception, do some recovery, and get back to it.
         | 
         | I'm also okay with this analysis being run in prod and not in
         | dev.
         | 
         | As a footnote, JITs, especially in Java, show that this kind of
         | analysis can be pretty fast. You don't need it to work 100% of
         | the time. The case of a variable being redefined in a dozen
         | places, you just ignore. The case where I call a function from
         | three places which increments an integer each time, I can find
         | with hardly any overhead at all. The latter tends to be where
         | most of the bottlenecks are.
        
         | chrisseaton wrote:
         | > I don't understand how this is supposed to be "quickly"
         | verifiable?
         | 
         | You don't verify, and instead you run assuming no verification
         | is needed. Then if someone wants to violate that assumption,
         | it's their problem to stop everyone who may have made that
         | assumption, and to ask them to not make it going forward.
         | 
         | You shift the cost to the person who's doing the
         | metaprogramming and keep it free for everyone who isn't.
         | 
         | https://chrisseaton.com/truffleruby/deoptimizing/
        
         | marcosdumay wrote:
         | Hum... You are getting lost on theoretical undecidability.
         | 
         | On the real world, when faced with a generally undecidable
         | problem, we don't run away and lose all hope. We decide the
         | cases that can be decided, and do something safe when they
         | can't be decided.
         | 
         | On your example, Python can just re-optimize everything after
         | an eval. That doesn't stop it from running optimized code if
         | the eval does not happen. It can do even more and only re-
         | optimize things that the eval touched, what has some extra
         | benefits and costs, so may or may not be better.
         | 
         | Besides, when there isn't an eval on the code, the interpreter
         | can just ignore anything about it.
        
           | dataflow wrote:
           | > You are getting lost on theoretical undecidability. [...]
           | We decide the cases that can be decided, and do something
           | safe when they can't be decided.
           | 
           | I'm not lost on that at all; I'm well aware of that. that's
           | precisely why I wrote
           | 
           | >> [...] require _a ton of static analysis_ on the source or
           | bytecode that I imagine will _at best_ be slow, and _at
           | worst_ impossible due to the halting problem
           | 
           | and not
           | 
           | >> static analysis is impossible in the general case so we
           | run away and lose all hope.
           | 
           | I'm not sure how you read that sentiment from my comment.
        
             | marcosdumay wrote:
             | Hum... Ok. Then the answer is that most cases do not demand
             | as much analysis time as you expect, and the ones that
             | demand more still can gain something from dynamic behavior
             | analysis in a JIT.
             | 
             | Also, you can combine the two to get something better than
             | any single analysis alone.
        
       | peatmoss wrote:
       | During Perl's hegemony as The Glue Language, I feel like the folk
       | wisdom was:
       | 
       | "Performance is a virtue; if Perl ceases to be good enough, or
       | you need to write 'serious' software rewrite in C."
       | 
       | And during Python's ascension, the common narrative shifted very
       | slightly:
       | 
       | "Performance is a virtue, but developer productivity is a virtue
       | too. Plus, you can drop to C to write performance critical
       | portions."
       | 
       | Then for our brief all-consuming affair with Ruby, the wisdom
       | shifted more radically:
       | 
       | "Developer productivity is paramount. Any language that delivers
       | computational performance is suspect from a developer
       | productivity standpoint."
       | 
       | But looking at "high-level" languages (i.e. languages that
       | provide developer productivity enhancing abstraction), we can
       | rewind the clock to look at language families that evolved during
       | more resource-constrained times.
       | 
       | Those languages, the lisps, schemes, smalltalks, etc. are now
       | really, really fast compared to Python, and rarely require
       | developers to shift to alternative paradigms (e.g. dropping to C)
       | just to deliver acceptable performance.
       | 
       | Perl and Python exploded right at the time that Lisp/Scheme
       | hadn't quite shaken the myth that they were slow, with
       | Python/Perl achieving acceptable performance by having dropped to
       | C most of the time.
       | 
       | Now the adoption moat is the wealth of libraries that exist for
       | Python--and it's a hell of a big moat. If I were a billionaire,
       | I'd hire a team of software developers to systematically review
       | libraries that were exemplars in various languages, and write /
       | improve idiomatic, performant, stylistically consistent versions
       | in something modern like Racket. I'd like to imagine that someone
       | would use those things :-)
        
         | zdw wrote:
         | This sounds a lot like what some Python package developers are
         | trying with Rust (example being the cryptography package),
         | which also has the unfortunate side effect of limiting support
         | for some less popular platforms.
        
         | edflsafoiewq wrote:
         | Perl/Python/Ruby grew up in the 90s, the "Bubble economy" of
         | the single core performance world, the likes of which had never
         | and probably will never be seen again on the face of the Earth.
         | In the post-Bubble world, throwing out 90% of your performance
         | before you even start writing code, especially when the same
         | dynamic features could be delivered via JIT without the cost,
         | seems crazy.
        
           | rockyj wrote:
           | So true, excellent point! I just do not understand startups
           | choosing Python/Ruby in 2022 when you can get most of the
           | features, type safety, concurrency, async and 5 times more
           | speed in other languages.
        
             | WJW wrote:
             | I don't think it is such a surprise. The ecosystems around
             | Rails (for Ruby) and numpy/pandas/etc (for python) are
             | orders of magnitude larger than you get in the modern
             | languages. In Rails for example, adding an entire user
             | management system (including niceties like password reset
             | mails and must-haves like proper security for obscure
             | vulnerabilities most people will have never heard of) is
             | literally a single extra line in the gemfile and two
             | console commands. In python the ML and numerics ecosystem
             | are completely beyond anything another language has to
             | offer at the moment, even more so when you compare the time
             | to get started.
             | 
             | In addition, "real" performance is often tricky to measure
             | and may be irrelevant compared to other parts of the
             | system. Yes, Ruby is 10-100x slower than C. But if a user
             | of my web service already has a latency of (say) 200ms to
             | the server then it barely matters if the web service
             | returns a response in 5 ms or in 0.5 ms. Similarly for
             | rendering an email: no user will notice their email
             | arriving half a second earlier. Similarly for a python
             | notebook: if it takes 1 or 2 seconds to prepare some data
             | for a GPU processing job that will take several hours, it
             | doesn't really matter that the data preparation could have
             | been done in 0.1 seconds instead if it had been done in
             | Rust.
             | 
             |  _Especially_ for startups where often you 're not sure if
             | you're building the right thing in the first place, a big
             | ecosystem of prebuilt libraries is super important. If it
             | turns out people actually want to buy what you've made in
             | sufficient numbers that the inefficiency of
             | Ruby/Python/JS/etc becomes a problem then you can always
             | rewrite the most CPU intensive parts in another language.
             | Most startup code will never have the problem of "too many
             | users" though, so it makes no sense to optimize for that
             | from the start.
        
             | ReflectedImage wrote:
             | Well if you choose Python/Ruby you only need 1/3 of the
             | developers as if you choose another language.
             | 
             | The productivity gain is so great it outweights everything
             | else. It's as simple as that.
        
         | peatmoss wrote:
         | Is it gauche to offer my own counterpoint?
         | 
         | Another possibility is that the requirement to "drop to C" is a
         | virtue by de-democratizing access to serious performance. In
         | other words, let the commoners eat Python, while the anointed
         | manage their own memory.
         | 
         | I personally find this argument a bit distasteful / disagree
         | with it, but there was a thread the other day that talked about
         | the, uh, variable quality of code in the Julia ecosystem (Julia
         | being another language where dropping to C isn't important for
         | performance). In Julia, the academics can just write their code
         | and get on with their work--the horror!
        
         | munificent wrote:
         | _> Those languages, the lisps, schemes, smalltalks, etc._
         | 
         | The main reason those languages got fast despite being highly
         | dynamic is because of _very_ complex JIT VM implementations.
         | (See also: JavaScript.)
         | 
         | The cost of that is that a complex VM is much less hackable and
         | makes it harder to evolve the language. (See also: JavaScript.)
         | 
         | Python and Ruby have, I think, reasonably chosen to have slower
         | simpler implementations so that they are able to nimbly respond
         | to user needs and evolve the language without needing massive
         | funding from giant corporations in order to support an
         | implementation. (See also: JavaScript.)
         | 
         | There are other effects at play, too, of course.
         | 
         | Once your implementation's strategy for speed is "drop to C and
         | use FFI", then it gets much harder to optimize the core
         | language with stuff like a JIT and inlining because the FFI
         | system itself gets in the way. Not having an FFI for JS on the
         | web essentially forced JavaScript users to push to make the
         | core language itself faster.
        
           | peatmoss wrote:
           | Spending a weekend or two writing a Scheme that beats Python
           | in performance has been a pastime for computer science
           | students for at least a couple decades now. I'm not sure that
           | I believe that a performant Scheme implementation has more
           | complexity than e.g. PyPy. In fact, I'd wager the converse.
        
             | mrtranscendence wrote:
             | You're either exaggerating or the computer science students
             | you're familiar with are wizards. I've never known the
             | student who could write a Scheme implementation, from
             | scratch, in one weekend that is both complete and which
             | beats Python from a performance perspective.
        
               | peatmoss wrote:
               | If it's an exaggeration, it's not much of one.
               | 
               | Two parts to your argument:
               | 
               | - Writing a Scheme implementation quickly: Google "Write
               | a Scheme in 48 hours" and "Scheme from scratch." 48 hours
               | to a functioning Scheme implementation seems to be a feat
               | replicated in multiple programming languages.
               | 
               | - Performance: I haven't benchmarked every hobby scheme,
               | but given the proliferation of Scheme implementations
               | that, despite limited developer resources, beat (pure)
               | Python with it's massive pool of developers (CPython,
               | PyPy), I still don't buy the idea that optimizing Scheme
               | is a harder task than optimizing Python. Again, I'd
               | strongly suggest that optimizing Scheme is a much easier
               | task than optimizing Python simply by virtue of how often
               | the feat has been accomplished.
        
               | eatonphil wrote:
               | I would not include PyPy in a list of easy to beat
               | implementations.
        
               | JulianWasTaken wrote:
               | Nor ones with massive pools of developers.
        
               | peatmoss wrote:
               | Compared to most Scheme implementations?
        
               | mrtranscendence wrote:
               | If you can give me an implementation that implements
               | almost all of R5RS, in 48 hours, beating Python in
               | performance, and all by a single developer, I'll tip my
               | hat to that guy or gal. But I can't imagine it's too
               | commonly done.
        
               | eatonphil wrote:
               | Nobody said you can implement a full Scheme
               | implementation in 48 hours or two weeks. That's very much
               | besides the point about how poor CPython performance is.
        
               | eatonphil wrote:
               | Substitute computer science student with "developer" and
               | it holds for me. Definitely some CS students can do it
               | too. Actually at my school we did have to implement a
               | Scheme compiler. So yeah it's not too big of a stretch to
               | say.
               | 
               | I think people who haven't implemented a language
               | underestimate how slow CPython is. And overestimate how
               | hard it is to build a compiler for a dynamic language.
               | 
               | I think every professional developer or CS student can
               | and should build a compiler for a dynamic language!
        
               | mrtranscendence wrote:
               | But the claim was that a student could write a conformant
               | Scheme implementation in 48 hours that beats Python.
               | Clearly it's possible for a student to write a Scheme
               | that's faster than Python, but is it a reasonably
               | _complete_ Scheme done in a single weekend?
               | 
               | Even I, very much a non-computer scientist, could write a
               | fast Scheme quickly if I could keep myself to a very
               | small subset, so that's not interesting to me.
        
               | eatonphil wrote:
               | Conformant is a word you introduced, they didn't say
               | that.
        
             | munificent wrote:
             | Sure, but that's because Python has objects.
             | 
             | If your write an object system on top of your performant
             | hobby Scheme implementation, you'll likely find that the
             | performance of its method dispatch is about as slow as it
             | is in Python. Probably even slower.
             | 
             | Purely procedural Python code isn't as slow as object-
             | oriented Python code.
        
               | peatmoss wrote:
               | That's fair, but also the fact that we're comparing hobby
               | scheme implementations to two mainstream extremely
               | popular implementations of Python and setting up
               | conditions that forces (hobby) Scheme to play to Python's
               | relative strengths is telling. :-)
               | 
               | The Python ecosystem has certainly received a lot of
               | developer resources and attention the past couple of
               | decades. Shall we compare the performance of CLOS on
               | SBCL, which again has seen comparatively little developer
               | resources, to Python's performance in dealing with
               | objects? I'd take that performance wager.
        
               | Spivak wrote:
               | This isn't as much of a gotcha as you think. Python is
               | slow because the language is so dynamic and simply has to
               | do more behind the scenes work on each line. It's not
               | impressive that a language that does less is faster.
               | What's impressive is that a language that does _more_ ,
               | like JS on V8, is faster.
        
               | CraigJPerry wrote:
               | Is CLOS doing less than Python?
               | 
               | I'm thinking CLOS has more dynamism than Python - they're
               | both dynamically typed, they're both doing a lookup then
               | dispatch, but then CLOS adds dynamism on top of that,
               | it's also looking in the metadata thingy (i'm not a lisp
               | developer, do they call it the hash? I'm meaning the key
               | value store on every "atom" - i'm so out of my depth
               | here, is atom the right word?) plus if i remember right
               | the way CLOS works you use multiple dispatch not just
               | single dispatch like python.
        
           | igouy wrote:
           | > Python and Ruby have, I think, reasonably chosen to have
           | slower simpler implementations...
           | 
           | ?
           | 
           | https://shopify.engineering/yjit-just-in-time-compiler-cruby
        
             | munificent wrote:
             | Yes, CRuby is slowly moving towards a JIT now because
             | performance is a major blocker for user adoption.
             | 
             | The larger Python ecosystem has tried that a number of
             | times too (Unladen Swallow, PyPy, etc.)
             | 
             | It's quite difficult since both of those languages already
             | lean heavily on C FFI and having frequent hops in and out
             | of FFI code tends to make it harder to get the JIT fast.
             | JITs work best when all of the code is in the host language
             | and can be optimized and inlined together.
        
           | eatonphil wrote:
           | Javascript the language seems to have evolved much more than
           | Python despite CPython's very simple implementation.
        
             | munificent wrote:
             | Hence my point about "massive funding from giant
             | corporations in order to support an implementation". :)
        
               | eatonphil wrote:
               | Well almost all the JavaScript language innovation was
               | syntax sugar and was implemented as transforms before the
               | browsers implemented it. I think JavaScript devs mostly
               | were fine to keep using transforms indefinitely and it's
               | just been more convenient that the browsers have moved to
               | implement it.
               | 
               | Python could have done this easily too but evolving as a
               | language just isn't as big a priority (not that I'm
               | saying it should be) and that's completely (or mostly)
               | disconnected from their backend implementation decisions.
        
       | boringg wrote:
       | What does "Modern" python even mean?
        
         | digisign wrote:
         | Focuses on 3.8+, but 3.7 has another year of life in it.
        
       | didip wrote:
       | If you are building server-side applications using Python 3 and
       | async API and if you didn't use
       | https://github.com/MagicStack/uvloop, you are missing out on
       | performance big time.
       | 
       | Also, if you happen to build microservices, don't forget to try
       | PyPy, that's another easy performance booster (if it's compatible
       | to your app).
        
         | mrslave wrote:
         | > if it's compatible to your app
         | 
         | Every time I experiment with PyPy (on a set of non-trivial web
         | services) I encounter at least one incompatibility with PyPy in
         | the dependency tree and leave disappointed.
        
         | [deleted]
        
       | s_Hogg wrote:
       | Great read, vaguely reminds me someone or other was trying to get
       | cpython going with cosmopolitan libc. Wonder what that would do
       | for speed.
        
         | make3 wrote:
         | why do you think that this would help performance? a quick read
         | says cosmo is slower than regular libc. maybe it would be more
         | portable, but not faster
        
       ___________________________________________________________________
       (page generated 2022-05-05 23:00 UTC)