[HN Gopher] Modern Python Performance Considerations ___________________________________________________________________ Modern Python Performance Considerations Author : chmaynard Score : 226 points Date : 2022-05-05 12:50 UTC (10 hours ago) (HTM) web link (lwn.net) (TXT) w3m dump (lwn.net) | kzrdude wrote: | Faster-cpython is not the main topic here but certainly welcome | since it's the most used python. They've done great things so | far. Though I remember I heard the promise of 50% improvement in | each of five separate steps :) | joncatanio wrote: | This is a great read, and it's fantastic to see all the work | being done to evaluate and improve the language! | | The dynamic-nature of the language is actually something that I | had studied a few years back [1]. Particularly the variable and | object attribute look ups! My work was just a master's thesis, so | we didn't go too deep into more tricky dynamic aspects of the | language (e.g. eval, which we restricted entirely). But we did | see performance improvements by restricting the language in | certain ways that aid in static analysis, which allowed for more | performant runtime code. But for those interested, the abstract | of my thesis [2] gives more insight into what we were evaluating. | | Our results showed that restricting dynamic code (code that is | constructed at run time from other source code) and dynamic | objects (mutation of the structure of classes and objects at run | time) significantly improved the performance of our benchmarks. | | There was also some great discussion on HN when I had posted our | findings as well [3]. | | [1]: https://github.com/joncatanio/cannoli | | [2]: https://digitalcommons.calpoly.edu/theses/1886/ | | [3]: https://news.ycombinator.com/item?id=17093051 | Animats wrote: | _But we did see performance improvements by restricting the | language in certain ways that aid in static analysis, which | allowed for more performant runtime code._ | | Well, yes. In Python, one thread can monkey-patch the code in | another thread while running. That feature is seldom used. In | CPython, the data structures are optimized for that. | Underneath, everything is a dict. This kills most potential | optimizations, or even hard-code generation. | | It's possible to deal with that efficiently. PyPy has a | compiler, an interpreter, and something called the "backup | interpreter", which apparently kicks in when the program being | run starts doing weird stuff that requires doing everything in | dynamic mode. | | I proposed adding "freezing", immutable creation, to Python in | 2010, as a way to make threads work without a global lock.[1] | Guido didn't like it. Threads in Python still don't do much for | performance. | | [1] | http://www.animats.com/papers/languages/pythonconcurrency.ht... | chrisseaton wrote: | > This kills most potential optimizations, or even hard-code | generation. | | It doesn't - this has been a basically solved problem since | Self and deoptimisation were invented. | Animats wrote: | In theory, yes. In CPython, apparently not. In PyPy, | yes.[1] PyPy has to do a lot of extra work to permit some | unlikely events. | | [1] https://carolchen.me/blog/jits-impls/ | [deleted] | chrisseaton wrote: | You're trying to correct me by posting my own mentee's | blog post at me. | jerf wrote: | "Those techniques are based on the idea that most code "does not | use the full dynamic power that it could at any given time" and | that Python can quickly check to see if they are using the | dynamic features." | | If anyone has a burning desire to try to write the next big | dynamically-typed scripting language, I've often noodled in my | head with the idea of a language that has a dynamically-typed | startup phase, but at some point you call "DoneBeingDynamic()" on | something (program, module, whatever, some playing would have to | be done here) and the dynamic system basically freezes everything | into place and becomes a static system. (Or you have an explicit | startup phase to your module, or something like that.) | | The core observation I'm driving this on is much the same as the | quote I give from the article. You generally set up the vast | majority of your "dynamicness" once at runtime, e.g., you set up | your monkeypatches, you read the tables out of the DB to set up | your active classes, you read the config files and munge together | the configurations, etc. But then forever after, your dynamic | language is constantly reading this stuff, over and over and | _over and over_ again, millions, billions, trillions of times, | with it never changing. But it has to be read for the language to | work. | | Combine that with perhaps some work on a system that backs to a | struct-like representation of things rather than a hash-like | representation, and you might be able to build something that | gets, say, 80% of the dynamicness of a 1990s-era dynamic | scripting language, while performing at something more like | compiled language speeds, albeit with a startup cost. If you | could skip over the dozens of operations resolving | x.y.z.q = 25 | | a dynamically-typed language like Python needs to properly | implement that and get down to a runtime that can do the same | thing compiled languages do by pre-computing the offset into a | struct and just setting the value, you might get near static- | language performance with dynamic typing affordances. | | You can also view this as a Lisp-like thing that has an | integrated phase where it has macros, but then at some point puts | this capability down. | | I tend to think it's just fundamentally flawed to take a language | that is intrinsically defined as "x.y.z.q" requiring dozens of | runtime operations versus trying to define a new one where it is | a first-class priority from day one that the system be able to | resolve that down to some static understanding of what "x.y.z.q" | is. e.g., it's OK if y is a property and z is some fancy override | if the runtime can simply hardcode the relevant details instead | of having to resolve them every time. You can outrun even JIT- | like optimizations if you can get this down to the point where | you don't even have to check incoming types, you just know. | marcosdumay wrote: | I disagree. You are just doing those same optimizations by | hand, instead of on a JIT. The computer is there to help us, | and a lot of the value in a dynamic language comes from being | able to override things at any time. | | If you just set your structure up and run it statically, you | are better with a static language, that can take all kinds of | value from that fixed structure. | borodi wrote: | This feels like you are describing julia, startup cost included | :). | coldtea wrote: | > _I 've often noodled in my head with the idea of a language | that has a dynamically-typed startup phase, but at some point | you call "DoneBeingDynamic()" on something (program, module, | whatever, some playing would have to be done here) and the | dynamic system basically freezes everything into place and | becomes a static system. (Or you have an explicit startup phase | to your module, or something like that.)_ | | V8 tries to guess that for classes and objects based on runtime | information - that's how it gets some of its speed (it still | needs checks about whether this is violated at any point, so | that it can get rid of the proxy/stub "static" object it | guessed). | | For a more static guarantee, there are also things like | Object.freeze which does about what you describe for dynamic | objects in JS (#). | | # https://gist.github.com/briancavalier/3772938 | jerf wrote: | I'd be curious to see if a language developed with the idea | that this is what it's going to do from scratch could do | better than trying to bodge it on afterwards. Rather than | pecking around what could be done literally decades after the | language is specified, what if you started out with this | idea? | | I dunno. It's possible the real world would stomp all over | this idea in practice, or the resulting language would just | be too complex to be usable. It does imply a rather weird | bifurcation between being in "the init phase" and "the normal | runtime phase", and who knows what other "phases" could | emerge. Although technically, Perl actually already has this | split, although generally it can be ignored because it's of | much less consequence in Perl precisely because there mostly | isn't much utility to having something done in the earlier | phase, unlike this hypothetical language. | gpderetta wrote: | It seems that lisp-like macros or more generally multistage | compilation is close to what you have in mind. | jerf wrote: | Yes, it's not a brand-new dimension of programming | languages, merely a refinement of existing ideas. However | I'm not aware of anything quite like it out there. Lisp | could be used to implement it, but, I mean, that's not a | very strong statement now is it? Lisp can be used to | implement anything. The question is about whether it | exists. | | Partially I throw this idea out as a bone to those who | like dynamic languages. Personally I don't have this | problem anymore because I've basically given them up, | except in cases where the problem is too small to matter. | And if you already know and like Lisp, you don't really | have this problem either. | | But if you are a devotee of the 1990s dynamic scripting | languages, you're getting really squeezed right now by | performance issues. You can run 40-50x slower than C, or | you can run circa 10x slower than C with an amazing JIT | that requires a ton of effort and will forever be very | quirky with performance, and in both cases you'll be | doing quite a lot of work to use more than one core at a | time. Python is just hanging in there with the amazing | amount of work being poured into NumPy, and from what I | gather from my limited interactions with data scientists, | as data sets get larger and the pipelines more complex, | the odds you'll fall out of what NumPy can do and fall | back to pure Python goes up and the price of that goes up | too. | | I think a new dynamic scripting language built from the | ground up to be multithreadable and high performance via | some techniques like this would have some room to run, | and while hordes of people will come out of the woodwork | promising that one of the existing ones will get there | Real Soon Now, just wait, they've almost got it, the | reality is I think that the current languages have pretty | much been pushed as far as they can be. Unless someone | writes this language, dynamic scripting languages are | going to continue slowly, quite slowly, but also quite | surely, just getting squeezed out of computing entirely. | I mean, I'm looking at the road ahead and I'm not sure | how Go or C# is going to navigate a world where even low- | end CPUs casually have 128 cores on consumer hardware.... | Python _qua_ Python is going to face a real uphill battle | when the decision to use it entails basically committing | to not only using less than 1% of the available cores | (without offloading on to the programmer a significant | amount of work to get past that), but also using that | core ~1.5 orders of magnitude less efficiently than a | compiled language. You 've always had to pay some to use | Python, sure, but that's an awful lot of _orders of | magnitude_ for "a nice language". Surely we can have "a | nice language" for less price than that. | ufo wrote: | This kind of sounds similar to what a JIT compiler does, except | that a JIT will silently fall back to slower code if you do | those forbidden dynamic things. I think the most appealing | thing about what you're suggesting here is less about the peak | performance and more about having better guarantees about | startup cost and that performance won't be degraded (prefer | failing loudly to chugging along unoptimized). These two areas | often aren't the strongest point in JIT-ed systems... | tln wrote: | This approach kind of describes Graal. | | Interestingly, GraalPython never seems to come up on these | speeding-up-Python articles & benchmarks while TruffleRuby is a | heavyweight in the speeding-up-Ruby space. | kmod wrote: | I tried to benchmark GraalPython for the talk but the | compatibility situation was so poor that I wasn't even close | to being able to run any benchmarks. | w-m wrote: | This may be a naive question (I have very little knowledge | about building languages and compilers): Would this be possible | in Python by introducing a keyword like `final`? Any object, | variable, method that is marked final just has to be looked up | once by the interpreter, the re-fetching the article describes | doesn't have to happen again. Trying to change a final thing | results in an exception. | uncomputation wrote: | With JavaScript, these kinds of optimizations in an engine make | sense due to the web being limited by it and thus speed is a huge | factor. With Python, however, if a Python web framework is "too" | slow, I would honestly say the problem is using Python at all for | a web server. Python shines beautifully as a (somewhat) cross | platform scripting language: file reading and writing, | environment variables, simple implementations of basic utilities: | sort, length, max, etc that would be cumbersome in C. The move of | Python out of this and into practically everything is the issue | and then we get led into rabbit holes such as this where since we | are using Python, a dynamic scripting language, for things a | second year computer science student should know are not "the | right jobs for the tool." | | Instead of performance, I'd like to see more effort in | portability, package management, and stability for Python | because, essentially since it is often enterprise managed, | juggling fifteen versions of Python where 3.8.x supports native | collection typing annotations but we use 3.7.x, etc. is my | biggest complaint. Also up there is pip and just the general mess | of dependencies and lack of a lock file. Performance doesn't even | make the list. | | This is not to discredit anyone's work. There is a lot of | excellent technical work and research done as discussed in the | article. I just think honestly a lot of this effort is wasted on | things low on the priority tree of Python. | Barrin92 wrote: | waprin wrote: | On paper, Python is not the right tool for the job. Both | because of its bad performance characteristic and because it's | so forgiving/flexible/dynamic , it's tough to maintain large | Python codebases with many engineers. | | At Google there is some essay that Python should be avoided for | large projects. | | But then there's the reality that YouTube was written in | Python. Instagram is a Django app. Pinterest serves 450M | monthly users as a Python app. As far as I know Python was a | key language for the backend of some other huge web scale | products like Lyft, Uber, and Robinhood. | | There's this interesting dissonance where all the second year | CS students and their professors agree it's the wrong tool for | the job yet the most successful products in the world did it | anyway. | | I guess you could interpret that to mean all these people | building these products made a bad choice that succeeded | despite using Python but I'd interpret it as another instance | of Worse is Better. Just like Linus was told monolithic kernels | were the wrong tool for the job but we're all running Linux | anyway. | | Sometimes all these "best practices" are just not how things | work in reality. In reality Python is a mission critical | language in many massively important projects and it's | performance characteristics matter a ton and efforts to improve | them should be lauded rather than scrutinized. | arinlen wrote: | > But then there's the reality that YouTube was written in | Python. Instagram is a Django app. Pinterest serves 450M | monthly users as a Python app. As far as I know Python was a | key language for the backend of some other huge web scale | products like Lyft, Uber, and Robinhood. | | All those namedrops mean and matter nothing. Hacking together | proof of concepts is a time honoured tradition, as is pushing | to production hacky code that's badly stiched up. Who knows | if there was any technical analysis to pick Python over any | alternative? Who knows how much additional engineering work | and additional resources was required to keep that Python | code from breaking apart in production? I mean, Python always | figured very low in webapp framework benchmarks. Did that | changed just because <trendy company> claims it used Python? | | Also serving a lot of monthly users says nothing about a tech | stack. It says a lot about the engineering that went into | developing the platform. If a webapp is architected so that | it can scale well to meet it's real world demand, even after | paying a premium for the poor choice of tech stack some guy | who is no longer around made in the past for god knows what | reason, what would that say about the tech stack? | dataflow wrote: | I don't think "I could use tool X for job Y" implies "X was | the right tool for jon Y". You could commute with a truck to | your workplace 300 feet away for 50 years straight and I | would still argue you probably used the wrong tool for the | job. "Wrong tool" doesn't imply "it is impossible to do | this", it just means "there are better options". | ChrisLomont wrote: | >the most successful products in the world did it anyway | | A few successful projects in the world did it. There's likely | far more successful products that didn't use it. | | The key metric along this line is how often each language | allows success to some level and how often they fail | (especially when due to the choice of language). | | >should be lauded rather than scrutinized | | One can do both at the same time. | sjtindell wrote: | Instagram has one billion monthly users generating $7 | billion a year. There are almost zero products on earth as | successful. | arinlen wrote: | > Instagram has one billion monthly users generating $7 | billion a year. | | Doesn't Instagram serve mostly static content that's put | together in an appealing way by mobile apps? I'd figure | Instagram's CDN has far more impact than whatever Python | code it's running somewhere in it's entrails. | | Cargo cult approaches to tech stacks don't define | quality. | xboxnolifes wrote: | The point is that it's still _one_ project. You need to | count the failures as well to rule out survivorship bias. | slt2021 wrote: | Just compare Instagram written in Python to Google Wave, | Google+ or any other Google's social media, written in | C++/Java :)))) | jeremycarter wrote: | And you can put 7 billion of effort into tweaking your | python application performance? | fddhjjj wrote: | > The key metric along this line is how often each language | allows success to some level and how often they fail | | How does python score on these key metrics? | w1nk wrote: | > There's this interesting dissonance where all the second | year CS students and their professors agree it's the wrong | tool for the job yet the most successful products in the | world did it anyway. | | > I guess you could interpret that to mean all these people | building these products made a bad choice that succeeded | despite using Python but I'd interpret it as another instance | of Worse is Better. Just like Linus was told monolithic | kernels were the wrong tool for the job but we're all running | Linux anyway. | | This isn't the correct perspective or take away. The 'tool' | for the job when you're talking about building/scaling a | website changes over time as the business requirements shift. | When you're trying to find market fit, iterating quickly | using 'RAD' style tools is what you need to be doing. Once | you've found that fit and you need to scale, those tools will | need to be replaced by things that are capable of scaling | accordingly. | | Evaluating this binary right choice / wrong choice only makes | sense when qualified with a point in time and or scale. | digisign wrote: | The folks that work on performance are not the folks working on | packaging. Shall we stop their work until the packaging team | gets in gear? | rmbyrro wrote: | Totally agree that performance is not on my top 10 wish list | for Python. | | But I disagree on " _not the right jobs for the tool_ ". | | Python is extremely versatile and can be used as a valid tool | for a lot of different jobs, as long as it fits the _job | requirements_ , performance included. | | It doesn't require a CS degree to know that fitting _job | requirements_ and other factors like the team expertise, speed, | budget, etc, are more important than fitting a theoretical | sense of "right jobs for the tool". | blagie wrote: | > It doesn't require a CS degree to know that fitting job | requirements and other factors like the team expertise, | speed, budget, etc, are more important than fitting a | theoretical sense of "right jobs for the tool". | | It requires experience. | | A lot of those lessons only come after you've seen how much | more expensive it is to maintain a system than to develop | one, and how much harder people issues are than technical | issues. | | A CS degree, or even a junior developer, won't have that. | moffkalast wrote: | Python can do just about anything... but it will take its | time doing it. | pjmlp wrote: | Agreed, my only use for Python since version 1.6, is portable | shell scripting or when sh scripts get too complicated. | | Anything beyond that, there are compiled languages with REPL | available. | mrtranscendence wrote: | What compiled languages do you have in mind? I suppose | technically there are repls for C or Rust or Java, but I | wouldn't consider them ideal for interactive programming. | Functional programming might do a bit better -- Scala and | GHCi work fine interactively. Does Go have a repl? | eatonphil wrote: | > compiled languages | | Might be tripping you up. Very few languages require that | _implementations_ be compiled or interpreted. For most | languages, having a compiler or interpreter is an | implementation decision. | | I can implement Python as an interpreter (CPython) or as a | compiler (mypyc). I can implement Scheme as an interpreter | (Chicken Scheme's csi) or as a compiler (Chicken Scheme's | csc). The list goes on: Standard ML's Poly/ML | implementation ships a compiler and an interpreter; OCaml | ships a compiler and an interpreter. | | There are interpreted versions of Go like | https://github.com/traefik/yaegi. And there are native-, | AOT-compiled versions of Java like GraalVM's native-image. | | For most languages there need be no relationship at all | between compiler vs interpreter, static vs dynamic, strict | or no typing. | pjmlp wrote: | Java, C#, F#, Lisp variants, and C++. | | Eclipse has Java scratchpads for ages, Groovy also works | out for trying out ideas and nowadays we have jshell. | | F# has a REPL in ML linage, and nowadays C# also shares a | REPL with it in Visual Studio. | | Lisp variants, going at it for 60 years. | | C++, there are hot reload environments, scripting variants, | and even C and C++ debuggers can be quite interactive. | | I used GDB in 1996, alongside XEmacs, as poor man's REPL | while creating a B+Tree library in C. | | Yes, there are Go interpreters available, | | https://github.com/traefik/yaegi | blagie wrote: | I want a common language I can work with. Right now, Python is | the only tool which fits the bill. | | A critical thing is Python does numerics very, very well. With | machine learning data science, and analytics being what they | are, there aren't many alternatives. R, Matlab, and Stata won't | do web servers. That's not to mention wonderful integrations | with OpenCV, torch, etc. | | Python is also competent at dev-ops, with tools like ansible, | fabric, and similar. | | It does lots of niches well. For example, it talks to hardware. | If you've got a quadcopter or some embedded thing, Python is | often a go-to. | | All of these things need to integrate. A system with | Ruby+R+Java will be much worse than one which just uses Python. | From there, it's network effects. Python isn't the ideal server | language, but it beats a language which _just_ does servers. | | As a footnote, Python does package management much better than | alternatives. | | pip+virtualenv >> npm + (some subset of require.js / rollup.js | / ES2015 modules / AMD / CommonJS / etc.) | | JavaScript has finally gone from a horrible, no-good, bad | language to a somewhat competent one with ES2015, but it has at | least another 5-10 years before it can start to compete with | Python for numerics or hardware. It's a sane choice if you're | front-end heavy, or mobile-heavy. If you're back-end heavy | (e.g. an ML system) or hardware-heavy (e.g. something which | talks to a dozen cameras), Python often is the only sane | choice. | Denvercoder9 wrote: | > As a footnote, Python does package management much better | than alternatives. | | If you use it as a scripting language, that might very well | be the case (it's at least simpler). When you're building | libraries or applications, no, definitely not. It's a huge | mess, and every 3 years or so we get another new tool that | promises to solve it, but just ends up creating a bigger | mess. | whimsicalism wrote: | I think poetry actually does solve it | Thrymr wrote: | Oh, there are a half dozen different tools that solve | python package management. Unfortunately, they are | mutually incompatible and none solve it for all use | cases. | whimsicalism wrote: | > it has at least another 5-10 years before it can start to | compete with Python for numerics or hardware | | More, given that no language competes at high-level numerics | with Python outside of Julia and numerics in general only | adds C++. | Robotbeat wrote: | Fortran >:D | whimsicalism wrote: | For low-level, fair. I only know of people in astronomy | academia who actually use it nowadays though. | DeathArrow wrote: | >. R, Matlab, and Stata won't do web servers. | | Not unless they're pushed to, like Python was. | | >A critical thing is Python does numerics very, very well. | | That's not Python doing numerical stuff. That's C code, | called from Python. | fractalb wrote: | > Not unless they're pushed to, like Python was. | | Readability of code and ease of use is a big thing. It's | just not about pushing hard till we make it. | | edit: formating | jonnycomputer wrote: | I wouldn't want to do a web-server in MATLAB. I like | MATLAB, but no, not that. | mrtranscendence wrote: | > That's not Python doing numerical stuff. That's C code, | called from Python. | | That's sort of a distinction without a difference, isn't | it? Python can be good for numeric code in many instances | because someone has gone through the effort of implementing | wrappers atop C and Fortran code. But I'd rather be using | the Python wrappers than C or especially Fortran directly, | so it makes at least a little sense to say that Python | "does numerics [...] well". | | > Not unless they're pushed to, like Python was. | | R and Matlab, maybe. A web server in Stata would be a | horrible beast to behold. I can't imagine what that would | look like. Stata is a _terrible_ general purpose language, | excelling only at canned econometrics routines and | plotting. I had to write nontrivial Stata code in grad | school and it was a painful experience I 'd just as soon | forget. | disgruntledphd2 wrote: | You can do web stuff in R, but it's a lot harder than it | needs to be. R sucks for string interpolation, and a lot | of web related stuff is string interpolation. | mrtranscendence wrote: | Yeah, I'm not surprised by that. The extent of my web | experience in R is calling rcurl occasionally, so I've | never tried and failed to do anything complicated. | blagie wrote: | It's not C code. It calls into a mixture of C, CUDA, | Fortran, and a slew of other things. Someone did the work | of finding the best library for me, and integrating them. | | As for me, I write: | | A * B | | It multiplies two matrices. C can't do that. In C, I'd have | some unreadable matrix64_multiply(a, b). Readability is a | big deal. Math should look more-or-less like math. I can | handle 2^4, or 2**4, but if you have mpow(2, 4) in the | middle of a complex equation, the number of bugs goes way | up. | | I'd also need to allocate and free memory. Data wrangling | is also a disaster in C. Format strings were a really good | idea in the seventies, and were a huge step up from BASIC | or Python. For 2022? | | And for that A * B? If I change data types, things just | work. This means I can make large algorithmic changes | painlessly. | | Oh, and I can develop interactively. ipython and jupyter | are great calculators. Once the math is right, I can copy | it into my program. | | I won't even get started on things like help strings and | documentation. | | Or closures. Closures and modern functional programming are | huge. Even in the days of C and C++, I'd rather do math in | a Lisp (usually, Scheme). | | I used to do numerics in C++, and in C before that. It's at | least a 10x difference in programmer productivity stepping | up to Python. | | Your comment sounds like someone who has never done | numerical stuff before, or at least not serious numerical | stuff. | danuker wrote: | > the number of bugs goes way up | | In case you are forced to use the unreadable long-named | unintuitively-syntaxed methods, add unit tests, and check | that input-output pairs match with whatever formula you | started with. | tomrod wrote: | Yet, Python (and most of her programmers including data | scientists, of which I am one) stumble with typing. | if 0.1 + 0.2 == 0.3: print('Data is handled | as expected.') else: print('Ruh | roh.') | | This fails on Python 3.10 because floats are not | decimals, even if we really want them to be. So most | folks ignore the complexity (due to naivety or | convenience) or architect appropriately after seeing | weird bugs. But the "Python is easiest and gets it right" | notion that I'm often guilty of has some clear edge | cases. | fnord123 wrote: | This is an issue for accountancy. Many numerical fields | have data coming from noisy instruments so being lossy | doesn't matter. In the same vein as why GPUs offer f16 | typed values. | dullcrisp wrote: | Why would you want decimals for numeric computations | though? Rationals might be useful for algebraic | computations, but that'd be pretty niche. I'd think | decimals would only be useful for presentation and maybe | accountancy. | tomrod wrote: | Well, for starters folks tend to code expecting | 0.1+0.2=0.3, rather than abs(0.3-0.2-0.1) < | tolerance_value | | Raw floats don't get you there unfortunately. | gjm11 wrote: | They also expect 1/3 + 1/3 + 1/3 == 1. Decimals won't | help with that. | kbenson wrote: | That's slightly different in that most programmers won't | read 1/3 as "one third" but instead "one divided by | three", and interpret that as three divisions added | together, and the expectations are different. Seeing a | constant written as a decimal invites people to think of | them as decimals, rather than the actual internal | representation, which is often "the float that most | closely represents or approximates that decimal". | dekhn wrote: | https://docs.python.org/3/library/decimal.html | tomrod wrote: | Correct! Many python users don't know about this and | similar libraries that assist with data types. Numpy has | several as well. | kbenson wrote: | > As a footnote, Python does package management much better | than alternatives | | No offense meant, but that sounds like the assessment of | someone that has only experienced really shitty package | management systems. PyPI has had their XMLRPC search | interface disabled for months (a year?) now, so you can't | even easily figure out what to install from the shell and | have to use other tools/a browser to figure it out. | | Ultimately, I'm moving towards thinking that most scripting | languages actually make for fairly poor systems and admin | languages. It used to be the ease of development made all the | other problems moot, but there's been large advances in | compiled language usability. | | For scripting languages you're either going to follow the | path or Perl or the the path of Python, and they both have | their problems. For Perl, you get amazing stability at the | expense of eventually the language dying out because there's | not enough new features to keep people interested. | | For Python, the new features mean that module writers want to | use them, and then they do, and you'll find that the system | Python you have can't handle what modules need for things you | want to install, and so you're forced to not just have a | separate module environment, but fully separate pythons | installed on servers so you cane make use of the module | ecosystem. For a specific app you're shipping around this is | fine, but when maintaining a fleet of servers and trying to | provide a consistent environment, this is a big PITA that you | don't want to deal with when you've already chosen a major | LTS distro to avoid problems like this. | | Compiling a scripting language usually doesn't help much | either, as that usually results in extremely bloated binaries | which have their own packaging and consistency problems. | | This is cyclical problem we've had so far. A language is used | for admin and system work, the requirements of administrators | grate up against the usage needs of people that use the | language for other things, and it fails for non-admin work | and loses popularity and gets replaced be something more | popular (Perl -> Python) or it fails for admin work because | it caters to other uses and eventually gets replaced by | something more stable (what I think will happen to Python, | what I think somewhat happened to bash earlier for slightly | different reasons). | | I'm not a huge fan of Go, but I can definitely see why people | switch to it for systems work. It alleviates a decent chunk | of the consistency problems, so it's at least better in that | respect. | jonnycomputer wrote: | >No offense meant, but that sounds like the assessment of | someone that has only experienced really shitty package | management systems. PyPI has had their XMLRPC search | interface disabled for months (a year?) now, so you can't | even easily figure out what to install from the shell and | have to use other tools/a browser to figure it out. | | Yes, this is, frankly, an absurd situation for python. | | And then there is the fact that I end up depending on | third-party solutions to manage dependencies. Python is | big-time now; stop the amateur hour crap. | the__alchemist wrote: | I agree! Here's a related point: Rust seems ideal for web | servers, since it's fast, and is almost as ergonomic as Python | for things you listed as cumbersome in C. So, why do I use | Python for web servers instead of Rust? Because of the robust | set of tools of tools Django provides. When evaluating a | language, fundamentals like syntax and performance are one | part. Given web server bottlenecks are I/O limited (mitigating | Python's slowness for many web server uses), and that I'd have | to reinvent several wheels in Rust, I use Python for current | and future web projects. | | Another example, with a different take: MicroPython, on | embedded. The only good reason I can think for this is to | appeal to people who've learned Python, and don't want to learn | another language. | rootusrootus wrote: | > the problem is using Python at all for a web server | | I don't agree with this. Maybe for a web server where | performance is really going to matter down to the microsecond, | and I've got no other way to scale it. I write server code in | both Javascript and Python, and despite all of my efforts I | still find that I can spin up a simple site in something like | django and then add features to it much more easily than I can | with node. It just has less overhead, is simpler, lets me get | directly to what I need without having to work too hard. It's | not like express is _hard_ per se, but python is such an easy | language to work with and it stays out of my way as long as I | 'm not trying to do exotic things. | | And then it pays dividends later, as well, because it's really | easy for a python developer to pick up code and maintain it, | but for JS it's more dependent on how well the original | programmer designed it. | srcreigh wrote: | The problem with Django services is the insanely low | concurrency level compared to other server frameworks | (including node). | | Django is single request at a time with no async. The | standard fix is gunicorn worker processes, but then you | require entire server memory * N memory instead of | lightweight thread/request struct * N memory for N requests. | | I shudder to think that whenever Django server is doing an | HTTP request to a different service or running a DB query, | it's just doing nothing while other requests are waiting in | the gunicorn queue. | | The difference is if you have an endpoint with 2s+ queries | taking 2s for one customer, with Django, it might cause the | entire service to stall for everybody, whereas with a decent | async server framework other fast endpoints can make progress | while the 2s ones are slow. | pdhborges wrote: | You can configure gunicorn to use multiple threads to | recover quite a bit of concurrency in those scenarios and | that is enough for many applications. | srcreigh wrote: | What threading/workers configuration do you use? | | I'm looking at a page now which recommends 9 concurrent. | requests for a Django server running on a 4 core | computer. | | Meanwhile node servers can easily handle hundreds of | concurrent requests. | pdhborges wrote: | We use the ncpu * 2 + 1 formula for the number of workers | that serve API requests. | | I don't think in 'handling x concurrent requests' terms | because I don't even know what that means. Usually I | think around thoughout, latency distributions and number | of connections that can be kept open (for servers that | deal with web sockets). | | For example if you have the 4 core computer and you have | 4 workers and your requests take around 50ms each you can | get to a throughput of 80 requests per second. If the | fraction of request time for IO if 50% you can bump your | thread count to try to reach 160 request per second. Note | that in this case each request consumes 25ms of CPU so | you would never be able to get more than 40 requests per | second per CPU whether you are using node or python. | manfre wrote: | Django has async support for everything except the ORM. | async db is possible without the ORM or by doing some | thread pool/sync to async wrapping. A PR for that was under | review last I checked. | | Either way, high concurrency websites shouldn't have | queries that take multiple seconds and it's still possible | to block async processes in most languages if you mix in a | blocking sync operation. | dirnctiwnsidj wrote: | This sounds like sour grapes. Python is a general-purpose | language. Languages like Awk and Perl and Bash are clearly | domain-specific, but Python is a pretty normal procedural | language (with OO bolted on). The fact that it is dynamic and | high-level does not mean it is unsuited for applications or the | back-end. People use high-level dynamic languages for servers | all the time, like Groovy or Ruby or, hell, even Node.js. | | What about Python makes it unsuitable for those purposes other | than its performance? | make3 wrote: | I'm not sure it's very relevant to say in a discussion of the | answer of "how do we improve Python" is "don't use Python". | People have all kinds of valid reasons to use Python. Let's | keep this on topic please | heavyset_go wrote: | > _Also up there is pip and just the general mess of | dependencies and lack of a lock file._ | | You can use pyproject.toml or requirements.txt as lock files, | Poetry can use the former and poetry.lock files, as well. | marius_k wrote: | > and lack of a lock file | | Is it possible to solve your problem using pip freeze? | robotsteve2 wrote: | The world doesn't revolve around web development. It's not the | only use case. Scientific Python is huge and benefits | tremendously from the language being faster. If Python can be | 1% faster, that's a significant force multiplier for scientific | research and engineering analysis/design (in both academia and | industry). | mrtranscendence wrote: | Because most of the really huge scientific Python libraries | are written as wrappers over lower-level language code, I'd | be curious to what extent speeding up Python by, say, 10% | would speed up "normal" scientific Python code on average. | 1%? 5%? | animatedb wrote: | If you are talking about large sets of numbers, then the | speed up will be far below 1%. | DeathArrow wrote: | >The first topic he raised, "why Python is slow", is somewhat | divisive | | What dynamic, interpreted, single threaded language is fast? | baisq wrote: | Practically every other language that ticks those boxes is | faster than Python. | bsder wrote: | > What dynamic, interpreted, single threaded language is fast? | | Javascript. End of list. | | The problem is that a Javascript implementation is now _so_ | complicated that you can 't develop a new one without massive | investment of resources. | brokencode wrote: | As far as interpreted languages go, Wren is pretty quick, but | still not fast compared to compiled languages. | | But for dynamic, single threaded languages, JavaScript is | famously fast with a modern JIT compiler like V8. | Qem wrote: | Lua (LuaJIT implementation). Some Smalltalk VMs are also quite | fast. For example, see Eliot Miranda work on CogVM. | astrobe_ wrote: | Your pushing it a bit too far if you say that JIT is | interpreted. | | To answer OP, if you replace "dynamic" by "untyped", Forth | qualifies. And it actually can go where there's no JIT to | save your A from the "just throw more hardware (and software) | at the problem" mindset. | Qem wrote: | I think someone once said dynamic langs must cheat to be | performant. Jitted runtimes are just interpreters cheating. | DeathArrow wrote: | What's wrong with using the right tool for the right job? Python | for utility scripts, Javascript for Web frontend, C and C++ for | system programming, C# for Web backend, R for statistical stuff | and data analysis? | | It seems to me some guys learned a language suited to a thing and | instead of learning other languages better fitted for other | purposes, they push for their one and only language to be used | everywhere, resulting in delays and financial losses. | | It's not very hard to learn another language. Or, if you are that | lazy, you can stay with the language you know and use it for what | was intended. | ReflectedImage wrote: | Python has domain on web backend, statistical stuff and data | analysis nowadays. | supreme_berry wrote: | Dumbest comment on the thread? | dijit wrote: | As a very partial; almost unrelated question: Is there any python | module that you use day-to-day that you'd like to have a | significant speedup with? | | I'm thinking of reimplementing some python modules in rust, as | that seems like the kind of weird thing I'm in to. I've done it | with some success (using the excellent work of the pyo3 project) | professionally, but I'd be interested in doing more. | yedpodtrzitko wrote: | Pydantic is quite popular library. Its author is doing exactly | this - rewriting its core [0] in Rust. It's still WIP, but | readme mentions that "Pydantic-core is currently around 17x | faster than Pydantic Standard." | | [0] https://github.com/samuelcolvin/pydantic-core | tclancy wrote: | Not working in Python right now, but I have 15 years of Python | + Django on the web and while there are any number of attempts | at this (I keep a list at | https://pinboard.in/u:tclancy/t:json/t:python/), any | improvement in JSON serialization and unserialization speeds is | a huge boon to projects. I am trying to think of similar | bottlenecks where a drop-in replacement can be a huge | performance improvement. | JackC wrote: | The missing thing last time I looked was a fast python json | library that's byte-compatible with stdlib -- same inputs, | same outputs. There are good fast options but they tend to | add some (perfectly reasonable) limitation like fixed | indentation size, for the sake of speed, that blocks them | from being dropped into an existing public API. | dotnet00 wrote: | Definitely matplotlib. Navigating image plots in interactive | mode with even just 10000x10000 pixels is painfully slow. While | I've picked up some alternatives, they don't feel as clean as | matplotlib. | wcunning wrote: | 10000% -- matplotlib for visualization of a lot of different | data I've looked at, but esp things like high res images in | machine learning contexts is incredibly slow, even on good | computers. It does fine for small vector stuff and render | once and save graphs, but it's bad for what a lot of people | use it for. | mritchie712 wrote: | pandas | curiousgal wrote: | I remember, when trying to squeeze some performance out of | it, that a lot of the overhead came from it trying to infer | types. | w-m wrote: | This is a curious reply for me. I would think that there are | very few parts in pandas that could be sped-up by | reimplementing them with a compiled language. Pandas is | plenty fast for the built-in methods, it only gets slow when | you start interfacing with Python, e.g. by doing an `.apply` | with your custom Python method. Obviously this interfacing | part is impossible to speed up by reimplementing parts of | pandas (you'd need a different API instead). | mynameis_jeff wrote: | would give https://github.com/modin-project/modin a shot | SnooSux wrote: | It's been done: https://github.com/pola-rs/polars | | But I'm sure there's always room for improvement | rytill wrote: | It's not like polars is a drop-in replacement, it has a | totally different API. | mrtranscendence wrote: | You wrote "it has a totally different API", did you mean | "it has an actually sane API?" Because that's what I | think of when I compare pandas to polars. | fgh wrote: | The answer would then be to have a look at polars. | zmgsabst wrote: | You'd be awesome if you wrote a library for large image | processing. | | You can make large Numpy arrays fine -- eg, 20k x 20k or 500k x | 500k, but trying to render that to anything but SVG or manual | tilings pukes badly. | | That's my main blocker on rendering high dimensional shapes: | you can do the math, but visualizations immediately fall over | (unless you do tiling yourself). | | There's probably someone with a more useful idea than | "gigapixel rendering" though. | DeathArrow wrote: | As I see it, Python is good for glue code and small scripts where | performance usually doesn't matter. Even if it would be more | performant, it would be a nightmare for large code bases since | it's dynamically typed. | | I really enjoy Nim which is "slick as Python, fast as C". | supreme_berry wrote: | You wouldn't believe how many near-FAANGS have hundreds of | large backend services on Python without any issues and from | times where typing was in docstrings. | baisq wrote: | Because they have insane amounts of money that they can throw | at the machines. | bjourne wrote: | I one had a database-backed website serving 50k unique | visitors/day written in Django and hosted on a low-budget | vps. Worked like a charm with very few hiccups. | CraigJPerry wrote: | I was curious so i had a bash at comparing the cost of just | buying another server to throw at the problem vs telling a | FAANG dev to optimise the code. | | A dedicated 40core / 6Tb server is around $2k but will be | amortized over the years of its life. It needs power, | cooling, someone to install it in a rack, someone to | recycle it afterwards, ..., around $175/yr | | A FAANG dev varies wildly but $400k seems fair-ish (given | how many have TC > 750k). | | So that's about 12 hours of time optimising the code vs | throwing another 40c / 6Tb machine at the problem for 365 | days. | | The big cost i'm missing out of both the server and the | developer is the building they work in. What's the recharge | for a desk at a FAANG, $150k/yr ? I have no idea how much a | rack slot works out at. | | Unless i've screwed up the figures anywhere, we should | probably all be looking at replacing Python with Ruby if we | can squeeze more developer productivity! | SuaveSteve wrote: | Why not switch to making __slots__ in classes the default and | then making attribute changes to an object during runtime an opt- | in? It will require a long grace period but wouldn't it help | optimisation efforts immensely? | BurningFrog wrote: | Where can I read about what kind of performance improvements | `__slots__` brings? | WillDaSilva wrote: | The Python docs themselves is a good place to start: | https://docs.python.org/3/reference/datamodel.html#slots | | The Python wiki also has some good info about it: | https://wiki.python.org/moin/UsingSlots | gjulianm wrote: | That's going to require quite a lot of changes, it's a giant | breaking change. All classes would need someone to go around | finding all the attributes that are created and adding an | __slots__ dictionary, to avoid regular attribute initialization | in __init__ failing. It's a massive task, and it would | completely break backwards compatibility for performance gains | that not everybody will need. | anamax wrote: | default __slots__ breaks a lot of monkey patching. | | An "easier" change would be to add a class attribute | "no__dict__", which says that the __dict__ attribute can't be | used, which lets the implementation do whatever it wants. That | can be incrementally added to classes. | | Another option is a "no__getattr__" attribute, which disables | gettattr and friends. | yedpodtrzitko wrote: | That would mean all installed dependencies need to comply with | this change as well, which is unlikely to happen in any | realistic timeframe. | [deleted] | g42gregory wrote: | 15 years ago I remember reading Guido van Rossum saying that | Python is a connector language and if you need performance, just | drop into C and write/use a C module. I thought it was crazy at | the time, but now I see that he was absolutely right. It took a | while, but now Python has a high-performing C module for pretty | much every task. | chrisseaton wrote: | But these don't compose right? Each is a black-box to each | other? A black-box add and a black-box multiply don't fuse. | kylebarron wrote: | They can! Numpy exposes a C API to other Python programs [0]. | It's not hard to write a Cython library that uses the Numpy C | API directly and does not cross into Python [1]. | | [0]: https://numpy.org/doc/stable/reference/c-api/index.html | | [1]: https://github.com/kylebarron/pymartini/blob/4774549ffa2 | 051c... | chrisseaton wrote: | So they can if you use their specific API? It doesn't | naturally compose in conventional Python code? | dekhn wrote: | C and Python are not black boxes to each other. The entire | python interpreter is literally a C API. You can create | pyobjects, add heterogenous PyObjects to PyLists, etc. So | evewrything in Python can be introspected from C. | | Turned around, Python has arbitrary access into the C | programming space (really, the UNIX or Windows process it's | running inside), so long as it has access to headers or other | type info it can see C with more than black box info. | | Most python numerics is implement in numpy; the low levels of | numpy are actually (or were) effectively a C API implementing | multiple dimension arrays, with a python wrapper. | klyrs wrote: | You're talking past chrisseaton's point here. If you want | two C extensions to interoperate with bare-metal | performance, you can't just do from lib1 | import makedata from lib2 import processdata | data = makedata() print(processdata(data)) | | Because makedata needs to provide a c->py bridge and | processdata needs a py->c bridge, so your process | inherently has python in the middle unless lib2 has | intimate knowledge of lib1. It can absolutely be done (I've | written plenty of c extensions that handle numpy arrays, | for example) but if somebody hasn't done the work, you | don't get it for free. If your c extension expects a list | of lists of floats, the numpy array totally supports that | interface... but (last I checked) the overhead there is way | slower than calling list(map(list, data)) and throwing that | into your numpy-naive c extension. | chrisseaton wrote: | > C and Python are not black boxes to each other | | Yes they are - the C interpreter knows _nothing_ about what | your C extension does. It can't optimise it because all it | has is your machine code - no higher level logic. | n8ta wrote: | Print doesn't have to be re resolved on every access... Not sure | about python but many interpreters do a resolution pass that | matches declarations and usages (and decides where data lives, | stack, heap, virtual register, whatever) | SnowflakeOnIce wrote: | In Python semantics, indeed, 'print' does need to be looked up | each time! | dataflow wrote: | > Python can quickly check to see if they are using the dynamic | features | | I don't understand how this is supposed to be "quickly" | verifiable? | | Nothing prevents you from doing eval('gl' + 'obals')()['len'] = | ...; how is the interpreter supposed to quickly check that this | isn't the case when you're calling a function that might not even | be in the current module? | | Doing this correctly would seem to require a ton of static | analysis on the source or bytecode that I imagine will at _best_ | be slow, and at worst impossible due to the halting problem. | [deleted] | kmod wrote: | Python dictionaries now have version counters that track how | many times they were modified, so the quick check is to ask | "was len not overidden last time and is the number of | modifications to the globals the same as it was last time". | gpderetta wrote: | One possibility is to move the cost to the assignment, so the | code that assigns a new value to the global 'len' function is | going to track and invalidate all cached lookups. Hopefully you | are changing the binding of 'len' less often than you are | calling it :) | kmod wrote: | Cinder does this (invalidation), and both Faster CPython and | Pyston use guarding. | gpderetta wrote: | Right, of course, guarded devirtualization is a common | technique. | [deleted] | bootwoot wrote: | I was reading this as an undetailed description of state | available WITHIN the interpreter. Probably there is a table of | globals that you can simply check last modification on or | something like this. Whether you hit it with eval or some other | tricky code, you can't modify a global without the interpreter | knowing about it. | dataflow wrote: | If that's what they mean, how would that be any faster than | what's going on right now? I thought normally when you hit a | callable, the interpreter would just look up its name, check | to see if it's a built-in, and then call the built-in if | so... whereas in this case you'd still have to look up the | name of the callable (is the idea to bypass this somehow? | what do they do currently?), check to see if it's different | than the built-in you'd _expect_ from the name (i.e. if it 's | ever been reassigned to), then call that expected built-in if | it's not... which seems like the same thing? At best it would | seem to convert 1 indirect call to a direct call, which would | be negligible for something like Python. Is the current | implementation somehow much slower than I'm imagining? What | am I missing? | the-lazy-guy wrote: | You could do something like primitive inline cache. Store | "version" of the globals in another variable. Each time | globals are modified - bump the version. For each call-site | and/or keep what the global name is resolved to + version | of "globals object" in a static variable. Now you can avoid | name resolution if version hasn't changed between two | executions of the line. Now in fast-path you just pay the | price of (easily predicted, because globals almost never | change) single compare and jump vs full hash-table lookup. | dataflow wrote: | I think the core of the optimization you're mentioning | hinges on a normal lookup being a slow hashtable lookup | (of a string?)... whereas I imagined the first thing the | interpreter would do would be to intern each name and | assign it a unique ID (as soon as during parsing, say) | and use that thereafter whenever they're not forced to | use a string (like with globals()). That integer could | literally be a global integer index into a table of | interned strings, so you could either avoid hashing | entirely (if the table isn't too big) or reduce it to | hashing an int, both of which are much faster than | hashing a string. Do they not do that already? Any idea | why? I feel like that's the real optimization you'd need | if checking a key in a hashtable is the slow part (and | it's independent of whether the value is being modified). | blagie wrote: | I don't think the world is quite so bad. | | x86 processors solve this by speculating about what's going on. | If you suddenly run into a 1976-era operation, everything slows | down dramatically for a bit (but still goes faster than an | 8086). If you have a branch or cache miss, things slow down a | little bit. | | One has a few possibilities: | | - A static analysis /proves/ something. print is print. You | optimize a lot. | | - A static analysis /suggests/ something. print is print, | unless redefined in an eval. You just need to go into a slow | path in operations like `eval`, so if print is modified, you | invalidate the static analysis. | | - A static or dynamic analysis suggests something | probabilistically. You can make the fast path fast, and the | slow path eventually work. If print isn't print, you raise an | internal exception, do some recovery, and get back to it. | | I'm also okay with this analysis being run in prod and not in | dev. | | As a footnote, JITs, especially in Java, show that this kind of | analysis can be pretty fast. You don't need it to work 100% of | the time. The case of a variable being redefined in a dozen | places, you just ignore. The case where I call a function from | three places which increments an integer each time, I can find | with hardly any overhead at all. The latter tends to be where | most of the bottlenecks are. | chrisseaton wrote: | > I don't understand how this is supposed to be "quickly" | verifiable? | | You don't verify, and instead you run assuming no verification | is needed. Then if someone wants to violate that assumption, | it's their problem to stop everyone who may have made that | assumption, and to ask them to not make it going forward. | | You shift the cost to the person who's doing the | metaprogramming and keep it free for everyone who isn't. | | https://chrisseaton.com/truffleruby/deoptimizing/ | marcosdumay wrote: | Hum... You are getting lost on theoretical undecidability. | | On the real world, when faced with a generally undecidable | problem, we don't run away and lose all hope. We decide the | cases that can be decided, and do something safe when they | can't be decided. | | On your example, Python can just re-optimize everything after | an eval. That doesn't stop it from running optimized code if | the eval does not happen. It can do even more and only re- | optimize things that the eval touched, what has some extra | benefits and costs, so may or may not be better. | | Besides, when there isn't an eval on the code, the interpreter | can just ignore anything about it. | dataflow wrote: | > You are getting lost on theoretical undecidability. [...] | We decide the cases that can be decided, and do something | safe when they can't be decided. | | I'm not lost on that at all; I'm well aware of that. that's | precisely why I wrote | | >> [...] require _a ton of static analysis_ on the source or | bytecode that I imagine will _at best_ be slow, and _at | worst_ impossible due to the halting problem | | and not | | >> static analysis is impossible in the general case so we | run away and lose all hope. | | I'm not sure how you read that sentiment from my comment. | marcosdumay wrote: | Hum... Ok. Then the answer is that most cases do not demand | as much analysis time as you expect, and the ones that | demand more still can gain something from dynamic behavior | analysis in a JIT. | | Also, you can combine the two to get something better than | any single analysis alone. | peatmoss wrote: | During Perl's hegemony as The Glue Language, I feel like the folk | wisdom was: | | "Performance is a virtue; if Perl ceases to be good enough, or | you need to write 'serious' software rewrite in C." | | And during Python's ascension, the common narrative shifted very | slightly: | | "Performance is a virtue, but developer productivity is a virtue | too. Plus, you can drop to C to write performance critical | portions." | | Then for our brief all-consuming affair with Ruby, the wisdom | shifted more radically: | | "Developer productivity is paramount. Any language that delivers | computational performance is suspect from a developer | productivity standpoint." | | But looking at "high-level" languages (i.e. languages that | provide developer productivity enhancing abstraction), we can | rewind the clock to look at language families that evolved during | more resource-constrained times. | | Those languages, the lisps, schemes, smalltalks, etc. are now | really, really fast compared to Python, and rarely require | developers to shift to alternative paradigms (e.g. dropping to C) | just to deliver acceptable performance. | | Perl and Python exploded right at the time that Lisp/Scheme | hadn't quite shaken the myth that they were slow, with | Python/Perl achieving acceptable performance by having dropped to | C most of the time. | | Now the adoption moat is the wealth of libraries that exist for | Python--and it's a hell of a big moat. If I were a billionaire, | I'd hire a team of software developers to systematically review | libraries that were exemplars in various languages, and write / | improve idiomatic, performant, stylistically consistent versions | in something modern like Racket. I'd like to imagine that someone | would use those things :-) | zdw wrote: | This sounds a lot like what some Python package developers are | trying with Rust (example being the cryptography package), | which also has the unfortunate side effect of limiting support | for some less popular platforms. | edflsafoiewq wrote: | Perl/Python/Ruby grew up in the 90s, the "Bubble economy" of | the single core performance world, the likes of which had never | and probably will never be seen again on the face of the Earth. | In the post-Bubble world, throwing out 90% of your performance | before you even start writing code, especially when the same | dynamic features could be delivered via JIT without the cost, | seems crazy. | rockyj wrote: | So true, excellent point! I just do not understand startups | choosing Python/Ruby in 2022 when you can get most of the | features, type safety, concurrency, async and 5 times more | speed in other languages. | WJW wrote: | I don't think it is such a surprise. The ecosystems around | Rails (for Ruby) and numpy/pandas/etc (for python) are | orders of magnitude larger than you get in the modern | languages. In Rails for example, adding an entire user | management system (including niceties like password reset | mails and must-haves like proper security for obscure | vulnerabilities most people will have never heard of) is | literally a single extra line in the gemfile and two | console commands. In python the ML and numerics ecosystem | are completely beyond anything another language has to | offer at the moment, even more so when you compare the time | to get started. | | In addition, "real" performance is often tricky to measure | and may be irrelevant compared to other parts of the | system. Yes, Ruby is 10-100x slower than C. But if a user | of my web service already has a latency of (say) 200ms to | the server then it barely matters if the web service | returns a response in 5 ms or in 0.5 ms. Similarly for | rendering an email: no user will notice their email | arriving half a second earlier. Similarly for a python | notebook: if it takes 1 or 2 seconds to prepare some data | for a GPU processing job that will take several hours, it | doesn't really matter that the data preparation could have | been done in 0.1 seconds instead if it had been done in | Rust. | | _Especially_ for startups where often you 're not sure if | you're building the right thing in the first place, a big | ecosystem of prebuilt libraries is super important. If it | turns out people actually want to buy what you've made in | sufficient numbers that the inefficiency of | Ruby/Python/JS/etc becomes a problem then you can always | rewrite the most CPU intensive parts in another language. | Most startup code will never have the problem of "too many | users" though, so it makes no sense to optimize for that | from the start. | ReflectedImage wrote: | Well if you choose Python/Ruby you only need 1/3 of the | developers as if you choose another language. | | The productivity gain is so great it outweights everything | else. It's as simple as that. | peatmoss wrote: | Is it gauche to offer my own counterpoint? | | Another possibility is that the requirement to "drop to C" is a | virtue by de-democratizing access to serious performance. In | other words, let the commoners eat Python, while the anointed | manage their own memory. | | I personally find this argument a bit distasteful / disagree | with it, but there was a thread the other day that talked about | the, uh, variable quality of code in the Julia ecosystem (Julia | being another language where dropping to C isn't important for | performance). In Julia, the academics can just write their code | and get on with their work--the horror! | munificent wrote: | _> Those languages, the lisps, schemes, smalltalks, etc._ | | The main reason those languages got fast despite being highly | dynamic is because of _very_ complex JIT VM implementations. | (See also: JavaScript.) | | The cost of that is that a complex VM is much less hackable and | makes it harder to evolve the language. (See also: JavaScript.) | | Python and Ruby have, I think, reasonably chosen to have slower | simpler implementations so that they are able to nimbly respond | to user needs and evolve the language without needing massive | funding from giant corporations in order to support an | implementation. (See also: JavaScript.) | | There are other effects at play, too, of course. | | Once your implementation's strategy for speed is "drop to C and | use FFI", then it gets much harder to optimize the core | language with stuff like a JIT and inlining because the FFI | system itself gets in the way. Not having an FFI for JS on the | web essentially forced JavaScript users to push to make the | core language itself faster. | peatmoss wrote: | Spending a weekend or two writing a Scheme that beats Python | in performance has been a pastime for computer science | students for at least a couple decades now. I'm not sure that | I believe that a performant Scheme implementation has more | complexity than e.g. PyPy. In fact, I'd wager the converse. | mrtranscendence wrote: | You're either exaggerating or the computer science students | you're familiar with are wizards. I've never known the | student who could write a Scheme implementation, from | scratch, in one weekend that is both complete and which | beats Python from a performance perspective. | peatmoss wrote: | If it's an exaggeration, it's not much of one. | | Two parts to your argument: | | - Writing a Scheme implementation quickly: Google "Write | a Scheme in 48 hours" and "Scheme from scratch." 48 hours | to a functioning Scheme implementation seems to be a feat | replicated in multiple programming languages. | | - Performance: I haven't benchmarked every hobby scheme, | but given the proliferation of Scheme implementations | that, despite limited developer resources, beat (pure) | Python with it's massive pool of developers (CPython, | PyPy), I still don't buy the idea that optimizing Scheme | is a harder task than optimizing Python. Again, I'd | strongly suggest that optimizing Scheme is a much easier | task than optimizing Python simply by virtue of how often | the feat has been accomplished. | eatonphil wrote: | I would not include PyPy in a list of easy to beat | implementations. | JulianWasTaken wrote: | Nor ones with massive pools of developers. | peatmoss wrote: | Compared to most Scheme implementations? | mrtranscendence wrote: | If you can give me an implementation that implements | almost all of R5RS, in 48 hours, beating Python in | performance, and all by a single developer, I'll tip my | hat to that guy or gal. But I can't imagine it's too | commonly done. | eatonphil wrote: | Nobody said you can implement a full Scheme | implementation in 48 hours or two weeks. That's very much | besides the point about how poor CPython performance is. | eatonphil wrote: | Substitute computer science student with "developer" and | it holds for me. Definitely some CS students can do it | too. Actually at my school we did have to implement a | Scheme compiler. So yeah it's not too big of a stretch to | say. | | I think people who haven't implemented a language | underestimate how slow CPython is. And overestimate how | hard it is to build a compiler for a dynamic language. | | I think every professional developer or CS student can | and should build a compiler for a dynamic language! | mrtranscendence wrote: | But the claim was that a student could write a conformant | Scheme implementation in 48 hours that beats Python. | Clearly it's possible for a student to write a Scheme | that's faster than Python, but is it a reasonably | _complete_ Scheme done in a single weekend? | | Even I, very much a non-computer scientist, could write a | fast Scheme quickly if I could keep myself to a very | small subset, so that's not interesting to me. | eatonphil wrote: | Conformant is a word you introduced, they didn't say | that. | munificent wrote: | Sure, but that's because Python has objects. | | If your write an object system on top of your performant | hobby Scheme implementation, you'll likely find that the | performance of its method dispatch is about as slow as it | is in Python. Probably even slower. | | Purely procedural Python code isn't as slow as object- | oriented Python code. | peatmoss wrote: | That's fair, but also the fact that we're comparing hobby | scheme implementations to two mainstream extremely | popular implementations of Python and setting up | conditions that forces (hobby) Scheme to play to Python's | relative strengths is telling. :-) | | The Python ecosystem has certainly received a lot of | developer resources and attention the past couple of | decades. Shall we compare the performance of CLOS on | SBCL, which again has seen comparatively little developer | resources, to Python's performance in dealing with | objects? I'd take that performance wager. | Spivak wrote: | This isn't as much of a gotcha as you think. Python is | slow because the language is so dynamic and simply has to | do more behind the scenes work on each line. It's not | impressive that a language that does less is faster. | What's impressive is that a language that does _more_ , | like JS on V8, is faster. | CraigJPerry wrote: | Is CLOS doing less than Python? | | I'm thinking CLOS has more dynamism than Python - they're | both dynamically typed, they're both doing a lookup then | dispatch, but then CLOS adds dynamism on top of that, | it's also looking in the metadata thingy (i'm not a lisp | developer, do they call it the hash? I'm meaning the key | value store on every "atom" - i'm so out of my depth | here, is atom the right word?) plus if i remember right | the way CLOS works you use multiple dispatch not just | single dispatch like python. | igouy wrote: | > Python and Ruby have, I think, reasonably chosen to have | slower simpler implementations... | | ? | | https://shopify.engineering/yjit-just-in-time-compiler-cruby | munificent wrote: | Yes, CRuby is slowly moving towards a JIT now because | performance is a major blocker for user adoption. | | The larger Python ecosystem has tried that a number of | times too (Unladen Swallow, PyPy, etc.) | | It's quite difficult since both of those languages already | lean heavily on C FFI and having frequent hops in and out | of FFI code tends to make it harder to get the JIT fast. | JITs work best when all of the code is in the host language | and can be optimized and inlined together. | eatonphil wrote: | Javascript the language seems to have evolved much more than | Python despite CPython's very simple implementation. | munificent wrote: | Hence my point about "massive funding from giant | corporations in order to support an implementation". :) | eatonphil wrote: | Well almost all the JavaScript language innovation was | syntax sugar and was implemented as transforms before the | browsers implemented it. I think JavaScript devs mostly | were fine to keep using transforms indefinitely and it's | just been more convenient that the browsers have moved to | implement it. | | Python could have done this easily too but evolving as a | language just isn't as big a priority (not that I'm | saying it should be) and that's completely (or mostly) | disconnected from their backend implementation decisions. | boringg wrote: | What does "Modern" python even mean? | digisign wrote: | Focuses on 3.8+, but 3.7 has another year of life in it. | didip wrote: | If you are building server-side applications using Python 3 and | async API and if you didn't use | https://github.com/MagicStack/uvloop, you are missing out on | performance big time. | | Also, if you happen to build microservices, don't forget to try | PyPy, that's another easy performance booster (if it's compatible | to your app). | mrslave wrote: | > if it's compatible to your app | | Every time I experiment with PyPy (on a set of non-trivial web | services) I encounter at least one incompatibility with PyPy in | the dependency tree and leave disappointed. | [deleted] | s_Hogg wrote: | Great read, vaguely reminds me someone or other was trying to get | cpython going with cosmopolitan libc. Wonder what that would do | for speed. | make3 wrote: | why do you think that this would help performance? a quick read | says cosmo is slower than regular libc. maybe it would be more | portable, but not faster ___________________________________________________________________ (page generated 2022-05-05 23:00 UTC)