[HN Gopher] Pyston v2: Faster Python
       Pyston v2: Faster Python
       Author : kmod
       Score  : 163 points
       Date   : 2020-10-28 17:42 UTC (5 hours ago)
 (HTM) web link (blog.pyston.org)
 (TXT) w3m dump (blog.pyston.org)
       | jedberg wrote:
       | I'm not versed in the details of the politics of CPython, but why
       | did this project fork instead of just contributing to CPython? Is
       | CPython really slow in integrating community contributions?
       | Edit: I read the blog post closer, and found this: "Our plan is
       | to open-source the code in the future, but since compiler
       | projects are expensive and we no longer have benevolent corporate
       | sponsorship, it is currently closed-source while we iron out our
       | business model."
         | kmod wrote:
         | I would say there's a small minority of our changes that we can
         | upstream, and eventually we'd like to upstream them. For the
         | rest, I would say the different set of priorities for the two
         | projects means they probably will stay separate:
         | - In the blog post I linked to an issue of someone trying to
         | upstream a quickening implementation and meeting resistance due
         | to the added complexity
         | - CPython prefers portability over performance, and we added a
         | number of big build dependencies that may not work on all the
         | platforms that CPython supports, though from our perspective it
         | works on all the important ones
         | - CPython has included a number of performance-degrading
         | features over the years and we plan to start cutting out some
         | that are only used for debugging but still hurt release mode. I
         | don't know for sure but I would expect resistance to backing
         | out those changes.
         | There are more, but the general idea is that the two projects
         | will make different tradeoffs. Maybe there is a world where the
         | code all lives in the same repository but is gated between
         | different behaviors, but the CPython maintainers have already
         | rejected a proposal like that.
         | samcolvin wrote:
         | Victor Stinner (Python core developer focused on performance)
         | has a great speech about this question:
         | There are a lot of reasons actually
         | - Performance limited by old CPython design. If you fork it you
         | have to deal with all the legacy code.
         | - CPython is limited to 1 thread because of the GIL.
         | - Specific memory allocators, C structures, reference counting,
         | specific garbage collector etc.
         | You can find that video in here: https://youtu.be/TXRPCZ7Nmh4
         | simonh wrote:
         | They intend to monetise it.
           | [deleted]
         | dwheeler wrote:
         | As far as I know CPython is quite welcoming.
         | However, per the blog post, this version of Pyston is closed
         | source. So CPython won't be interested, and many others won't
         | be interested either.
           | rrdharan wrote:
           | Python is a welcoming community, but they are decidedly not
           | welcoming of contributions that significantly increase the
           | complexity of the CPython reference implementation.
           | This has been discussed extensively:
           | https://news.ycombinator.com/item?id=11125769
             | orf wrote:
             | I might have to eat those old words as there is some form
             | of runtime code specialisation being considered for
             | inclusion in CPython. It doesn't sound like a fully-fledged
             | JIT, but that might be a distinction without a difference.
               | ak217 wrote:
               | Link?
               | orf wrote:
               | https://github.com/markshannon/faster-
               | cpython/blob/master/pl...
               | And some discussion:
               | https://news.ycombinator.com/item?id=24848318
               | ak217 wrote:
               | Thanks - Mark Shannon is a core committer, right? So
               | hopefully he has standing to get this done.
               | The plan looks very high level at this point, but it
               | looks like Mark is an expert in interpreter VM and JIT
               | technologies. All I can hope for is that he doesn't get
               | blocked by the "keep cpython simple" obstructionism.
               | mrtranscendence wrote:
               | He thinks (they think?) the project will need to be
               | funded to the tune of $2M. That seems like a hefty sum to
               | me, though maybe it's doable.
               | ak217 wrote:
               | It's not hefty at all once they pitch this improvement to
               | the many wealthy VC-funded companies whose business and
               | data science divisions depend on Python.
             | throwaway894345 wrote:
             | This is why I see little hope for Python, which is to say
             | that while I'm sure it will continue to have a large
             | following for many years a la C, C++, etc, I don't have
             | hope for it being an exciting language or one that is
             | particularly productive. Python already has performance and
             | packaging problems which don't seem to be easily divorced
             | from CPython, since virtually the whole reference
             | implementation is depended upon directly by much of the
             | ecosystem due to the sprawling C-extension interface.
             | Pypy has done yeoman's work in improving performance while
             | maintaining compatibility with an impressive amount of the
             | ecosystem, and even still there are many important packages
             | which aren't compatible with Pypy and for which Pypy-
             | compatible analogs don't exist (or aren't
             | supported/maintained). For example, the only Pypy-
             | compatible Postgres drivers were unsupported last I
             | checked.
             | Moreover, the Python community (or at least its leadership)
             | seems to have very little energy around tackling these
             | longstanding problems. Meanwhile, there are many other
             | languages which are not only performant, but which are
             | rapidly encroaching on Python's historically unique(ish)
             | "easiness" (in the sense that Python is considered "easy",
             | which is to say for people who don't have to manage
             | build/deploy/packaging/etc or otherwise have performance
             | issues). Further, many of these languages continue to
             | improve at a remarkable pace, while Python is content to
             | rest on the laurels of its scientific computing mindshare--
             | and given the rather poor nature of the numeric computing
             | package APIs and their somewhat low performance ceiling, I
             | don't expect Python to be so dominant in this domain in
             | another 5-10 years, especially as more companies need to
             | figure out how to productionize scientific workloads.
               | mrtranscendence wrote:
               | > Meanwhile, there are many other languages which are not
               | only performant, but which are rapidly encroaching on
               | Python's historically unique(ish) "easiness"
               | What are those languages? I may have a blindspot, but the
               | languages that get enough buzz for me to notice are
               | either not competing with Python in important dimensions
               | (e.g. Rust) or have a narrower focus (e.g. Julia). Elixir
               | maybe? JavaScript and its derivatives? But these lack the
               | scientific programming ecosystem that's helped drive
               | Python recently.
               | I have no doubt that languages exist that might fit the
               | bill of being both as easy as and more performant than
               | Python -- there are a _lot_ of languages out there. But I
               | 'm unaware of any whose mindshare has been sufficiently
               | growing that it threatens Python in the mid-term.
               | If I'm off base please let me know. I'd love to find a
               | viable competitor to Python that's strictly better than
               | it.
               | newen wrote:
               | I think Julia is strictly better than Python, both for
               | data oriented applications and for web development.
               | mrtranscendence wrote:
               | Julia might be. It's certainly higher performance from
               | what I understand, and it offers some syntactic
               | flexibility that I miss from R when using Python (Python
               | could never have a fully complete dplyr). It seems from
               | afar to be rather more complex than those languages,
               | though.
               | throwaway894345 wrote:
               | > I'd love to find a viable competitor to Python that's
               | strictly better than it.
               | I strongly recommend Go as a better Python. Personally, I
               | think it's easier to write than Python (although people
               | who care very little about correctness will be bothered a
               | bit by the type checker), and the tooling is many times
               | better (single-binary deployments, great dependency
               | management, etc are awesome). Also, the performance is
               | about 100-1000 times better for serial execution, and
               | Go's goroutines allow you to take advantage of multiple
               | cores much more easily than with Python.
               | JavaScript and TypeScript are similarly easy-to-use,
               | performant languages with a better-than-Python tooling
               | story. I've also heard similar things about Elixir,
               | Closure, and Kotlin.
               | vosper wrote:
               | What's Go like for REPL-driven / exploratory development?
               | That's mostly how I use Python.
               | bywaterstreet wrote:
               | Go has nothing on Python in this regard. I write Go every
               | day, and come from a Python background. I often describe
               | Go as the strongly-typed, more performant version of
               | Python. I say this mostly because my Go code isn't too
               | dissimilar from my Python code (structure, naming,
               | packages). But I still drop into Python if I want to do
               | something quickly. I don't really know why. Maybe it's
               | the Go tooling, e.g. unused variables cause compilation
               | errors, so things like this slow me down. Or maybe it's
               | because Python just offers _so_ much out of the box, e.g.
               | all the data structures you'll ever need (list, set,
               | dict, tuples), and all small things you take for granted
               | (like writing "is a in list", which would require a
               | function in Go). The REPL is Python's killer feature.
               | mrtranscendence wrote:
               | I'll be honest, as someone whose non-Python programming
               | (paltry as it is) is mostly done in statically-typed
               | functional languages, I have a bit of a bias against Go
               | for the whole generics thing. I'll give it a closer look
               | at some point.
               | omginternets wrote:
               | I gotta be honest, the lack of generics has never
               | bothered me.
               | The type-assertion escape hatch has always been largely
               | sufficient for all but the most performance-critical
               | projects (of which I have exactly 1, and it was a side
               | project), and runtime panics due to failed assertions are
               | quite easy to eliminate via wrapper types.
               | My advice: just go for it. You almost certainly won't
               | miss generics.
               | (Edit: I've broken my own rule and given advice without
               | first asking what kind of programming you do. My
               | assertion holds for most run-of-the-mill stuff, e.g.
               | writing REST interfaces, network servers, etc. If you're
               | within 2-sigma of the industry, you won't miss generics.)
               | saurik wrote:
               | 100% agree. To a very real extent, Go probably only
               | exists at all because Google poured an infinite amount of
               | effort into Unladen Swallow--a project designed to remove
               | the GIL from Python 2 to make it more usable for
               | concurrent programs and add an LLVM-based JIT compiler to
               | improve performance--and Python's response was to not
               | only not merge it, but to break the entire Python
               | ecosystem for a decade by forking the language... while
               | somehow leaving "concurrency" and "performance" off the
               | list of goals while making their backwards incompatible
               | change (which of course also destroyed all the work on
               | Unladen Swallow); hell: early versions of Python 3 were
               | benchmarking universally slower than Python 2 :/. Google
               | seems to have "gotten the hint", and around the same time
               | (2009) hired Rob Pike to do Go, and has since "moved on"
               | from Python and taken a massive part of the ecosystem of
               | users Python used to have with it (the rest having bailed
               | for node.js; remember when Python was the future of web
               | development, competing with Ruby thanks to Django? those
               | were the days): people still use Python, but it is now an
               | entirely different crop of data scientists and AI
               | people... all of whom are starting to run into the
               | performance issues in even their glue layer, and so if
               | the Swift tensorflow efforts ever work out, Python is
               | done.
               | petters wrote:
               | > Google poured an infinite amount of effort into Unladen
               | Swallow
               | Google most certainly did not.
               | hitekker wrote:
               | Huh, I was wondering why Unladen Swallow faded rapidly
               | since 2009 and why Python seems secondary at Google
               | compared to Java and Go.
               | Silence and distance seems to be a common echo of failed
               | projects.
               | rhencke wrote:
               | Google did not hire Rob Pike with Go in mind. Go did not
               | even come around as an idea until Pike had worked at
               | Google for a while. [1]
               | Don't forget Rob Pike created Sawzall at Google first
               | (2005). [2]
               | [1] https://golang.org/doc/faq#history [2]
               | https://research.google/pubs/pub61/
               | fnord123 wrote:
               | The early Go compiler reused the code generator from
               | Limbo. To the point that many comments referred to Limbo.
               | So while Rob Pike might not have joined Google to work on
               | Go or had Go in mind, the idea and direction for Go was
               | probably preordained.
               | It doesn't really support or attack your claims. It's
               | just a thing I wanted to share.
               | jnwatson wrote:
               | There's so little truth in this it is hard to know where
               | to start.
               | jashmatthews wrote:
               | Unladen Swallow was only an internship project.
               | http://qinsb.blogspot.com/2011/03/unladen-swallow-
               | retrospect...
               | N1H1L wrote:
               | The glue layers speedups are really bothering me. I
               | absolutely fail to understand how a data science heavy
               | language in 2020 can have such convoluted parallel
               | processing. Dask, numba, joblib all are unfinished
               | projects and absolutely not headache-free. CuPy works
               | great, but it is not multi-GPU capable as of today.
               | And anytime you point this out, people will trot out a
               | toy problem in Cython and try to prove you wrong, which
               | has zero relevance to real world issues. Hell, Cython
               | cannot even compile something as basic as Numpy FFTs,
               | something that is absolutely critical in signal
               | processing.
               | uranusjr wrote:
               | Didn't the original Go creators mention on multiple
               | occasions they intended to build a replacement to C, only
               | accidentally produced an application language? Assuming
               | that's true, Go would have happened regardless. Google
               | couldn't have pulled out of Unladen Swallow before Go is
               | usable, since they could not have known it's actually an
               | alternative at the time.
               | woeirua wrote:
               | I agree that packaging and dependency resolution can be a
               | total nightmare at times. Anaconda helps to an extent,
               | but it's far from ideal especially compared to the
               | default options for other languages.
               | The main problem I have with Python is its
               | maintainability, especially when you have multiple
               | developers working in the same code base. I would _never_
               | choose a dynamically typed language again to build
               | something that is going to be more than several thousand
               | lines of code, especially because of how it cripples your
               | IDE which is essential for helping junior developers
               | understand the existing codebase. Type hints help to an
               | extent, but they 're just a small bandaid over an oozing
               | sore. Languages like Go are significantly better if
               | you're going to build large codebases that need to
               | survive for a long period of time.
               | willseth wrote:
               | A different perspective is that Python leadership has
               | been overwhelmed addressing the concerns of the enormous
               | and growing Python community, for whom generally
               | performance is not yet the primary concern - believe it
               | or not. It's probably fair to suggest PSF has stumbled in
               | executing some of their goals, most notably and publicly
               | the transition to v3, but overall it seems like the
               | general Python community is most interested in language
               | and ecosystem features, while performance-critical
               | workloads are already being addressed in a number of
               | projects (not just PyPy, but Numba, Cython, and others).
               | Data science may be at the heart of Python's strengths,
               | but it's simply not accurate to suggest Python (whoever
               | that is!) is resting on its laurels, as evidenced by the
               | very active, albeit sprawling, package ecosystem. And
               | while the CPython devs seem to have found their stride in
               | 3.x releases.
               | It's also inaccurate to suggest numeric computing in
               | Python has a "low performance ceiling," considering you
               | can get near-C performance via JIT or AOT using a package
               | like Numba, and in most cases it's not even necessary
               | because of Numpy and the many other highly optimized
               | compute packages that can do most of the heavy lifting.
               | I think the main draw of Python is not just that the
               | syntax and language features are approachable, but that
               | the package ecosystem is so broad and active, you are
               | likely to make lighter work of the same job done in
               | another language. I think to displace Python you would
               | have to displace the package ecosystem, which seems as
               | big and broad as it's ever been.
               | ak217 wrote:
               | I agree with your overall characterization. As someone
               | who works a lot with Python and is very familiar with its
               | package ecosystem, I find the lack of leadership from PSF
               | to be discouraging and upsetting. The past few Python
               | releases have what I would characterize as cosmetic
               | improvements while repeatedly missing opportunities to
               | improve interpreter, packaging, and interface
               | fundamentals. The position that CPython has to be kept
               | simple as a reference implementation is untenable in the
               | absence of an active collaboration or effort to produce a
               | performance oriented implementation.
               | As far as I'm concerned, CPython as an interpreter
               | technology has not advanced in the past decade - and
               | Python has grown thanks to the data science community
               | efforts and excellent library ecosystem you mention, not
               | due to PSF. PSF can only miss so many opportunities
               | before something comprehensively better starts to eclipse
               | it.
               | willseth wrote:
               | I do get the impression that position is not the
               | overwhelming consensus among CPython core devs (nor does
               | it really make sense, given the huge dependency of many
               | packages to the C API), whether or not it is what PSF may
               | have publicly communicated. There are a few glimmers that
               | interpreter improvements are on the horizon, with some
               | proposals like subinterpreters and GIL mitigation getting
               | a lot of attention, all which (to my knowledge) are
               | necessary and prerequisite to serious, bold performance
               | improvements.
               | I agree performance is important, and I think we have
               | reason to be optimistic, but with the understanding that
               | that level of improvement, even if currently underway,
               | will just take a lot of effort and time. Meanwhile, as
               | someone fairly new to Python and working on several
               | performance critical pieces, I've been pretty impressed
               | with what you can do with the current compute packages
               | (after taking months to work my way through most of
               | them).
               | visarga wrote:
               | Not just the package ecosystem, but you also have to have
               | a large number of developers and jobs, so you can do
               | hiring. Developing a project in a non-mainstream language
               | means more difficult hiring.
               | sk2020 wrote:
               | Scipy has a poor performance ceiling? Numpy has a poor
               | API? Compared to what? Eigen? Whatever the Scala guys
               | use? That sounds kind of silly to me, especially when
               | hardly anyone is actually CPU-bound, anyway.
               | newen wrote:
               | Numpy has a very poor API compared to Julia, Matlab,
               | Mathematica, R. That's just me comparing to the ones I
               | know. It's a mishmash of methods and functions, in-place
               | operations and non-modifying operations, confusing
               | indexing and broadcasting API. There are much better
               | things available for array manipulation.
               | sk2020 wrote:
               | I like R, but there is nothing elegant or consistent
               | about its standard library. I don't think you're making a
               | good-faith argument.
               | newen wrote:
               | Don't just randomly accuse people of making bad faith
               | arguments. I've use R a lot and while it has its issues,
               | it has a much better interface than numpy.
               | throwaway894345 wrote:
               | WRT performance ceiling, I'm mostly talking about things
               | like Pandas which eagerly evaluate and which aren't
               | amenable to a parallel execution model (multiple threads
               | operating on the same data frame with minimal
               | contention).
               | WRT poor APIs, I'm talking about things like matplotlib
               | or pandas or etc that take a whole slew of arguments and
               | try to guess the caller's intent by inspecting the types
               | of the arguments. The referent isn't "some other
               | scientific computing API" (although I'm sure there are
               | some sane scientific computing APIs), but rather "other
               | APIs in general" since there's nothing inherent to any
               | particular domain that demands this kind of 'magical'
               | API.
               | WRT 'hardly anyone is CPU bound'--the context is numeric
               | computing; what are people bound by if not CPU? I've seen
               | several projects where web endpoints were timing out
               | while grinding in Pandas, largely because there weren't
               | good options for taking advantage of multiple processors.
               | Based on prototypes I did, I'm confident that other
               | languages could serve those requests in single-digit
               | seconds if not sub-second.
               | willseth wrote:
               | "Based on prototypes I [spent a limited amount of time on
               | and didn't research better methods], I'm confident..."
               | ihnorton wrote:
               | In the context of pandas, 3 GB of (raw, uncompressed)
               | data could easily require 30 GB of RAM, and that kind of
               | overhead adds up quickly.
               | bearzoo wrote:
               | many scientific computing applications are considered to
               | be bounded by io
               | xapata wrote:
               | matplotlib and pandas were designed with the idea of
               | mimicking interfaces more popular than the project (when
               | they were first conceived). The "easy" interface is a
               | large part of why those projects are now more popular
               | then their inspirations.
               | sk2020 wrote:
               | So who does it right? If all these APIs suck compared to
               | an imaginary perfect library, then that isn't a useful
               | comparison.
               | Also, if an endpoint is spending minutes to respond, then
               | I would think actually profiling the application would be
               | a good start. Maybe researching prior art in the problem
               | domain would be good too. If nobody can be bothered to
               | explore the several solutions to distributing pandas
               | computations over multiple cores, like Dask, and get the
               | NPV of just buying more or faster cores, then "Python
               | sucks" isn't your problem.
               | mrtranscendence wrote:
               | You're getting some pushback, but I tend to agree with
               | you on matplotlib and pandas. Great libraries are
               | designed so that you can get a feel for them and -- with
               | practice -- use them intuitively. Even after years of
               | (admittedly light) use I still find pandas' multi-indexes
               | confusing, and I always have to look up the best of
               | myriad ways to do something in matplotlib. In comparison,
               | R's dplyr and ggplot have stuck with me even ages after
               | giving up day-to-day use of R.
             | Jasper_ wrote:
             | It's a shame how the core maintainers see CPython as a
             | "reference implementation", rather than an opportunity to
             | make a world-class programming language implementation.
             | That the CPython maintainers have, time and time again,
             | decided for a "simple implementation" has pushed away many
             | professional VM engineers and researchers who would be more
             | than willing to help maintain a JIT.
             | I also will say that a lot of the complexity of making a
             | Python JIT is all the weird edge cases and special
             | interpreter functionality. To a VM engineer, CPython is a
             | very bizarre codebase; all the complexity is tucked away in
             | corners other than its C codebase.
             | Your complexity has to go somewhere.
               | coldtea wrote:
               | > _It 's a shame how the core maintainers see CPython as
               | a "reference implementation"_
               | Ironically this also makes CPython less interesting as a
               | reference language - which keeps PyPy and co
               | insignificant in adoption...
             | jnwatson wrote:
             | That post was 4 years ago. Even though I haven't heard
             | explicitly, certainly some of the recent comment from core
             | developer talk about performance techniques that would
             | definitely make things more complicated.
             | There's new boss(es) over the Python project, and things
             | might be changing in the near future.
             | N1H1L wrote:
             | I read the thread, it's from 2016. But hasn't numba made
             | some big progress in JITting CPython?
             | toolslive wrote:
             | it's _still_ essentially the same switch(opcode) based
             | interpreter it was 20 years ago. no threading, no super
             | instructions, no jitting, nothing.
             | https://github.com/python/cpython/blob/master/Python/ceval.
             | c...
               | jashmatthews wrote:
               | Direct threading is in here https://github.com/python/cpy
               | thon/blob/0564aafb71a153dd0aca4...
       | tedunangst wrote:
       | Everybody should note that the existence of pyston does not
       | prevent _you_ from making cpython or pypy faster if that 's what
       | you want to happen.
       | ihnorton wrote:
       | > A very-low-overhead JIT using DynASM
       | Interesting. DynASM [1] is the template assembler used in LuaJIT,
       | so it sounds like they might be JIT'ing CPython bytecode. IIRC
       | this is also what the first version of Pyston did. I'm curious
       | how this is working out, both implementation and performance-wise
       | compared to LLVM (used in Pyston v1). That could mean there is a
       | lot of performance still on the table, at least for some kinds of
       | code, but also a big complexity jump to get further gains.
       | [1] https://luajit.org/dynasm.html
       | theamk wrote:
       | Wanted to see redistribution rules, and was surprised to see
       | there is no license anywhere for the binaries... The closest
       | thing I found is "copyright" file inside .deb:
       | Copyright: 2020 The Pyston Team <support@pyston.org>
       | License: Closed source, all rights reserved.
       | I guess it means no one should be touching the file, as they
       | haven't even granted access to run it.
       | SoSoRoCoCo wrote:
       | Ugh. At some point we need to stop using side-forks/-projects
       | like this because then they become competing standards that pull
       | resources away from the main projects and evolve into their own
       | incompatible beasts. I hope they instead contribute to the main
       | branch, instead of wandering off into NIH land.
         | rudi-c wrote:
         | That's not how progress happens in practice. First of all, it's
         | not a known fact that a Python JIT implementation with
         | reasonable maintenance cost, functionality, and performance can
         | exist. Pyston is trying to prove that it is possible, but it's
         | not exactly a weekend project.
         | Even if you solved all the political issues around getting
         | CPython maintainers to accept performance contributions
         | (discussed already in this thread), adding a JIT to the main
         | CPython codebase would surely slow the pace of development of
         | other Python features. If the effort were to fail, then
         | CPython's primary maintainers will have wasted a whole bunch of
         | time coordinating with the JIT effort. I'm personally rooting
         | for Pyston to succeed but it's admittedly an ambitious project.
         | So forking off an experiment is the right move here. Pyston's
         | existence as a side-project is pulling no resources away from
         | the main project -- but it would if it was trying to send
         | changes upstream.
         | Hypothetically the Pyston developers could be making non-JIT
         | contributions to CPython instead, but developers aren't
         | interchangeable commodities. Pyston's engineers have expertise
         | in optimizations -- they may not be as skilled or interested in
         | language development.
         | imtringued wrote:
         | The solution to that is not to stick your head into sand, it's
         | to merge the desired changes.
         | JoeAltmaier wrote:
         | An issue. But isn't that forking fundamental to the principle
         | of Open Source? The market will choose what it wants, like some
         | Genetic Algorithm where the fittest thrives and the rest go
         | extinct. Else how does a project evolve at all?
         | WesolyKubeczek wrote:
         | Disagree. Currently, many people have "Python" projects which
         | only CPython can run, for some reason or other. Having a lot of
         | competing interpreters/compilers helps define what the core of
         | the language actually is, and which of your assumptions are
         | hinged on that single implementation.
         | yegle wrote:
         | Sure, but who to blame? There's no such thing as "Python
         | Language Specification", and everyone is trying hard to be
         | CPython-compatible (but ~impossible as your implementation need
         | to be compatible with various C-written module).
         | Should there be a "language spec", competing implementation
         | would make the language stronger not weaker. Examples include
         | gccgo, various Java vm, C++ compilers.
           | xapata wrote:
           | I'm pretty sure you can find the Python language
           | specification in the form of documentation at python.org.
           | Some things that CPython does are occasionally misconstrued
           | as specification, but the docs usually call that out.
             | yegle wrote:
             | The closest thing to a "language spec" is
             | https://docs.python.org/3/reference/index.html, which still
             | contains a lot of CPython specific/internal implementations
             | (e.g. https://docs.python.org/3/reference/datamodel.html?hi
             | ghlight...).
             | This, and combined with https://www.hyrumslaw.com/, makes
             | this "Python Language Reference" a "CPython Language
             | Reference".
       | The_rationalist wrote:
       | Interesting to see that it significantly outperform pypy on some
       | metrics, but wouldn't have it been better to allocate the human
       | resources towards pypy instead of a duplicated effort?
         | Rochus wrote:
         | Do you mean the Pylint result? Was this confirmed elsewhere?
         | It's unlikely that an interpreter is faster than a tracing JIT
         | in geomean over a relevant set of benchmarks.
         | jnwatson wrote:
         | They are completely different, incompatible approaches. Pypy in
         | particular has a C compatibility issue.
       | shadykiller wrote:
       | I read this as Python V2 and thinking why would they make V2
       | faster than V3 :D
       | nickjj wrote:
       | If anyone is curious, it is available on the Docker Hub but it
       | seems it's a version from 4 years ago based on
       | https://hub.docker.com/r/pyston/pyston/tags.
         | kmod wrote:
         | Whoops, I forgot we had that up, I took it down until we can
         | post an up-to-date image
       | fastball wrote:
       | Now we just need a Python implementation re-written in Rust.
       | (only kinda joking)
         | Recursing wrote:
         | https://github.com/RustPython/RustPython RustPython is still in
         | active development, I don't know how compatible they are though
       | nickjj wrote:
       | It's really awesome that it's not just a speed boost but a
       | drastic decrease in memory usage too.
       | Going from a 230mb Flask app down to 55mb is huge if it's really
       | a drop in replacement. If you factor in gunicorn process count,
       | the wins are even higher because if you had 4 gunicorn processes
       | each using 230mb but now they use 55mb, you're really going from
       | 920mb down to 220mb of RAM.
       | Edit: This isn't true in the end, a brain malfunction mis-read
       | the table thinking PyPy was actually the regular Python
       | interpreter. It would be interesting to see how it compares to
       | the default Python implementation for memory usage tho.
         | shoo wrote:
         | From the benchmark numbers reported in the post, pyston uses
         | slightly more memory than cpython for the flaskblogging
         | benchmark. switching to pyston only a win in terms of reducing
         | memory consumption if you're using pypy, and it would be more
         | of a win to switch to cpython
           | nickjj wrote:
           | Wow thanks for the clarification.
           | I don't know why but I read PyPy 7.3.2 as Python 3.7 in the
           | table. Talk about a brain auto-complete failure haha.
       | beervirus wrote:
       | 20% isn't nothing... but is it really worth switching to a non-
       | standard, closed-source version of the interpreter?
         | sjansen wrote:
         | Depends on your scale. If each of your web servers has a hand
         | picked name, definitely not. But if you stopped naming servers
         | a long time ago, and if the pricing structure is favorable, it
         | could mean a huge cost saving without an expensive rewrite.
           | gnulinux wrote:
           | What does it have to do with naming servers, can you explain?
           | Or do you mean if you have very few servers such that you can
           | name each and every one of them, this wouldn't be worth it?
             | sjansen wrote:
             | Pretty much. If you're spending $1000/month on servers,
             | you'd only save < $200/month and the added complexity
             | probably isn't worth it. If you're spending $100000/month
             | on servers, saving ~$20000/month is probably worth it.
             | Where is the tipping point between those two numbers?
             | Depends on context.
             | Rewriting in a more efficient language might seem even
             | cheaper, but you have to factor in risk and opportunity
             | cost. At some point it does become smarter to rewrite, but
             | your app needs to be pretty simple or your server bill
             | pretty huge before it's actually the best option.
           | Foober223 wrote:
           | Electricity costs can far outweigh developer salaries.
           | In some cases a rewrite may save more money than moving to a
           | 20% faster python.
         | jedberg wrote:
         | Well it's a drop in replacement, so you can always just go back
         | to CPython.
       | ghj wrote:
       | I stalked the author's linkedin and notice he has competitive
       | programming experience:
       | https://www.topcoder.com/members/kmod/details/?track=DATA_SC...
       | (and top 15 putnam, ICPC world finals, etc)
       | I wonder if he would be interested in optimizing for purely
       | algorithmic tasks?
       | There are a lot active and successful CPython and PyPy users on
       | https://atcoder.jp/. For example:
       | https://atcoder.jp/contests/practice2/submissions?f.Task=&f....
       | (the user "maspy" is rated at 2750 using only cpython!!!)
       | https://atcoder.jp/contests/practice2/submissions?f.Task=&f....
       | (though pypy is more practical)
       | I am linking to atcoder because their testing data is public so
       | you can rerun contestants solutions using both
       | pyston/cpython/pypy for benchmarking purposes:
       | https://www.dropbox.com/sh/arnpe0ef5wds8cv/AAAk_SECQ2Nc6SVGi...
       | Right now, other than a handful of people who figured out how to
       | make numba's jit work, only pypy is viable for competitive
       | programming. I wonder if you can do better than pypy?
       | There are also a few red coders on codeforces.com who mostly use
       | pypy (cpython is completely unviable there because numpy and
       | numba is not installed)
       | https://codeforces.com/submissions/pajenegod
       | https://codeforces.com/submissions/conqueror_of_tourist
       | But codeforces' test cases aren't public anyway so it's not as
       | relevant.
         | kmod wrote:
         | All my CodeJam solutions are in Python :)
         | While we could certainly go in this direction, we're not
         | planning to, because in our experience optimizations for
         | different workloads are largely distinct, and this use case is
         | already handled well by PyPy.
           | ghj wrote:
           | Isn't this use case the scientific computing use case? That's
           | a fairly large part of the ecosystem to give up on!
           | I think it's still a relatively low effort way (just need to
           | write a scraper) to create a benchmark on a diverse set of
           | algorithmic tasks that have clearcut criteria on AC/TLE/WA.
           | PyPy is often 10x faster than cpython on these problems (and
           | just 2x slower than equivalent C++ solution) so it will be a
           | much nicer headline too if you can achieve similar
           | performances!
           | Though I can also see how it can be completely irrelevant for
           | server workloads. Pypy's unicode is so slow, some people on
           | codeforces still use pypy2 over pypy3 just to avoid it. And c
           | extensions is so bad on pypy, you can often get better
           | performance on cpython if you need to use numpy.
       | cbsmith wrote:
       | Can we talk about the name??
       | Like, if you wanted to create a confusing name, I can't think of
       | worse ideas than naming it "v2" for a runtime that is Python _3_
       | compatible, particularly for a project that has traditionally
       | been Python 2 compatible...
       | pwinnski wrote:
       | I feel like anybody really searching for speed is using something
       | other than Python. I don't use Python for speed, but for ease-of-
       | use.
       | It took me some clicks to see it's supposed to be a drop-in
       | replacement, so that's good.
         | RandallBrown wrote:
         | Sure, but there's lots of people already using Python that
         | would probably love some more speed.
         | tgv wrote:
         | I feel you, but then again, Python applications tend to
         | deployed anyway, so a speed boost can be welcome for some.
         | Memory usage looks better, too.
         | irrational wrote:
         | Maybe. One of my projects at work is supporting a legacy
         | ColdFusion website that is nearly 20 years old (well, it was
         | started 20 years ago, but has seen updates since). We'd love to
         | move it to something faster, but it is huge and it would
         | literally take years to rewrite it. Other things take priority
         | so it will probably never be rewritten. I imagine there are
         | Python projects in a similar boat.
           | pastage wrote:
           | Rewriting parts of it will not take years. Most of it can
           | probably be thrown away, replaced by standard components.
           | (only talking from personal experience, sure we had legacy
           | systems running for years).
             | irrational wrote:
             | Unfortunately that wouldn't work with this site. Between
             | the five of us in my group we have nearly 100 years of
             | developer experience. We've talked a lot about if there is
             | anyway we could divest of it a little at a time or replace
             | parts of it, but we really can't.
         | coldtea wrote:
         | > _I feel like anybody really searching for speed is using
         | something other than Python._
         | Well, anybody using Python for all its over benefits, still
         | would very much like more speed.
         | qorrect wrote:
         | I use python because of Numpy, Scikit and Tensorflow. I don't
         | know of any other languages with libraries as productive as
         | these, so speeding these apps up is a big win for a lot of
         | people.
         | Also 20% is huge, I look forward to trying it!
           | ehsankia wrote:
           | Well that still goes in the "ease-of-use" bucket in my
           | opinion.
           | That being said, those libraries are already highly optimized
           | and all the heavy stuff running on C anyways, so making the
           | Python itself faster won't make that much of a difference in
           | those workflows.
           | wil421 wrote:
           | The 20% speed up looked like it was for flask and Django. In
           | their benchmarks PyTorch did not have a speed increase.
             | austinpena wrote:
             | Likely because pytorch uses C++ under the hood.
           | hhas01 wrote:
           | 20% faster is nothing. You want at least a magnitude faster
           | to justify the cost and risk of switching.
           | The Python language is already 30 years old, and it wasn't
           | even the cutting-edge in imperative language design
           | (<koff>Smalltalk, Lisp</koff>) back then. It's positively
           | antiquated now.
           | I've never understood this tunnel-vision obsession with
           | endlessly chasing ever-diminishing returns. It's Zawinski's
           | Law of Software by way of Greenspun's Tenth Rule, and a
           | fundamental failure of courage.
           | Learn the lessons from both the good _and_ bad of what's been
           | done before, and move on. A better long-term answer would be
           | to design a much faster, more efficient language for running
           | Numpy, Scikit, and Tensorflow, then port those libraries over
           | to that. If that language turns out to be good for other
           | things too, then great. If not, let a thousand flowers bloom.
           | There is a much larger learning opportunity here, to get a
           | whole lot better at migrating extant code bases from an old,
           | popular, dead-ended language to a new, upcoming one. But it's
           | like finding your way to Carnegie Hall: it takes practise,
           | practise, practise.
             | attractivechaos wrote:
             | Interesting that you are down voted. I agree with you that
             | 20% faster is really nothing in comparison to the many new
             | languages. I was saddened by the path Python3 chose. If
             | compatibility was already broken at the time, they should
             | have designed something with performance in mind from the
             | beginning. The V8 javascript engine was there. We knew
             | things could get >10X faster with JIT. Python is too big to
             | die, but it is an inferior programming language in many
             | ways.
               | hhas01 wrote:
               | "Interesting that you are down voted."
               | Doesn't bother me; I've got points to spare.
               | What downvoters haven't got, it seems, is any arguments.
             | newen wrote:
             | I agree. Almost all arguments in favor of Python is about
             | sunk cost. Not much about the actual language is appealing
             | compared to modern languages.
               | hhas01 wrote:
               | Ah, sunk costs. Where the future goes to die.
               | And I say this as a 20-year Python user myself, 'cos
               | while it has scratched many itches and continues to do
               | so, I am not the least bit sentimental about it. The best
               | compliment would be to kill it with something far better,
               | that steals all its good parts and replaces the rest.
               | coldtea wrote:
               | > _Almost all arguments in favor of Python is about sunk
               | cost. Not much about the actual language is appealing
               | compared to modern languages._
               | Well, I, for one, use Python because "the actual language
               | is appealing compared to modern languages".
               | dragonwriter wrote:
               | > Almost all arguments in favor of Python is about sunk
               | cost.
               | I think you are confusing ecosystem and other established
               | advantages with sunk costs, they are different things.
               | It's true that (from the perspective of the people who
               | built them), those advantages are the products of sunk
               | costs, but the argument is about the ongoing value
               | delivered, not the sunk cost involved in delivering it.
               | > Not much about the actual language is appealing
               | compared to modern languages.
               | Even if that was true, many of the actual languages
               | competing with Python are _less_ modern by any measure,
               | and in any case so what? Does it matter when choosing a
               | langauge if an advantage is produced by the abstract
               | design of a language, it 's ecosystem, or the
               | peculiarities of the available implementations?
               | Advantages are advantages, value is value.
               | Sure, if you are considering how to promote a "modern"
               | language against Python, it's important to distinguish
               | whether your current barrier is the design of your
               | language or Python's ecosystem to know how to direct your
               | efforts, but if you aren't a tool evangelist and instead
               | are choosing a language for a project, I don't see that
               | it matters _why_ Python is a net advantage, as long as it
               | is.
               | hhas01 wrote:
               | "the argument is about the ongoing value delivered"
               | Which is an admirable sentiment... but the title of this
               | thread is not "Python: still doing useful work" but
               | "Python: now 20% faster", and being ridiculously self-
               | congratulatory about this when the correct response is to
               | laugh at the silly pointless frivolity of it.
               | Trying to make Python fast is a fool's errand, because
               | Python is _slow by design_.
               | A useful argument would be that Python is faster overall
               | at solving various real-world problems than current
               | alternatives; but that's not the popular argument being
               | made, because the population fixates on minutiae instead
               | of overall perspective.
           | throwaway5752 wrote:
           | Aren't all of those mostly implemented in C or C++
           | extensions? Not to minimize speeding up the glue code, but I
           | don't know if they will see that large of an improvement.
           | anakaine wrote:
           | I'm in 5he same bucket here. My other ecosystem products are
           | built on top of python, so we make use of pandas, numpy,
           | dask, and our vendors own modules (which have taken them 10+
           | years to put together with a reasonably full team).
         | chrisseaton wrote:
         | > I don't use Python for speed, but for ease-of-use.
         | Right - but if you can get that with better performance that's
         | good isn't it?
         | I don't get why people object to performance work on languages
         | not intended for performance.
           | mhh__ wrote:
           | > I don't get why people object to performance work on
           | languages not intended for performance
           | My gut feeling is that Python is just not safe or static
           | enough to ever be worth trying to compete with (say) C++
           | with. Python sure is easy but I think the asymptotic cost of
           | using it for a big project (in my hands at least) is just not
           | worth it.
           | I like Python's syntax quite a bit but I feel bad watching
           | people learn to program using it - partly because it's an
           | oddly low-level language (It's closer to being C than
           | Haskell) and there's no compiler to stop you shooting
           | yourself in the foot (If I write something fundamentally
           | unsound I want to know about it _now_ not when the process
           | has been running all day)
             | saluk wrote:
             | As someone who learned on python (many years ago), I think
             | the balance of being allowed to shoot myself in the foot,
             | while only having to learn complexity when complex concepts
             | came up, was a good combination rather than a bad one.
             | Trying to learn C++ and Java before, having to type
             | everything was an impedement. On python, you find out that
             | types still matter when you try to add a string to an
             | integer, for example. The moment the type matters to you is
             | when you need to dig into those details. I'm not sure I
             | would have the career I do if it hadn't been around to be
             | the one language that fit my 16-year old brain.
           | kazinator wrote:
           | People object to changes in requirements and implementations
           | that they don't control and don't perceive as being
           | beneficial to their use case.
             | chrisseaton wrote:
             | But it's a fork. And what do you mean by 'requirements'?
             | It's compatible. Forget it exists if you don't want it.
           | omginternets wrote:
           | >I don't get why people object to performance work on
           | languages not intended for performance.
           | It's simple, really. They're concerned about the trade-offs.
             | chrisseaton wrote:
             | If it's in a fork that you can pretend doesn't exist if you
             | don't want it... what is the trade-off?
               | omginternets wrote:
               | A fragmented ecosystem, incompatibilities, etc.
               | And more importantly, your original question isn't asking
               | about forks, specifically. It was asking why someone
               | might oppose performance work.
               | chrisseaton wrote:
               | > incompatibilities, etc
               | But it's compatible. It's a drop-in replacement.
           | Shared404 wrote:
           | > I don't get why people object to performance work on
           | languages not intended for performance.
           | Presumably because they see it as effort that could be better
           | applied elsewhere.
           | I don't know that I agree, but I can understand the
           | viewpoint.
           | Narann wrote:
           | > I don't get why people object to performance work on
           | languages not intended for performance.
           | You're right, this objection is not relevant in general.
           | But in the situation we talk about, having 20 % more
           | performance requires you to choose a fork that can came with
           | it's own limitation. It's definitely not free.
           | This trade of makes the point about the language choice
           | relevant.
       | Animats wrote:
       | How about PyPy?
         | ralph87 wrote:
         | PyPy is the awesomest thing created since sliced bread, but C
         | extension interoperability is still a source of perf problems
         | in a few scenarios AIUI. Glad to have Pyston in addition to
         | PyPy
         | [deleted]
         | mivade wrote:
         | They compare PyPy and CPython in their benchmarks.
       | oscargrouch wrote:
       | Funny their choice over luajit's Dynasm, when you have something
       | like Turbofan laying around.
       | The V8 Javascript interpreter can generate machine code
       | dinamically targetting the host arch for each bytecode
       | instruction..
       | You can define the target assembly directly in C++ without
       | resorting to any specific machine code.
       | Maybe complexity or because they wanted to stick to C?
       | simonw wrote:
       | "After the project ended, some of us from the team brainstormed
       | how we would do it differently if we were to do it again. In
       | early 2020, enough pieces were in place for us to start a company
       | and work on Pyston full-time."
       | I didn't know the Pyston team had split off to form their own
       | company! Anyone know who's involved, or if they've raised money
       | for it?
         | simonw wrote:
         | Looks like https://www.linkedin.com/in/kevinmodzelewski/ is the
         | founder, as-of May this year.
       | acrefoot wrote:
       | It's exciting to see this reborn outside of Dropbox!
       | EdSchouten wrote:
       | Not that it's necessarily a bad thing, but it looks like the
       | Pyston project isn't getting a lot of updates:
       | https://github.com/pyston/pyston/commits/v2.0
         | soperj wrote:
         | Project says: A faster and highly-compatible implementation of
         | the Python programming language. __The code here is out of
         | date, please follow our blog __
           | kapilvt wrote:
           | where is the code? the repo looks relatively untouched, the
           | blog directs to this repo for filing issues.
           | https://github.com/pyston/pyston
             | soperj wrote:
             | I think it's private. Looks like a for profit fork.
             | mkl wrote:
             | Pyston v2 is closed source (for now?). From the blog post:
             | > Our plan is to open-source the code in the future, but
             | since compiler projects are expensive and we no longer have
             | benevolent corporate sponsorship, it is currently closed-
             | source while we iron out our business model.
       (page generated 2020-10-28 23:00 UTC)