[HN Gopher] Accelerate Python code by importing Taichi ___________________________________________________________________ Accelerate Python code by importing Taichi Author : synergy20 Score : 241 points Date : 2022-09-08 15:48 UTC (7 hours ago) (HTM) web link (docs.taichi-lang.org) (TXT) w3m dump (docs.taichi-lang.org) | stevenhuang wrote: | Is this doing automatic memoization or actually emitting | optimized JIT? | IanCal wrote: | I thought I recognized the name. Taichi also is linked with | differentiable programming https://docs.taichi- | lang.org/docs/differentiable_programming | | An extremely interesting area. I keep wanting to use it for | something but haven't had a good use case yet, nor frankly do I | think I really understand it. | okasaki wrote: | Would be even more interesting if the examples were also | implemented in C. | v3ss0n wrote: | How many time this has been posted. | bragr wrote: | 6 times: https://news.ycombinator.com/from?site=taichi-lang.org | forgotpwd16 wrote: | This shows everything from the domain. The specific article | was posted once but went unnoticed. | jack_pp wrote: | I'll take this opportunity to point out that if you're doing | anything numpy related that seems too slow you should run numba | on it, in my case we were doing a lot of cosine distance | calculations and our inference time sped up 10x by simply running | the cosine distance function from numpy through numba and it's as | easy as adding a decorator. | ipunchghosts wrote: | How does this differ from numba? | garyrob wrote: | Taichi vs. Numba: As its name indicates, Numba is tailored for | Numpy. Numba is recommended if your functions involve | vectorization of Numpy arrays. Compared with Numba, Taichi | enjoys the following advantages: | | Taichi supports multiple data types, including struct, | dataclass, quant, and sparse, and allows you to adjust memory | layout flexibly. This feature is extremely desirable when a | program handles massive amounts of data. However, Numba only | performs best when dealing with dense NumPy arrays. Taichi can | call different GPU backends for computation, making large-scale | parallel programming (such as particle simulation or rendering) | as easy as winking. But it would be hard even to imagine | writing a renderer in Numba. | make3 wrote: | did they show benchmark to support that it does better | kburman wrote: | NotYourLawyer wrote: | Sounds kind of like the old psyco, but with GPU support. | brrrrrm wrote: | how does this compare to a normal JIT compiled language? Hacking | on performance to Python seems really brittle | brrrrrm wrote: | To answer the question, JavaScript is 31x faster out of the box | on size 1000000 (compared to the 6x claimed in the post). 71x | on 10M. bwasti@bwasti-mbp code % time python3 | prime.py 78498 python3 prime.py 3.86s user | 0.02s system 98% cpu 3.938 total bwasti@bwasti-mbp code | % time bun prime.js 78498 bun prime.js 0.07s | user 0.02s system 74% cpu 0.125 total | zeugmasyllepsis wrote: | I was curious about node/jitless node versus python/pypy: | src time node primes.js 78498 | node primes.js 1.75s user 0.05s system 102% cpu 1.761 total | src time node --jitless primes.js 78498 node | --jitless primes.js 5.06s user 0.04s system 100% cpu 5.088 | total src time python primes.py 78498 | python primes.py 4.27s user 0.03s system 100% cpu 4.276 | total src asdf shell python pypy3.9-7.3.9 | src time python primes.py 78498 python | primes.py 0.91s user 0.07s system 100% cpu 0.970 total | | Actually relatively on par with one another. | tomaszsobota wrote: | I enjoyed learning about Reaction-Diffusion from this article | more than actually finding out I can run them 93475x faster with | Taichi ;D | salty_biscuits wrote: | Interestingly enough, Alan Turing is originator of the study of | reaction diffusion equations in biology | | https://www.semanticscholar.org/paper/The-chemical-basis-of-... | bragr wrote: | > I am a loyal C++/Fortran user but would like to try out Python | as it is gaining increasing popularity. However, rewriting code | in Python is a nightmare - I feel the performance must be more | than 100x slower than before! | | I think I found your problem. TBH you might like Julia more than | Python and you won't have to invent a new DSL in the process. | [deleted] | naillo wrote: | If you switch to julia you'd have to port over the entire | pytorch ecosystem as well if you want to use this together with | that (taichi has great compatability via `to_tensor`). Also, | the kernels you write in taichi is hardly a DSL, they're nearly | indistinguishable from normal python code (minus the rare | occasional caveat). | bragr wrote: | >If you switch to julia you'd have to port over the entire | pytorch ecosystem | | Not if you're porting from Fortran in the first place as the | author claims. | stackbutterflow wrote: | Python is rising in popularity? I can't remember the last time | I reached to Python for something. Not that it says anything | about the language but it kinda disappeared from my bubble. | mountainriver wrote: | Ever heard of machine learning? | ShamelessC wrote: | Python is extremely popular, yes. | BerislavLopac wrote: | Python is pretty much the second best available language for | _everything_. | PufPufPuf wrote: | Precisely. Python often isn't the best choice, but it's | always a good enough choice. | azinman2 wrote: | It's the first thing I'll pick for most problems. | yazzku wrote: | I tried Julia and I had to wait several seconds for my program | to start due to JIT even for very trivial applications. Did I | do anything wrong? Seemed like the worst of both worlds. Python | runs slow but at least it starts fast. | robomartin wrote: | > I feel the performance must be more than 100x slower than | before! | | Not too far off. This paper [0] compared 27 languages for speed | and energy efficiency (which is very interesting). | | Python was 72 times slower than C and consumed 76 times more | energy. | | I think Python is a very useful language at many levels. Great | for prototyping stuff. No doubt about that. If performance and | energy efficiency are important, it seems obvious one has to | look elsewhere. | | [0] https://haslab.github.io/SAFER/scp21.pdf | 19h wrote: | Pretty sure isPrime can't simply be replaced with k % 2 == 0 ;) | MobiusHorizons wrote: | What makes you think that it was? | cyphar wrote: | The indentation is broken, so one possible way to interpret | it is: def is_prime(n: int): result | = True for k in range(2, int(n ** 0.5) + 1): | if n % k == 0: result = False | break return result | | Which is just a more complicated (n % 2 != 0). Obviously the | break should be in the if block. | [deleted] | tzot wrote: | Yes, the identation is incorrect in the blog post (running the | code would throw an IndentationError anyway); it's correct in | the github code though. | coldpie wrote: | Great, now you tell me. | | * Tosses doctoral thesis in the trash. | [deleted] | graton wrote: | Wouldn't also need to change the function name to `isEven`? | jesuslop wrote: | I now get that Reaction-Diffusion business. | | 1) "Diffussion" is species vs time equals species spatial | laplacian. | | 2) The "reaction" equations are non-painfully derived from Baez | stochastic Petri nets/chemical reaction networks in [1] (species | vs time = multivariate polynomial in species, "space dependant | rate equation") | | So Reaction-Diffusion is just adding up. Species vs time = | species spatial laplacian _plus_ multivariate polynomial in | species. One more for the toolbox! | | [1] https://arxiv.org/abs/1209.3632 | jokoon wrote: | question to OP: | | how faster can the code in those SO answers be? | | https://stackoverflow.com/questions/73473074/speed-up-set-pa... | | This code is recursive and generate set partitions for large N | values (N larger than 12), it essentially works by skipping small | partitions and small subsets to target desirable set partitions. | Solutions that don't skip those suffer from "combinatory | explosion". | | I did not write this code, I want to test it later with taichi, | but I'm curious if taichi can run this faster. | speps wrote: | I thought it was a parsing issue in Python when doing "import | taichi as ti" vs "import taichi". No it's just presenting Taichi, | a Python package to do parallel computation. | | EDIT: title of the thread was "Accelerate Python code 100x by | import taichi as ti" like TFA | [deleted] | rjh29 wrote: | Me too - it wouldn't be unheard of in a language where | referencing multiple.levels.of.variable in a loop is orders of | magnitude slower than doing "a = multiple.levels.of.variable" | outside the loop and referencing a inside of it. | | *may have been fixed in recent versions of Python - I heard of | this many years ago! | stingraycharles wrote: | Isn't that expected behaviour, as you're only looking up "a" | once when you do it outside the loop, while doing it every | time when inside the loop? | | Because any reference in the whole hierarchy could change | during the looping (e.g. one could say "multiple.levels = {}" | at some point), the interpreter really would need to check it | every time unless it can somehow "prove" that these changes | will never happen / haven't happened. | | Just keeping a reference to "a" is semantically very | different, and I'd consider that a normal optimisation. | 20after4 wrote: | The issue is just how slow python is, it takes a very long | time to resolve those references. So it's expected that the | multiple.levels.of.dereference would be slower but it's | perhaps orders of magnitude slower where in another | language it might be much less of a performance hit. | stingraycharles wrote: | Maybe, but then the loop has nothing to do with it, and | the problem is rather the slow lookups. | nerdponx wrote: | Historically this was a legitimate performance micro- | optimization in Python: def f(i): | ... def g(): _f = f for | i in range(100000): _f(i) | | because looking up a local variable was faster than looking | up a global. I'm not sure if that's still true in newer | versions. | nequo wrote: | Sounds like Lua. But I see no difference in execution time | with Python 3.10. | tomthe wrote: | Unfortunately, phytran is missing in the comparison. Phytran | works in a lot of cases and it easy to use by just using python | types. I would like to see a comparison with taichi, as taichi | also seems to be interesting. | mkl wrote: | I'm pretty sure you mean Pythran. That's been disappointing in | my experiments with it. Nuitka is another one that's missing. | skykooler wrote: | This page is unreadable if your system theme is dark mode - dark | grey text on a black background. | sysop073 wrote: | The text shows up white for me. | MagerValp wrote: | Just a wild guess - are you using Firefox with the facebook | container and seeing font issues on multiple sites? Upgrade to | the newly released 2.3.4. | insane_dreamer wrote: | Impressive performance gains. | | any Taichi v Julia benchmarks? | ChrisRackauckas wrote: | The only ones that I know of are in | https://arxiv.org/abs/2012.06684 (with Julia | DifferentialEquations.jl and DiffTaichi), but those are more | algorithmic. There, Julia does extremely well, but the | conclusion that I would draw from it is more: use a programming | language with robust and differentiable differential equation | solvers rather than writing simple Euler loops by hand (as this | article does). | visarga wrote: | Does it support dicts? I guess not ___________________________________________________________________ (page generated 2022-09-08 23:00 UTC)