hngopher.com

       [HN Gopher] Accelerate Python code by importing Taichi
       ___________________________________________________________________
        
       Accelerate Python code by importing Taichi
        
       Author : synergy20
       Score  : 241 points
       Date   : 2022-09-08 15:48 UTC (7 hours ago)
        
 (HTM) web link (docs.taichi-lang.org)
 (TXT) w3m dump (docs.taichi-lang.org)
        
       | stevenhuang wrote:
       | Is this doing automatic memoization or actually emitting
       | optimized JIT?
        
       | IanCal wrote:
       | I thought I recognized the name. Taichi also is linked with
       | differentiable programming https://docs.taichi-
       | lang.org/docs/differentiable_programming
       | 
       | An extremely interesting area. I keep wanting to use it for
       | something but haven't had a good use case yet, nor frankly do I
       | think I really understand it.
        
       | okasaki wrote:
       | Would be even more interesting if the examples were also
       | implemented in C.
        
       | v3ss0n wrote:
       | How many time this has been posted.
        
         | bragr wrote:
         | 6 times: https://news.ycombinator.com/from?site=taichi-lang.org
        
           | forgotpwd16 wrote:
           | This shows everything from the domain. The specific article
           | was posted once but went unnoticed.
        
       | jack_pp wrote:
       | I'll take this opportunity to point out that if you're doing
       | anything numpy related that seems too slow you should run numba
       | on it, in my case we were doing a lot of cosine distance
       | calculations and our inference time sped up 10x by simply running
       | the cosine distance function from numpy through numba and it's as
       | easy as adding a decorator.
        
       | ipunchghosts wrote:
       | How does this differ from numba?
        
         | garyrob wrote:
         | Taichi vs. Numba: As its name indicates, Numba is tailored for
         | Numpy. Numba is recommended if your functions involve
         | vectorization of Numpy arrays. Compared with Numba, Taichi
         | enjoys the following advantages:
         | 
         | Taichi supports multiple data types, including struct,
         | dataclass, quant, and sparse, and allows you to adjust memory
         | layout flexibly. This feature is extremely desirable when a
         | program handles massive amounts of data. However, Numba only
         | performs best when dealing with dense NumPy arrays. Taichi can
         | call different GPU backends for computation, making large-scale
         | parallel programming (such as particle simulation or rendering)
         | as easy as winking. But it would be hard even to imagine
         | writing a renderer in Numba.
        
           | make3 wrote:
           | did they show benchmark to support that it does better
        
       | kburman wrote:
        
       | NotYourLawyer wrote:
       | Sounds kind of like the old psyco, but with GPU support.
        
       | brrrrrm wrote:
       | how does this compare to a normal JIT compiled language? Hacking
       | on performance to Python seems really brittle
        
         | brrrrrm wrote:
         | To answer the question, JavaScript is 31x faster out of the box
         | on size 1000000 (compared to the 6x claimed in the post). 71x
         | on 10M.                   bwasti@bwasti-mbp code % time python3
         | prime.py         78498         python3 prime.py  3.86s user
         | 0.02s system 98% cpu 3.938 total         bwasti@bwasti-mbp code
         | % time bun prime.js         78498         bun prime.js  0.07s
         | user 0.02s system 74% cpu 0.125 total
        
           | zeugmasyllepsis wrote:
           | I was curious about node/jitless node versus python/pypy:
           | src time node primes.js                      78498
           | node primes.js  1.75s user 0.05s system 102% cpu 1.761 total
           | src time node --jitless primes.js         78498         node
           | --jitless primes.js  5.06s user 0.04s system 100% cpu 5.088
           | total           src time python primes.py         78498
           | python primes.py  4.27s user 0.03s system 100% cpu 4.276
           | total           src asdf shell python pypy3.9-7.3.9
           | src time python primes.py         78498         python
           | primes.py  0.91s user 0.07s system 100% cpu 0.970 total
           | 
           | Actually relatively on par with one another.
        
       | tomaszsobota wrote:
       | I enjoyed learning about Reaction-Diffusion from this article
       | more than actually finding out I can run them 93475x faster with
       | Taichi ;D
        
         | salty_biscuits wrote:
         | Interestingly enough, Alan Turing is originator of the study of
         | reaction diffusion equations in biology
         | 
         | https://www.semanticscholar.org/paper/The-chemical-basis-of-...
        
       | bragr wrote:
       | > I am a loyal C++/Fortran user but would like to try out Python
       | as it is gaining increasing popularity. However, rewriting code
       | in Python is a nightmare - I feel the performance must be more
       | than 100x slower than before!
       | 
       | I think I found your problem. TBH you might like Julia more than
       | Python and you won't have to invent a new DSL in the process.
        
         | [deleted]
        
         | naillo wrote:
         | If you switch to julia you'd have to port over the entire
         | pytorch ecosystem as well if you want to use this together with
         | that (taichi has great compatability via `to_tensor`). Also,
         | the kernels you write in taichi is hardly a DSL, they're nearly
         | indistinguishable from normal python code (minus the rare
         | occasional caveat).
        
           | bragr wrote:
           | >If you switch to julia you'd have to port over the entire
           | pytorch ecosystem
           | 
           | Not if you're porting from Fortran in the first place as the
           | author claims.
        
         | stackbutterflow wrote:
         | Python is rising in popularity? I can't remember the last time
         | I reached to Python for something. Not that it says anything
         | about the language but it kinda disappeared from my bubble.
        
           | mountainriver wrote:
           | Ever heard of machine learning?
        
           | ShamelessC wrote:
           | Python is extremely popular, yes.
        
           | BerislavLopac wrote:
           | Python is pretty much the second best available language for
           | _everything_.
        
             | PufPufPuf wrote:
             | Precisely. Python often isn't the best choice, but it's
             | always a good enough choice.
        
           | azinman2 wrote:
           | It's the first thing I'll pick for most problems.
        
         | yazzku wrote:
         | I tried Julia and I had to wait several seconds for my program
         | to start due to JIT even for very trivial applications. Did I
         | do anything wrong? Seemed like the worst of both worlds. Python
         | runs slow but at least it starts fast.
        
         | robomartin wrote:
         | > I feel the performance must be more than 100x slower than
         | before!
         | 
         | Not too far off. This paper [0] compared 27 languages for speed
         | and energy efficiency (which is very interesting).
         | 
         | Python was 72 times slower than C and consumed 76 times more
         | energy.
         | 
         | I think Python is a very useful language at many levels. Great
         | for prototyping stuff. No doubt about that. If performance and
         | energy efficiency are important, it seems obvious one has to
         | look elsewhere.
         | 
         | [0] https://haslab.github.io/SAFER/scp21.pdf
        
       | 19h wrote:
       | Pretty sure isPrime can't simply be replaced with k % 2 == 0 ;)
        
         | MobiusHorizons wrote:
         | What makes you think that it was?
        
           | cyphar wrote:
           | The indentation is broken, so one possible way to interpret
           | it is:                 def is_prime(n: int):           result
           | = True           for k in range(2, int(n ** 0.5) + 1):
           | if n % k == 0:                   result = False
           | break           return result
           | 
           | Which is just a more complicated (n % 2 != 0). Obviously the
           | break should be in the if block.
        
             | [deleted]
        
         | tzot wrote:
         | Yes, the identation is incorrect in the blog post (running the
         | code would throw an IndentationError anyway); it's correct in
         | the github code though.
        
         | coldpie wrote:
         | Great, now you tell me.
         | 
         | * Tosses doctoral thesis in the trash.
        
         | [deleted]
        
         | graton wrote:
         | Wouldn't also need to change the function name to `isEven`?
        
       | jesuslop wrote:
       | I now get that Reaction-Diffusion business.
       | 
       | 1) "Diffussion" is species vs time equals species spatial
       | laplacian.
       | 
       | 2) The "reaction" equations are non-painfully derived from Baez
       | stochastic Petri nets/chemical reaction networks in [1] (species
       | vs time = multivariate polynomial in species, "space dependant
       | rate equation")
       | 
       | So Reaction-Diffusion is just adding up. Species vs time =
       | species spatial laplacian _plus_ multivariate polynomial in
       | species. One more for the toolbox!
       | 
       | [1] https://arxiv.org/abs/1209.3632
        
       | jokoon wrote:
       | question to OP:
       | 
       | how faster can the code in those SO answers be?
       | 
       | https://stackoverflow.com/questions/73473074/speed-up-set-pa...
       | 
       | This code is recursive and generate set partitions for large N
       | values (N larger than 12), it essentially works by skipping small
       | partitions and small subsets to target desirable set partitions.
       | Solutions that don't skip those suffer from "combinatory
       | explosion".
       | 
       | I did not write this code, I want to test it later with taichi,
       | but I'm curious if taichi can run this faster.
        
       | speps wrote:
       | I thought it was a parsing issue in Python when doing "import
       | taichi as ti" vs "import taichi". No it's just presenting Taichi,
       | a Python package to do parallel computation.
       | 
       | EDIT: title of the thread was "Accelerate Python code 100x by
       | import taichi as ti" like TFA
        
         | [deleted]
        
         | rjh29 wrote:
         | Me too - it wouldn't be unheard of in a language where
         | referencing multiple.levels.of.variable in a loop is orders of
         | magnitude slower than doing "a = multiple.levels.of.variable"
         | outside the loop and referencing a inside of it.
         | 
         | *may have been fixed in recent versions of Python - I heard of
         | this many years ago!
        
           | stingraycharles wrote:
           | Isn't that expected behaviour, as you're only looking up "a"
           | once when you do it outside the loop, while doing it every
           | time when inside the loop?
           | 
           | Because any reference in the whole hierarchy could change
           | during the looping (e.g. one could say "multiple.levels = {}"
           | at some point), the interpreter really would need to check it
           | every time unless it can somehow "prove" that these changes
           | will never happen / haven't happened.
           | 
           | Just keeping a reference to "a" is semantically very
           | different, and I'd consider that a normal optimisation.
        
             | 20after4 wrote:
             | The issue is just how slow python is, it takes a very long
             | time to resolve those references. So it's expected that the
             | multiple.levels.of.dereference would be slower but it's
             | perhaps orders of magnitude slower where in another
             | language it might be much less of a performance hit.
        
               | stingraycharles wrote:
               | Maybe, but then the loop has nothing to do with it, and
               | the problem is rather the slow lookups.
        
           | nerdponx wrote:
           | Historically this was a legitimate performance micro-
           | optimization in Python:                   def f(i):
           | ...              def g():             _f = f             for
           | i in range(100000):                 _f(i)
           | 
           | because looking up a local variable was faster than looking
           | up a global. I'm not sure if that's still true in newer
           | versions.
        
             | nequo wrote:
             | Sounds like Lua. But I see no difference in execution time
             | with Python 3.10.
        
       | tomthe wrote:
       | Unfortunately, phytran is missing in the comparison. Phytran
       | works in a lot of cases and it easy to use by just using python
       | types. I would like to see a comparison with taichi, as taichi
       | also seems to be interesting.
        
         | mkl wrote:
         | I'm pretty sure you mean Pythran. That's been disappointing in
         | my experiments with it. Nuitka is another one that's missing.
        
       | skykooler wrote:
       | This page is unreadable if your system theme is dark mode - dark
       | grey text on a black background.
        
         | sysop073 wrote:
         | The text shows up white for me.
        
         | MagerValp wrote:
         | Just a wild guess - are you using Firefox with the facebook
         | container and seeing font issues on multiple sites? Upgrade to
         | the newly released 2.3.4.
        
       | insane_dreamer wrote:
       | Impressive performance gains.
       | 
       | any Taichi v Julia benchmarks?
        
         | ChrisRackauckas wrote:
         | The only ones that I know of are in
         | https://arxiv.org/abs/2012.06684 (with Julia
         | DifferentialEquations.jl and DiffTaichi), but those are more
         | algorithmic. There, Julia does extremely well, but the
         | conclusion that I would draw from it is more: use a programming
         | language with robust and differentiable differential equation
         | solvers rather than writing simple Euler loops by hand (as this
         | article does).
        
       | visarga wrote:
       | Does it support dicts? I guess not
        
       ___________________________________________________________________
       (page generated 2022-09-08 23:00 UTC)