[HN Gopher] Intel Extension for TensorFlow ___________________________________________________________________ Intel Extension for TensorFlow Author : hochmartinez Score : 75 points Date : 2022-10-28 18:29 UTC (4 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | echelon wrote: | I love the hardware Nvidia makes, but for the good of our field | and for progress, we need our ML stacks to work with all other | vendors. CUDA is the wrong platform to build our future on. | _aavaa_ wrote: | Disappointed by the lack of benchmark comparison in the README | minimaxir wrote: | For context, in other Python ML packages like scikit-learn | running the Intel extension on a compatable CPU can result in a | _massive_ performance increase. | | Although since TensorFlow models should be trained on a GPU | unlike sklearn, that's less useful, and there are better tools | for CPU inference. (e.g. SavedModels or ONNX) | benreesman wrote: | I don't know much about Intel's toolchain: people say good | stuff about TBB and things like that. | | Historically doing inference on Intel gear was mostly about | whether or not to target AVX2 or AVX512 when building Eigen or | whatever. A few years ago the net win was AVX2 because the de- | clock and re-clock just killed you. | | What's the game these days? Long term I doubt inference will be | done on x86, but I think a lot of people still do it. | make3 wrote: | yes, but Intel now has mainstream GPUs that are reasonably (~ | RTX 3060) performant, which is what I assume this is for | Roark66 wrote: | Are you talking about Intel Arc? I'm yet to see any ML | relevant benchmarks on Intel Arc. If you are aware of any | please let me know. | throwaway1851 wrote: | Same here. At 16GB of VRAM and only $349, it could fill a | really nice slot for DL. | dweekly wrote: | For the curious, Apple has an analogous Tensorflow Metal plugin | to allow for Apple Silicon (and AMD GPU) acceleration using the | same plugin architecture. | | https://developer.apple.com/metal/tensorflow-plugin/ | muxamilian wrote: | Which is basically unusably buggy. | | For example, tf.sort only sorts up to 16 values and overwrites | the rest with -0. Apparently not fixed for over one year: | https://developer.apple.com/forums/thread/689299 | | Also, tf.random always returns the same random numbers: | https://developer.apple.com/forums/thread/69705 | | Although I guess these bugs are not the fault of Tensorflow's | plugin architecture but rather Apple's implementation. | Roark66 wrote: | Intel did a great thing for people interested in ML and numeric | research by making their MKL library and compiler free and cross | platform compatible. Even today on my AMD zen3 Ryzen machine | intel's MKL linked numpy and pytorch are in some operations 10* | (yes that is really ten times) faster in comparison with the next | best alternative (openBlas etc). I was shocked to discover how | much of a difference MKL makes for cpu workloads. This is mostly | because it makes use of AVX2 cpu extensions which make certain | matrix operations a lot faster. | galangalalgol wrote: | Compiling eigen uses avx512 just fine. I don't think it is | quite the discriminator it once was. However, I amp happy they | made it open and it would be great of oneAPI became a real | alternative to cuda for more tasks. I tried getting it to work | with flux.jl a while back and ran into some difficulty. | ysleepy wrote: | They crippled the performance on non-Intel CPUs on purpose | until recently. | | Intel's anti competitive behaviour follows them throughout | their history. | pantalaimon wrote: | They crippled their own consumer CPUs by retroactively | disabling AVX512 | Roark66 wrote: | Yes, this is true, but even back then (during the time they | crippled it) it was possible to pretend to run on Intel | hardware. I haven't done this, but I read it was possible. | ysleepy wrote: | I just felt your first sentence, without a qualifier, gave | too much credit looking at the context. | chrchang523 wrote: | That was my previous experience, but have you tried linking to | AMD AOCL recently? I would not expect the performance gap | between Intel MKL and AMD AOCL to still be as large as you | describe on a Zen 3. | Scene_Cast2 wrote: | Would you know if AOCL is supported by numpy? And are there | any benchmarks out there? (Especially interested in Zen4) ___________________________________________________________________ (page generated 2022-10-28 23:00 UTC)