[HN Gopher] Launch HN: Tensil (YC S19) - Open-Source ML Accelera...
       ___________________________________________________________________
        
       Launch HN: Tensil (YC S19) - Open-Source ML Accelerators
        
       Hello HN! I'm Tom, co-founder at Tensil (https://www.tensil.ai/).
       We design free and open source machine learning accelerators that
       anyone can use.  A machine learning inference accelerator is a
       specialized chip that can run the operations used in ML models very
       quickly and efficiently. It can be either an ASIC or an FPGA, with
       ASIC giving better performance but FPGA being more flexible.
       Custom accelerators offer dramatically better performance per watt
       than existing GPU and CPU options. Massive companies like Google
       and Facebook use them to make training and inference cheaper.
       However, everyone else has been left out: small and mid-sized
       companies, students and academics, hobbyists and tinkerers
       currently have no chance of getting custom ML hardware. We aim to
       change that, starting with ML inference on embedded and edge FPGA
       platforms. Our dream is that our accelerators help people make new
       applications possible that simply weren't feasible before.  We
       believe that advances in AI go hand in hand with advances in
       computing hardware. As a couple of software and ML engineers hoping
       to live in a world alongside intelligent machines, we wanted to
       know why those hardware advances were taking so long! We taught
       ourselves digital design and gradually realized that the next
       generation of hardware will need to be finely customized to enable
       state of the art ML models at the edge, that is, running on your
       devices and not in the cloud. In the CPU world, the RISC-V
       RocketChip implementation has proven the value of customizable
       compute hardware. The problem was that no-one was building that
       kind of capability for ML acceleration. We started Tensil to build
       customizable ML accelerators and see what kind of applications
       people can create with them.  Tensil is a set of tools for running
       ML models on custom accelerator architectures. It includes an RTL
       generator, a model compiler, and a set of drivers. It enables you
       to create a custom accelerator, compile an ML model targeted at it,
       and then deploy and run that compiled model. To see how to do this
       and get it running on an FPGA platform, check out our tutorial at
       https://www.tensil.ai/docs/tutorials/resnet20-ultra96v2/.  We
       developed an accelerator generator in Chisel and then wrote a
       parameterizable graph compiler in Scala. (Fun fact: unlike in
       software, formal verification is actually a totally viable way to
       test digital circuits and we have made great use of this
       technique.) The accelerator generator takes in the desired
       architecture parameters and produces an instance of the accelerator
       which can be synthesized using standard EDA tools. The compiler
       implements ML models using the accelerator's instruction set and
       can target any possible instance of the accelerator.  Currently,
       the accelerator architecture is based around a systolic array,
       similar to well-known ML ASICs. You can view the architecture spec
       in our documentation. The compiler performs a wide variety of tasks
       but is optimized for convolutional neural networks. There are also
       drivers for each supported platform, currently limited to FPGAs
       running bare-metal or with a host OS.  When you tell the driver to
       run your ML model, it sets up the input data and then streams the
       compiled model into the accelerator. The accelerator independently
       accesses host memory during execution. When the accelerator is
       done, the driver is notified and looks for the output in the pre-
       assigned area of host memory.  How are we different from other
       accelerator options? There are many ML ASICs out there but they are
       all locked into a single architecture, whereas we have
       customization at the core of our technology. This offers the
       potential for a better trade-off between
       performance/price/watts/accuracy. Compared with other FPGA options,
       Xilinx DPU is great but it's closed source and can be difficult to
       work with if your model is in any way customized. By going open
       source, we aim to support the widest possible range of models. FINN
       is a very cool project but requires big changes to your model in
       order to work, and also typically requires large FPGAs which are
       unsuitable for edge deployments. We work out of the box with any
       model (no need to quantize), and on small edge FPGAs. For embedded
       systems, tflite/tfmicro are great for deploying very small ML
       models on extremely constrained edge devices, but they are limited
       in terms of the performance and accuracy that can be achieved. Our
       tools allow you to work with full size state of the art models at
       high accuracy and speed.  Currently we're focused on the edge and
       embedded ML inference use case. If you run ML models using any of
       the major frameworks (TensorFlow/Keras, PyTorch, etc.) on small,
       embedded or edge devices then Tensil is a good fit for you right
       now. If you primarily run inference in the data center or need lots
       of training acceleration, reach out to us and we can walk you
       through our roadmap. For now we are focused on CNN inference on
       edge FPGA platforms, but our aim is to support all model
       architectures on a wide variety of fabrics for both training and
       inference.  The core technology will always be free and open
       source, but we plan to offer a "pro" version with extra enterprise
       features under a dual license arrangement, similar to Gitlab. We
       are also working on a cloud service for running our tools in a
       hosted setup, in which you'll be able to run a search across all
       possible Tensil architectures to automatically find the best FPGA
       for your model.  If you're interested to learn more, check out our
       docs (https://www.tensil.ai/docs), our Github repo
       (https://github.com/tensil-ai/tensil) and join our Discord
       (https://discord.gg/TSw34H3PXr). And feel free to reach out any
       time (email in profile).  We're here to enable you to develop
       amazing new ML based applications, so we'd love to hear your
       experiences of working with ML compute hardware, whether it be CPU,
       GPU, or some other specialized platform. Have you had to make major
       changes to your ML models to get them to run on the available
       hardware? Are there any cool features or UX improvements that you
       wish hardware makers would add? Are there features that you'd like
       to add to your own applications but don't know how you'd get them
       to work on an edge device? Looking forward to your comments!
        
       Author : tdba
       Score  : 46 points
       Date   : 2022-03-11 18:00 UTC (5 hours ago)
        
       | sathergate wrote:
       | how does this compare to apache TVM?
        
         | tdba wrote:
         | Great question - TVM / OctoML are a great option if you have an
         | off-the-shelf ML model and off-the-shelf hardware. Tensil is
         | different in that you can actually customize the accelerator
         | hardware itself, allowing you to get the best trade-off of
         | performance / accuracy / power usage / cost given your
         | particular ML workload. This is especially useful if you want
         | to avoid degrading the accuracy of your models (e.g. through
         | quantization) to achieve performance targets.
        
           | sathergate wrote:
           | That makes sense. So is this only for edge compute use cases,
           | or can I use tensil on an FPGA I have running in my data
           | centre?
        
             | tdba wrote:
             | You absolutely can use it in a data centre. You can even
             | tape out an ASIC using these designs! Currently we've done
             | most of our prototyping with edge FPGA platforms but if you
             | want to try other platforms we'd love to help you get
             | started. You can email me at tom@tensil.ai or use the
             | contact methods on the website.
        
       | bjourne wrote:
       | Wow! This looks amazingly impressive. Super-duper good. If the
       | software is as impressive as the website (haven't tried it out
       | yet!), you'll make tons of $$$ on this product. If I were a vc
       | with millions I'd be begging you to take some. I wonder if you
       | plan to support CGRAs and LSTMs? Also, what about quantized and
       | compressed models? Model compression is a sore point and afaik,
       | there aren't any good tools that lets you make tradeoffs between
       | accuracy and compute efficiency.
        
         | 323454 wrote:
         | Just saw your edit re: model compression. One thing that Tensil
         | can do is help you avoid the need to quantize or compress your
         | model entirely! For example, we've found that using a 16-bit
         | fixed point numeric data type preserves almost all the model
         | accuracy while not sacrificing performance thanks to the huge
         | amount of parallelism available on FPGA.
         | 
         | The broader point is that Tensil is extremely flexible, so you
         | can try out lots of different accelerator configurations to
         | find the one that works best for your ML model. Think of it as
         | optimizing the hardware first, then the software if needed.
         | 
         | We're actually working on a tool to manage and automate this
         | hardware architecture search - watch this space!
        
         | tdba wrote:
         | Thank you for the kind words! Just to clarify, the core
         | technology here is free and open source, anyone can use it
         | right now for free. We do have commercialization plans in
         | addition - we may explore things like additional paid features
         | for enterprise use or paid tiers of extra support.
         | 
         | Regarding LSTMs, yes. We're aiming to support all machine
         | learning model architectures: do you have any particular models
         | you're interested in that we should be prototyping with?
         | 
         | For CGRAs, we don't have any immediate plans to explicitly
         | support them. What kind of use case do you have in mind?
         | Generally, any platform that can implement a blob of generated
         | RTL should be something we can work with quite easily.
        
       | lagrange77 wrote:
       | Wow! After repeatedly unsuccessfully tying to get an overview
       | over NN accelerators today, i just found this on the HN homepage.
       | Looks very promising and to me this seems to be a very logical
       | approach, in terms of efficiency (besides analog computers).
       | 
       | I would also be very interested in some benchmarks comparing the
       | generated hardware with things like Google Coral or Nvidia
       | Jetson.
       | 
       | I am sure this will be a success.
        
         | touisteur wrote:
         | One more thing to keep a look on in the NN accelerator world is
         | TensTorrent. That thing looks amazing, but it's mostly for
         | datacenter and 'heavy' edge (pcie board, at least 75W, so to
         | measure against Alveo U50/U55 and Tesla T4/A30 and up to 300W
         | so A40/A100).
        
           | tdba wrote:
           | Thanks, we'll take a look!
        
         | tdba wrote:
         | Glad this helped clarify things for you! The tricky thing about
         | benchmarks is that one of the key benefits of Tensil is the
         | flexibility to find a trade-off between performance, accuracy,
         | cost and power usage that works for you. Benchmarks that only
         | consider performance or performance per watt can be a bit
         | narrow from that point of view. That said, this is a good idea
         | and we'll add some comparisons that we think make sense to the
         | docs!
        
           | touisteur wrote:
           | I wanted to add something about the xilinx dpu, and you
           | brushed on the subject but I was quite unhappy with the
           | softip thing. It embeds all instructions for all kinds of
           | networks so, taking a lot of gates for unused features, it's
           | not much customizable, and perf for anything else than
           | vanilla conv2d stuff quickly gets down. Buying an Alveo board
           | to get such low inference perf was a gutpunch.
           | 
           | FINN seems far better there. At least you get millions
           | inference/sec on simple quantized CNN1Ds.
           | 
           | The xrt api is simple and relatively ok, too. Stream data,
           | execute inference, fetch results, mostly sync, so you have to
           | wrap a lot of threading there, but the basics are there.
        
             | tdba wrote:
             | Yep, this is something we've heard before. If you're really
             | familiar with the Xilinx ecosystem, one way we've described
             | Tensil is that it is the "Microblaze for ML" - easy to use,
             | lots of flexibility and customizability, with performance
             | good enough for most applications. The DPU and FINN would
             | then be the more specialized tool for situations where you
             | need specific features they are optimized for.
        
               | touisteur wrote:
               | Ha, now you've made me curious. Let's see how everything
               | progresses then. Thanks for the earnestness on these
               | comment threads.
        
       | mwcampbell wrote:
       | How does this compare to Coral's USB Accelerator [1], which
       | apparently uses Google's TPU? I'm guessing Tensil is better for
       | companies that are already either working with an FPGA or
       | producing custom silicon, but the Coral product might be easier
       | to get started with when prototyping on something like a
       | Raspberry Pi.
       | 
       | [1]: https://coral.ai/products/accelerator
        
         | tdba wrote:
         | Coral is a great project, especially if you are using a
         | completely vanilla off-the-shelf model. However if you've ever
         | tried compiling a custom ML model for it, you know how finicky
         | it can be. There are lots of ways that you can accidentally
         | make it impossible for Coral to run your model, and it can be
         | difficult to figure out what went wrong.
         | 
         | With Tensil, you circumvent that problem by changing the
         | hardware to make it work for your model. If you have made
         | modifications to an off-the-shelf model or have trained your
         | own one from scratch, it might be a better option from the
         | point of view of ease-of-use and even performance.
        
           | touisteur wrote:
           | Heh very similar experience with myriad-x there. Going off
           | the beaten path is a pain, especially since the low-level is
           | now so hidden...
        
             | tdba wrote:
             | Absolutely, the UX for compiler tools often leaves a lot to
             | be desired. This is something we want to fix!
        
               | touisteur wrote:
               | This is hard, very hard stuff. Between MLIR, the xla
               | world, most HLS things (generalist stuff leaves a lot of
               | perf on the table and you often end up in vhdl/asm anyway
               | - while specialised stuff is often too restricted...) and
               | the vivado 'let's write HDL/RTL like C', many broke their
               | teeth.
               | 
               | I wish you good luck there, but you're up a huge task.
               | You have all my congrats for going open source, and I
               | think now it's mostly the only way forward. FINN is OSS
               | and I'm very happy to have an OSS alternative. If only
               | old Altera would go full OSS on new AI+FPGA stuff maybe
               | we'd see great cross pollination.
               | 
               | Anyway, if Intel FPGA people aren't watching this, I can
               | assure you they'll be looking soon.
        
               | tdba wrote:
               | Thank you - we'd love to see more OSS support from FPGA
               | vendors too and we'll be watching closely for any
               | developments there.
        
           | mwcampbell wrote:
           | Ah, thanks for that clarification. I see that your tutorial
           | is using the Avnet Ultra96 V2 dev board. Do you have anything
           | that would work with a Raspberry Pi? Maybe some kind of FPGA
           | addon board? Or do you feel that the Raspberry Pi isn't a
           | good starting point for developing a real commercial product?
        
             | tdba wrote:
             | This is a great idea, we're looking at boards that could be
             | used in combination with a Raspberry Pi. The reason we
             | haven't investigated this so far is that most of the dev
             | boards we've tested with have an ARM core embedded in the
             | FPGA fabric, so the additional CPU the Raspberry Pi would
             | provide wasn't necessary.
        
       | [deleted]
        
       | ColonelPhantom wrote:
       | Great project! How does the performance compare with conventional
       | CPU/GPU based inference? Those devices are usually a lot higher
       | power (and bigger/more expensive), but obviously do not benefit
       | from specialization.
        
         | tdba wrote:
         | Thanks! The general answer is that it depends on your model and
         | on which FPGA platform we're talking about, but in a head-to-
         | head benchmark test you'll find results in the ballpark of
         | 2-10x CPU and 0.5-2x GPU. As you point out, the power and cost
         | are big differentiators. The other thing to consider is (as
         | another commenter mentioned) that usually inference on CPU or
         | GPU will require you to do some model quantization or
         | compression, which can degrade model accuracy. Tensil can give
         | you a way around that dilemma, so that you can have great
         | performance without sacrificing accuracy.
        
           | touisteur wrote:
           | Hi, I'm curious what you mean about model quantization being
           | necessary on CPU and GPU? They're not necessary by default,
           | as openvino, tvm, tensorrt can run single-precision inference
           | on most classic models quite fast? If you're reaching for
           | very low power or ultimate perf, yeah you can downgrade to
           | fp16 (well... Mixed precision) with NVIDIA tensor cores or
           | avx512-fp16, or bf16 in some Intel vnni confs? Going to
           | integer will give you more throughput too but it's not
           | necessary. Even myriad-x is supposed to handle some kind of
           | fp16 with the shave cores.
           | 
           | The only time I had to reach for quantized (integer) networks
           | to do anything at all was inferencing on FPGAs. Are you
           | targeting dsp slices by default or implementing full ieee754
           | floating point by default?
           | 
           | Are you saying that with Tensil you can run single precision
           | non-quantized models with up to 2x gpu perf?
           | 
           | I probably misunderstood your last sentence, sorry.
           | 
           | Genuinely curious!
        
             | 323454 wrote:
             | Sorry if this was unclear - in a datacenter use case you
             | are right, but for an edge deployment, you will usually
             | need to quantize, prune or compress your ML model to get it
             | working as fast as you'd like on a sufficiently small
             | CPU/GPU. Compared with running your ML model unchanged on
             | those platforms, Tensil can run with the performance ranges
             | listed above. You can also quantize and use Tensil too!
        
           | forgotmyoldacc wrote:
           | It'd be great if you could add benchmark numbers for this
           | comparing CPU/GPU on inference / sec and inference / watt.
        
       | rowanG077 wrote:
       | What kind of FPGAs can this reasonably run on? Is that model
       | dependent? Could a small model run on an ICE40 FPGA? I looked
       | over the doc but I can't find anything concrete.
        
         | tdba wrote:
         | It depends on the model, yes. Here are some examples in the
         | benchmarks section of our docs:
         | https://www.tensil.ai/docs/reference/benchmarks/
         | 
         | We haven't specifically tested on any ICE40 FPGAs yet - if this
         | is something that you'd really like to see, let me know! Taking
         | a look at the lineup, the ICE40 LP8K and LP4K would be suitable
         | for running a very small version of the Tensil accelerator.
         | You'd want to run a small model in order to get reasonable
         | performance.
         | 
         | Generally speaking, FPGAs with some kind of DSP (digital signal
         | processing) capability will work best, since they can most
         | efficiently implement the multiply-accumulate operations
         | needed.
        
           | ColonelPhantom wrote:
           | I think iCE40 LP/HX series are the biggest ones, but the
           | iCE40UP5K is also neat: it has hardware multipliers unlike
           | the LP/HX, and a relatively large 1 megabit RAM on-chip.
           | Unfortunately, I think the UP family is relatively slow (as
           | in propagation delay/max clock frequency).
        
             | 323454 wrote:
             | Thanks for pointing this out! The UP5K does look promising.
        
           | rowanG077 wrote:
           | Cool! Yeah I would be interested in that. I would actually
           | have some use cases for edge compute if it can fit into tiny
           | FPGAs like the ICE40.
        
             | 323454 wrote:
             | That's excellent - feel free to join our Discord if you'd
             | like to brainstorm ideas or get help choosing models and
             | boards https://discord.gg/TSw34H3PXr
        
           | mochomocha wrote:
           | I'm curious if you have any benchmark (or anecdotal evidence)
           | on the relative perf&power efficiency of using the DSP blocks
           | of the FPGA boards or not?
        
             | 323454 wrote:
             | I don't have hard numbers at hand, but I'd estimate
             | something like an order of magnitude improvement for using
             | DSP for multiplication vs not. If they're available on the
             | fabric, you'll definitely want to use them! If this is an
             | experiment you want to run, I'd be very happy to help you
             | figure out how to do it.
        
       | dang wrote:
       | All: these guys did a Show HN yesterday at
       | https://news.ycombinator.com/item?id=30615605 (there was a
       | scheduling mixup on my part). I mention it here to (a) explain
       | the dupe, for anyone who saw that thread; but also (b) to tell
       | everyone that the discussion there was unusually high-quality, so
       | you might want to check out those comments first.
       | 
       | Actually, maybe we should just merge that thread into this one.
       | I'll double check if that makes sense.
       | 
       | Edit: ok, I've moved the comments in here now. Some of the
       | timestamps are messed up, but I think it makes more sense for the
       | comments to be in one place so readers don't have to go back and
       | forth. Sorry for any confusion!
        
       | emacs28 wrote:
       | What physical connection is required between the FPGA and host?
       | For example, do they communicate through a PCIe connection?
        
         | tdba wrote:
         | In our current demos, the Tensil logic talks to the host
         | through a couple of AXI and AXI Stream interfaces. There are
         | AXI adapters for many other protocols, including PCIe, that
         | should be able to support many different kinds of connectivity.
         | Here's a link to our docs explaining the host<->Tensil
         | connection:
         | https://www.tensil.ai/docs/howto/integrate/#2-connect-the-ax...
        
       | erulabs wrote:
       | Congrats Tom! Can't wait to have a use-case for this (soon!)
        
       | YayaScript wrote:
       | What's the difference? https://hailo.ai/
        
         | tdba wrote:
         | Generally the comparison between Tensil and any fixed ASIC is
         | going to run along similar lines, which we explain in this
         | comment regarding the Coral accelerator:
         | https://news.ycombinator.com/item?id=30643520#30645318
         | 
         | The big difference is that while those fixed ASICs offer great
         | performance on the set of models they were optimized for, there
         | can be big limitations on their ability to implement other more
         | custom models efficiently. Tensil offers the flexibility to
         | solve that problem.
        
       | touisteur wrote:
       | Hi, maybe you've addressed this somewhere and I haven't read
       | fully (sorry) but how does it compare to FINN from Xilinx?
        
         | 323454 wrote:
         | FINN is a very cool project, but usually requires big changes
         | to your model in order to work, e.g. quantizing down to 1 or 2
         | bit weights. It also works best on large FPGAs which are
         | unsuitable for edge deployments. Tensil works out of the box
         | with any model (no need to quantize / compress) and on small
         | edge FPGAs.
        
           | touisteur wrote:
           | Thanks for your answers.
           | 
           | True that, to get crazy perf with FINN, one needs to quantize
           | like crazy (at least it's the default strategy, but it's
           | something that might change if/when it can synthetize to use
           | dsp slices or shiny Versal Weird Cores). Now I'll _have_ to
           | take a look at Tensil. How would it scale on large FPGAs
           | though? Would you leave the floor planning to a seasoned vhdl
           | person? Does Tensil handle it (generating parrallel
           | pipelines, maxing out performance using all resources on
           | chip) ? Say for someone doing 1D CNNs or some 1D VAEs with
           | (tens of) millions inferences /second on a continuous stream
           | (low batch size)? :-).
           | 
           | I'm not sure what Intel proposes nowadays on that front, with
           | the abandonment of OpenVino for FPGA. No idea how one could
           | use the stratix 10 nx with its 'ai cores' with actual neural
           | networks. Tensil might be a gateway for all this (I sadly
           | don't have much for FINN to become crossplatform...).
        
             | 323454 wrote:
             | So far we've been focused on edge devices like the Zynq,
             | Artix and Zynq Ultrascale+ families. Tensil certainly works
             | on larger devices but it's not as optimized there as we'd
             | like it. If that's interesting to you, I'd love to talk and
             | understand your use case in more depth.
             | 
             | The Intel FPGA side is interesting, as you say there are
             | fewer projects targeting their technologies for ML use
             | cases. We haven't tested support for their boards yet, but
             | there is nothing in our generated RTL that is exclusive to
             | Xilinx. The only thing we'd need to add is new drivers for
             | their platforms.
        
               | vmaccel wrote:
               | Would love to take a look at this. We just launched our
               | FPGA-based cloud platform last year and currently we
               | offer all of the Alveo series and some Intel as well.
               | vmaccel.com
        
               | tdba wrote:
               | VMAccel looks very interesting! Send me an email and we
               | can explore how to collaborate.
        
         | dang wrote:
         | (This comment was originally posted at
         | https://news.ycombinator.com/item?id=30615605, where the
         | question made more sense, but I've moved it into the new thread
         | because it's interesting.)
        
       | mochomocha wrote:
       | Very cool work, congratulations on the launch! Can you comment on
       | how you see the trend of edge computing evolve in the future for
       | SBCs? In terms of perf per watt, could FPGAs compete against a
       | coral-style TPU? What if we had open Mali GPU or NPU APIs to
       | program against the chips already present on SBCs? I'm just a
       | hobbyist so I know very little of what people actually deploy in
       | industrial settings - which would be your target customers.
        
         | 323454 wrote:
         | Cheers, and great question! FPGAs are pretty amazing devices,
         | but one thing that's been holding them back is how difficult
         | they have been to work with. Typically to actually make use of
         | an FPGA you'd need to have an FPGA expert and an embedded
         | software engineer on your team, along with all the requisite
         | tools and materials.
         | 
         | That has started to change dramatically in the last decade,
         | with open source FPGA toolchains like yosys, runtimes like the
         | PYNQ framework and RTL generator tools like Tensil being
         | developed. When you put these things together, working with
         | FPGAs starts to become as easy as using any other compute
         | platform. For that reason, I think there are lots of
         | applications involving FPGAs that will soon be invented to take
         | advantage of this trend. One could speculate that the reason
         | Intel and AMD are buying up FPGA vendors is because they see
         | the potential there.
         | 
         | As far as head-to-head comparisons go, as long as you're
         | running the workload it was designed for in the environment it
         | was designed for, an ASIC will always be the best possible perf
         | per watt. The question is what happens when you go outside
         | those bounds. Can you take your model, swap out a layer, and
         | have it run just as fast on your Coral or NPU? Probably not, at
         | least right now. But with Tensil, you can re-run your
         | architecture search to find the best accelerator, and take
         | advantage of it right away.
        
       | ZeroCool2u wrote:
       | So, Tensil looks really cool. One of the constraints listed in
       | the docs though is that it only supports convolutional networks
       | at the moment.
       | 
       | What does the timeline look like for supporting some of the more
       | popular transformer/attention based arch's look like?
        
         | tdba wrote:
         | We're working on our roadmap right now and prioritizing support
         | based on user interest. If there's a particular model or set of
         | models you're interested in accelerating, I'd love to hear
         | about it!
         | 
         | If there's a lot of interest in transformers, we'd aim to offer
         | support in the next couple of months.
        
           | ZeroCool2u wrote:
           | A lot of SOTA models seem to be gravitating towards
           | transformer based models. Obviously, I can't speak for the
           | entire field, but you can just go take a look at the most
           | popular HuggingFace repos and see what I mean. They started
           | out focused on language, but because transformers have become
           | so popular, they're expanding into the audio and vision
           | domains quickly. Their library called 'transformers' is,
           | outside of research, most peoples go to high level framework
           | as it largely abstracts away a lot of the boilerplate that
           | writing in pure TF, PyTorch, Jax requires.
           | 
           | See:
           | 
           | https://huggingface.co/spaces
           | 
           | https://github.com/huggingface/transformers
        
             | tdba wrote:
             | Agreed, this is the way things seem to be trending. We'll
             | definitely add support for transformers in the near future,
             | the question is only whether there are other things we
             | should work on first, especially with respect to the edge
             | and embedded domain where smaller conv models still
             | dominate. Thank you for the links!
        
       | normcoreashore wrote:
       | Sweet!
        
       ___________________________________________________________________
       (page generated 2022-03-11 23:00 UTC)