[HN Gopher] A new programming language for high-performance comp... ___________________________________________________________________ A new programming language for high-performance computers Author : rbanffy Score : 69 points Date : 2022-02-11 20:01 UTC (2 hours ago) (HTM) web link (news.mit.edu) (TXT) w3m dump (news.mit.edu) | going_ham wrote: | There is also this: | | https://anydsl.github.io/ | | They have some framework that achieves high level compute! | ogogmad wrote: | I think formal verification makes sense for programs like | optimising compilers. You have a reasonable hypothesis to prove: | That the optimisations don't change the behaviour of the program | in any way, except for speed. | agambrahma wrote: | Hmm, I would've expected them to double down on differentiation | within Julia (via Zygote etc) | simeonschaub wrote: | It reads to me like ATL is actually more of a tensor compiler | than a general-purpose language. In fact, I'd actually be | curious if we could lower (subsets of) Julia to ATL, let it | optimize the tensor expressions and transpile everything back | into Julia/LLVM IR for further optimizations. | | In Julia, there is already Tullio.jl [1], which is basically a | front end for tensor expressions, which can target both CPUs | and GPUs with AD support and automatically parallelizes the | computation. It doesn't really optimize the generated code much | right now though, so something like this could be interesting. | | [1] https://github.com/mcabbott/Tullio.jl | eigenspace wrote: | MIT is a big place, there's lots of people there who don't use | julia, and I don't think there's any structures in place to | stop MIT people from making new languages (nor should there | be). | UncleOxidant wrote: | Yeah, based on the title and where it's from I half expected it | to be some kind of DSL based on Julia. Instead it's based on | Coq which gives them formal verification capability. | dan-robertson wrote: | In mathematics, there are two ways to deal with tensors: | | 1. You come up with notation for tensor spaces, and tensor | product operations, and a way to specify the indexes of pairs of | dimensions you would like to contract, either using algebra or | many-argument linear maps. | | 2. You write down a symbol for the tensors and one suffix for | each of the indices. For an arbitrary rank tensor you write | something like $T_{ij\cdots k}$. You write tensor operations | using the summation convention: every index variable must appear | exactly once in a term (indicating that it is free) or twice | (indicating that it should be contracted - that is you should | take the sum of the values for each possible value of the index). | Special tensors are used to express special things. | | So for the product of two matrices C = AB, you could write: | c_{ik} = a_{ij} b_{jk} | | Or a matrix times a vector: v_i = a_{ij} u_j | | Or the trace of A: t = a_{ij} \delta_{ij} | | (delta is the name of the identity matrix, aka the Kroenecker | delta). For the determinant of a matrix you could write: | d = a_{1i} a_{2j} \cdots a_{(n)k} \epsilon_{ij\cdots k} | | (Epsilon is the antisymmetric tensor, a rank n, n-dimensional | tensor where the element at i, j, ..., k is 1 if those indicted | are an even permutation of 1, 2, ..., n, -1 if they are an odd | permutation, and 0 otherwise). | | The great advantage is that you don't spend lots of time faffing | around with bounds for the indices or their positions in the | tensor product. But to some extent this is only possible because | the dimension (ie the bounds) tends to have a physical or | geometric meaning. This is also why when the notation is used in | physics you also have subscripts and superscripts and you must | contract one subscript index with a corresponding superscript | index. | | I wonder if it is possible to achieve a similarly concise | convenient notation for expressing these kinds of vector | computations. | | You can imagine abusing the notation a bit but leaving most loops | and ranges implicit, e.g. let buf[i] = f(x[i]); | let buf' = buf or 0; let out[i] = buf'[i-1] + buf[i]; | out | | (With this having a declarative meaning and not specifying how | iteration should be done). | | A convolution with 5x5 kernel could be something like: | /* in library */ indicator(p) = if p then 1 else 0 | delta[[is]][[js]] = indicator(is == js) | windows(radii)[[is]][[js]][[ks : (-radii..radii)]] = | delta[[is]][[js + ks]] /* in user code */ | out[i,j] = in[p,q] * windows([2, 2])[p,q,i,j,k,l] * weight[k,l] | /* implicit sun over all k, l, p, q, but only loops over k, l as | p, q values implied */ | | Where I've invented the syntax [[is]] to refer to multiple | indices at once. | | I don't know if it would be possible to sufficiently generalise | it and avoid having to specify the size of everything everywhere, | or whether a computer could recognise enough patterns to convert | it to efficient computation. | floren wrote: | Meanwhile, HPC software will continue to be written with C+MPI, | or Fortran+MPI. | | Source: was involved in development of multiple HPC frameworks, | attended multiple HPC conferences, despite wailing and gnashing | of teeth all actual software continues to be written the same way | it's always been. | brrrrrm wrote: | It'd be quite a boring world if observations of the past could | reliably predict the future. | [deleted] | AnimalMuppet wrote: | I'm not an expert. But with current CPUs, doesn't HPC | critically depend on memory alignment? (For two reasons: | Avoiding cache misses, and enabling SIMD.) Sure, algorithms | matter, probably more than memory alignment. But when you've | got the algorithm perfect and you still need more speed, | you're probably going to need to be able to tune memory | alignment. | | C lets you control memory alignment, in a way that very few | other languages do. Until the competitors in HPC catch up to | that, C is going to continue to outrun those other languages. | | (Now, for doing the tweaking to get the most out of the | algorithm, do I want to do that in C? No. I want to do it in | something that makes me care about fewer details while I | experiment.) | brrrrrm wrote: | You're correct that low level details very much matter in | the HPC space. The types of optimizations described in this | paper are exactly that! To elucidate, check out Halide's | cool visualizations/documentation (which this paper | compares itself to) https://halide- | lang.org/tutorials/tutorial_lesson_05_schedul... | agumonkey wrote: | No mention of chapel ? just asking, it's been in development | for years and I just saw its name on the benchmarkgame so some | people still have interest in it. | mindleyhilner wrote: | Paper: | http://people.csail.mit.edu/lamanda/assets/documents/LiuPOPL... | jimbob45 wrote: | >speed and correctness do not have to compete ... they can go | together, hand-in-hand | | Is Rust considered slow these days outside of compilation speed? | ska wrote: | Not fast in this sense, but definitely not slow. | stonewareslord wrote: | Not fast in which sense? HPC? Or single/multithredded | processing on one machine? | wrnr wrote: | Back in the days the compile times where legendary for being | slow and that fact has stuck around but this is no longer true | today. | brrrrrm wrote: | for the type of computation described in this article, Rust is | considered slow. I recently stumbled across a site to track | it's progress in changing that: | https://www.arewelearningyet.com | whatshisface wrote: | No, it's quite fast. | UncleOxidant wrote: | This sounds like of like Halide [1] which I think is also from | the same group in MIT. However, Halide is an embedded DSL in C++. | It sounds like ATL being based on Coq would be more likely to | always produce correct code. | | [1] https://halide-lang.org/ | xqcgrek2 wrote: | So, basically a linear algebra template library hiding | implementation to the user | | big deal /s | dang wrote: | " _Please don 't post shallow dismissals, especially of other | people's work. A good critical comment teaches us something._" | | https://news.ycombinator.com/newsguidelines.html | | Edit: you've unfortunately been posting a lot of unsubstantive | and/or flamebait comments. We ban accounts that do that. I | don't want to ban you again, so you could you please review the | site guidelines and fix this? We'd be grateful. | jll29 wrote: | Shame that reporting for "normal" earthlings is unlikely to | include a few sample lines of code of this new language. | | Draft paper: https://arxiv.org/pdf/2008.11256.pdf | | Sample code (looks like a DSL embedded in Python): | https://github.com/gilbo/atl/blob/master/examples/catenary.p... | ogogmad wrote: | One of the innovations in the paper is they find a method for | implementing reverse-mode autodiff in a stateless manner, | making optimisations easier to verify. | | I might be wrong but they take the principle of dual numbers: a | + b epsilon, and change the b component into its linear dual. | Therefore, b is a function, and not a number. If b is a | function instead of a number, then there's more flexibility | regarding flow of control, and that flexibility allows you to | express reverse-mode autodiff. | brrrrrm wrote: | a pure functional language is trivially differentiated, no? | View a program as a data flow graph and then append the | differentiation data flow graph. How this is executed | (reverse-mode vs forward-mode vs some exotic mix) is just a | leaked abstraction. Ideally, it wouldn't need to be exposed | to the user. ___________________________________________________________________ (page generated 2022-02-11 23:00 UTC)