[HN Gopher] Lfortran: Modern interactive LLVM-based Fortran comp... ___________________________________________________________________ Lfortran: Modern interactive LLVM-based Fortran compiler Author : zaikunzhang Score : 124 points Date : 2023-08-28 15:00 UTC (7 hours ago) (HTM) web link (lfortran.org) (TXT) w3m dump (lfortran.org) | pseudosavant wrote: | Was I the only one hoping that Lfortran was something for music | synthesis (LFOs...) using Fortran? I can't not see LFO - like | that keming joke. | ruste wrote: | This is probably fantastic from a maintainability perspective, | but I'm curious if some performance is left on the table by using | LLVM IR instead of compiling directly to machine code. I know | there are a number of optimizations that can be made for Fortran | that can't be made for C-like languages and I wonder if some of | those C-like assumptions are implicitly encoded in the IR. | certik wrote: | The original author of LFortran. Great question. | | We designed LFortran to first "raise" the AST (Abstract Syntax | Tree) to ASR (Abstract Semantic Representation). The ASR keeps | all the semantics of the original code, but it is otherwise as | abstract/simple as possible. Thus by definition it allows us to | do any optimization possible, as ASR->ASR optimization pass. We | do some already, we will do many more in the future. This | optimizes all the things where you need to know the high level | information about Fortran. Then once we can't do any more | optimizations, we lower to LLVM. If in the future it turns out | we need some representation between ASR and LLVM, such as MLIR, | we can add it. | | We also have a direct ASR->WASM and WASM->x64 machine code, and | even direct ASR->machine code, but the ASR->LLVM backend is the | most advanced, after that probably our ASR->C backend and after | that our ASR->WASM backend. | throwaway17_17 wrote: | How does LLVM cope with the array semantics? I was under the | impression that the noalias attribute in the IR was not | activated in such a way as to enable the optimizations that | make Fortran so fast. | certik wrote: | My experience with LLVM so far has been that is possible to | get maximum speed as long as we generate the correct and | clean LLVM IR, and do many of the high level optimizations | ourselves. | | If LLVM has any downsides, it is that it is hard to run in | the browser, so we don't use it for | https://dev.lfortran.org/, and that it is slow to compile | (both LLVM itself, as well as it makes LFortran slow to | compile, compared to our direct WASM/x64 backends). But | when it comes to runtime performance of the generated code, | LLVM seems very good. | galangalalgol wrote: | Rust drove the fixes needed in llvm to support noalias. | They went through a couple reverts before seemingly fixing | everything. If lfortran emits noalias, llvm can probably | handle it now. | csjh wrote: | WASM -> x64 as in a full WASM AOT compiler? Don't those | already exist? What's the benefit to making one specifically | for LFortran? Unless I'm misunderstanding | | Super cool stuff though | certik wrote: | Yes, we could make the WASM->x64 standalone. The main | motivation is speed of compilation. We do not do any | optimizations, but we want to generate the x64 binary as | quickly as possible, with the idea that it would be used in | Debug mode, for development. Then for Release mode you | would use LLVM, which is slow to compile, but good runtime | performance. And since we already have ASR->WASM backend | (used for example at https://dev.lfortran.org/), | maintaining WASM->x64 is much simpler than ASR->x64 | directly. | konradha wrote: | LFortran is not necessarily using LLVM IR to compile. It's | building up an ASR [1] structure that's already being used in | the LFortran-specific backends. Potentially it can make full | use of Fortran semantics! | | [1] https://docs.lfortran.org/en/design/ | zik wrote: | > there are a number of optimizations that can be made for | Fortran that can't be made for C-like languages | | That used to be true a long time ago but since the restrict | keyword was introduced in C99 it's not really true any more. | leephillips wrote: | As an indirect answer, consider Julia, which is based on LLVM | and seems to be competitive with Fortran on large scale, | numerically intensive calculations. | queuebert wrote: | Can Julia pre-compile to a binary executable now? If not, | they can't replace Fortran. | uoaei wrote: | https://docs.juliahub.com/PackageCompiler/MMV8C/1.2.1/devdo | c... | krestomantsi wrote: | That is not a true binary. Making julia truly compile | into binaries is now the number 1 goal of the language | according to Tim Holy and the julia team. | certik wrote: | LFortran can translate your Fortran code to Julia via our | Julia backend. Once Julia can compile to a binary, it | will be exciting to do some comparisons, like speed of | compilation and performance of the generated binary. As | well as the quality of the Julia code that we generate, | we'll be happy to improve it to create canonical Julia | code, if at all possible. | tombert wrote: | As someone who's only played with Fortran, and never done | anything too serious with it, can you explain an optimization | that can be done in Fortran that can't be done in a C-like | language? | | I'm not being argumentative, I'm actually really curious. | bogeholm wrote: | Here's a link to a StackOverflow answer that gives a good | example: "Is Fortran easier to optimize than C for heavy | calculations?" [0] | | [0]: https://stackoverflow.com/questions/146159/is-fortran- | easier... | pklausler wrote: | The most significant distinction is that dummy arguments in | Fortran can generally be assumed by an optimizer to be free | of aliasing, when it matters. Modifications to one dummy | argument can't change values read from another, or from | global data. So a loop like subroutine foo(a, | b, n) integer n real a(n), b(n) do j | = 1, n a(j) = 2 * b(j) end do end | | can be vectorized with no concern about what might happen if | the `b` array shares any memory with the `a` array. The | burden is on the programmer to not associate these dummy | arguments on a call with data that violate this requirement. | | (This freedom from aliasing doesn't extend to Fortran's | POINTER feature, nor does it apply to the ASSOCIATE | construct, some compilers notwithstanding.) | 3836293648 wrote: | This can be done in C, but not C++, though in practice all | C++ compilers support it. It's the `restrict` keyword | jcranmer wrote: | Fortran has true multidimensional arrays in a way that C | doesn't have--if you know an array is 5x3, you know that A[6, | 1] doesn't map to a valid element whereas in C, it does map | to a valid element. This turns out to make a lot of loop | optimizations easier. (Also, being Fortran, you tend to pass | around arrays with size information anyways, which C doesn't | do, since you typically just get pointers with C). | WanderPanda wrote: | Is the size info compile-time or runtime in Fortran? | certik wrote: | It can be both. If you know the dimension at compile | time, it is compile time, if you don't it will be | runtime. | certik wrote: | A simple example is returning an allocatable array from a | function, where the Fortran compiler can decide to allocate | on a stack instead, or even inline the function and eliminate | completely. While in C the compiler would need to understand | the semantics of an allocatable array. If you use raw C | pointer and malloc, and use Clang, my understanding is that | Clang translates quite directly to LLVM and LLVM is too low | level to optimize this out, depending on the details of how | you call malloc. | | Of course, you can rewrite your C code by hand to generate | the same LLVM code from Clang, as LFortran generates for the | Fortran code. So in principle I think anything can be done in | C, as anything can be done in assembly or machine code. But | the advantage of Fortran is that it is higher level, and thus | allows you to write code using arrays in a high level way and | do not have to do many special things as a programmer, and | the compiler can then highly optimize your code. While in C | very often you might need to do some of these optimizations | by hand as a user. | bee_rider wrote: | I don't think such an optimization exists. | | The nice think about Fortran it that is does the sensible | thing by default for the type of scientific computing codes | that are inside it's wheelhouse (the trivial example, it | assumes arguments don't alias by default). | | C can beat anything, assuming unlimited effort. Fortran is | nice for scientists who want to write pretty good code. Or | grad students who are working on dissertations in something | other than hand-tuning kernels. | queuebert wrote: | This is the correct answer. They almost entirely compile to | the same machine code for the computationally intensive | parts. (Even Julia does that these days.) But the | limitations of Fortran prevent a lot of difficult-to-debug | C bugs, while not affecting typical scientific and | numerical capability. | sakras wrote: | How does this compiler compare with Flang? I saw it shouted out | on the main page but didn't really see any comparisons for why | you'd pick one or the other. | certik wrote: | It's hard to have a fair comparison, and both compilers are | also moving targets. I tried to do some comparison in a sibling | comment: https://news.ycombinator.com/item?id=37300279 | | The best is to mention both (as well as GFortran), and users | can decide. For LPython (https://lpython.org/) we list all of | the about 30 Python compilers at the webpage, but beyond that | it's very hard to have a meaningful comparison. | slavapestov wrote: | > LFortran is structured around two independent modules, AST | (Abstract Syntax Tree) and ASR (Abstract Semantic | Representation), both of which are standalone (completely | independent of the rest of LFortran) and users are encouraged to | use them independently for other applications and build tools on | top. | | Modern frontend architecture comes to Fortran! Awesome. | cjohnson318 wrote: | Is this the same outfit that did LPython? It looks like the | same/similar web design. | certik wrote: | Yes, both LPython and LFortran are our two thin frontends to | ASR (Abstract Semantic Representation). Not just the website is | reused, but the internals are reused, so LPython runs your code | at exactly the same speed as LFortran would, since ASR and all | optimizations and backends are shared. | dang wrote: | Related ongoing thread: | | _Fortran_ - https://news.ycombinator.com/item?id=37291504 - Aug | 2023 (193 comments) | | (current thread is better because less generic) | fanf2 wrote: | I wanted to see a comparison with flang, which I thought is the | main LLVM Fortran front end. | certik wrote: | I am the original author of LFortran. What kind of comparison | would you like to see? If you have any specific questions, I am | happy to answer. | fanf2 wrote: | What are the relative strengths and weaknesses? If LFortran | is newer, what problems with flang does it aim to address? | How should someone choose between them, and how does the | decision process change in different circumstances? | certik wrote: | There is old Flang, which motivated me to start LFortran. | The new Flang, which presumably you are referring to, | started possibly in the same month as LFortran, but we | didn't know about each other. | | It's best if you ask Flang developers what they see as the | advantages of Flang over LFortran. From my biased | perspective, LFortran can run interactively, it is fast to | compile the compiler (30s on my laptop) and LFortran | compiles your code very quickly (especially with our direct | x64 or C backends). It runs in a browser: | https://dev.lfortran.org/. We have many backends (LLVM, C, | C++, Julia, WASM, x64). We plan to add Python and Fortran | (the latter could be used to modernize your old Fortran | code). It is easy to add new backends, and it is also easy | to add new frontends, so we have LPython and LFortran as | two thin frontends, to our intermediate representation that | we call ASR (Abstract Semantic Representation). The | internal design is simple, so a small team can develop | LCompilers at a fast pace. New contributors without any | prior compiler experience get up to speed very quickly | (typically a few weeks or even less). | | We are still in alpha, which means it is expected to break | for your code (and when it does, please report all bugs!). | To choose between them, I recommend to test them out and | pick the one that you like the most, based on your | criteria. Note that the most mature and widespread open | source Fortran compiler is GFortran. | gcr wrote: | I misread LLVM as LLM and wondered what on earth language models | have to do with Fortran compilation. (They don't. LLVM is a | compiler framework.) | | Anyways, great work to the team! It's fun to see such a flurry of | articles from the Fortran community today | Alifatisk wrote: | > I misread LLVM as LLM and wondered what on earth language | models have to do with Fortran compilation. | | You gave me a good laugh this evening! | certik wrote: | Turns out Fortran is a great fit for LLM, I am not joking: | https://github.com/certik/fastGPT/. | Conscat wrote: | I was talking to someone on Grinder yesterday who made the same | mistake. I wonder how common it is to misread "LLVM" as "LLM". | certik wrote: | Thanks! I know, LLM came much later after LLVM. But if you are | interested in LLM that LFortran can compile, check out: | https://github.com/certik/fastGPT/. | csjh wrote: | Would be cool for there to be a `llama2.f`, like | https://github.com/karpathy/llama2.c, to demo its | capabilities | certik wrote: | Yes indeed, we'll do it next (unless somebody beats me to | it). First I am focusing on compiling fastGPT with | LFortran, we can do it, but have a few workarounds that I | want to fix. Then we'll do llama2. ___________________________________________________________________ (page generated 2023-08-28 23:00 UTC)