[HN Gopher] Lfortran: Modern interactive LLVM-based Fortran comp...
       ___________________________________________________________________
        
       Lfortran: Modern interactive LLVM-based Fortran compiler
        
       Author : zaikunzhang
       Score  : 124 points
       Date   : 2023-08-28 15:00 UTC (7 hours ago)
        
 (HTM) web link (lfortran.org)
 (TXT) w3m dump (lfortran.org)
        
       | pseudosavant wrote:
       | Was I the only one hoping that Lfortran was something for music
       | synthesis (LFOs...) using Fortran? I can't not see LFO - like
       | that keming joke.
        
       | ruste wrote:
       | This is probably fantastic from a maintainability perspective,
       | but I'm curious if some performance is left on the table by using
       | LLVM IR instead of compiling directly to machine code. I know
       | there are a number of optimizations that can be made for Fortran
       | that can't be made for C-like languages and I wonder if some of
       | those C-like assumptions are implicitly encoded in the IR.
        
         | certik wrote:
         | The original author of LFortran. Great question.
         | 
         | We designed LFortran to first "raise" the AST (Abstract Syntax
         | Tree) to ASR (Abstract Semantic Representation). The ASR keeps
         | all the semantics of the original code, but it is otherwise as
         | abstract/simple as possible. Thus by definition it allows us to
         | do any optimization possible, as ASR->ASR optimization pass. We
         | do some already, we will do many more in the future. This
         | optimizes all the things where you need to know the high level
         | information about Fortran. Then once we can't do any more
         | optimizations, we lower to LLVM. If in the future it turns out
         | we need some representation between ASR and LLVM, such as MLIR,
         | we can add it.
         | 
         | We also have a direct ASR->WASM and WASM->x64 machine code, and
         | even direct ASR->machine code, but the ASR->LLVM backend is the
         | most advanced, after that probably our ASR->C backend and after
         | that our ASR->WASM backend.
        
           | throwaway17_17 wrote:
           | How does LLVM cope with the array semantics? I was under the
           | impression that the noalias attribute in the IR was not
           | activated in such a way as to enable the optimizations that
           | make Fortran so fast.
        
             | certik wrote:
             | My experience with LLVM so far has been that is possible to
             | get maximum speed as long as we generate the correct and
             | clean LLVM IR, and do many of the high level optimizations
             | ourselves.
             | 
             | If LLVM has any downsides, it is that it is hard to run in
             | the browser, so we don't use it for
             | https://dev.lfortran.org/, and that it is slow to compile
             | (both LLVM itself, as well as it makes LFortran slow to
             | compile, compared to our direct WASM/x64 backends). But
             | when it comes to runtime performance of the generated code,
             | LLVM seems very good.
        
             | galangalalgol wrote:
             | Rust drove the fixes needed in llvm to support noalias.
             | They went through a couple reverts before seemingly fixing
             | everything. If lfortran emits noalias, llvm can probably
             | handle it now.
        
           | csjh wrote:
           | WASM -> x64 as in a full WASM AOT compiler? Don't those
           | already exist? What's the benefit to making one specifically
           | for LFortran? Unless I'm misunderstanding
           | 
           | Super cool stuff though
        
             | certik wrote:
             | Yes, we could make the WASM->x64 standalone. The main
             | motivation is speed of compilation. We do not do any
             | optimizations, but we want to generate the x64 binary as
             | quickly as possible, with the idea that it would be used in
             | Debug mode, for development. Then for Release mode you
             | would use LLVM, which is slow to compile, but good runtime
             | performance. And since we already have ASR->WASM backend
             | (used for example at https://dev.lfortran.org/),
             | maintaining WASM->x64 is much simpler than ASR->x64
             | directly.
        
         | konradha wrote:
         | LFortran is not necessarily using LLVM IR to compile. It's
         | building up an ASR [1] structure that's already being used in
         | the LFortran-specific backends. Potentially it can make full
         | use of Fortran semantics!
         | 
         | [1] https://docs.lfortran.org/en/design/
        
         | zik wrote:
         | > there are a number of optimizations that can be made for
         | Fortran that can't be made for C-like languages
         | 
         | That used to be true a long time ago but since the restrict
         | keyword was introduced in C99 it's not really true any more.
        
         | leephillips wrote:
         | As an indirect answer, consider Julia, which is based on LLVM
         | and seems to be competitive with Fortran on large scale,
         | numerically intensive calculations.
        
           | queuebert wrote:
           | Can Julia pre-compile to a binary executable now? If not,
           | they can't replace Fortran.
        
             | uoaei wrote:
             | https://docs.juliahub.com/PackageCompiler/MMV8C/1.2.1/devdo
             | c...
        
               | krestomantsi wrote:
               | That is not a true binary. Making julia truly compile
               | into binaries is now the number 1 goal of the language
               | according to Tim Holy and the julia team.
        
               | certik wrote:
               | LFortran can translate your Fortran code to Julia via our
               | Julia backend. Once Julia can compile to a binary, it
               | will be exciting to do some comparisons, like speed of
               | compilation and performance of the generated binary. As
               | well as the quality of the Julia code that we generate,
               | we'll be happy to improve it to create canonical Julia
               | code, if at all possible.
        
         | tombert wrote:
         | As someone who's only played with Fortran, and never done
         | anything too serious with it, can you explain an optimization
         | that can be done in Fortran that can't be done in a C-like
         | language?
         | 
         | I'm not being argumentative, I'm actually really curious.
        
           | bogeholm wrote:
           | Here's a link to a StackOverflow answer that gives a good
           | example: "Is Fortran easier to optimize than C for heavy
           | calculations?" [0]
           | 
           | [0]: https://stackoverflow.com/questions/146159/is-fortran-
           | easier...
        
           | pklausler wrote:
           | The most significant distinction is that dummy arguments in
           | Fortran can generally be assumed by an optimizer to be free
           | of aliasing, when it matters. Modifications to one dummy
           | argument can't change values read from another, or from
           | global data. So a loop like                 subroutine foo(a,
           | b, n)         integer n         real a(n), b(n)         do j
           | = 1, n           a(j) = 2 * b(j)         end do       end
           | 
           | can be vectorized with no concern about what might happen if
           | the `b` array shares any memory with the `a` array. The
           | burden is on the programmer to not associate these dummy
           | arguments on a call with data that violate this requirement.
           | 
           | (This freedom from aliasing doesn't extend to Fortran's
           | POINTER feature, nor does it apply to the ASSOCIATE
           | construct, some compilers notwithstanding.)
        
             | 3836293648 wrote:
             | This can be done in C, but not C++, though in practice all
             | C++ compilers support it. It's the `restrict` keyword
        
           | jcranmer wrote:
           | Fortran has true multidimensional arrays in a way that C
           | doesn't have--if you know an array is 5x3, you know that A[6,
           | 1] doesn't map to a valid element whereas in C, it does map
           | to a valid element. This turns out to make a lot of loop
           | optimizations easier. (Also, being Fortran, you tend to pass
           | around arrays with size information anyways, which C doesn't
           | do, since you typically just get pointers with C).
        
             | WanderPanda wrote:
             | Is the size info compile-time or runtime in Fortran?
        
               | certik wrote:
               | It can be both. If you know the dimension at compile
               | time, it is compile time, if you don't it will be
               | runtime.
        
           | certik wrote:
           | A simple example is returning an allocatable array from a
           | function, where the Fortran compiler can decide to allocate
           | on a stack instead, or even inline the function and eliminate
           | completely. While in C the compiler would need to understand
           | the semantics of an allocatable array. If you use raw C
           | pointer and malloc, and use Clang, my understanding is that
           | Clang translates quite directly to LLVM and LLVM is too low
           | level to optimize this out, depending on the details of how
           | you call malloc.
           | 
           | Of course, you can rewrite your C code by hand to generate
           | the same LLVM code from Clang, as LFortran generates for the
           | Fortran code. So in principle I think anything can be done in
           | C, as anything can be done in assembly or machine code. But
           | the advantage of Fortran is that it is higher level, and thus
           | allows you to write code using arrays in a high level way and
           | do not have to do many special things as a programmer, and
           | the compiler can then highly optimize your code. While in C
           | very often you might need to do some of these optimizations
           | by hand as a user.
        
           | bee_rider wrote:
           | I don't think such an optimization exists.
           | 
           | The nice think about Fortran it that is does the sensible
           | thing by default for the type of scientific computing codes
           | that are inside it's wheelhouse (the trivial example, it
           | assumes arguments don't alias by default).
           | 
           | C can beat anything, assuming unlimited effort. Fortran is
           | nice for scientists who want to write pretty good code. Or
           | grad students who are working on dissertations in something
           | other than hand-tuning kernels.
        
             | queuebert wrote:
             | This is the correct answer. They almost entirely compile to
             | the same machine code for the computationally intensive
             | parts. (Even Julia does that these days.) But the
             | limitations of Fortran prevent a lot of difficult-to-debug
             | C bugs, while not affecting typical scientific and
             | numerical capability.
        
       | sakras wrote:
       | How does this compiler compare with Flang? I saw it shouted out
       | on the main page but didn't really see any comparisons for why
       | you'd pick one or the other.
        
         | certik wrote:
         | It's hard to have a fair comparison, and both compilers are
         | also moving targets. I tried to do some comparison in a sibling
         | comment: https://news.ycombinator.com/item?id=37300279
         | 
         | The best is to mention both (as well as GFortran), and users
         | can decide. For LPython (https://lpython.org/) we list all of
         | the about 30 Python compilers at the webpage, but beyond that
         | it's very hard to have a meaningful comparison.
        
       | slavapestov wrote:
       | > LFortran is structured around two independent modules, AST
       | (Abstract Syntax Tree) and ASR (Abstract Semantic
       | Representation), both of which are standalone (completely
       | independent of the rest of LFortran) and users are encouraged to
       | use them independently for other applications and build tools on
       | top.
       | 
       | Modern frontend architecture comes to Fortran! Awesome.
        
       | cjohnson318 wrote:
       | Is this the same outfit that did LPython? It looks like the
       | same/similar web design.
        
         | certik wrote:
         | Yes, both LPython and LFortran are our two thin frontends to
         | ASR (Abstract Semantic Representation). Not just the website is
         | reused, but the internals are reused, so LPython runs your code
         | at exactly the same speed as LFortran would, since ASR and all
         | optimizations and backends are shared.
        
       | dang wrote:
       | Related ongoing thread:
       | 
       |  _Fortran_ - https://news.ycombinator.com/item?id=37291504 - Aug
       | 2023 (193 comments)
       | 
       | (current thread is better because less generic)
        
       | fanf2 wrote:
       | I wanted to see a comparison with flang, which I thought is the
       | main LLVM Fortran front end.
        
         | certik wrote:
         | I am the original author of LFortran. What kind of comparison
         | would you like to see? If you have any specific questions, I am
         | happy to answer.
        
           | fanf2 wrote:
           | What are the relative strengths and weaknesses? If LFortran
           | is newer, what problems with flang does it aim to address?
           | How should someone choose between them, and how does the
           | decision process change in different circumstances?
        
             | certik wrote:
             | There is old Flang, which motivated me to start LFortran.
             | The new Flang, which presumably you are referring to,
             | started possibly in the same month as LFortran, but we
             | didn't know about each other.
             | 
             | It's best if you ask Flang developers what they see as the
             | advantages of Flang over LFortran. From my biased
             | perspective, LFortran can run interactively, it is fast to
             | compile the compiler (30s on my laptop) and LFortran
             | compiles your code very quickly (especially with our direct
             | x64 or C backends). It runs in a browser:
             | https://dev.lfortran.org/. We have many backends (LLVM, C,
             | C++, Julia, WASM, x64). We plan to add Python and Fortran
             | (the latter could be used to modernize your old Fortran
             | code). It is easy to add new backends, and it is also easy
             | to add new frontends, so we have LPython and LFortran as
             | two thin frontends, to our intermediate representation that
             | we call ASR (Abstract Semantic Representation). The
             | internal design is simple, so a small team can develop
             | LCompilers at a fast pace. New contributors without any
             | prior compiler experience get up to speed very quickly
             | (typically a few weeks or even less).
             | 
             | We are still in alpha, which means it is expected to break
             | for your code (and when it does, please report all bugs!).
             | To choose between them, I recommend to test them out and
             | pick the one that you like the most, based on your
             | criteria. Note that the most mature and widespread open
             | source Fortran compiler is GFortran.
        
       | gcr wrote:
       | I misread LLVM as LLM and wondered what on earth language models
       | have to do with Fortran compilation. (They don't. LLVM is a
       | compiler framework.)
       | 
       | Anyways, great work to the team! It's fun to see such a flurry of
       | articles from the Fortran community today
        
         | Alifatisk wrote:
         | > I misread LLVM as LLM and wondered what on earth language
         | models have to do with Fortran compilation.
         | 
         | You gave me a good laugh this evening!
        
           | certik wrote:
           | Turns out Fortran is a great fit for LLM, I am not joking:
           | https://github.com/certik/fastGPT/.
        
         | Conscat wrote:
         | I was talking to someone on Grinder yesterday who made the same
         | mistake. I wonder how common it is to misread "LLVM" as "LLM".
        
         | certik wrote:
         | Thanks! I know, LLM came much later after LLVM. But if you are
         | interested in LLM that LFortran can compile, check out:
         | https://github.com/certik/fastGPT/.
        
           | csjh wrote:
           | Would be cool for there to be a `llama2.f`, like
           | https://github.com/karpathy/llama2.c, to demo its
           | capabilities
        
             | certik wrote:
             | Yes indeed, we'll do it next (unless somebody beats me to
             | it). First I am focusing on compiling fastGPT with
             | LFortran, we can do it, but have a few workarounds that I
             | want to fix. Then we'll do llama2.
        
       ___________________________________________________________________
       (page generated 2023-08-28 23:00 UTC)