[HN Gopher] Jax vs. Julia (Vs PyTorch) ___________________________________________________________________ Jax vs. Julia (Vs PyTorch) Author : sebg Score : 63 points Date : 2022-05-04 17:36 UTC (5 hours ago) (HTM) web link (kidger.site) (TXT) w3m dump (kidger.site) | longemen3000 wrote: | I feel called out on the academic part hahaah. I simply want to | code state of the art (thermodynamic) models, and at least julia | helps by providing easy testing and publishing infraestructure. | but obviously we can't compete with a corporation in code quality | (we are trying!) | | Unrelated, but for small sizes, i really prefer to use forward | mode in julia (Via ForwardDiff.jl) instead of Zygote. the | overhead of reverse ADing over an arbitrary function with | mutation is not worth it. | tagrun wrote: | In the context of neural networks with differential equations | (which appears to be the original poster's field), the trade- | off depends: | https://diffeqflux.sciml.ai/dev/ControllingAdjoints/ | machinekob wrote: | I strongly agree with readability in my opinion its cause | Academia people live in "bubbles" and they assume everyone knew | what a domain specific terms and greek letters means so its | easier to read some omega then for example learning_rate or lr. | | But for us mortal who cross multiple domains its just getting | extremely frustrating to read full math based notation without | any extra info about notation in package/functions etc. so | debugging multiple sub-packages is just getting too time | consuming as you have to learn both person style of writing code, | whole scientific notation and get domain knowledge before you can | even touch the code. | aaplok wrote: | > Academia people live in "bubbles" and they assume everyone | knew what a domain specific terms and greek letters | | Naming things by their English name is not more universal than | using Greek letters. It's just serving amother group of people | who live in a different bubble. | belval wrote: | Yes and no, the example that the author gives is actually a | very good one: | | > Many Julia APIs look like Optimiser(e=...) rather than | Optimiser(learning_rate=...). This is a pretty unreadable | convention. | | The learning rate is a well known name that basically every | one will understand, on the other hand, "e" or eta, is not | even used everywhere in the literature with some papers using | alpha instead. | | This just looks clever, it's a pretty bad parameter name. | melissalobos wrote: | > The learning rate is a well known name that basically | every one will understand | | Absolutely! Because as we all know, everyone speaks | English. | | The GP's point was that greek letters are used in lots and | lots of papers even written in other languages. I have read | quite a few papers in Japanese that used exactly the same | conventions with respect to the greek letters and latin | letters used. | 127 wrote: | Google Translate is one click away. I can easily | translate both Japanese and Chinese comments and variable | names to get the gist of it. Using single hieroglyphs for | it makes the entire endeavor impossible. | belval wrote: | How many researchers in the ML/DL community don't speak | English? I don't have hard numbers but I highly doubt | that it's a significant proportion. What is the reach of | your Japanese papers when almost no-one outside of Japan | can read Japanese? | | Even China, despite their best effort to de-westernize | their culture still uses English in their research | papers. | | And if all the above wasn't enough, Julia's libraries are | still all in English so if an hypothetical researcher's | English is so poor that they don't know what "learning | rate" is, I'd venture that they'll have trouble | programming in Julia/JAX/PyTorch. | SiempreViernes wrote: | How many don't speak it as a native language? Quite a lot | as most of the world uses something else as their primary | language. | | If you're instead asking of how many can struggle trough | an english text supported by machine translators, then | that's clearly almost everyone. | | There's very often a significant gap between the ease | with which the native and the foreign language can be | used for reasoning, but surely I don't need to point that | out since any bilingual person knows this. | agumonkey wrote: | I understand both camps but I believe, these are superficial | problems. It's like worrying about the comfort of seat in the | operating room of a nuclear plant. | abakus wrote: | superficial you say. How about I name these in chinese in my | package? | melissalobos wrote: | Sure, just try to properly document what it does. There are | some characters that are easily confused at first glance | even for native speakers, so be sure to use some common | sense. | nextos wrote: | Exactly. I actually find Julia's ecosystem (not the language) | _way_ more approachable than Python 's. | | In Python, most libraries are big monoliths. Whereas in | Julia, libraries are small and composable. Furthermore, it's | the same language all the way down. | | Python's libraries are superb, but the learning curve to | develop (not to use them) is really steep. | jstx1 wrote: | I don't understand. What do you mean by "learning curve to | develop" an existing Python library? | bobbylarrybobby wrote: | And if you ever _do_ want to edit the code, you have to know | the name of every non-ASCII symbol the codebase uses if you | want to type out those same symbols without copying and pasting | them. If you 're not familiar with the material, entering a | character like x can be a real challenge, and is actually more | keystrokes than just typing "xi". | leephillips wrote: | For me it's one keystroke, with the dead-Greek modifier key. | | I'll do it right here, by typing <G-x>: x | tagrun wrote: | Difference is 1 keystroke: you type \xi. Why is that a "real | challenge"? | tlb wrote: | What editors does this work in? | tagrun wrote: | Juno, Jupyter support it out-of-the-box. With plug-ins: | VS Code (unicode-latex), Atom (latex-completions), Emacs | (which has TeX input), Vim (latex-unicoder), ... | leephillips wrote: | Any editor, on Linux, if you have your keyboard set up to | type Greek letters and the Unicode symbols that you use | frequently. I can do it directly in the comment box: x | animal_spirits wrote: | The challenge comes from a person like me, who doesn't know | off the top of my head which Greek letter x is. So for each | of these symbols I'd have to google them and learn it, or | have some notepad where I can copy paste the needed symbol | tagrun wrote: | So you don't know the Greek alphabet, but write high- | performance computing code involving non-trivial math? | (Julia's main use case is HPC) | ShamelessC wrote: | What? Did you not read the parent of this thread? | | > strongly agree with readability in my opinion its cause | Academia people live in "bubbles" and they assume | everyone knew what a domain specific terms and greek | letters means | | In any case, I program HPC stuff myself with pytorch and | no - I don't know the Greek alphabet and probably don't | understand "non trivial math". The assumption that these | people can't contribute is pretty off-putting honestly. | More engineers would join such efforts if there wasn't so | much gatekeeping. | belval wrote: | The author example is using the greek letter "eta" | instead of spelling out "learning_rate", this is a pretty | damning example. | dTal wrote: | You should probably take the time to learn the names of | the Greek letters. There are only 24 of them and they're | even related to the English ones. It's not a huge time | investment, and it's probably worth it if you work in | engineering. | blindseer wrote: | Another massive frustration for me is that Julia has no formal | way to say "here are public functions and these are private" but | does have a completely orthogonal way of saying "these are | functions that will populate your global namespace if you use | `using`", i.e. `export variable_name`, and people absolutely | confuse the hell out of these. I don't think there's even | agreement in the Julia community if you should use `export` for | your public API or not. | | And if you misspell the exports or change the variable in | question, Julia won't even warn you about it. That is straight | crazy behavior to me, and I still don't understand how that | hasn't been changed. | | The `using` + `import` packages in Julia combined with how | `export`s work make it SUCH a confusing and frustrating | experience for beginners in Julia. | | I personally like mathematical symbols when I'm writing and | reading code in my domain, but I do feel very lost when I'm | reading Julia code outside of my area of expertise. All my | colleagues hate it too (hard to grep, hard to type if you don't | know the math and are just a software engineer) and I'm coming | around to the idea of not using it or documenting it explicitly. | | The fact that mutable structs are easier to use but immutable | structs are more performant, the lack of composition of fields | the way Go handles it, the lack of traits or interfaces, the | sorry state of compilation time, the non existent tooling all | lead a beginner / intermediate Julia developer in the wrong | direction in my opinion. It's very easy to write code that is | straight up broken or not efficient in Julia, and that's probably | why I won't pick it for a big project going forward. | | But I'm still keeping my eye on it. Maybe in 5 years it'll be the | language for a lot of the jobs? | adgjlsfhk1 wrote: | """ Even in the major well-known well-respected Julia packages - | I'll avoid naming names - the source code has very obvious cases | of unused local variables, dead code branches that can never be | reached, etc. """ Please name names. that way we can fix stuff. | Other than that, great post! | lern_too_spel wrote: | I think the author is pushing for better code quality tooling | in Julia instead of having people manually fix these problems. | adgjlsfhk1 wrote: | we absolutely should, but in the short term, fixing issues is | good. | SemanticStrengh wrote: | The most next gen autodiff library probably is | https://github.com/breandan/kotlingrad because of its features, | ergonomy and type safety | martinsmit wrote: | Idk, Enzyme is pretty next gen, all the way down to LLVM code. | | https://github.com/EnzymeAD/Enzyme | tagrun wrote: | In his item #1, he links to | https://discourse.julialang.org/t/loaderror-when-using-inter... | The issue is actually a Zygote bug, a Julia package for auto- | differentiation, and is not directly related to Julia codebase | (or Flux package) itself. Furthermore, the problematic code is | working fine now, because DiffEqFlux has switched to Enzyme, | which doesn't have that bug. He should first confirm whether the | problem he is citing is actually a problem or not. | | Item #2, again another Zygote bug. | | Item #3, which package? That sounds like an hyperbole, an | extrapolation from a small sample, and "I'll avoid naming names" | is a lazy excuse that would hide this. It is similarly easy to | point to poorly written (or poorly documented) JAX code or Python | code as well, so that doesn't prove that "Julia is lacklustre and | JAX shines". Also, as an academician, I strongly disagree that | learning_rate=... is better than e=..., but that's a matter of | convention & taste, with no bearing on the correctness or | performance of the package/language. It's bikeshedding. I agree | that errors are usually not "instructable" in Julia ML packages | (which needs to be improved), so monkey typing is less likely to | succeed. | | Item #4 is such a nitpicker. Sure, Julia may not have a special | syntax for that particular fringe array slicing like Numpy does | and you'll instead need to make a function call, but | matrix/tensor code in Numpy is usually filled with calls to zips | and cats with explicit indices, whereas in Julia, no explicit | indexing needs to be done usually. Also, one can nitpick | similarly in the opposite direction: Julia has many language | features lacking from JAX or Python, why not talk about those as | well? | | I'm not a huge fan of Julia either (mainly because of it's | garbage collected nature), but this is such a low-effort | criticism of it. | blindseer wrote: | I've been using Julia since 2017 and still do on a day to day | basis, and I agree with the author in a lot of cases, even his | subjective naming conventions gripes. | | The author's biggest criticism is that Julia doesn't have | tooling to make the developer experience better. There's | Revise, JuliFormatter, LanguageServer and Jet, but the | development experience in Python is enviable. There's like 3 | different REPLs, at least two competing linters and auto | formatters. It's okay to admit that these are places Julia is | lacking. | | I think your kind of response to criticism about Julia is what | gives the Julia community a bad name, in my opinion. What is | wrong with saying these things suck and need improvements? | Would you rather Julia not improve and stay the way it is right | now forever? Surely I hope not. | tagrun wrote: | What exactly do you think I said about Julia's linters and | auto-formatters? | nullstyle wrote: | > but that's a matter of convention & taste, with no bearing on | the correctness or performance of the package/language. It's | bikeshedding. | | I heartily disagree with that. Bikeshedding is about focussing | on the trivial, and the symbols we choose in our codebases are | hardly trivial, and indeed many of us regard naming things as | one of the central problems in programming[1]. | | [1]:https://medium.com/hackernoon/naming-the-things-in- | programmi... | tagrun wrote: | Unlike the example in the link you give, e isn't a generic | random name like a,x that can mean anything. If you ever read | a paper on stochastic gradient optimization, you'd know that | e means learning rate in the context. | | It is bikeshedding because it is analogous to insisting that | using "angle" instead of "th", or "radius" instead of "r" in | a 2D geometry library is superior and takes your code from | being a lackluster to something that shines (in the words of | the original author), while not having anything useful to say | anything about the mathematical/technical aspects of the code | itself. | | Here is the definition of bikeshedding: | | > The term was coined as a metaphor to illuminate Parkinson's | Law of Triviality. Parkinson observed that a committee whose | job is to approve plans for a nuclear power plant may spend | the majority of its time on relatively unimportant but easy- | to-grasp issues, such as what materials to use for the staff | bikeshed, while neglecting the design of the power plant | itself, which is far more important but also far more | difficult to criticize constructively. It was popularized in | the Berkeley Software Distribution community by Poul-Henning | Kamp[1] and has spread from there to the software industry at | large. | | from https://en.wiktionary.org/wiki/bikeshedding | 00ajcr wrote: | My interpretation of the point in the blog post was that | explicitly spelling out variable names makes APIs and the | underlying code much more accessible to a wider audience. | | Sure, there'll be a subset of users of these libraries that | have read ML/textbooks and are familiar with what e means | in this context. | | Today, many (most?) users of ML libraries will probably not | know what e means without looking it up. Adhering to | mathematical notation puts up an unnecessary barrier to | using the API/code and ultimately limits wider | engagement/collaboration. | | To attract a bigger slice of the ML community, choosing | names that the ML hobbyyist can read, understand and use | without pause is the better path forward. | tagrun wrote: | You are saying most people don't know what e in that | context means (=people who likely haven't read a book or | a paper on stochastic gradient, and don't know how it | actually works), but they would somehow magically figure | out what it actually does if we call it "learning_rate" | in ASCII letters. How does that work? | | FYI, the documentation of the function | https://fluxml.ai/Flux.jl/stable/training/optimisers/ | explicitly says it is learning rate: | | > Learning rate (e): Amount by which gradients are | discounted before updating the weights. | | so this is already explicit to anyone who reads the | documentation. The quibble in the post is about the named | parameter. | jstx1 wrote: | > How does that work? | | You can look up "learning rate" much easier than to look | up "what is this Greek letter on my screen" followed by | "what is the use of this Greek letter in my context" and | only then followed by searching for "learning rate" | | More importantly, it's possible to know what a learning | rate is without knowing what Greek letter it's commonly | denoted as. Especially since mathematical notation is so | inconsistent across authors. I want less ambiguity in | code, not more. Explicit is better than implicit. | | Mathematical notation is notorious for being an absolute | mess of inconsistencies. Who in their right mind looked | at it and went "yep, I want more of this is my source | code". ___________________________________________________________________ (page generated 2022-05-04 23:00 UTC)