[HN Gopher] JEP 430: String Templates (Preview) Proposed to Targ...
       ___________________________________________________________________
        
       JEP 430: String Templates (Preview) Proposed to Target Java 21
        
       Author : za3faran
       Score  : 111 points
       Date   : 2023-03-03 17:52 UTC (5 hours ago)
        
 (HTM) web link (openjdk.org)
 (TXT) w3m dump (openjdk.org)
        
       | spankalee wrote:
       | This looks great.
       | 
       | I maintain the lit-html template library which uses JavaScript
       | tagged template literals to embed HTML in JS.
       | 
       | A key feature of JS template literals is that the tag receives
       | the string fragments and values separately, and can return non-
       | string results. This means we can escape values and validate
       | template structure to prevent XSS attacks.
       | 
       | This approach in Java should lead to a lot more lightweight but
       | safe DSL embeddings.
       | 
       | (I don't love the syntax, but... it's Java)
       | 
       | edit: the other critical feature of JS tagged template literals,
       | that I'm not sure this has, is referential equality of the
       | template strings passed to the tag function across multiple
       | invocations.
       | 
       | This is required to be able to do one-time preparation work on a
       | template and re-use that with different sets of values. Think a
       | `SQL.` template processor that turns the strings into a prepared
       | statement once and re-uses that for each query with different
       | parameters.
       | 
       | edit 2: awesome, it does:                   The fragments() of a
       | StringTemplate are constant across all evaluations of a template
       | expression, while values() is computed fresh for each evaluation.
       | For example:
        
         | aardvark179 wrote:
         | This spec doesn't need to specify that as it should fall out
         | naturally from Constable and co (JEP 303 and 334) in
         | combination with string templates.
        
       | nezirus wrote:
       | That database stuff looks horrible. Why do they feel the need to
       | introduce DB query templating into string templates? No matter
       | what you do on the client side, the database engine itself should
       | escape/validate the data. Didn't we learn that lesson with PHP?
       | 
       | In addition to that, not every database needs prepared statements
       | for safe queries e.g. "Parametrized queries" in PostgreSQL
       | (available in libpq as PQExecParams and exposed in many other
       | higher level languages)
        
         | bcrosby95 wrote:
         | The database stuff looks like an example, and shows how you
         | could extend the template system in a way that doesn't
         | introduce security problems.
        
           | nezirus wrote:
           | That is exactly the point, you should not use general string
           | templating system for SQL queries, together with "roll your
           | own" escape and validation mechanisms. I really don't see why
           | they included that part, if not to show how to shoot yourself
           | in the foot.
        
             | bcrosby95 wrote:
             | There is no roll your own escape mechanics. The example
             | uses prepared statements.
        
       | Phelinofist wrote:
       | What happens if I already have a field called STR or FMT?
       | 
       | > STR is a public static final field that is automatically
       | imported in every Java source file.
        
         | kaba0 wrote:
         | Then your STR would be used, and if you want to use this string
         | template STR you would have to use it's qualified name.
        
           | ivan_gammel wrote:
           | Or have an alias:                  public static final
           | ValidationProcessor $ = ....STR;             $."My name is
           | \{name}"
        
         | archgoon wrote:
         | [dead]
        
       | ko27 wrote:
       | It's interesting how C# is always far ahead of Java, they
       | introduced it way earlier, the syntax is simpler, and you can
       | make is safe by using FormattableString as the param type, for
       | example in EF you can do this without worrying about SQL
       | injection:
       | 
       | FromSql($"EXECUTE dbo.GetMostPopularBlogsForUser {user}")
       | 
       | https://learn.microsoft.com/en-us/ef/core/querying/sql-queri...
        
         | DaiPlusPlus wrote:
         | A major shortcoming with FormattableString is that the C#
         | compiler is hard-coded to always prefer implicitly converting
         | an interpolated string to a string, which makes it impossible
         | to write extension-methods for FormattsbleString objects.
         | 
         | ...and they also always default to formatting with
         | CurrentCulture instead of InvarintCulture. Apparently this was
         | by-design as interpolated strings were never originally
         | intended for generating machine-readable strings.
         | 
         | Finally, there's no way to perform common "mini-templating"
         | with a FormattableString, such as repeating regions, show/hide
         | regions within a string, or little things like inflection (e.g.
         | rendering "{0:N} items" or "{0:N0} item" when arg0 is 1 or not.
         | 
         | I'm happy to see Java (finally) gain a similar feature, but it,
         | like C#,s, seems... limited in its abilities.
        
         | pjmlp wrote:
         | Not always, e.g. default interface methods, just to give one
         | example.
        
         | [deleted]
        
       | _old_dude_ wrote:
       | It seems the design team has lost its mojo.
       | 
       | First, the lesson of Log4Shell have not been learnt. Allowing
       | more libraries to do formatting, what can go wrong ?
       | 
       | And String.format() is slow because every values are boxed. This
       | JEP repeats the same mistake with the templates.
        
         | pron wrote:
         | The template instantiation itself is done by the language, not
         | libraries, and the entire mechanism was designed for security
         | (read the JEP); for example, templates are (virtually) limited
         | to literals and can't come from user input.
         | 
         | As to boxing, the built-in template processors, STR and FMT,
         | don't do boxing (they use MethodHandle mechanisms similar to
         | those used by lambdas) -- FMT is ~40x faster than
         | String.format, I'm told -- but that capability hasn't been
         | exposed as an API yet. As with other features, we try exposing
         | basic usages first, and more sophisticated ones later.
        
       | [deleted]
        
       | MBCook wrote:
       | I thought the choice of STR was ugly at first but as you continue
       | reading and see how that allows multiple options that provide
       | more than generic string joining it really starts to make sense.
       | 
       | Unlike many here I do like the choice of \\{}, although
       | personally I would have preferred \\() like Swift.
       | 
       | I can't wait.
        
       | slt2021 wrote:
       | Fantastic and long awaited feature. Little bit clunky/ugly choice
       | of chars, but I guess we will get used to it
        
       | MH15 wrote:
       | A bit ugly, but seems powerful. Very similar to tagged template
       | literals in JS.
        
       | kaba0 wrote:
       | Not sure where should I give this minor heads up regarding the
       | JEP:                 StringProcessor INTER = (StringTemplate st)
       | -> {           String placeHolder = "*";           String stencil
       | = String.join(placeHolder, fragments);           for (Object
       | value : st.values()) {               String v =
       | String.valueOf(value);               stencil =
       | stencil.replaceFirst(placeHolder, v);           }         return
       | stencil;       };
       | 
       | 'fragments' should be 'st.fragments()' here I believe to make it
       | compile.
        
       | tofflos wrote:
       | I wonder if annotations and annotation pre-processors could have
       | been an alternate way to approach this had they been applicable
       | to String constants.                   String name = "Joan";
       | PreparedStatement query = @SQL "select * from users where
       | firstname = \{name}";
        
         | ivan_gammel wrote:
         | Annotations are declarative, processors are imperative. You
         | need an interface with a method to run the processing.
        
       | rcoveson wrote:
       | > Gosling: For me as a language designer, which I don't really
       | count myself as these days, what "simple" really ended up meaning
       | was could I expect J. Random Developer to hold the spec in his
       | head. That definition says that, for instance, Java isn't -- and
       | in fact a lot of these languages end up with a lot of corner
       | cases, things that nobody really understands. Quiz any C
       | developer about unsigned, and pretty soon you discover that
       | almost no C developers actually understand what goes on with
       | unsigned, what unsigned arithmetic is. Things like that made C
       | complex. The language part of Java is, I think, pretty simple.
       | The libraries you have to look up.[0]
       | 
       | So we can have this super powerful, extensible string
       | interpolation system using never-before-seen syntax as a
       | _language_ feature, not a standard library extension, but not
       | unsigned integers?
       | 
       | I'll echo what other commenters have said: This looks powerful
       | but ugly. I'll add to it by pointing out that "powerful but ugly"
       | is what we've already got. So we gain the ability to move format
       | arguments inline with the string contents _if_ we 're willing to
       | use the `STR."\\{}"` syntax? That seems like a lateral move.
       | 
       | 0. http://www.gotw.ca/publications/c_family_interview.htm
        
         | pron wrote:
         | We _could_ have unsigned integers but we 're choosing not to
         | because on the whole their disadvantages outweigh their
         | advantages. On the other hand, once we have user-defined value
         | types, you'll be able to define unsigned integers in a library
         | if you want.
         | 
         | The extensibility and power of string templates were a
         | requirement; security experts quite simply vetoed adding string
         | interpolation as it's just too dangerous, especially in server-
         | side software, where Java is mostly used.
         | 
         | As to `STR.` -- which you _won 't_ need to use most of the time
         | -- see my other comment:
         | https://news.ycombinator.com/item?id=35013470
        
           | Kwpolska wrote:
           | Python is also widely used server-side, and they introduced
           | f-strings with simple and friendly syntax a few years ago. JS
           | added template literals in 2015's ES6, when Node.js was very
           | much a thing. Why is Java special here?
        
             | pron wrote:
             | Java's syntax is just as friendly and simple -- see my
             | other comments on the subject -- it just requires the
             | receiver to define a policy, which is essential for
             | security. You only need to use STR when the receiver does
             | _not_ define a template processor and works with strings. I
             | have no idea what Python 's or JS's security experts
             | advised, but that code injection is one of the most common
             | vulnerabilities in memory-safe languages is a fact reported
             | by all security advisories. String interpolation is one of
             | the most dangerous features a language can have.
        
               | Kwpolska wrote:
               | I'm not a fan of the special `log.info."x: \\{x}";`
               | syntax, it looks like a weird mix of field access and a
               | string literal.
               | 
               | And even with the fancy new syntax, I'm sure there will
               | be people passing STR."SELECT * FROM x WHERE y = \\{y}"
               | to their database. You can educate people all you want,
               | but not all developers will read the docs telling them
               | this is dangerous. Even if all the docs do the right
               | thing, people might end up reading old tutorials, and
               | will then notice the "inefficiency" of the prepared
               | statement syntax and will just do a STR."" format. Or
               | they might consider the database.executeQuery."SELECT
               | \\{x}"; syntax "invalid" and try to fix it.
        
       | mabbo wrote:
       | String name = "Joan";       String info = STR."My name is
       | \{name}";
       | 
       | I don't love it, to be honest. `STR` is a bit much. Requiring the
       | \ for the first but not second brace. I want string templates in
       | Java, but this feels ugly.
        
         | archgoon wrote:
         | [dead]
        
         | v-erne wrote:
         | Im guessing that the second backslash was deemd unecessary (and
         | in my opinion it is - no need for additional key strokes in the
         | name of some arbitral consistancy). Besides you are probably
         | looking at this wrong - backslash here is equivalent of hash or
         | dolar sign - it denotes begining of expression (just like most
         | languages do). And they literally could not use another one
         | because of backward compatibilty it they want to keep the "no
         | STR"/simple versuon of interpolation possible. They will just
         | add new escape sequence /{ - its kinda brilliant if you think
         | about it.
        
         | pron wrote:
         | STR is not some special syntax, just a receiver for a method
         | call. You would only need STR to interpolate a template into a
         | String, but any receiver can specify that interpretation of the
         | template -- or a different one. So, for example, a logger could
         | use:                   log.info."x: \{x}";
         | 
         | and similarly for other uses that don't produce strings but
         | JSON, SQL etc. So STR would only be used -- hopefully rarely --
         | for old APIs that have not determined their own template
         | interpretation strategy and only accept String.
         | 
         | The use of the backslash is important to distinguish between a
         | string literal and a template literal because "x: \\{x}" is not
         | a valid string literal today. Swift uses \\(...), BTW.
        
           | Bjartr wrote:
           | > So STR would only be used -- hopefully rarely
           | 
           | If that's the hope, why have it automatically imported
           | statically for all files?
        
             | pron wrote:
             | Because it doesn't hurt, and because it will take some time
             | until all relevant libraries expose their chosen template
             | processor, reducing the need for STR, the tax on less
             | restrained uses needs to be neither too high nor too low.
        
         | Alupis wrote:
         | It is indeed ugly. Having switched full-time to Kotlin a little
         | less than a year ago, I honestly prefer their approach to
         | String interpolation using `${var}`.
         | 
         | I understand Java has a backwards compatibility issue that
         | Kotlin does not though, which would make breaking changes to
         | all the hard-coded strings containing `$` that would now need
         | to be escaped. This is discussed in the JEP.
         | 
         | Backwards compatibility - simultaneously Java's strongest and
         | weakest asset.
        
         | lstamour wrote:
         | Yeah, the choice of \\{ to create an expression instead of
         | printing a literal { character is an odd one when you compare
         | how you output a double-quote character:
         | 
         | > To aid refactoring, double-quote characters can be used
         | inside embedded expressions without escaping them as \".
         | 
         | I would expect for consistency that \"{expression}\" outputs an
         | expression and \"\\{expression}\" would not.
        
           | kaba0 wrote:
           | Why would you have to escape '{'?
           | 
           | I don't get your examples. The JEP meant that they can be
           | used _inside_ not, in-between, so:                 \{ "a" +
           | "b" + 3 }
           | 
           | could work.
        
       | Noe2097 wrote:
       | Why oh why is that a backslash!?
       | 
       | Over the 9 languages mentioned: 5 languages (inc. JVM ones like
       | groovy and kotlin) are using `$`, 1 language (swift) is using
       | `\\`.
       | 
       | `\\` is a pain to type on many keyboard layouts -- actually most
       | but the US one. It seemed to me that `$` would have been a much
       | more "conventional" choice.
       | 
       | This really makes me sad. It looks like the choice was made on
       | purpose to be different.
        
         | kothar wrote:
         | I assume because `\\{` was not a valid escape sequence, which
         | means any use of this character pair can be identified as a
         | template without changing the semantics of existing string
         | literals.
        
           | aardvark179 wrote:
           | Bingo!
        
         | qw wrote:
         | I agree. On my keyboard I need to press [?] + [?] + 7
         | 
         | It's actually made worse because { and } also need special
         | combinations.
         | 
         | To type \\{X} I need to press                 ([?] + [?] + 7)
         | (SHIFT + [?] + 8) X (SHIFT + [?] + 9)
         | 
         | I know this is something we need to live with due to historical
         | reasons, but I would prefer that new syntax is made simpler.
        
           | lenkite wrote:
           | What about parenthesis () and square brackets [] ? (writing a
           | lisp DSL that might have users from non-US layouts)
        
           | bberrry wrote:
           | Agreed, the economics for this syntax is terrible.
        
         | spankalee wrote:
         | I don't understand this rationale either:
         | 
         | > For the syntax of embedded expressions we considered using
         | ${...}, but that would require a tag on string templates
         | (either a prefix or a delimiter other than ") to avoid
         | conflicts with legacy code.
         | 
         | Can't the template processor expression itself function as the
         | tag? Is STR."..." already legal now?
        
         | pdevr wrote:
         | Seriously, yes. Great thought process behind the design, marred
         | by the good-for-nothing (subjective opinion) backslash.
         | 
         | If any of you design a language or a DSL, please, please -
         | avoid the backslash - for the reasons stated above. Hard to
         | type, introduces unseen problems, most will hate it. It
         | (backslash) is unbecoming, of anything elegant.
        
           | SyrupThinker wrote:
           | But if you do follow that advice consider if it is worth just
           | using a different escape character for _all_ interpolations.
           | That way could still make use of this very sensible syntax of
           | reusing the escape for interpolation whilst avoiding the
           | backslash issues.
        
         | cs02rm0 wrote:
         | I'm not a fan either. Java seems to be declining in prevalence
         | in my corner of industry, I'm sure these changes are made by
         | wiser minds than mine, but I'm sceptical about whether such a
         | choice is really right for users.
        
         | jacobn wrote:
         | I'd also really have preferred $, but apparently there's some
         | backwards compatibility issue with using that character?
        
           | kaba0 wrote:
           | "${val}" is a legal string literal that is certainly used in
           | plenty of existing programs.
           | 
           | In current Java you would have to escape \, so only
           | "\\\\{val}" could have existed before.
        
             | _old_dude_ wrote:
             | That's why JS use ` and not reuse ".
             | 
             | You want the lexer to be able to make the difference
             | between a constant string and a template string and you
             | want users to be able to read the code.
             | 
             | This JEP solves the former not the later.
        
               | kaba0 wrote:
               | It's not as bad as people make it out to be. Swift
               | without any backwards compatibility constraints chose
               | "\\(val)". It does need a slight getting used to but it
               | is not any worse than ${}.
        
               | Kwpolska wrote:
               | Swift's choice is strange and ugly IMO, even if it did
               | not involve backwards compatibility constraints.
        
       | supriyo-biswas wrote:
       | The proposal is uglier than f-strings in Python. A better
       | approach would be to use backticks like JavaScript does.
        
         | geodel wrote:
         | As long it is technically solid, ugly is just fine in places
         | where Java is used most. Even if it were most lovely syntax
         | Python/JS devs are not gonna jump to code in Java.
        
       | jerf wrote:
       | I am legit impressed, and I say this as one who has been _very_
       | hard on string interpolation in other languages; see
       | https://www.jerf.org/iri/post/2942/ and the matching
       | https://pkg.go.dev/github.com/thejerf/strinterp#section-read...
       | for instance. I have criticisms most developers don't even think
       | about.
       | 
       | This isn't exactly what I laid out, of course, but I think it
       | achieves the goals I was looking for, which is the real issue. In
       | particular, having the default string interpolation be prefixed
       | with "STR." is enough for linters and scanners to get a chew on
       | (it's easy by scanner standards to track that a STR.-interpolated
       | string got fed into a database query), and for code reviewers to
       | develop an instinct to look at such interpolations more closely
       | than they need to for a DB. interpolation. An STR annotation is
       | not technically necessary for the former, if it were simply the
       | default, but it is a big deal for the latter. I want people to
       | have a chance to notice and think about their use of bare string
       | interpolation for at least a fraction of a second as they type
       | "STR." (or autocomplete it or whatever).
       | 
       | This does put a heavy burden on the libraries to implement it
       | correctly, but if they do it's even safer than what I was
       | thinking.
       | 
       | One thing though: Please for the love of the internet, for those
       | of you writing interpolators, DO NOT write an interpolator that
       | picks apart the values passed in through a \\{ ... } and starts
       | instantiating arbitrary classes via Java reflection. Just stay
       | away from that entirely, OK?
        
         | jsiepkes wrote:
         | I guess it's an advantage of playing catch-up. You can learn
         | from other languages their experience and mistakes.
         | 
         | Still props for the Java team for doing a good job.
        
           | pron wrote:
           | Back in 1997, when James Gosling outlined his vision for
           | Java, he said it should be a conservative language (wrapping
           | a very innovative runtime) that would ideally only adopt
           | features that have proven themselves, for some time, in other
           | languages. Being a last mover is at the very core of Java's
           | evolution strategy. It's not playing catch-up because we're
           | not trying to adopt all features other, more "adventurous",
           | languages have, but rather to selectively pick the _fewest_
           | features that would have the biggest impact. That 's the
           | aspiration, at least.
        
             | oblio wrote:
             | Yeah, you're mostly right, but then... generics.
        
             | avgcorrection wrote:
             | Thanks for choosing only tried and proven tech like
             | pervasive nullable pointers.
        
             | The_Colonel wrote:
             | Right, and that's why Java introduced checked exceptions in
             | 1.0. A feature which wasn't proven back then and remains
             | unproven now.
             | 
             | Another example being overengineered streams. Parallel
             | streams look really cool as a demo, but have been proven
             | dangerous in their default configuration (they can exhaust
             | the default thread pool).
        
               | [deleted]
        
               | kaba0 wrote:
               | Checked exceptions are exact analogs to Result types, but
               | better (auto-unwrap, bubble up, stack traces). It is
               | unfortunately far from flawless (inheritance is not a
               | great combo with it), but that has been part of the
               | language since the beginnings.
               | 
               | Streams are not a language feature, it is only a library.
               | And I honestly don't find them over-engineered, they
               | really make plenty of logics very readable.
        
               | lenkite wrote:
               | Pity that the Streams functions don't support checked
               | exceptions.
        
             | taftster wrote:
             | And honestly, in my opinion, that approach is just really
             | proving itself right now. I am so happy to see Java
             | seemingly on the right path again, adopting solid
             | capabilities, innovating, but letting other languages take
             | some punches first. We have had some dark days (maybe a
             | dark decade), but full steam ahead now. Thank you pron and
             | team!
        
             | rileyphone wrote:
             | Do you have a link for the 1997 document? The history is
             | surprisingly difficult to explore for being pretty recent.
        
               | pron wrote:
               | Here's a public 1997 document where much of that is said:
               | 
               | https://www.win.tue.nl/~evink/education/avp/pdf/feel-of-
               | java...
        
         | aardvark179 wrote:
         | I think I've discussed this on and off with people for almost
         | six years at this point. I'm so glad to see it's finally
         | happening.
         | 
         | Now we just need to combine this, the constants work that has
         | been done over the last few years, and regular expressions, to
         | make the regexp API so much nicer.
        
         | bokchoi wrote:
         | > One thing though: Please for the love of the internet, for
         | those of you writing interpolators, DO NOT write an
         | interpolator that picks apart the values passed in through a
         | \\{ ... } and starts instantiating arbitrary classes via Java
         | reflection. Just stay away from that entirely, OK?
         | 
         | Indeed. Please no more log4shell hell
        
       | lanna wrote:
       | Are they trying to make Scala look more complicated than the
       | other languages on purpose? The Scala example f"$x%d plus $y%d
       | equals ${x + y}%d" could be written simply as s"$x plus $y equals
       | ${x + y}"
        
         | geodel wrote:
         | Yes, Java is legit scared of Scala's rising popularity.
        
           | oblio wrote:
           | Is Scala rising? It used to be hyped a ton back in the
           | earlier Twitter days, about 10 years ago, but since then,
           | everything has been quiet.
           | 
           | If anything, Kotlin is probably ahead of Scala now, adoption
           | wise.
        
           | isbvhodnvemrwvn wrote:
           | Sarcasm carries poorly over text :)
        
           | AtlasBarfed wrote:
           | If anything Java is becoming groovy. Almost all features
           | adopted are groovy features.
        
           | avgcorrection wrote:
           | Wait. Did we just get teleported back to 2018?
        
         | ndriscoll wrote:
         | It's funny too because the Scala ecosystem has had safe string
         | interpolation for years. e.g. with Slick sql"SELECT * FROM
         | Person p WHERE p.last_name = $lastName" will produce a
         | parameterized query/prepared statement that you can run, and it
         | does this at compile time so runtime injection like log4shell
         | is impossible.
        
       | clhodapp wrote:
       | It's interesting that this feature is essentially a clone of how
       | string interpolation has been made extensible in Scala. However,
       | the JEP only mentions Scala once.... in the form of an example
       | which has been made strangely more complex than it has to be (it
       | would be much shorter if they had used the s interpolator instead
       | of the f interpolator).
       | 
       | The one major difference is that, because Scala has macros and
       | typeclasses, it can turn malformed interpolations for things like
       | JSON or SQL into compile-time errors instead of runtime errors.
        
       | ninkendo wrote:
       | One thing I'm not sure is possible in this proposal, is there a
       | way to make the interpolation "lazy", such that the string (and
       | the evaluation of its interpolated components) can be skipped if
       | the string isn't ultimately used?
       | 
       | In swift, there's some nice quasi-laziness you can add to
       | function parameters, so that (say) a logging function that can
       | fully skip evaluating a string sent to it with the `@autoclosure`
       | syntax:                   func log(message: @autoclosure () ->
       | String, level: Level = .info) {             guard level >=
       | configuredLevel else {                 return             }
       | actuallyLog(message())         }
       | 
       | And call it like:                   log(message: "User is
       | \(someExpensiveFunction(user))", level: .debug)
       | 
       | And if the configured log level does not include debug messages,
       | `someExpensiveFunction(user)` doesn't get called.
       | 
       | This works because @autoclosure lets you take a parameter that is
       | "function returning String", but callers can just pretend they're
       | passing it a String, without having to decorate it in a function.
       | The compiled code will turn it into a closure behind the scenes,
       | and thus it'll be evaluated lazily.
       | 
       | Not sure if there's any way to do something like this in Java
       | with this proposal...
        
         | lazulicurio wrote:
         | It's a tad more verbose, but I assume the simplest way would
         | just be to use a standard closure.                   void
         | log(Supplier<? extends String> messageSupplier);
         | log(() -> log.info"User is \{someExpensiveFunction(user)}");
         | 
         | The pattern of using Supplier to provide a lazy argument is
         | pretty well-established afaict.
        
       | ivan_gammel wrote:
       | I love it. The design choices are great, given the number of
       | constraints. Looking forward to IDE and Maven plugin support of
       | this feature, which can add a lot of juice in pre-compile time
       | enabling very well integrated DSLs.
        
       ___________________________________________________________________
       (page generated 2023-03-03 23:00 UTC)