[HN Gopher] JEP 430: String Templates (Preview) Proposed to Targ... ___________________________________________________________________ JEP 430: String Templates (Preview) Proposed to Target Java 21 Author : za3faran Score : 111 points Date : 2023-03-03 17:52 UTC (5 hours ago) (HTM) web link (openjdk.org) (TXT) w3m dump (openjdk.org) | spankalee wrote: | This looks great. | | I maintain the lit-html template library which uses JavaScript | tagged template literals to embed HTML in JS. | | A key feature of JS template literals is that the tag receives | the string fragments and values separately, and can return non- | string results. This means we can escape values and validate | template structure to prevent XSS attacks. | | This approach in Java should lead to a lot more lightweight but | safe DSL embeddings. | | (I don't love the syntax, but... it's Java) | | edit: the other critical feature of JS tagged template literals, | that I'm not sure this has, is referential equality of the | template strings passed to the tag function across multiple | invocations. | | This is required to be able to do one-time preparation work on a | template and re-use that with different sets of values. Think a | `SQL.` template processor that turns the strings into a prepared | statement once and re-uses that for each query with different | parameters. | | edit 2: awesome, it does: The fragments() of a | StringTemplate are constant across all evaluations of a template | expression, while values() is computed fresh for each evaluation. | For example: | aardvark179 wrote: | This spec doesn't need to specify that as it should fall out | naturally from Constable and co (JEP 303 and 334) in | combination with string templates. | nezirus wrote: | That database stuff looks horrible. Why do they feel the need to | introduce DB query templating into string templates? No matter | what you do on the client side, the database engine itself should | escape/validate the data. Didn't we learn that lesson with PHP? | | In addition to that, not every database needs prepared statements | for safe queries e.g. "Parametrized queries" in PostgreSQL | (available in libpq as PQExecParams and exposed in many other | higher level languages) | bcrosby95 wrote: | The database stuff looks like an example, and shows how you | could extend the template system in a way that doesn't | introduce security problems. | nezirus wrote: | That is exactly the point, you should not use general string | templating system for SQL queries, together with "roll your | own" escape and validation mechanisms. I really don't see why | they included that part, if not to show how to shoot yourself | in the foot. | bcrosby95 wrote: | There is no roll your own escape mechanics. The example | uses prepared statements. | Phelinofist wrote: | What happens if I already have a field called STR or FMT? | | > STR is a public static final field that is automatically | imported in every Java source file. | kaba0 wrote: | Then your STR would be used, and if you want to use this string | template STR you would have to use it's qualified name. | ivan_gammel wrote: | Or have an alias: public static final | ValidationProcessor $ = ....STR; $."My name is | \{name}" | archgoon wrote: | [dead] | ko27 wrote: | It's interesting how C# is always far ahead of Java, they | introduced it way earlier, the syntax is simpler, and you can | make is safe by using FormattableString as the param type, for | example in EF you can do this without worrying about SQL | injection: | | FromSql($"EXECUTE dbo.GetMostPopularBlogsForUser {user}") | | https://learn.microsoft.com/en-us/ef/core/querying/sql-queri... | DaiPlusPlus wrote: | A major shortcoming with FormattableString is that the C# | compiler is hard-coded to always prefer implicitly converting | an interpolated string to a string, which makes it impossible | to write extension-methods for FormattsbleString objects. | | ...and they also always default to formatting with | CurrentCulture instead of InvarintCulture. Apparently this was | by-design as interpolated strings were never originally | intended for generating machine-readable strings. | | Finally, there's no way to perform common "mini-templating" | with a FormattableString, such as repeating regions, show/hide | regions within a string, or little things like inflection (e.g. | rendering "{0:N} items" or "{0:N0} item" when arg0 is 1 or not. | | I'm happy to see Java (finally) gain a similar feature, but it, | like C#,s, seems... limited in its abilities. | pjmlp wrote: | Not always, e.g. default interface methods, just to give one | example. | [deleted] | _old_dude_ wrote: | It seems the design team has lost its mojo. | | First, the lesson of Log4Shell have not been learnt. Allowing | more libraries to do formatting, what can go wrong ? | | And String.format() is slow because every values are boxed. This | JEP repeats the same mistake with the templates. | pron wrote: | The template instantiation itself is done by the language, not | libraries, and the entire mechanism was designed for security | (read the JEP); for example, templates are (virtually) limited | to literals and can't come from user input. | | As to boxing, the built-in template processors, STR and FMT, | don't do boxing (they use MethodHandle mechanisms similar to | those used by lambdas) -- FMT is ~40x faster than | String.format, I'm told -- but that capability hasn't been | exposed as an API yet. As with other features, we try exposing | basic usages first, and more sophisticated ones later. | [deleted] | MBCook wrote: | I thought the choice of STR was ugly at first but as you continue | reading and see how that allows multiple options that provide | more than generic string joining it really starts to make sense. | | Unlike many here I do like the choice of \\{}, although | personally I would have preferred \\() like Swift. | | I can't wait. | slt2021 wrote: | Fantastic and long awaited feature. Little bit clunky/ugly choice | of chars, but I guess we will get used to it | MH15 wrote: | A bit ugly, but seems powerful. Very similar to tagged template | literals in JS. | kaba0 wrote: | Not sure where should I give this minor heads up regarding the | JEP: StringProcessor INTER = (StringTemplate st) | -> { String placeHolder = "*"; String stencil | = String.join(placeHolder, fragments); for (Object | value : st.values()) { String v = | String.valueOf(value); stencil = | stencil.replaceFirst(placeHolder, v); } return | stencil; }; | | 'fragments' should be 'st.fragments()' here I believe to make it | compile. | tofflos wrote: | I wonder if annotations and annotation pre-processors could have | been an alternate way to approach this had they been applicable | to String constants. String name = "Joan"; | PreparedStatement query = @SQL "select * from users where | firstname = \{name}"; | ivan_gammel wrote: | Annotations are declarative, processors are imperative. You | need an interface with a method to run the processing. | rcoveson wrote: | > Gosling: For me as a language designer, which I don't really | count myself as these days, what "simple" really ended up meaning | was could I expect J. Random Developer to hold the spec in his | head. That definition says that, for instance, Java isn't -- and | in fact a lot of these languages end up with a lot of corner | cases, things that nobody really understands. Quiz any C | developer about unsigned, and pretty soon you discover that | almost no C developers actually understand what goes on with | unsigned, what unsigned arithmetic is. Things like that made C | complex. The language part of Java is, I think, pretty simple. | The libraries you have to look up.[0] | | So we can have this super powerful, extensible string | interpolation system using never-before-seen syntax as a | _language_ feature, not a standard library extension, but not | unsigned integers? | | I'll echo what other commenters have said: This looks powerful | but ugly. I'll add to it by pointing out that "powerful but ugly" | is what we've already got. So we gain the ability to move format | arguments inline with the string contents _if_ we 're willing to | use the `STR."\\{}"` syntax? That seems like a lateral move. | | 0. http://www.gotw.ca/publications/c_family_interview.htm | pron wrote: | We _could_ have unsigned integers but we 're choosing not to | because on the whole their disadvantages outweigh their | advantages. On the other hand, once we have user-defined value | types, you'll be able to define unsigned integers in a library | if you want. | | The extensibility and power of string templates were a | requirement; security experts quite simply vetoed adding string | interpolation as it's just too dangerous, especially in server- | side software, where Java is mostly used. | | As to `STR.` -- which you _won 't_ need to use most of the time | -- see my other comment: | https://news.ycombinator.com/item?id=35013470 | Kwpolska wrote: | Python is also widely used server-side, and they introduced | f-strings with simple and friendly syntax a few years ago. JS | added template literals in 2015's ES6, when Node.js was very | much a thing. Why is Java special here? | pron wrote: | Java's syntax is just as friendly and simple -- see my | other comments on the subject -- it just requires the | receiver to define a policy, which is essential for | security. You only need to use STR when the receiver does | _not_ define a template processor and works with strings. I | have no idea what Python 's or JS's security experts | advised, but that code injection is one of the most common | vulnerabilities in memory-safe languages is a fact reported | by all security advisories. String interpolation is one of | the most dangerous features a language can have. | Kwpolska wrote: | I'm not a fan of the special `log.info."x: \\{x}";` | syntax, it looks like a weird mix of field access and a | string literal. | | And even with the fancy new syntax, I'm sure there will | be people passing STR."SELECT * FROM x WHERE y = \\{y}" | to their database. You can educate people all you want, | but not all developers will read the docs telling them | this is dangerous. Even if all the docs do the right | thing, people might end up reading old tutorials, and | will then notice the "inefficiency" of the prepared | statement syntax and will just do a STR."" format. Or | they might consider the database.executeQuery."SELECT | \\{x}"; syntax "invalid" and try to fix it. | mabbo wrote: | String name = "Joan"; String info = STR."My name is | \{name}"; | | I don't love it, to be honest. `STR` is a bit much. Requiring the | \ for the first but not second brace. I want string templates in | Java, but this feels ugly. | archgoon wrote: | [dead] | v-erne wrote: | Im guessing that the second backslash was deemd unecessary (and | in my opinion it is - no need for additional key strokes in the | name of some arbitral consistancy). Besides you are probably | looking at this wrong - backslash here is equivalent of hash or | dolar sign - it denotes begining of expression (just like most | languages do). And they literally could not use another one | because of backward compatibilty it they want to keep the "no | STR"/simple versuon of interpolation possible. They will just | add new escape sequence /{ - its kinda brilliant if you think | about it. | pron wrote: | STR is not some special syntax, just a receiver for a method | call. You would only need STR to interpolate a template into a | String, but any receiver can specify that interpretation of the | template -- or a different one. So, for example, a logger could | use: log.info."x: \{x}"; | | and similarly for other uses that don't produce strings but | JSON, SQL etc. So STR would only be used -- hopefully rarely -- | for old APIs that have not determined their own template | interpretation strategy and only accept String. | | The use of the backslash is important to distinguish between a | string literal and a template literal because "x: \\{x}" is not | a valid string literal today. Swift uses \\(...), BTW. | Bjartr wrote: | > So STR would only be used -- hopefully rarely | | If that's the hope, why have it automatically imported | statically for all files? | pron wrote: | Because it doesn't hurt, and because it will take some time | until all relevant libraries expose their chosen template | processor, reducing the need for STR, the tax on less | restrained uses needs to be neither too high nor too low. | Alupis wrote: | It is indeed ugly. Having switched full-time to Kotlin a little | less than a year ago, I honestly prefer their approach to | String interpolation using `${var}`. | | I understand Java has a backwards compatibility issue that | Kotlin does not though, which would make breaking changes to | all the hard-coded strings containing `$` that would now need | to be escaped. This is discussed in the JEP. | | Backwards compatibility - simultaneously Java's strongest and | weakest asset. | lstamour wrote: | Yeah, the choice of \\{ to create an expression instead of | printing a literal { character is an odd one when you compare | how you output a double-quote character: | | > To aid refactoring, double-quote characters can be used | inside embedded expressions without escaping them as \". | | I would expect for consistency that \"{expression}\" outputs an | expression and \"\\{expression}\" would not. | kaba0 wrote: | Why would you have to escape '{'? | | I don't get your examples. The JEP meant that they can be | used _inside_ not, in-between, so: \{ "a" + | "b" + 3 } | | could work. | Noe2097 wrote: | Why oh why is that a backslash!? | | Over the 9 languages mentioned: 5 languages (inc. JVM ones like | groovy and kotlin) are using `$`, 1 language (swift) is using | `\\`. | | `\\` is a pain to type on many keyboard layouts -- actually most | but the US one. It seemed to me that `$` would have been a much | more "conventional" choice. | | This really makes me sad. It looks like the choice was made on | purpose to be different. | kothar wrote: | I assume because `\\{` was not a valid escape sequence, which | means any use of this character pair can be identified as a | template without changing the semantics of existing string | literals. | aardvark179 wrote: | Bingo! | qw wrote: | I agree. On my keyboard I need to press [?] + [?] + 7 | | It's actually made worse because { and } also need special | combinations. | | To type \\{X} I need to press ([?] + [?] + 7) | (SHIFT + [?] + 8) X (SHIFT + [?] + 9) | | I know this is something we need to live with due to historical | reasons, but I would prefer that new syntax is made simpler. | lenkite wrote: | What about parenthesis () and square brackets [] ? (writing a | lisp DSL that might have users from non-US layouts) | bberrry wrote: | Agreed, the economics for this syntax is terrible. | spankalee wrote: | I don't understand this rationale either: | | > For the syntax of embedded expressions we considered using | ${...}, but that would require a tag on string templates | (either a prefix or a delimiter other than ") to avoid | conflicts with legacy code. | | Can't the template processor expression itself function as the | tag? Is STR."..." already legal now? | pdevr wrote: | Seriously, yes. Great thought process behind the design, marred | by the good-for-nothing (subjective opinion) backslash. | | If any of you design a language or a DSL, please, please - | avoid the backslash - for the reasons stated above. Hard to | type, introduces unseen problems, most will hate it. It | (backslash) is unbecoming, of anything elegant. | SyrupThinker wrote: | But if you do follow that advice consider if it is worth just | using a different escape character for _all_ interpolations. | That way could still make use of this very sensible syntax of | reusing the escape for interpolation whilst avoiding the | backslash issues. | cs02rm0 wrote: | I'm not a fan either. Java seems to be declining in prevalence | in my corner of industry, I'm sure these changes are made by | wiser minds than mine, but I'm sceptical about whether such a | choice is really right for users. | jacobn wrote: | I'd also really have preferred $, but apparently there's some | backwards compatibility issue with using that character? | kaba0 wrote: | "${val}" is a legal string literal that is certainly used in | plenty of existing programs. | | In current Java you would have to escape \, so only | "\\\\{val}" could have existed before. | _old_dude_ wrote: | That's why JS use ` and not reuse ". | | You want the lexer to be able to make the difference | between a constant string and a template string and you | want users to be able to read the code. | | This JEP solves the former not the later. | kaba0 wrote: | It's not as bad as people make it out to be. Swift | without any backwards compatibility constraints chose | "\\(val)". It does need a slight getting used to but it | is not any worse than ${}. | Kwpolska wrote: | Swift's choice is strange and ugly IMO, even if it did | not involve backwards compatibility constraints. | supriyo-biswas wrote: | The proposal is uglier than f-strings in Python. A better | approach would be to use backticks like JavaScript does. | geodel wrote: | As long it is technically solid, ugly is just fine in places | where Java is used most. Even if it were most lovely syntax | Python/JS devs are not gonna jump to code in Java. | jerf wrote: | I am legit impressed, and I say this as one who has been _very_ | hard on string interpolation in other languages; see | https://www.jerf.org/iri/post/2942/ and the matching | https://pkg.go.dev/github.com/thejerf/strinterp#section-read... | for instance. I have criticisms most developers don't even think | about. | | This isn't exactly what I laid out, of course, but I think it | achieves the goals I was looking for, which is the real issue. In | particular, having the default string interpolation be prefixed | with "STR." is enough for linters and scanners to get a chew on | (it's easy by scanner standards to track that a STR.-interpolated | string got fed into a database query), and for code reviewers to | develop an instinct to look at such interpolations more closely | than they need to for a DB. interpolation. An STR annotation is | not technically necessary for the former, if it were simply the | default, but it is a big deal for the latter. I want people to | have a chance to notice and think about their use of bare string | interpolation for at least a fraction of a second as they type | "STR." (or autocomplete it or whatever). | | This does put a heavy burden on the libraries to implement it | correctly, but if they do it's even safer than what I was | thinking. | | One thing though: Please for the love of the internet, for those | of you writing interpolators, DO NOT write an interpolator that | picks apart the values passed in through a \\{ ... } and starts | instantiating arbitrary classes via Java reflection. Just stay | away from that entirely, OK? | jsiepkes wrote: | I guess it's an advantage of playing catch-up. You can learn | from other languages their experience and mistakes. | | Still props for the Java team for doing a good job. | pron wrote: | Back in 1997, when James Gosling outlined his vision for | Java, he said it should be a conservative language (wrapping | a very innovative runtime) that would ideally only adopt | features that have proven themselves, for some time, in other | languages. Being a last mover is at the very core of Java's | evolution strategy. It's not playing catch-up because we're | not trying to adopt all features other, more "adventurous", | languages have, but rather to selectively pick the _fewest_ | features that would have the biggest impact. That 's the | aspiration, at least. | oblio wrote: | Yeah, you're mostly right, but then... generics. | avgcorrection wrote: | Thanks for choosing only tried and proven tech like | pervasive nullable pointers. | The_Colonel wrote: | Right, and that's why Java introduced checked exceptions in | 1.0. A feature which wasn't proven back then and remains | unproven now. | | Another example being overengineered streams. Parallel | streams look really cool as a demo, but have been proven | dangerous in their default configuration (they can exhaust | the default thread pool). | [deleted] | kaba0 wrote: | Checked exceptions are exact analogs to Result types, but | better (auto-unwrap, bubble up, stack traces). It is | unfortunately far from flawless (inheritance is not a | great combo with it), but that has been part of the | language since the beginnings. | | Streams are not a language feature, it is only a library. | And I honestly don't find them over-engineered, they | really make plenty of logics very readable. | lenkite wrote: | Pity that the Streams functions don't support checked | exceptions. | taftster wrote: | And honestly, in my opinion, that approach is just really | proving itself right now. I am so happy to see Java | seemingly on the right path again, adopting solid | capabilities, innovating, but letting other languages take | some punches first. We have had some dark days (maybe a | dark decade), but full steam ahead now. Thank you pron and | team! | rileyphone wrote: | Do you have a link for the 1997 document? The history is | surprisingly difficult to explore for being pretty recent. | pron wrote: | Here's a public 1997 document where much of that is said: | | https://www.win.tue.nl/~evink/education/avp/pdf/feel-of- | java... | aardvark179 wrote: | I think I've discussed this on and off with people for almost | six years at this point. I'm so glad to see it's finally | happening. | | Now we just need to combine this, the constants work that has | been done over the last few years, and regular expressions, to | make the regexp API so much nicer. | bokchoi wrote: | > One thing though: Please for the love of the internet, for | those of you writing interpolators, DO NOT write an | interpolator that picks apart the values passed in through a | \\{ ... } and starts instantiating arbitrary classes via Java | reflection. Just stay away from that entirely, OK? | | Indeed. Please no more log4shell hell | lanna wrote: | Are they trying to make Scala look more complicated than the | other languages on purpose? The Scala example f"$x%d plus $y%d | equals ${x + y}%d" could be written simply as s"$x plus $y equals | ${x + y}" | geodel wrote: | Yes, Java is legit scared of Scala's rising popularity. | oblio wrote: | Is Scala rising? It used to be hyped a ton back in the | earlier Twitter days, about 10 years ago, but since then, | everything has been quiet. | | If anything, Kotlin is probably ahead of Scala now, adoption | wise. | isbvhodnvemrwvn wrote: | Sarcasm carries poorly over text :) | AtlasBarfed wrote: | If anything Java is becoming groovy. Almost all features | adopted are groovy features. | avgcorrection wrote: | Wait. Did we just get teleported back to 2018? | ndriscoll wrote: | It's funny too because the Scala ecosystem has had safe string | interpolation for years. e.g. with Slick sql"SELECT * FROM | Person p WHERE p.last_name = $lastName" will produce a | parameterized query/prepared statement that you can run, and it | does this at compile time so runtime injection like log4shell | is impossible. | clhodapp wrote: | It's interesting that this feature is essentially a clone of how | string interpolation has been made extensible in Scala. However, | the JEP only mentions Scala once.... in the form of an example | which has been made strangely more complex than it has to be (it | would be much shorter if they had used the s interpolator instead | of the f interpolator). | | The one major difference is that, because Scala has macros and | typeclasses, it can turn malformed interpolations for things like | JSON or SQL into compile-time errors instead of runtime errors. | ninkendo wrote: | One thing I'm not sure is possible in this proposal, is there a | way to make the interpolation "lazy", such that the string (and | the evaluation of its interpolated components) can be skipped if | the string isn't ultimately used? | | In swift, there's some nice quasi-laziness you can add to | function parameters, so that (say) a logging function that can | fully skip evaluating a string sent to it with the `@autoclosure` | syntax: func log(message: @autoclosure () -> | String, level: Level = .info) { guard level >= | configuredLevel else { return } | actuallyLog(message()) } | | And call it like: log(message: "User is | \(someExpensiveFunction(user))", level: .debug) | | And if the configured log level does not include debug messages, | `someExpensiveFunction(user)` doesn't get called. | | This works because @autoclosure lets you take a parameter that is | "function returning String", but callers can just pretend they're | passing it a String, without having to decorate it in a function. | The compiled code will turn it into a closure behind the scenes, | and thus it'll be evaluated lazily. | | Not sure if there's any way to do something like this in Java | with this proposal... | lazulicurio wrote: | It's a tad more verbose, but I assume the simplest way would | just be to use a standard closure. void | log(Supplier<? extends String> messageSupplier); | log(() -> log.info"User is \{someExpensiveFunction(user)}"); | | The pattern of using Supplier to provide a lazy argument is | pretty well-established afaict. | ivan_gammel wrote: | I love it. The design choices are great, given the number of | constraints. Looking forward to IDE and Maven plugin support of | this feature, which can add a lot of juice in pre-compile time | enabling very well integrated DSLs. ___________________________________________________________________ (page generated 2023-03-03 23:00 UTC)