[HN Gopher] Native Reflection in Rust ___________________________________________________________________ Native Reflection in Rust Author : jswrenn Score : 199 points Date : 2022-12-15 15:54 UTC (7 hours ago) (HTM) web link (jack.wrenn.fyi) (TXT) w3m dump (jack.wrenn.fyi) | unconed wrote: | My version of Greenspun's Tenth [1] is that any sufficiently | complex static language contains an adhoc, informally specified, | bug ridden and slow version of a dynamic "any" type. | | Thx OP for providing an example. | | [1] https://en.wikipedia.org/wiki/Greenspun's_tenth_rule | kibwen wrote: | Rust has a dynamic any type, `std::any::Any`. | 8jy89hui wrote: | This is a beautiful (hacky) demo of something that I didn't think | was possible in Rust (yet). I hope other applications don't | accidentally start using it just to discover that it doesn't work | in release mode. | | Very impressive work! | jswrenn wrote: | Oh, I should add a note about that. Fortunately, it's quite | easy to tell Rust to generate debuginfo even in release mode. | bouk wrote: | It would be really cool if it was possible to natively inspect | the state of a Rust generator in a type-safe way | Animats wrote: | _" When you call .reflect on a dyn Reflect value, deflect figures | out its concrete type in four steps:"_ | | * _invokes local_type_id to get the memory address of your | value's static implementation of local_type_id_ | | * _maps that memory address to an offset in your application's | binary_ | | * _searches your application's debug info for the entry | describing the function at that offset_ | | * _parses that debugging information entry (DIE) to determine the | type of local_type_id's &self parameter_. | | This is a rather strange thing to bolt onto a language. I could | see this as an external tool. The use case seems to be programs | which used "async" so much they can't figure out the resulting | state machine. External debug tools to view and examine the async | state machine might be helpful. | | My experience with Rust has been that debugging of safe code is | just not a problem. Print statements and logging are enough. | pcwalton wrote: | > This is a rather strange thing to bolt onto a language. I | could see this as an external tool. | | It _is_ an external tool. This is a crate, not a part of the | compiler. | loeg wrote: | > This is a rather strange thing to bolt onto a language. | | It can just be an extremely fun and cute demo, without | practical application. | jerf wrote: | It can also be something that looks cool and doesn't | necessarily ever get past "kinda works", but piques the | interest of the core dev team and they take steps to make it | work even better, resulting in the ultimate "deprecation" of | this sort of thing by virtue of it being even better | integrated into the core. | | I don't have the context to judge the probability of that in | this specific case (lots of technical nitty-gritty comes in | to this sort of thing), but I've certainly seen similar | things happen in other communities. | More-nitors wrote: | how about adding this to debuggers for better object-views? | (could it be possible to provide near-js/python/java level of | obj view?) | gpderetta wrote: | Thus is already using DWARF debug infos. Using this for | debugging would be a long way around to arrive where you | started | | You can already script gdb to provide rich views of any | data structure. | olvy0 wrote: | I've used very similar method, at work, to provide C++ | "reflection" between my own system and a system from another | team. | | Basically, the other system is a dynamic library which sends and | receives C structures from my application. Those structures are | then mapped into a buffer that is supposed to have the same size | and there are pointers with metadata pointing into the buffer | that are supposed to be exactly like the struct elements. Those | structures can have arbitrary complexity, and are passed around | through type erasure (essentially char*). | | I wrote a "reflection" code for the other team, which runs when | they register the struct instance to be sent, checks if there's a | matching PDB [0] around, reads it, and outputs a json including | the metadata needed, which can then be used to define the | structures' metadata on our side correctly. | | This is all in C/C++ since in some contexts we have soft real- | time requirements, else I would have used any of the many RPC | frameworks available. | | This has been working for several years now. | | This is not a generic solution but it's good enough for in-house | communication between 2 systems that are maintained by different | parts of the organization, where the API between them, that like | I said is based on passing around char* buffers, has been more or | less set in stone a long time ago. Conway's law [1] and all that. | Sigh. | | [0] We are a Windows shop although the same thing should work | with DWARF info, same as the OP library works. In fact he says | "It may never work on Windows, which does not use DWARF to encode | debug info" but I can say that the same approach does work on | Windows, for C++ at least. The PDB format might be a tad | undocumented, but its documentation has been improved in the last | decade or so since I started working on my library. Writing some | small test programs is enough to understand how to access it, if | all you need is meta info on C-style structures. Other stuff is | more... challenging. But it wasn't necessary for my use-case. | | [1] https://en.wikipedia.org/wiki/Conway%27s_law | kp995 wrote: | Can't we rely more on Rust's Pattern Matching and it's strong | type system? | | Reflection seems more helpful when the programming language is | little unsounded. | jswrenn wrote: | Absolutely! That's the approach that frunk [0] takes. Frunk | (and other reflection libraries like it) are suitable for most | use cases, and make better use of Rust's affordances. | | My crate is suitable for cases where you cannot know (or | control) the set of types you might need to reflect on in | advance. It's primary use-cases are related to debugging. | | [0]: https://docs.rs/frunk | halfmatthalfcat wrote: | Is Frunk Rust's Shapeless (from Scala)? | jswrenn wrote: | Yep! | Thaxll wrote: | Today I learn that Rust does not have reflection. | estebank wrote: | Reflection is usually not available in AoT compiled languages. | The prevalent Rust coding styles rely heavily on monomorphic | data types and functions, meaning there's nothing _left_ to | reflect at runtime. But if you want to deal with trait objects | and need to access the underlying type, you need to use | Any::downcast or rely on annotations on every type you want to | reflect on. Or now, leverage DWARF info on Linux with deflect. | omginternets wrote: | What are monomorphic data types? What should be my first read | on the subject? | estebank wrote: | It's a fancy way of saying "every time this type is used, | replace all the generic type params with what was used and | generate code for it". It's how generics are implemented in | Rust. If you have struct Foo<T>(T); | | And you create Foo(42i32) and Foo(0.0f64), the compiler | will create the equivalent to struct | Fooi32(i32); struct Foof64(f64); | | In other languages like Java, generics are implemented the | way that Rust does "trait objects" (&dyn Trait). | | Rust is not the only language that does this, to be clear. | | If you're interested in a quick intro on the _compiler_ | side of this, you can read https://rustc-dev-guide.rust- | lang.org/backend/monomorph.html | shpongled wrote: | Nice examples - you can also have languages (like SML) | where monomorphization is simply an implementation | detail. Some compilers (e.g., MLton) perform | monomorphization and others don't. | yakubin wrote: | That depends on what you mean. SML has "polymorphism" | boiling down to being able to plug an arbitrary type in | some places, which is denoted like _' a_. But when people | talk about generics, they more often talk about C++ | templates, Java generics, Rust traits, etc. whose SML | equivalent are signatures, structs and functors. | Signatures are a bit like Rust traits, structs are a bit | like Rust implementations of traits, whereas functors are | like Rust's "templates", i.e. wherever you swap angle | brackets to parametrise something with types constrained | by traits, or values constrained by types. Except in Rust | this parametrisation can be slapped on a bunch of things. | It can be on structs, on functions, on traits, on | implementations of traits etc. In SML you need to group | all the "parametrised" things into a struct (and a | corresponding signature), which is going to be returned | by a functor. | | And now the thing is: with transparent signature | ascriptions, functors are monomorphised in SML, instead | of everything being hidden behind signatures (as is in | the case of Rust with traits when you use _dyn_ ), which | has semantic consequences. E.g. a struct returned by a | functor may contain a type. You can't perform proper | type-checking without monomorphising, because you don't | know what the exact type is. E.g. in the following | program, the final line couldn't be type-checked without | monomorphisation: signature ITERABLE = | sig type ElemT type SrcT | val new_iter: SrcT -> unit -> ElemT option end | signature LIST_ELEM_TYPE = sig type T | end functor ListIterFun (ListElemType: | LIST_ELEM_TYPE): ITERABLE = struct type ElemT | = ListElemType.T type SrcT = ElemT list | fun new_iter l = let val lr = ref l | in fn () => case !lr of | nil => NONE | | (x::xs) => (lr := xs; SOME x) | end end structure | IntElemType: LIST_ELEM_TYPE = struct type T = | int end structure IntListIter = | ListIterFun(IntElemType) val next = | IntListIter.new_iter [1, 2, 3, 4, 5] | | If I change the signature ascription on ListIterFun to an | opaque ascription ( _: > ITERABLE_), the final line won't | type-check, because it's not obvious from the signature, | that ElemT is int. So transparent signature ascriptions | require monomorphisation (Rust traits without _dyn), and | opaque signature ascriptions free the compiler from | having to do monomorphisation (Rust traits with_ dyn*). | | There was a lot of discussion of this issue when Go was | settling on a design for its generics, under the phrase | "reified generics". | codeflo wrote: | I only recently realized that certain type system | features, like polymorphic recursion, make | monomorphization impossible in the general case. In | Haskell for example, it's by necessity only an | optimization that's used where applicable, and not the | general strategy. | gloryjulio wrote: | I think cpp does this too | estebank wrote: | It indeed does. The only difference is that Rust has | traits (similar to C++'s concepts) which require explicit | mention of what interface the type parameters have inside | the function, whereas C++'s templates will have a compile | error _after_ instantiation if you passed something that | didn 't meet the expected contract. This is closer to | Rust's macros in operation. | | Given fn foo<T>(a: T, b: T) -> T { a + | b } | | The compiler will complain that you should have been | explicit on how T is going to be used: | error[E0369]: cannot add `T` to `T` --> | src/lib.rs:1:32 | 1 | fn foo<T>(a: T, | b: T) -> T { a + b } | | - ^ - T | | | | T | | help: consider restricting type parameter `T` | | 1 | fn foo<T: Add<Output = T>>(a: T, b: T) -> T { a + b } | | +++++++++++++++++ | | whereas in C++ this would have been accepted _until_ you | called foo with two things that couldn 't be added | together, like a Rust macro[1]. | | [1]: https://play.rust- | lang.org/?version=nightly&mode=debug&editi... | codeflo wrote: | To add to this, even the Foo-wrapper is gone, just the | i32 remains. Rust values are amorphous data blobs at | runtime. | CryZe wrote: | ABI wise that is not true though. structs have struct | ABI, even just a newtype struct around an integer will | not use integer ABI unless annotated with | #[repr(transparent)]. | estebank wrote: | Yes, that's true but that is an implementation detail | that only comes into play when dealing with ABI, and | _then_ you should be using #[repr(transparent)] to ensure | that the compiler won 't do something else :) | codeflo wrote: | Sure, it's good to point out the difference between "the | behavior of a typical optimizing compiler" and "things | actually guaranteed by the language". The context of the | discussion was the former, I think. I'm not even that | certain that monomorphization is actually required in | theory. | estebank wrote: | Yes, monomorphization isn't _needed_ in theory, as long | as the user-visible behavior remains the same, and in | practice the team is exploring options[1] to identify | cases where the currently manual practice of writing | pub fn foo<T: AsRef<X>>(x: T) { | inner_foo(x.as_ref()); } fn inner_foo(_: | &X) { todo!() } | | can be instead done by the compiler automatically | (turning monomorphized code back into polymorphic code, | hence the polimorphization hame). | | [1]: https://rustc-dev-guide.rust- | lang.org/backend/monomorph.html... | estebank wrote: | Expanding on trait objects: these are implemented as | "V-Tables", structs holding pointers to the trait's | methods and to the underlying type. This means that if | you _need_ to know what the underlying type, you have to | do something fancy, usually referred to as "reflection". | Also, invocation of generic functions that use V-Tables | require "chasing pointers", which makes cache locality | worse (because data might not be in the same cache read | as the v-table itself), but makes the generated binary | smaller (because if you have something like Foo<T> used | with 1000 types, with monomorphization you end up with | 2000 generated types in the binary, instead of 1001 with | trait objects). | Joker_vD wrote: | Pretty sure that some usage patterns of polymorphic types | can not be completely monomorphized. Here's example in | Golang: package main | import ( "fmt" ) type | wrapper[T any] struct { Value T } | func (w wrapper[T]) String() string { return | fmt.Sprintf("{%v}", w.Value) } func | stringWrapped[T any](n int, v T) string { if | n == 0 { return fmt.Sprintf("%v", v) | } return stringWrapped(n-1, wrapper[T]{Value: | v}) } func main() { n := | 0 fmt.Scanf("%d", &n) result := | stringWrapped(n, "test") fmt.Println(result) | } | | Go refuses to compile because it can't possibly generate | all instances of wrapper[T] that this program may use: | wrapper[string], wrapper[wrapper[string]], | wrapper[wrapper[wrapper[string]]], etc. | estebank wrote: | Rust will complain about a recursion limit being reached | during instantiation[1]. The solution in Rust is to use | &dyn Trait or Box<dyn Trait> instead.[2] | | [1]: https://play.rust- | lang.org/?version=stable&mode=debug&editio... | | [2]: https://play.rust- | lang.org/?version=stable&mode=debug&editio... | | ^ This blows the stack because it keeps calling itself | with no break condition, but shows how the type system | accepted the code. | gpderetta wrote: | I think this is called polymorphic recursion in Haskell | circles. | | In C++ you can monomorphize as long as you can somehow | prove the recursion terminates at compile time (for | example by threading a static recursion counter). | dgb23 wrote: | Not exactly the same thing but JITs can turn dynamic | objects into structs if the structure is consistent. JS | runtimes and Julia do this as far as I know. | adgjlsfhk1 wrote: | Julia doesn't do this. It just has structs in the first | place. | mmis1000 wrote: | Firefox's js runtime also do tricks like generate multi | copy of optimized function when the function has multi | call site instead make one with lots of if else. So it no | longer suffer from the problem that function that | frequently get multi different type of parameters from | different call site has poor performance. | | It's probably exactly how templates work, except the | details are invisible to users. | | https://hacks.mozilla.org/2020/11/warp-improved-js- | performan... | estebank wrote: | Yes! Java as well. And this is how those languages can | show impressive benchmarks for consistent workloads. In | theory they can even surpass AoT languages. In practice | it depends on the specifics. | [deleted] | planede wrote: | That's runtime reflection. | | Compile time reflection AFAIK is available in D and Zig, and | is planned for C++. | elcritch wrote: | That's right. Nim does as well. It's amazing. Once you get | used to having CTTI and being able to use it, it's hard to | program without it. Bonus points if you can do basic | dependent types too. | | In C++ with SFINAE you can effectively do CTTI-style | programming in C++. C++ has long had runtime type | reflection as well (RTTI), though it needs to be compiled | in. Looks like there's a boost library for CTTI. | Conscat wrote: | The C++ reflection improves a lot in C++20, but it's | still very limited compared to that aspect of Nim, or | even Zig. The std::meta::info and "splices" based on | Haskell for C++26 are incredibly exciting to me. I have | many use cases in mind. Splices in combination with | std::embed will make C++ basically just a bad Racket (but | one with inline assembly!). | yakubin wrote: | Yup. I consider runtime reflection an antifeature, which | has negative performance effects, is unsafe (see e.g. | log4j) and leads to fragile code. | | I would however welcome static reflection with open arms. | In Rust in particular, I'd prefer it if derive was | implemented using static reflection, rather than proc | macros. | nestorD wrote: | The usual argument is that between having macro and focusing on | a strong type system, there are very few legitimate usecase for | reflection left in Rust. | snordgren wrote: | Rust has very little influence from reflection-heavy languages | like Java and C#. On their list of influences | (https://doc.rust-lang.org/reference/influences.html), Java is | not even mentioned, and C# is only mentioned for its | attributes. There is very little overlap between the design | philosophies that influenced Rust and Java/C#. | | Ruts does not support inheritance either. But I have never | missed either feature in a Rust program. | Tuna-Fish wrote: | Reflection is typically provided by a runtime, and languages | that don't have runtimes usually don't have it. You shouldn't | expect a low-level systems language to have reflection. There | is no zero-cost way of implementing it. | spacechild1 wrote: | This is of course only true for runtime reflection. And which | language does not have a runtime? | Joker_vD wrote: | Except Rust has runtime: [0]. And so, usually, does C (in | hosted implementations). | | [0] https://doc.rust-lang.org/reference/runtime.html | pornel wrote: | These are a couple of functions executables can call at run | time, but they're more like an extra standard library. It's | not a runtime in the same sense as a runtime in dynamic or | GC languages that manages all objects and is able to know | types of arbitrary objects and inspect/trace them. | | Rust has no run-time type information except limited | downcasts via `dyn Any` or explicitly derived traits on | per-type basis, and these features compile to type-specific | monomorphic code rather than calling some run-time | reflection. | throwaway894345 wrote: | Pretty sure you don't need a runtime to track runtime | type info. What we think of as a "runtime" in GC | languages is usually several distinct things (a | scheduler, a GC, and maybe some other stuff in the case | of Java/.Net). | [deleted] | armchairhacker wrote: | Does this still work if the application is complied in release | mode or with optimizations? | | Even if not, this is still very useful for debugging | jswrenn wrote: | It only works if DWARF is generated. By default, the `release` | profile of Cargo sets `debug = false` [0]. But, it's quite easy | to override this setting, and have a build that is both | optimized and includes debuginfo. | | [0]: https://doc.rust- | lang.org/cargo/reference/profiles.html#rele... | jeroenhd wrote: | Does using DWARF info imply that this will break when you strip | the resulting executable? I often strip my Rust binaries because | it practically halves the application size, which can become | quite a lot in a language where you're statically linking | everything. | | Regardless, quite an ingenious use of standard ELF features, I | didn't think this would be possible in Rust without adding some | kind of VM around reflection code. | jswrenn wrote: | Yes, unfortunately that's a tradeoff here. Rust does support | splitting debug info into other files, but Deflect doesn't | support loading split debuginfo _yet_. | HideousKojima wrote: | C# has similar issues where they have to be conservative about | what them trim from binaries for AoT in case it is used for | reflection, so I imagine you'd run into the same issues for | almost any compiled language you want to implement reflection | for. | davidhyde wrote: | Great writeup! The defmt logging crate uses a linker script to | extract debug symbols so that you get nicely formatted stack | traces on embedded systems. It works on linux, macos and windows. | I wonder if the same technique can be applied to this project. It | needs a runner though so may not be the right approach. | | https://github.com/knurling-rs/defmt ___________________________________________________________________ (page generated 2022-12-15 23:00 UTC)