[HN Gopher] Eliminating Data Races in Firefox
       ___________________________________________________________________
        
       Eliminating Data Races in Firefox
        
       Author : caution
       Score  : 319 points
       Date   : 2021-04-06 15:28 UTC (7 hours ago)
        
 (HTM) web link (hacks.mozilla.org)
 (TXT) w3m dump (hacks.mozilla.org)
        
       | wyldfire wrote:
       | > What is ThreadSanitizer? ThreadSanitizer (TSan) is compile-time
       | instrumentation to detect data races according to the C/C++
       | memory model on Linux. It is important to note that these data
       | races are considered undefined behavior within the C/C++
       | specification.
       | 
       | TSan, ASan, UBSan - if you are writing code and your toolchain
       | supports these, you should use them. Over the past 6-7+ years
       | I've used them, I have never seen a false positive (I'm sure some
       | have occasionally existed but they seem rare in practice).
       | 
       | > However, developers claimed on various occasions that a
       | particular report must be a false positive. In all of these
       | cases, it turned out that TSan was indeed right and the problem
       | was just very subtle and hard to understand.
       | 
       | Yes, I have seen this phenomenon!
        
         | inetknght wrote:
         | Address Sanitizer -- never seen a false positive. Ever.
         | 
         | UBSan -- again never had a false positive.
         | 
         | Thread Sanitizer _does_ have false positives in some
         | circumstances. They 're hard to verify to be falsy and in many
         | ways it's better to refactor to eliminate the complaint because
         | you'll end up with better cleaner code. For example, Qt uses
         | those _atomic fences_ that the article describes [0] [1].
         | 
         | [0]: _TSan does not produce false positive data race reports
         | when properly deployed, which includes instrumenting all code
         | that is loaded into the process and avoiding primitives that
         | TSan doesn't understand (such as atomic fences)._
         | 
         | [1]: https://lists.qt-
         | project.org/pipermail/interest/2014-March/0...
        
           | derf_ wrote:
           | I have seen false positives in ASAN.
           | 
           | In Firefox, media threads use a reduced stack size (128 kB),
           | due to, historically, 32-bit address-space concerns with lots
           | of media elements on a page. Opus, in its default
           | configuration, requires a fair amount of working stack memory
           | (a few dozen kB in some cases). "A few dozen" is much smaller
           | than 128 kB, but apparently enabling ASAN increases the stack
           | usage, quite a bit. So we got crashes under ASAN when it
           | overflowed the stack that would not have been possible
           | without ASAN enabled.
           | 
           | https://bugzilla.mozilla.org/show_bug.cgi?id=750231
        
             | inetknght wrote:
             | Address Sanitizer inserts sentinels on the stack to watch
             | for certain kinds of dangerous stack behavior. I would be
             | willing to bet that the stack overflow _could_ occur
             | without ASAN and it 's not a false positive.
        
               | derf_ wrote:
               | You are right to be concerned. The primary culprit is the
               | use of C99 variable-length arrays (or alloca when those
               | are not available). I can see why ASAN would want to pad
               | those.
               | 
               | Fortunately, the allocation sizes of these arrays are
               | tightly constrained by the limits of the Opus format
               | (this is not an accident: I helped design the format). I
               | performed a careful analysis (by hand) to ensure the
               | maximum possible stack usage (without ASAN) was not
               | anywhere close to the size allocated for the stack. It is
               | of course possible that I made mistakes, but I tried very
               | hard not to.
               | 
               | The broader point is that this is an entire class of
               | memory access error that ASAN _cannot_ properly validate
               | due to its design. Because it changes the system under
               | test in ways that differ from a real deployment, I cannot
               | use it to mechanically validate my hand analysis.
               | 
               | It is worth noting that valgrind does not have this
               | problem (because it instruments the original binary,
               | instead of making compile-time changes), and is also used
               | by Firefox in testing.
        
               | inetknght wrote:
               | Address Sanitizer isn't a tool that solves every problem.
               | But it solves well the problems it's intended.
               | 
               | I'm assuming you're still talking about Firefox running
               | Opus [0]? I'm not familiar with it. But I do think that
               | any platform that can run Firefox and wants to listen to
               | audio should have _a lot_ more than 128k lying around.
               | Otherwise your user isn 't using the right tool for the
               | job.
               | 
               | I understand memory constraints can be tight and it feels
               | like there's 128k of space _right there_ that you can
               | use. And to achieve that you 've written it in C and
               | manually calculated the stack space needed. It feels
               | _awesome_ to have that kind of knowledge and experience
               | with opportunity to leverage it! It 's great that you've
               | helped design a media format and implement something to
               | process it!
               | 
               | But Firefox wastes tons of _megabytes_ just to load,
               | store, and transmit telemetry, preferences, search
               | providers... the list can go on. Do you really want to
               | try to convince me that a 128k slab for audio that the
               | user specifically loaded is too expensive in the face of
               | all of the telemetry loaded by Firefox these days?
               | 
               | Why add all the maintenance headache when you can instead
               | allocate a single slab for your own use? If I were to
               | write it and manually optimize then I'd use an allocated
               | slab. And, as a bonus, the slab's ownership can move to
               | any other thread if needed. And it would make it clean in
               | Address Sanitizer. And, even more important, I wouldn't
               | have to worry about what happens if the frames before my
               | entrypoint changes. For example, Firefox's software
               | design could change and adds a couple of wrappers around
               | the manually-sized stack. What happens if I want to
               | compile Firefox on a different platform which doesn't
               | permit a manually-sized thread stack?
               | 
               | So it sounds like you've written something that's going
               | to be very difficult to maintain if you were to vanish.
               | And you've written it for a software product where that
               | level of optimization is, perhaps, better done
               | automatically by tools available to you.
               | 
               | If you build a standalone media player specifically for
               | Opus then that'd be pretty cool. But Firefox shouldn't
               | have pieces that are difficult to maintain.
               | 
               | [0]: https://hacks.mozilla.org/2018/10/introducing-
               | opus-1-3/
        
           | nyanpasu64 wrote:
           | I've gotten false positives in ASan when trying to instrument
           | a Win32 application code compiled in MSVC using MSVC's then-
           | experimental ASan support, using the wrong combination of the
           | maze of ASan libraries (static library combined with dynamic
           | MFC). And I also get startup crashes when running MSVC ASan
           | binaries in Wine. But it's obscure beta use-cases and some
           | may be fixed by now.
        
         | perryizgr8 wrote:
         | > Yes, I have seen this phenomenon!
         | 
         | I've had this happen to me personally. Looking again and again
         | at the tsan error and the code it pointed to, I was absolutely
         | sure it was a false positive. Thankfully I got a friend to look
         | at it and after a good night's sleep he was able to explain to
         | me how exactly the code was wrong.
         | 
         | Tl;dr: It is extremely hard to write correct threaded code in
         | C/C++, and it is quite hard to recognize errors even when they
         | are pointed out to you.
        
         | CodeBrad wrote:
         | In my opinion, the main limitation of TSan is that it cannot
         | find all possible races. (An impossible task so I think TSan is
         | a pretty great tool to get to the point that this is the main
         | limitation)
         | 
         | As a dynamic tool, if a race is observed during execution by
         | TSan, the race is very likely to be real. But it is impossible
         | to "observe" execution for every part of a large code base, and
         | so real races are likely still hiding in some edge case that
         | TSan did not execute.
         | 
         | Static Analysis tools are interesting in that they have the
         | opposite problem. Static tools can detect races across the
         | entire code base, but without runtime information it is
         | difficult to know if the races are real. This leads to more
         | real races being detected, but also comes with more false
         | positives.
         | 
         | There is a lot of interesting working being done on data race
         | detection and how we can bring these two sides closer together
         | to find more real races, without overwhelming developers with
         | false positives.
        
           | inetknght wrote:
           | > _In my opinion, the main limitation of TSan is that it
           | cannot find all possible races._
           | 
           | In my experience TSan finds all possible races if it's built
           | with unit tests and unit tests provide enough coverage. If
           | unit tests don't provide enough coverage then there are
           | likely other bugs lurking beyond just data races.
        
             | CodeBrad wrote:
             | I agree, good code coverage can probably get most of the
             | way there. But there are some problems with relying on
             | coverage.
             | 
             | Coverage can be misleading. Maybe there is some function
             | being executed in parallel:                 int global;
             | void foo() {         int local;                  if
             | (complex_condition)           bar(&local);         else
             | bar(&global);       }
             | 
             | And assume maybe somewhere way down the call chain the data
             | passed to bar is modified. The bar function and all of the
             | functions called in bar can have 100% test coverage, but if
             | the else branch is not covered, the race will be missed.
             | 
             | So without true 100% test coverage, it is possible races
             | are being missed by dynamic tools, and true 100% test
             | coverage is probably not possible or feasible in large
             | complex projects.
        
               | inetknght wrote:
               | > _but if the else branch is not covered, the race will
               | be missed._
               | 
               | Yes.
               | 
               | > _true 100% test coverage is probably not possible or
               | feasible in large complex projects._
               | 
               | I have yet to encounter a single large complex project
               | where 100% test coverage is not possible. I _have_
               | encountered such projects where 100% test coverage
               | required massive refactoring of the code to facilitate
               | testing and doing so resulting in fixing most customer
               | complaints received.
        
               | tsimionescu wrote:
               | So you truly believe that Firefox or Windows or Ubuntu
               | could not only be 100% covered by unit tests, but that it
               | would actually be worth taking the time to do so? (edit:
               | typos)
        
               | inetknght wrote:
               | Yes, I firmly believe that Firefox, Windows, and Ubuntu
               | can achieve 100% unit test coverage of software. I truly
               | believe it would be worth its while.
        
               | Yoric wrote:
               | As someone who has worked on the code of Firefox, I
               | strongly doubt that.
               | 
               | For instance, pretty much every piece of code (whether
               | native or JS) I've seen is programmed defensively,
               | assuming that for some reason invariants can be broken -
               | something that typically cannot happen unless there is a
               | memory corruption involved. Testing all these defensive
               | programming branches is not possible without introducing
               | memory corruptions.
        
             | nitwit005 wrote:
             | Unit tests tend to be written to run without parallelism,
             | or at least with less parallelism than the normal app has.
        
               | hedora wrote:
               | Not my unit tests. :-)
        
             | tene wrote:
             | I've never actually used TSan, so please correct me if I've
             | missed something significant about how it works, but
             | doesn't TSan need to observe the race happening in order to
             | detect it?
             | 
             | The code coverage analysis I've seen in the past has only
             | evaluates what code is run at least once, but how do you
             | get coverage analysis to evaluate the coverage of what code
             | runs concurrently with what other code effectively enough
             | to get confidence that you'll detect data races?
             | 
             | For example, if you've got a cache that safely handles
             | concurrent use for most operations, but has a data race
             | when two threads both cache miss on the same key within a
             | short window of each other, are there kinds of code-
             | coverage analysis that could theoretically point out that
             | none of your tests involve a cache miss of the same key in
             | two different threads?
             | 
             | I've seen concurrency permutation testing tools, that can
             | reorder the sequence of operations between different
             | threads searching for a permutation that causes a data
             | race, but that relies on actually executing the
             | potentially-racy behaviour at all during testing, right?
             | 
             | I guess this is also what fuzzing is about, but are there
             | tools or techniques you can use to increase your confidence
             | in how well your tests cover the space of concurrent use of
             | your data structures?
        
               | steveklabnik wrote:
               | TSan uses run-time instrumentation, yes.
        
       | BlueTemplar wrote:
       | Pff, I always suspected that they were a bunch of racists...
        
         | dang wrote:
         | Please stop posting unsubstantive and/or flamebait comments to
         | HN. It looks like you've been doing it a lot, and it's not what
         | this site is for.
         | 
         | https://news.ycombinator.com/newsguidelines.html
        
           | BlueTemplar wrote:
           | Well, it was supposed to be a joke, but I now realize that it
           | was in bad taste, so I apologize.
        
       | xvilka wrote:
       | Do they have a roadmap for complete Firefox conversion to Rust?
        
         | steveklabnik wrote:
         | It's been a while, but I don't believe that is a goal, so I
         | don't believe there's a roadmap for it.
        
         | stjohnswarts wrote:
         | I think you should change your question to "which -parts- of
         | firefox will be redone in rust and is there a road map for
         | that?"
        
         | jdashg wrote:
         | The current roadmap for that is "never". :) There just isn't
         | enough ROI for rewriting all old code.
        
           | villasv wrote:
           | I'd go further and say that the ROI is not only
           | "insufficient", it is negligible. Unless your legacy code is
           | written in a language/framework where available experts are
           | disappearing (which is not the case here), there is
           | practically zero value on rewriting code that works as
           | intended.
        
             | zozbot234 wrote:
             | Plenty of legacy code was reimplemented from C++ to Java (a
             | memory-safe language, not unlike Rust) when the latter was
             | released. The ROI is definitely there. But given the scale
             | of that task, it makes sense to pick the lowest-hanging
             | fruit first. AIUI, most of the effort wrt. "oxidation" is
             | now around WebRender, specifically making it the default
             | render backend.
        
           | jhgb wrote:
           | Surely at one point the return for getting rid of the rest of
           | the C++ code would be getting rid of the C++ build process?
        
       | est31 wrote:
       | > Overall Rust appears to be fulfilling one of its original
       | design goals: allowing us to write more concurrent code safely.
       | Both WebRender and Stylo are very large and pervasively multi-
       | threaded, but have had minimal threading issues. What issues we
       | did find were mistakes in the implementations of low-level and
       | explicitly unsafe multithreading abstractions -- and those
       | mistakes were simple to fix.
       | 
       | > This is in contrast to many of our C++ races, which often
       | involved things being randomly accessed on different threads with
       | unclear semantics, necessitating non-trivial refactorings of the
       | code.
        
         | WalterBright wrote:
         | D takes a different approach. D has a type constructor
         | `shared`. Only types marked as `shared` (or `immutable`) can be
         | accessed by multiple threads. If you've got concurrency
         | problems, the scope of the problems are immediately reduced to
         | the subset of the code typed as `shared`.
         | 
         | In addition, `shared` types cannot be accessed directly. They
         | can only be accessed via library routines, such as D's atomic
         | library.
         | 
         | Having an `immutable` data type also helps, as immutable data
         | can be accessed by multiple threads without need for
         | synchronization.
        
           | eximius wrote:
           | It's not that different. Rust has Sync + Send + &refs and
           | &mut refs.
        
           | secondcoming wrote:
           | Does this approach have to deal with potential recursive
           | locking?
        
             | WalterBright wrote:
             | Not by itself. But D's prototype ownership-borrowing system
             | can help with that.
        
           | d110af5ccf wrote:
           | Did D ever get around to specifying a formal memory model?
           | Because when I tried to adopt D a couple years ago I felt
           | there was a lot of ambiguity surrounding the semantics of
           | `shared`, particularly when interfacing with C and C++ code.
           | 
           | I ended up just casting to and from `shared`, and that
           | _seemed_ to work, but it was pretty verbose for highly
           | parallel code and I was never quite sure under what
           | circumstances doing that might violate the compiler 's
           | invariants.
           | 
           | Also such casting appeared to eliminate most of the benefits,
           | since what appeared to be local data might have been cast and
           | had a pointer sent across to another thread elsewhere. In the
           | end `shared`, @nogc antics (the language internals weren't
           | fully annotated), closures requiring the GC (compare to C++),
           | and the friction of interfacing with highly templated C++
           | code such as the Eigen library caused me to abandon the
           | attempt. I sure learned a lot in the process though!
        
           | aliceryhl wrote:
           | That description seems very similar to how Rust's type system
           | enforces thread safety. You have two kinds of references:
           | shared and exclusive. The shared references will typically
           | only provide immutable access, whereas if you have an
           | exclusive reference, you can mutate it. Beyond this, there
           | are some special types that can be mutated through shared
           | references, e.g. the AtomicI32 type. Similarly, the Mutex
           | type lets you lock it given only a shared reference and
           | provides you with an exclusive reference when locked.
        
         | kibwen wrote:
         | And when it comes to finding bugs in your unsafe concurrency
         | primitives in Rust programs, there is a tool called Loom that
         | helps with that, as a powerful complement to what
         | ThreadSanitizer provides: https://github.com/tokio-rs/loom .
         | You can think of Loom sort of like QuickCheck, except the
         | permutations are all the possible thread states of your program
         | based on the memory orderings that you have specified.
         | 
         | Here's a video that briefly talks about what Loom can do
         | https://youtu.be/rMGWeSjctlY?t=7519 (it's part of a larger
         | video about memory orderings, and is a great introduction to
         | the topic).
        
           | gamegoblin wrote:
           | Loom is really awesome, though it is focused on exhaustive
           | testing, so not suitable for code that has a lot of possible
           | interleavings (e.g. due to a ton of threads, or a large body
           | of code).
           | 
           | There is a new project out of AWS called Shuttle [1] which is
           | like Loom, but it does random exploration instead of
           | exhaustive exploration, which enables massively distributed
           | testing of really complicated stuff.
           | 
           | [1] https://github.com/awslabs/shuttle
        
             | kibwen wrote:
             | Excellent, thanks for the tip! I have to ask though, given
             | a tool that does exhaustive testing (like Loom), shouldn't
             | it be trivial to adapt it to do random testing merely by
             | randomizing rather than enumerating the test inputs? Is
             | there something more going on that I'm not seeing?
        
             | wyldfire wrote:
             | Is there anything like Loom or Shuttle for C/C++?
             | 
             | Or are these only implemented in rust and they'd still work
             | with C/C++?
        
         | evmar wrote:
         | It's interesting, I read this and had the opposite conclusion
         | -- people are still writing racy code using Rust.
         | 
         | I mean, it's great that you can write threaded code with more
         | confidence, but it seems the conclusion here is that Rust gives
         | you _more_ confidence but not _absolute_ confidence and you
         | still need the normal stable of runtime detection tooling. The
         | second fix[1] in particular resembles the bad pattern of
         | "sprinkle atomics it until it works" you see in non-Rust
         | languages.
         | 
         | (If this comment raises your hackles, please read the "no true
         | Scotsman[2]" fallacy before responding.)
         | 
         | [1]:
         | https://hg.mozilla.org/integration/autoland/rev/044f8b594c71
         | [2]: https://en.wikipedia.org/wiki/No_true_Scotsman
        
           | fluidcruft wrote:
           | Is that "sprinkle atomics" stuff in bindings/interfaces for
           | non-Rust external code? I'm not particularly surprised that
           | you would have to wrap atomics around externalities and not
           | doing that would be a source of errors if the external code
           | has race conditions.
        
             | evmar wrote:
             | It's not, I linked the patch. (I don't fully understand the
             | patch so I'm happy to be corrected here, but it looks like
             | it was an ordinary "use atomics incorrectly" bug.)
        
           | dan-robertson wrote:
           | Another fallacy to keep in mind is that two things not being
           | perfect does not make them equal failures.
        
           | est31 wrote:
           | You are correct. The "no data races" slogan of Rust is
           | accurate but you need to mention the way it needs to be
           | understood. First, data races are only a subset of all race
           | conditions. Second, like the other safety guarantees, the
           | statement only applies to safe Rust. Once you use unsafe Rust
           | or any non-Rust language (and both usually happens in any non
           | trivial program), this guarantee stops being a guarantee. But
           | it doesn't mean it vanishes into nothingness. It instead
           | becomes a quantitative statement: it's much easier to avoid
           | data races in Rust than in unsafe alternatives.
           | 
           | I think when you build Rust projects that require
           | reliability, it's foolish to lean back and believe that
           | marketing slogan and not look for concurrency bugs. However,
           | once you found them, it's way easier to fix them, at least
           | when your unsafety is well encapsulated like how it's
           | recommended.
           | 
           | I think the big challenge for Rust in this context is to
           | improve its tooling so that using it is comparable or even
           | easier than the C++ tooling. This blog post is proof that
           | such tools should be ran otherwise you are missing out on
           | important bugs.
        
           | coliveira wrote:
           | I think the mistaken idea that Rust allows "bug free"
           | multithreaded programs will lead to a new generation of
           | difficult to maintain multithreaded software... The lesson of
           | multithreading should be that you need to use less of it and
           | only when necessary, not that you should write more. It is
           | similar to what we got when Java was the darling against C++:
           | you now have lots of Java programs that don't have the same
           | problems as the C++ version, but still are difficult to
           | understand, leading to the well known FactoryFactory
           | syndrome.
        
             | mplanchard wrote:
             | I honestly don't see "less concurrency" as a viable route
             | for modern software. We're adding cores at a much faster
             | rate than we're improving clock speeds. Anything that makes
             | concurrency easier is a win in my book.
        
               | coliveira wrote:
               | There are several ways to increase concurrency: low level
               | and high level. UNIX, for example, provided high level
               | tools for concurrency (processes). You can also use
               | message passing if you want. It is possible to write more
               | concurrent software using safer primitives, instead of
               | relying in low level techniques such as Rust/C++
               | multithreading.
        
               | tsimionescu wrote:
               | I don't think it makes sense to call processes either
               | higher or lower level than threads. Splitting your
               | program into multiple processes often comes with many
               | other drawbacks than spitting it into multiple threads.
               | You can even still have data races if you use files or
               | mmap to communicate between processes (you can also get
               | something similar even with pipes/sockets for complex
               | chunked data structures).
               | 
               | Even worse, there are very few, if any, tools that could
               | help you analyze a multiprocess program for any kind of
               | race conditions.
        
           | brundolf wrote:
           | > What issues we did find were mistakes in the
           | implementations of low-level and _explicitly unsafe_
           | multithreading abstractions
           | 
           | (emphasis mine)
           | 
           | The important thing is that the races only happened in unsafe
           | { } blocks. It's well-established that these blocks should
           | basically be treated like C++ in terms of scrutiny, but (I
           | believe) you can still have roughly "absolute confidence" in
           | non-unsafe code.
           | 
           | It's true and interesting that unsafe { } blocks deserve
           | sanitization analysis like C++ code does - and maybe this
           | hasn't been discussed enough - but I don't think it's fair to
           | suggest that the enormous language benefits are outweighed by
           | some false sense of confidence. The proportions in that idea
           | just seem way out of whack.
        
             | benschulz wrote:
             | AFAIU one can create races in safe code using atomics with
             | insufficient ordering.
             | 
             | (That notwithstanding, I agree that Rust is a vast
             | improvement over C/++.)
        
               | steveklabnik wrote:
               | "races" is too general. Safe Rust cannot have _data
               | races_. It can have _race conditions_.
        
               | benschulz wrote:
               | I'm not convinced this is a meaningful distinction in
               | this context. Assuming a developer consciously creates a
               | race which turns out to be a bug, how does it matter
               | whether it was a data race or some other kind of race
               | condition?
        
               | steveklabnik wrote:
               | Data races are undefined behavior, and race conditions
               | are a logic error. Yes, both are bugs, and both are
               | important to fix. But they have (at least to me)
               | different severities.
               | 
               | (Worth noting that data races have a clear, unambiguous
               | definition, and therefore are possible to eliminate
               | through tooling. Race conditions rely on program intent
               | or logic, and are therefore... not really.)
               | 
               | Also, I think it matters, generally, that we are truthful
               | about what Rust prevents and does not prevent.
        
               | brundolf wrote:
               | > Worth noting that data races have a clear, unambiguous
               | definition, and therefore are possible to eliminate
               | through tooling. Race conditions rely on program intent
               | or logic, and are therefore... not really.
               | 
               | This is a great point. Usually people only talk about the
               | two categories in terms of vague severity, but this is a
               | hard distinction which has lots of implications for how
               | each can be dealt with.
        
               | [deleted]
        
               | bcoates wrote:
               | In general data races can break program invariants,
               | violate memory safety and do other undefined behavior.
               | 
               | In C++, data races can cause jumps to invalid addresses
               | and the compiler will assume data races away and delete
               | checks that are redundant in the single-thread case. The
               | equivalent problem happens in most languages where data
               | races are possible (two Python threads writing at once
               | can put a value into an impossible/illegal state)
               | 
               | An ordinary race condition is just an observable effect
               | of the nondeterministic progress of threads, so it's
               | possible to have a truly benign race condition, so long
               | as the correctness of the whole program doesn't depend on
               | the outcome of the race. I/O causes nondeterminism
               | anyway, so it's not really practical to avoid all
               | ordinary races.
        
               | anp wrote:
               | "This context" is the original article, which is
               | discussing data races specifically.
        
               | volta83 wrote:
               | You can also create races in safe Rust by using the file
               | system, no concurrency primitives necessary (just create
               | a new file, close it, and re-open it assuming it's still
               | there..).
               | 
               | What safe Rust doesn't have is _data races_.
        
           | stjohnswarts wrote:
           | They literally found tons of new races conditions in c++ and
           | 2 in the rust code with their tool. And you want to act as if
           | that little factoid makes rust useless somehow. Anyone who
           | thinks a language is flawless is not thinking correctly and
           | that includes rust. However it does a much better job at this
           | than C/C++ yet somehow you twisted that into declaring rust
           | as useless.
        
             | evmar wrote:
             | I'm not sure how you got from "Rust gives you more
             | confidence but not absolute confidence and you still need
             | the normal stable of runtime detection tooling" to
             | "useless".
        
               | frenchy wrote:
               | It's probably because "Rust gives you absolute
               | confidence" is a strawman, I don't think anyone was
               | arguing that, and often people's first reaction to a
               | strawman is to construct a contrary strawman.
        
           | agumonkey wrote:
           | hopefully it will lead to a new borrow checker logic
        
         | chowells wrote:
         | Yeah, that was the part that stood out to me as well. I thought
         | it was pretty intuitively obvious that having your data races
         | confined to small segments is better than allowing them to be
         | spread over the entire codebase, but a lot of people act like
         | it negates the advantage of working in a safer language
         | entirely. It's nice to have empirical reports that back up the
         | "obvious" though. Sometimes "obvious" things turn out to be
         | wrong. It's always good to check them.
        
           | setr wrote:
           | > but a lot of people act like it negates the advantage of
           | working in a safer language entirely.
           | 
           | I think the underlying assumption is more along the lines of:
           | the easy cases are made easier, and the hard cases are made
           | impossible -- or rather the concern being that rust hasn't
           | made the problem easier, it just moved it around.
           | 
           | Which is often the case with claims of solving long-lasting
           | problems
        
             | stouset wrote:
             | > I think the underlying assumption is more along the lines
             | of: the easy cases are made easier, and the hard cases are
             | made impossible -- or rather the concern being that rust
             | hasn't made the problem easier, it just moved it around.
             | 
             | Except in this case the article linked demonstrates that
             | they've made things dramatically easier overall. Data races
             | went from being endemic in a C++ codebase to being in a few
             | low-level and easy-to-locate areas.
             | 
             | The easy stuff stayed easy, the bulk of the rest was folded
             | into the easy case, and even the hard cases appear to have
             | been made easier to find, identify, and fix.
        
               | Quekid5 wrote:
               | Once you pick away enough at the straws, the needle
               | becomes a lost easier to find.
        
             | pdpi wrote:
             | Not all forms of moving things around are equivalent. Some
             | just keep the problem as diffuse as it was before, others
             | restrict the problem to a smaller area.
             | 
             | Rust does not, by any means, solve this problem. What it
             | does do is help you prove that the problem is contained to
             | a smaller subset of your codebase that you can audit/reason
             | about much more easily.
        
             | kibwen wrote:
             | _> the concern being that rust hasn't made the problem
             | easier, it just moved it around._
             | 
             | This is largely the point of Rust, though. Rust takes
             | problems that used to manifest at runtime and causes them
             | to manifest at compile-time, shortening the feedback loop
             | drastically. It also takes problems that used to manifest
             | all over the codebase and roots their causes in the leaf
             | nodes of the program that are making use of `unsafe`
             | blocks. Moving around the problem turns out to be a
             | valuable way of addressing the problem, without ignoring
             | any of the essential complexity or making anything
             | fundamentally impossible.
        
               | wtetzner wrote:
               | > shortening the feedback loop drastically
               | 
               | More importantly, forcing you to catch them. Problems at
               | runtime don't get caught until that condition actually
               | occurs during an actual run of the software.
        
             | chriswarbo wrote:
             | > Which is often the case with claims of solving long-
             | lasting problems
             | 
             | This is what I really like about programming language
             | theory: problems which are literally undecidable in
             | existing languages, can be made almost _trivial_ by
             | designing a language with that problem in mind. The trick
             | is to have the language keep track of the information
             | needed for the solution, so we 're not dealing with "black
             | boxes".
             | 
             | A classic example is annotating every variable with a type,
             | and propagating that information through compound
             | expressions; lets us spot problems like trying to use a
             | number as a string, without worrying about the halting
             | problem (we do this by forbidding some technically-valid
             | programs, like `1 + (true? 10 : "hello")`).
             | 
             | Of course, this leaves the much harder problems of (a)
             | keeping the burden imposed by this extra information down
             | to a reasonable level, and (b) trying to make the language
             | compelling enough for people to actually use. Rust is one
             | of the few examples of going all-in on a new ecosystem, and
             | it actually being embraced. Often it's easier to "embed" a
             | new language inside an existing language, to piggy-back on
             | the existing tooling (e.g. the variety of auto-
             | differentiating languages built in Python)
        
               | Quekid5 wrote:
               | Yes, people often forget that TMs really are "pure
               | runtime". Sure, the program itself is static, but nothing
               | else is. There's no imposed semantics on _anything_
               | beyond the pure mechanism of the interpreter itself. This
               | lessens the  'problem' part of the Halting Problem
               | drastically if you just impose some constraints on the
               | programs, e.g. static types, maybe a borrow-checker, etc.
        
         | [deleted]
        
       | ChiefOBrien wrote:
       | Who knew using already existing data race analysis tools is much
       | more feasible than shoehoning a huge project into a brand new
       | language. Way to go mozilla!
        
         | Jweb_Guru wrote:
         | If you believe that this tool has found all the data races in
         | the C++ code, or even a majority of them, I have a bridge to
         | sell you.
        
           | ChiefOBrien wrote:
           | Oh come on now. If it was really important for them then why
           | did they chose C++ a long time ago after deciding to rewrite
           | Netscape from scratch? Their history is full of dumb choices
           | and now suddenly going full Rust is the most important thing
           | because... the codebase is lousy? It's not gonna bring back
           | any new/old users, and right now the web needs a healthy
           | marketshare.
        
             | dang wrote:
             | Hey, could you please not take HN threads into flamewar,
             | including programming language flamewar? It's tedious, and
             | one reason this site got started was to have a forum that
             | avoids that kind of thing.
             | 
             | Thoughtful critique is welcome, but that would require (a)
             | adding solid information, and (b) taking out the swipes.
             | 
             | https://news.ycombinator.com/newsguidelines.html
        
             | steveklabnik wrote:
             | > now suddenly going full Rust is the most important thing
             | 
             | As said above, Mozilla does not seem to have ever
             | suggested, in the past or now, that "going full Rust" is a
             | goal, let alone "the most important thing."
        
               | ChiefOBrien wrote:
               | Yeah and Webrender, Stylo, and Servo just happen to
               | appear out of thin air.
        
               | steveklabnik wrote:
               | Webrender and Stylo were big projects, but they're
               | already in tree, and that means about 10% of Firefox is
               | Rust. That's hardly "full Rust."
        
               | dralley wrote:
               | Firefox has something like 8 million lines of C++. Stylo
               | replaced about 150,000 lines of that, and WebRender is
               | more of a fast path than a true replacement of anything.
               | 
               | Both of those rewrites came with massive speedups due to
               | the amount of parallelism they were able to use, which
               | was impractical in C++. Other parts of Firefox are not as
               | critical or parallelizable, hence why they aren't being
               | replaced any time soon.
               | 
               | You seem to be quite aggressively misinformed, if you are
               | not deliberately trolling.
        
               | ChiefOBrien wrote:
               | > the amount of parallelism they were able to use, which
               | was impractical in C++
               | 
               | You mean after not using OpenMP, basic multithreading
               | core guidelines, and not doing the refactoring with
               | sanity checkup in TLA+, tools that end up on the
               | frontpage everytime people rediscover them.
        
               | Yoric wrote:
               | Out of curiosity, have you used OpenMP or TLA+? I seem to
               | remember that solve problems that are really, really
               | different from the use cases for which Rust is designed.
        
             | dralley wrote:
             | Please a more productive use of time than constructing
             | ridiculous strawmen on the internet.
        
       ___________________________________________________________________
       (page generated 2021-04-06 23:00 UTC)