[HN Gopher] Java Concurrency - Understanding the Basics of Threads ___________________________________________________________________ Java Concurrency - Understanding the Basics of Threads Author : turkogluc Score : 117 points Date : 2020-10-30 09:32 UTC (1 days ago) (HTM) web link (turkogluc.com) (TXT) w3m dump (turkogluc.com) | danielhlockard wrote: | The font colors in the code examples are quite hard to read on my | machine. `public class Main {` is so dark that I have to | highlight to read | itsmemattchung wrote: | Years ago, I tried learning how to use threads by following | tutorials similar to this one, where you are taught how to | implement threads from {python, java, c++}. However, it wasn't | until I studied operating systems (when I returned to graduate | school for computer science) was I able to wrap my mind around | threads -- from a language agnostic view point, how and what | lightweight processes are, how to implement locks and | synchronization barriers -- and how they help facilitate | concurrency. | derefr wrote: | Seconded. It's silly to learn threads "from the outside in" -- | thinking of them as an opaque abstraction and trying to | understand the API they present. There's no coherent | abstraction there; you'll only learn to cargo-cult the API, | without gaining an intuition for what threads "are" or when and | where you'd want to use those APIs. | | The key thing to know, is that threads _aren't_ a first-class | kernel object. In OS kernels, there are only _OS processes_ and | _memory regions_. | | To learn about threads, you should just learn about OS | processes; and then learn that distinct OS processes can share | memory regions between them, often via subprocess-spawn-time | inheritance. Learn what fork(2) does on POSIX, and how it | manages to be fast. | | Starting with that intuition, it's simple to then absorb what | "threads" actually are: a _usage pattern_ for spawning and | managing OS processes that share memory; and a set of | convenience APIs (that may be in-kernel, as in Windows; or | purely in userland, as in Linux) for setting up this usage | pattern. Everything these "threading" APIs can do, you can do | yourself directly using the process-management and memory- | mapping APIs. And those same calls are all that e.g. libpthread | is doing. | formerly_proven wrote: | This strikes me as rather focused on Linux kernel | implementation details, since in Windows processes and | threads are actually distinct concepts (as opposed to the | Linux kernel, which really only knows about tasks), where | every live process has n>1 threads and the address space of | threads is afaik strictly defined through the process it is | part of. | djeiasbsbo wrote: | I know about unix/linux processes, ipc and the relevant | system calls (exec*, fork, clone, ...) but where do I | continue from there? | | Studying C, I haven't really come across threads other than | trying out the things in `pthread.h`. | | Would you recommend just reading the source code of that | header for a better understanding? | rusk wrote: | I just visualise it as different instruction pointers with | their own stack and shared heap. But I'm coming from Java so | that might be an oversimplification! | selimthegrim wrote: | What books are used in your program? | dbsmith83 wrote: | Idk about GP, but one book I highly recommend is Java | Concurrency in Practice - https://www.amazon.com/Java- | Concurrency-Practice-Brian-Goetz... | | It's old, but the material holds up well since it covers a | lot of fundamentals | johnnycerberus wrote: | I also recommend it, a little bit old because it doesn't | cover new features but the fundamentals are strong, Brian | Goetz really did a great job. | robto wrote: | We were an all-Java shop and we were considering how to | make our application a SAAS cloud application. Our senior | engineers read this book. They all agreed that it was very | educational, but the conclusion was that Java concurrency | in practice has too many footguns, and so we ended up | adopting Clojure. | | I think modern Java has better support for it, but if | you've got mutable state spread throughout your application | you're going to have a hard time no matter what. | secondcoming wrote: | > senior engineers read this book | | How does one become a senior engineer if you don't | understand concurrency? | | Mutable state is most easily solved by having cpoies of | everything, but then that's a tradeoff between | performance and infrastructure/resource costs, but I | guess that if you're in an all-Java shop that isn't much | of an issue. | wging wrote: | Rich Hickey is supposed to have said that he created | Clojure because he was tired of telling people to read | that book. | | Best I can find as a source for now is | https://www.youtube.com/watch?v=2y5Pv4yN0b0 -- I thought | there was a link somewhere to Hickey himself saying this, | but can't find it. | yetkin wrote: | The problem is not the threads, it is the mutations of variables | which boost the complexity of the code. So a tutorial on creation | of theads actually an invitation to hell. Nothing is cool about | it. Cool thing is achieving concurrency without threads/race | conditions/shared memory | Nursie wrote: | There's a lot cool about threads, and you can learn to | implement them well. | | Threads handled well do not need to have race conditions, and | race conditions/deadlocks are also very possible in | distributed, message-passing systems. | cle wrote: | A computer is a mutation machine, you cannot escape mutation by | hand-waving it away. If you are writing programs in which you | can achieve concurrency without threads and shared memory, it's | because you're building on the shoulders of all the engineers | who didn't hand-wave it away. Many of us, due to product | requirements, don't have the luxury of using higher level | abstractions like that. | mrkeen wrote: | And yet we're comfortable hand-waving GOTO away - that is, | not calling computers GOTO machines. | cle wrote: | There are thousands of engineers (at least) who use it or | its equivalent every day. Just because they've built | abstractions that allow you to ignore it, doesn't mean | nobody has to deal with it anymore. | lostcolony wrote: | But that's identical to what he's saying; he's not saying | no one has to deal with mutable memory. Just that most | developers who need concurrency shouldn't have to. Same | as registers, GOTO, etc. | cle wrote: | That's not what they said. They said "nothing is cool | about a tutorial on creation of threads", and that the | cool thing is "achieving concurrency without threads/race | conditions/shared memory" which ironically is only | enabled by all the engineers who spend their time working | on and maintaining those "uncool" things. | | If they don't need to use threads, then good for them. | But to dismiss threads and learning material about | threads as "uncool" is just silly. The thing that enables | that misunderstanding is all the work that's done on them | in the first place. | mrfox321 wrote: | Can databases be efficiently implemented without shared access? | | Can message passing accomplish this at the same level of | performance? | | although I agree that simplifying resource access should | probably be considered before fully shared state. | mrkeen wrote: | > Can databases be efficiently implemented without shared | access? | | Let me ask a different question: Why did databases take off | in the way they did? Sure they persist stuff to disk, but so | do files. What they offer is a concurrency model so good that | you almost never think about it. Beginner programmers can | competently write large, concurrent systems by writing | single-threaded programs which are backed by a central DB, | without even knowing the term "race condition". | | If beginner database articles told users how to make database | Threads, Thread groups, and how to signal and catch | interruptions, I don't think databases would have enjoyed | nearly as much popularity. | | While Threads are fundamental to Java concurrency, I kinda | agree with yetkin's point. It introduces the Thread footgun | without even paying lip service to the problems of shared, | mutable state. | agumonkey wrote: | Do you use other paradigm / languages ? (clojure comes to mind, | but maybe others) | abhishekjha wrote: | Akka actor framework comes to mind. I am in the process of | learning it and it is definitely simpler to wrap your head | around it. | itronitron wrote: | For Java at least, the Java Concurrency API is preferred. | mrjoelkemp wrote: | Actor model with Elixir/Erlang and the BEAM VM. | rowls66 wrote: | I'll add Pony to the list. This language uses the actor model | like Akka and Erlang, but allows for the safe sharing of data | between actors enforced by an an ingenious use of the type | system. The result is an actor programming model with better | performance than Erlang because mutable data can be safely | shared. | | I have been a long time Java developer, and I have worked a | lot with highly concurrent code. Pony really opened my eyes | to what was possible. | | Unfortunately, the language, standard library and runtime is | still pretty immature. It does however have very good 'C' | interop. So for some problems it would be a very good fit. | [deleted] | Igelau wrote: | All the cool people are in Hell. Only go to Heaven for the | climate. | FpUser wrote: | The concepts of threads and concurrent data access is simple | enough for any decent programmer to comprehend. There is no | hell here. Sure there are some complex cases but complex cases | will arise in many situations when programming things. | | And achieving concurrency without shared memory is impossible | in general case. Sure it is possible to isolate such access to | a separate layer and make it transparent for the rest of the | program but someone still has to program such layer. | JackFr wrote: | The problem for novices is that a program that behaves | correctly looks a lot like a correct program. Until one day | it doesn't. | | And because you're in production and getting random spurious | failures, the panicked (but common) reaction is to wrap every | shared resource in a synchronized block. Which makes an | incorrect implementation worse but possibly correct. | FpUser wrote: | If the resource is shared and being accessed from many | threads and is both written to and read from then it is the | correct behavior to to lock it with the proper type of lock | at access time. Depending on resource it might be possible | to split it into few with more granular access. | | As for novices: they are called that for reason and | supposed to be under supervision rather than allowed | running wild. | secondcoming wrote: | Why is this being downvoted? It's the truth. | | HN needs to only allow downvotes that have an | accompanying explanation comment. | FpUser wrote: | HN uses downvotes mostly to boo the people with opinions | deviating from common party line. As for reasonable | explanation - you're asking too much. Programming as many | other things often are treated as the religion. No | arguments, it just is. | teslalang wrote: | As with most other compromised social sites, no badthink | allowed here and how dare you. | ubercow13 wrote: | What's a better alternative to synchronizing access to | shared resources? | mrkeen wrote: | Treat it like GC and don't leave it up to the programmer. | FpUser wrote: | An make a programmer unable to achieve highest | performance when needed. We leave in supposedly free | world. If you want to be "protected" be my guest and use | languages with GC. Plenty of those. For somebody who need | the opposite and uses "unprotected" tools - leave them | alone. You have no rights to decide how other people do | their work unless they're under your direct control. | formerly_proven wrote: | Novices don't build working concurrent systems of any kind | with any toolkit, period. Concurrency is _hard_ and | thinking all the "concurrency problems" go away with some | message passing is both ludicrous and dangerous. Fearless | concurrency can only be attained through understanding, not | by thinking all your problems went away because you're | using a "cool approach". | mrkeen wrote: | It's pretty easy to make the leap from individual SQL | statements to SQL statements which are wrapped in a | transaction. | formerly_proven wrote: | Excellent example for making my point, since "just wrap | it in a transaction" usually leads to concurrency bugs | like the beloved lost update. | FpUser wrote: | If you're talking database like transaction it "usually" | leads to concurrency bugs _only_ if the transaction level | is not _strictly_ serializable. It does not hurt to know | things before labeling them. | mrkeen wrote: | This is not something I'm familiar with. What's the | beloved lost update and what transactions are you using | that suffer from it? | formerly_proven wrote: | Transactions give varying degrees of "isolation" between | them, depending on the database (and its version + | configuration). For example, in what SQL would call READ | COMMITTED, where transactions will only read data that | has been committed, read-modify-write updates are | generally bugs. The classic example: - | Intent: both transactions deduct 50 money - | transaction 1: SELECT balance FROM account; // = 100 | - transaction 2: SELECT balance FROM account: // = 100 | - transaction 1: UPDATE account SET balance = 50 | - transaction 1: COMMIT - transaction 2: UPDATE | account SET balance = 50 - transaction 2: COMMIT | - Result: balance is 50, but should be 0 | | With serializabile transactions (not all databases have | this, particularly if you look beyond SQL): | - Intent: both transactions deduct 50 money - | transaction 1: SELECT balance FROM account; // = 100 | - transaction 2: SELECT balance FROM account: // = 100 | - transaction 1: UPDATE account SET balance = 50 | - transaction 1: COMMIT - transaction 2: UPDATE | account SET balance = 50 - transaction 2: COMMIT | -> Fails, needs to retry - transaction 2b: SELECT | balance FROM account: // = 50 - transaction 2b: | UPDATE account SET balance = 0 - transaction 2b: | COMMIT -> Ok! - Result: balance is 0 | | Because this is needed so frequently, databases have | calculated updates, basically atomic operations: | - transaction 1: UPDATE account SET balance = balance - | 50; // values indeterminate - transaction 2: | UPDATE account SET balance = balance - 50; // values | indeterminate - transactions 1,2: COMMIT | - Result: balance is 0 | | Or, one could lock the rows, like so: - | transaction 1: SELECT FOR UPDATE balance FROM account; // | = 100 - transaction 2: SELECT FOR UPDATE balance | FROM account: // = transaction 2 is stalled until | transaction 1 commits or rollbacks - transaction | 1: UPDATE account SET balance = 50 - transaction | 1: COMMIT // transaction 2 can now continue and | gets balance = 50 - transaction 2: UPDATE account | SET balance = 00 - transaction 2: COMMIT | - Result: balance is 0 | | And this is just one simple example of the problems you | can have concurrently accessing _one_ table, even while | using transactions. Not to speak of the issues you can | run into when interacting with systems outside a single | database, which don 't interact with the transaction | semantics of the DB. | | Concurrency is just very non-trivial regardless the | abstraction. | abhishekjha wrote: | Surprisingly this is what the akka framework promises : | Message passing and immutability of objects. | FpUser wrote: | Software usually has state (unless that state is | completely kept and managed externally in a database for | example). And the state mutates. Simple case example is a | big array that has to be processed in place. | [deleted] | LgWoodenBadger wrote: | With Futures and Executors/ExecutorServices I find that I rarely | ever need to use raw Threads these days. Most of the thread- | safety issues commonly encountered are eliminated with this | approach as well. | alyandon wrote: | Pretty much. I can't even recall the last time I've had to | touch a low level threading class in Java since the executors | cover so many of the common use cases. | silvestrov wrote: | yeah, calling Thread.interrupt(), join() or similar methods are | often a code smell for a bad "programming 101" teacher. | | Executors are the way to go for almost all finite-time | concurrency. | | New threads should normally only be used for stuff that keeps | running until the process quits. | cmckn wrote: | Absolutely, executors offer much better (safer, more | consistent) lifecycle management of a Runnable vs. a homegrown | solution in my experience. The last time I extended Thread I | think it was just to pull off a custom name format. | grandinj wrote: | Except for that pesky swallowing of exceptions, I agree | exabrial wrote: | I was going to comment that as well. Futures, Managed | Executors, tasks, runnables, are more higher level structures | that are better suited for general use. Those constructs are | often implemented using threads though, so it's worth knowing | what's happening one layer below the abstraction layer. | Nursie wrote: | Likewise, with futures and executors I haven't had to touch | threads directly for some time. | | They give you the tools to just say "go away and do these | things", which after years of dealing directly with pthreads in | C was a breath of fresh air! | thaumasiotes wrote: | But compare this very recent submission: | https://news.ycombinator.com/item?id=24921657 | | which presents threads as the solution to the pain of using | futures. | JSavageOne wrote: | I'm not a Java developer, but isn't RxJava the current best | practice around managing concurrency in Java? I thought the | consensus was that manually dealing with thread creation is too | error-prone and unmanagable. | vips7L wrote: | No, ExecutorServices are the current best practice around | managing concurrency in Java. RxJava is only something you | should use if you have specific performance requirements (with | data to back it up), and you need the Observable pattern. | pjmlp wrote: | RxJava got hyped in Android before Google went with Kotlt | first, now they are into co-routines and depending on the | Jetpack project, Java developers might still be able to use it, | or be forced into Kotlin. | sk5t wrote: | RxJava has some nice tools for async buffering, debounce, etc., | but it pays to understand CountdownLatch, Semaphore, Mutex, | ExecutorService, etc., and I would definitely not consider | RxJava a substitute for other things. | | Do avoid anything related to the hoary old Java "Future" class, | though. CompletionStage or get out! | abhishekjha wrote: | We are using the Akka framework so as to not to have to deal | with threads directly. Message passing and immutable objects | simplify a lot while adding one more abstraction layer. | cultus wrote: | Project Loom[0] is going to be coming out at some point. That | brings direct JVM support for delimited continuations and fibers. | | That's really going to change and simplify JVM concurrency and I | think many other things. Delimited continuations can be used to | implement algebraic effects, which is exciting for functional | programming. | | [0] https://openjdk.java.net/projects/loom/ | mrkeen wrote: | > Delimited continuations can be used to implement algebraic | effects, which is exciting for functional programming. | | How-so? I thought the type system was the precluding factor for | algebraic effects, not the threading model. | cle wrote: | I honestly don't think it will, not for a long time. It's going | to make things more complicated, because now we have to figure | out how to move the entire ecosystem to it incrementally, while | operating and maintaining systems during the long transition | period, and making technical decisions at every step about how | best to do concurrency with all those extra constraints. | cultus wrote: | I don't see it used much in legacy systems, but the benefits | are large enough that I could see new Akka-ish libraries | built on it. | an_opabinia wrote: | Their compatibility story is alright though, at least they | are trying to make "Thread" forward-compatible with the new | runtime. And they are working on getting Netty to work, which | will immediately get things compiling (not working) for a lot | of projects. | | The biggest challenge will be reconciling Futures, Netty and | ThreadLocal (FiberLocal) patterns. Think defining a SQL | transaction lifetime, a distributed lock, or a OpenTracing | span. For Spring Framework people, no big deal. For everyone | else, lots of complex decisions to make. | abhishekjha wrote: | What a strange coincidence! I have started learning Java | concurrency from the book[1]. I am on the synchronization chapter | and it looks like managing threads via Runnable directly is going | to be painful. I am hoping I get a good intro to Eecutors | somewhere down the road. Adding the OP tutorial to go through | once I have a better hold on writing concurrent programs. | | 1. https://www.amazon.in/Java-Threads-Concurrency-Utilities- | Fri... | rurban wrote: | And understand it's limitations: | https://news.ycombinator.com/item?id=24955376 Java's insecure | parallelism | edem wrote: | I wouldn't accept anyone's PR with explicit Thread usage in them. | You should either use some high-level construct like | CompletableFuture or a concurrent data structure instead. ___________________________________________________________________ (page generated 2020-10-31 23:00 UTC)