[HN Gopher] Java Concurrency - Understanding the Basics of Threads
       ___________________________________________________________________
        
       Java Concurrency - Understanding the Basics of Threads
        
       Author : turkogluc
       Score  : 117 points
       Date   : 2020-10-30 09:32 UTC (1 days ago)
        
 (HTM) web link (turkogluc.com)
 (TXT) w3m dump (turkogluc.com)
        
       | danielhlockard wrote:
       | The font colors in the code examples are quite hard to read on my
       | machine. `public class Main {` is so dark that I have to
       | highlight to read
        
       | itsmemattchung wrote:
       | Years ago, I tried learning how to use threads by following
       | tutorials similar to this one, where you are taught how to
       | implement threads from {python, java, c++}. However, it wasn't
       | until I studied operating systems (when I returned to graduate
       | school for computer science) was I able to wrap my mind around
       | threads -- from a language agnostic view point, how and what
       | lightweight processes are, how to implement locks and
       | synchronization barriers -- and how they help facilitate
       | concurrency.
        
         | derefr wrote:
         | Seconded. It's silly to learn threads "from the outside in" --
         | thinking of them as an opaque abstraction and trying to
         | understand the API they present. There's no coherent
         | abstraction there; you'll only learn to cargo-cult the API,
         | without gaining an intuition for what threads "are" or when and
         | where you'd want to use those APIs.
         | 
         | The key thing to know, is that threads _aren't_ a first-class
         | kernel object. In OS kernels, there are only _OS processes_ and
         | _memory regions_.
         | 
         | To learn about threads, you should just learn about OS
         | processes; and then learn that distinct OS processes can share
         | memory regions between them, often via subprocess-spawn-time
         | inheritance. Learn what fork(2) does on POSIX, and how it
         | manages to be fast.
         | 
         | Starting with that intuition, it's simple to then absorb what
         | "threads" actually are: a _usage pattern_ for spawning and
         | managing OS processes that share memory; and a set of
         | convenience APIs (that may be in-kernel, as in Windows; or
         | purely in userland, as in Linux) for setting up this usage
         | pattern. Everything these "threading" APIs can do, you can do
         | yourself directly using the process-management and memory-
         | mapping APIs. And those same calls are all that e.g. libpthread
         | is doing.
        
           | formerly_proven wrote:
           | This strikes me as rather focused on Linux kernel
           | implementation details, since in Windows processes and
           | threads are actually distinct concepts (as opposed to the
           | Linux kernel, which really only knows about tasks), where
           | every live process has n>1 threads and the address space of
           | threads is afaik strictly defined through the process it is
           | part of.
        
           | djeiasbsbo wrote:
           | I know about unix/linux processes, ipc and the relevant
           | system calls (exec*, fork, clone, ...) but where do I
           | continue from there?
           | 
           | Studying C, I haven't really come across threads other than
           | trying out the things in `pthread.h`.
           | 
           | Would you recommend just reading the source code of that
           | header for a better understanding?
        
           | rusk wrote:
           | I just visualise it as different instruction pointers with
           | their own stack and shared heap. But I'm coming from Java so
           | that might be an oversimplification!
        
         | selimthegrim wrote:
         | What books are used in your program?
        
           | dbsmith83 wrote:
           | Idk about GP, but one book I highly recommend is Java
           | Concurrency in Practice - https://www.amazon.com/Java-
           | Concurrency-Practice-Brian-Goetz...
           | 
           | It's old, but the material holds up well since it covers a
           | lot of fundamentals
        
             | johnnycerberus wrote:
             | I also recommend it, a little bit old because it doesn't
             | cover new features but the fundamentals are strong, Brian
             | Goetz really did a great job.
        
             | robto wrote:
             | We were an all-Java shop and we were considering how to
             | make our application a SAAS cloud application. Our senior
             | engineers read this book. They all agreed that it was very
             | educational, but the conclusion was that Java concurrency
             | in practice has too many footguns, and so we ended up
             | adopting Clojure.
             | 
             | I think modern Java has better support for it, but if
             | you've got mutable state spread throughout your application
             | you're going to have a hard time no matter what.
        
               | secondcoming wrote:
               | > senior engineers read this book
               | 
               | How does one become a senior engineer if you don't
               | understand concurrency?
               | 
               | Mutable state is most easily solved by having cpoies of
               | everything, but then that's a tradeoff between
               | performance and infrastructure/resource costs, but I
               | guess that if you're in an all-Java shop that isn't much
               | of an issue.
        
               | wging wrote:
               | Rich Hickey is supposed to have said that he created
               | Clojure because he was tired of telling people to read
               | that book.
               | 
               | Best I can find as a source for now is
               | https://www.youtube.com/watch?v=2y5Pv4yN0b0 -- I thought
               | there was a link somewhere to Hickey himself saying this,
               | but can't find it.
        
       | yetkin wrote:
       | The problem is not the threads, it is the mutations of variables
       | which boost the complexity of the code. So a tutorial on creation
       | of theads actually an invitation to hell. Nothing is cool about
       | it. Cool thing is achieving concurrency without threads/race
       | conditions/shared memory
        
         | Nursie wrote:
         | There's a lot cool about threads, and you can learn to
         | implement them well.
         | 
         | Threads handled well do not need to have race conditions, and
         | race conditions/deadlocks are also very possible in
         | distributed, message-passing systems.
        
         | cle wrote:
         | A computer is a mutation machine, you cannot escape mutation by
         | hand-waving it away. If you are writing programs in which you
         | can achieve concurrency without threads and shared memory, it's
         | because you're building on the shoulders of all the engineers
         | who didn't hand-wave it away. Many of us, due to product
         | requirements, don't have the luxury of using higher level
         | abstractions like that.
        
           | mrkeen wrote:
           | And yet we're comfortable hand-waving GOTO away - that is,
           | not calling computers GOTO machines.
        
             | cle wrote:
             | There are thousands of engineers (at least) who use it or
             | its equivalent every day. Just because they've built
             | abstractions that allow you to ignore it, doesn't mean
             | nobody has to deal with it anymore.
        
               | lostcolony wrote:
               | But that's identical to what he's saying; he's not saying
               | no one has to deal with mutable memory. Just that most
               | developers who need concurrency shouldn't have to. Same
               | as registers, GOTO, etc.
        
               | cle wrote:
               | That's not what they said. They said "nothing is cool
               | about a tutorial on creation of threads", and that the
               | cool thing is "achieving concurrency without threads/race
               | conditions/shared memory" which ironically is only
               | enabled by all the engineers who spend their time working
               | on and maintaining those "uncool" things.
               | 
               | If they don't need to use threads, then good for them.
               | But to dismiss threads and learning material about
               | threads as "uncool" is just silly. The thing that enables
               | that misunderstanding is all the work that's done on them
               | in the first place.
        
         | mrfox321 wrote:
         | Can databases be efficiently implemented without shared access?
         | 
         | Can message passing accomplish this at the same level of
         | performance?
         | 
         | although I agree that simplifying resource access should
         | probably be considered before fully shared state.
        
           | mrkeen wrote:
           | > Can databases be efficiently implemented without shared
           | access?
           | 
           | Let me ask a different question: Why did databases take off
           | in the way they did? Sure they persist stuff to disk, but so
           | do files. What they offer is a concurrency model so good that
           | you almost never think about it. Beginner programmers can
           | competently write large, concurrent systems by writing
           | single-threaded programs which are backed by a central DB,
           | without even knowing the term "race condition".
           | 
           | If beginner database articles told users how to make database
           | Threads, Thread groups, and how to signal and catch
           | interruptions, I don't think databases would have enjoyed
           | nearly as much popularity.
           | 
           | While Threads are fundamental to Java concurrency, I kinda
           | agree with yetkin's point. It introduces the Thread footgun
           | without even paying lip service to the problems of shared,
           | mutable state.
        
         | agumonkey wrote:
         | Do you use other paradigm / languages ? (clojure comes to mind,
         | but maybe others)
        
           | abhishekjha wrote:
           | Akka actor framework comes to mind. I am in the process of
           | learning it and it is definitely simpler to wrap your head
           | around it.
        
           | itronitron wrote:
           | For Java at least, the Java Concurrency API is preferred.
        
           | mrjoelkemp wrote:
           | Actor model with Elixir/Erlang and the BEAM VM.
        
           | rowls66 wrote:
           | I'll add Pony to the list. This language uses the actor model
           | like Akka and Erlang, but allows for the safe sharing of data
           | between actors enforced by an an ingenious use of the type
           | system. The result is an actor programming model with better
           | performance than Erlang because mutable data can be safely
           | shared.
           | 
           | I have been a long time Java developer, and I have worked a
           | lot with highly concurrent code. Pony really opened my eyes
           | to what was possible.
           | 
           | Unfortunately, the language, standard library and runtime is
           | still pretty immature. It does however have very good 'C'
           | interop. So for some problems it would be a very good fit.
        
         | [deleted]
        
         | Igelau wrote:
         | All the cool people are in Hell. Only go to Heaven for the
         | climate.
        
         | FpUser wrote:
         | The concepts of threads and concurrent data access is simple
         | enough for any decent programmer to comprehend. There is no
         | hell here. Sure there are some complex cases but complex cases
         | will arise in many situations when programming things.
         | 
         | And achieving concurrency without shared memory is impossible
         | in general case. Sure it is possible to isolate such access to
         | a separate layer and make it transparent for the rest of the
         | program but someone still has to program such layer.
        
           | JackFr wrote:
           | The problem for novices is that a program that behaves
           | correctly looks a lot like a correct program. Until one day
           | it doesn't.
           | 
           | And because you're in production and getting random spurious
           | failures, the panicked (but common) reaction is to wrap every
           | shared resource in a synchronized block. Which makes an
           | incorrect implementation worse but possibly correct.
        
             | FpUser wrote:
             | If the resource is shared and being accessed from many
             | threads and is both written to and read from then it is the
             | correct behavior to to lock it with the proper type of lock
             | at access time. Depending on resource it might be possible
             | to split it into few with more granular access.
             | 
             | As for novices: they are called that for reason and
             | supposed to be under supervision rather than allowed
             | running wild.
        
               | secondcoming wrote:
               | Why is this being downvoted? It's the truth.
               | 
               | HN needs to only allow downvotes that have an
               | accompanying explanation comment.
        
               | FpUser wrote:
               | HN uses downvotes mostly to boo the people with opinions
               | deviating from common party line. As for reasonable
               | explanation - you're asking too much. Programming as many
               | other things often are treated as the religion. No
               | arguments, it just is.
        
               | teslalang wrote:
               | As with most other compromised social sites, no badthink
               | allowed here and how dare you.
        
             | ubercow13 wrote:
             | What's a better alternative to synchronizing access to
             | shared resources?
        
               | mrkeen wrote:
               | Treat it like GC and don't leave it up to the programmer.
        
               | FpUser wrote:
               | An make a programmer unable to achieve highest
               | performance when needed. We leave in supposedly free
               | world. If you want to be "protected" be my guest and use
               | languages with GC. Plenty of those. For somebody who need
               | the opposite and uses "unprotected" tools - leave them
               | alone. You have no rights to decide how other people do
               | their work unless they're under your direct control.
        
             | formerly_proven wrote:
             | Novices don't build working concurrent systems of any kind
             | with any toolkit, period. Concurrency is _hard_ and
             | thinking all the  "concurrency problems" go away with some
             | message passing is both ludicrous and dangerous. Fearless
             | concurrency can only be attained through understanding, not
             | by thinking all your problems went away because you're
             | using a "cool approach".
        
               | mrkeen wrote:
               | It's pretty easy to make the leap from individual SQL
               | statements to SQL statements which are wrapped in a
               | transaction.
        
               | formerly_proven wrote:
               | Excellent example for making my point, since "just wrap
               | it in a transaction" usually leads to concurrency bugs
               | like the beloved lost update.
        
               | FpUser wrote:
               | If you're talking database like transaction it "usually"
               | leads to concurrency bugs _only_ if the transaction level
               | is not _strictly_ serializable. It does not hurt to know
               | things before labeling them.
        
               | mrkeen wrote:
               | This is not something I'm familiar with. What's the
               | beloved lost update and what transactions are you using
               | that suffer from it?
        
               | formerly_proven wrote:
               | Transactions give varying degrees of "isolation" between
               | them, depending on the database (and its version +
               | configuration). For example, in what SQL would call READ
               | COMMITTED, where transactions will only read data that
               | has been committed, read-modify-write updates are
               | generally bugs. The classic example:                   -
               | Intent: both transactions deduct 50 money         -
               | transaction 1: SELECT balance FROM account; // = 100
               | - transaction 2: SELECT balance FROM account: // = 100
               | - transaction 1: UPDATE account SET balance = 50
               | - transaction 1: COMMIT         - transaction 2: UPDATE
               | account SET balance = 50         - transaction 2: COMMIT
               | - Result: balance is 50, but should be 0
               | 
               | With serializabile transactions (not all databases have
               | this, particularly if you look beyond SQL):
               | - Intent: both transactions deduct 50 money         -
               | transaction 1: SELECT balance FROM account; // = 100
               | - transaction 2: SELECT balance FROM account: // = 100
               | - transaction 1: UPDATE account SET balance = 50
               | - transaction 1: COMMIT         - transaction 2: UPDATE
               | account SET balance = 50         - transaction 2: COMMIT
               | -> Fails, needs to retry         - transaction 2b: SELECT
               | balance FROM account: // = 50         - transaction 2b:
               | UPDATE account SET balance = 0         - transaction 2b:
               | COMMIT -> Ok!         - Result: balance is 0
               | 
               | Because this is needed so frequently, databases have
               | calculated updates, basically atomic operations:
               | - transaction 1: UPDATE account SET balance = balance -
               | 50; // values indeterminate         - transaction 2:
               | UPDATE account SET balance = balance - 50; // values
               | indeterminate         - transactions 1,2: COMMIT
               | - Result: balance is 0
               | 
               | Or, one could lock the rows, like so:                   -
               | transaction 1: SELECT FOR UPDATE balance FROM account; //
               | = 100         - transaction 2: SELECT FOR UPDATE balance
               | FROM account: // = transaction 2 is stalled until
               | transaction 1 commits or rollbacks         - transaction
               | 1: UPDATE account SET balance = 50         - transaction
               | 1: COMMIT         // transaction 2 can now continue and
               | gets balance = 50         - transaction 2: UPDATE account
               | SET balance = 00         - transaction 2: COMMIT
               | - Result: balance is 0
               | 
               | And this is just one simple example of the problems you
               | can have concurrently accessing _one_ table, even while
               | using transactions. Not to speak of the issues you can
               | run into when interacting with systems outside a single
               | database, which don 't interact with the transaction
               | semantics of the DB.
               | 
               | Concurrency is just very non-trivial regardless the
               | abstraction.
        
               | abhishekjha wrote:
               | Surprisingly this is what the akka framework promises :
               | Message passing and immutability of objects.
        
               | FpUser wrote:
               | Software usually has state (unless that state is
               | completely kept and managed externally in a database for
               | example). And the state mutates. Simple case example is a
               | big array that has to be processed in place.
        
       | [deleted]
        
       | LgWoodenBadger wrote:
       | With Futures and Executors/ExecutorServices I find that I rarely
       | ever need to use raw Threads these days. Most of the thread-
       | safety issues commonly encountered are eliminated with this
       | approach as well.
        
         | alyandon wrote:
         | Pretty much. I can't even recall the last time I've had to
         | touch a low level threading class in Java since the executors
         | cover so many of the common use cases.
        
         | silvestrov wrote:
         | yeah, calling Thread.interrupt(), join() or similar methods are
         | often a code smell for a bad "programming 101" teacher.
         | 
         | Executors are the way to go for almost all finite-time
         | concurrency.
         | 
         | New threads should normally only be used for stuff that keeps
         | running until the process quits.
        
         | cmckn wrote:
         | Absolutely, executors offer much better (safer, more
         | consistent) lifecycle management of a Runnable vs. a homegrown
         | solution in my experience. The last time I extended Thread I
         | think it was just to pull off a custom name format.
        
         | grandinj wrote:
         | Except for that pesky swallowing of exceptions, I agree
        
         | exabrial wrote:
         | I was going to comment that as well. Futures, Managed
         | Executors, tasks, runnables, are more higher level structures
         | that are better suited for general use. Those constructs are
         | often implemented using threads though, so it's worth knowing
         | what's happening one layer below the abstraction layer.
        
         | Nursie wrote:
         | Likewise, with futures and executors I haven't had to touch
         | threads directly for some time.
         | 
         | They give you the tools to just say "go away and do these
         | things", which after years of dealing directly with pthreads in
         | C was a breath of fresh air!
        
           | thaumasiotes wrote:
           | But compare this very recent submission:
           | https://news.ycombinator.com/item?id=24921657
           | 
           | which presents threads as the solution to the pain of using
           | futures.
        
       | JSavageOne wrote:
       | I'm not a Java developer, but isn't RxJava the current best
       | practice around managing concurrency in Java? I thought the
       | consensus was that manually dealing with thread creation is too
       | error-prone and unmanagable.
        
         | vips7L wrote:
         | No, ExecutorServices are the current best practice around
         | managing concurrency in Java. RxJava is only something you
         | should use if you have specific performance requirements (with
         | data to back it up), and you need the Observable pattern.
        
         | pjmlp wrote:
         | RxJava got hyped in Android before Google went with Kotlt
         | first, now they are into co-routines and depending on the
         | Jetpack project, Java developers might still be able to use it,
         | or be forced into Kotlin.
        
         | sk5t wrote:
         | RxJava has some nice tools for async buffering, debounce, etc.,
         | but it pays to understand CountdownLatch, Semaphore, Mutex,
         | ExecutorService, etc., and I would definitely not consider
         | RxJava a substitute for other things.
         | 
         | Do avoid anything related to the hoary old Java "Future" class,
         | though. CompletionStage or get out!
        
         | abhishekjha wrote:
         | We are using the Akka framework so as to not to have to deal
         | with threads directly. Message passing and immutable objects
         | simplify a lot while adding one more abstraction layer.
        
       | cultus wrote:
       | Project Loom[0] is going to be coming out at some point. That
       | brings direct JVM support for delimited continuations and fibers.
       | 
       | That's really going to change and simplify JVM concurrency and I
       | think many other things. Delimited continuations can be used to
       | implement algebraic effects, which is exciting for functional
       | programming.
       | 
       | [0] https://openjdk.java.net/projects/loom/
        
         | mrkeen wrote:
         | > Delimited continuations can be used to implement algebraic
         | effects, which is exciting for functional programming.
         | 
         | How-so? I thought the type system was the precluding factor for
         | algebraic effects, not the threading model.
        
         | cle wrote:
         | I honestly don't think it will, not for a long time. It's going
         | to make things more complicated, because now we have to figure
         | out how to move the entire ecosystem to it incrementally, while
         | operating and maintaining systems during the long transition
         | period, and making technical decisions at every step about how
         | best to do concurrency with all those extra constraints.
        
           | cultus wrote:
           | I don't see it used much in legacy systems, but the benefits
           | are large enough that I could see new Akka-ish libraries
           | built on it.
        
           | an_opabinia wrote:
           | Their compatibility story is alright though, at least they
           | are trying to make "Thread" forward-compatible with the new
           | runtime. And they are working on getting Netty to work, which
           | will immediately get things compiling (not working) for a lot
           | of projects.
           | 
           | The biggest challenge will be reconciling Futures, Netty and
           | ThreadLocal (FiberLocal) patterns. Think defining a SQL
           | transaction lifetime, a distributed lock, or a OpenTracing
           | span. For Spring Framework people, no big deal. For everyone
           | else, lots of complex decisions to make.
        
       | abhishekjha wrote:
       | What a strange coincidence! I have started learning Java
       | concurrency from the book[1]. I am on the synchronization chapter
       | and it looks like managing threads via Runnable directly is going
       | to be painful. I am hoping I get a good intro to Eecutors
       | somewhere down the road. Adding the OP tutorial to go through
       | once I have a better hold on writing concurrent programs.
       | 
       | 1. https://www.amazon.in/Java-Threads-Concurrency-Utilities-
       | Fri...
        
       | rurban wrote:
       | And understand it's limitations:
       | https://news.ycombinator.com/item?id=24955376 Java's insecure
       | parallelism
        
       | edem wrote:
       | I wouldn't accept anyone's PR with explicit Thread usage in them.
       | You should either use some high-level construct like
       | CompletableFuture or a concurrent data structure instead.
        
       ___________________________________________________________________
       (page generated 2020-10-31 23:00 UTC)