Item 66: Synchronize access to shared mutable data
Effective Java, Second Edition - Joshua Bloch
The synchronized keyword ensures that only a single thread can execute a method or block at one time. Many programmers think of synchronization solely as a means of mutual exclusion, to prevent an object from being observed in an inconsistent state while it’s being modified by another thread. In this view, an object is created in a consistent state (Item 15) and locked by the methods that access it. These methods observe the state and optionally cause a state transition, transforming the object from one consistent state to another. Proper use of synchronization guarantees that no method will ever observe the object in an inconsistent state.
This view is correct, but it’s only half the story. Without synchronization, one thread’s changes might not be visible to other threads. Not only does synchronization prevent a thread from observing an object in an inconsistent state, but it ensures that each thread entering a synchronized method or block sees the effects of all previous modifications that were guarded by the same lock.
The language specification guarantees that reading or writing a variable is atomic unless the variable is of type long or double [JLS, 17.4.7]. In other words, reading a variable other than a long or double is guaranteed to return a value that was stored into that variable by some thread, even if multiple threads modify the variable concurrently and without synchronization.
You may hear it said that to improve performance, you should avoid synchronization when reading or writing atomic data. This advice is dangerously wrong. While the language specification guarantees that a thread will not see an arbitrary value when reading a field, it does not guarantee that a value written by one thread will be visible to another. Synchronization is required for reliable communication between threads as well as for mutual exclusion. This is due to a part of the language specification known as the memory model, which specifies when and how changes made by one thread become visible to others [JLS, 17, Goetz06 16].
The consequences of failing to synchronize access to shared mutable data can be dire even if the data is atomically readable and writable. Consider the task of stopping one thread from another. The libraries provide the Thread.stop method, but this method was deprecated long ago because it is inherently unsafe—its use can result in data corruption. Do not use Thread.stop. A recommended way to stop one thread from another is to have the first thread poll a boolean field that is initially false but can be set to true by the second thread to indicate that the first thread is to stop itself. Because reading and writing a boolean field is atomic, some programmers dispense with synchronization when accessing the field:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
// Broken! - How long would you expect this program to run? public class StopThread { private static boolean stopRequested; public static void main(String[] args) throws InterruptedException { Thread backgroundThread = new Thread( new Runnable() { public void run() { int i = 0 ; while (!stopRequested) i++; } }); backgroundThread.start(); TimeUnit.SECONDS.sleep( 1 ); stopRequested = true ; } } |
You might expect this program to run for about a second, after which the main thread sets stopRequested to true, causing the background thread’s loop to terminate. On my machine, however, the program never terminates: the background thread loops forever!
The problem is that in the absence of synchronization, there is no guarantee as to when, if ever, the background thread will see the change in the value of stopRequested that was made by the main thread. In the absence of synchronization, it’s quite acceptable for the virtual machine to transform this code:
while (!done) i++; |
into this code:
if (!done) while ( true ) i++; |
This optimization is known as hoisting, and it is precisely what the HotSpot server VM does. The result is a liveness failure: the program fails to make progress. One way to fix the problem is to synchronize access to the stopRequested field. This program terminates in about one second, as expected:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
// Properly synchronized cooperative thread termination public class StopThread { private static boolean stopRequested; private static synchronized void requestStop() { stopRequested = true ; } private static synchronized boolean stopRequested() { return stopRequested; } public static void main(String[] args) throws InterruptedException { Thread backgroundThread = new Thread( new Runnable() { public void run() { int i = 0 ; while (!stopRequested()) i++; } }); backgroundThread.start(); TimeUnit.SECONDS.sleep( 1 ); requestStop(); } } |
Note that both the write method (requestStop) and the read method (stopRequested) are synchronized. It is not sufficient to synchronize only the write method! In fact, synchronization has no effect unless both read and write operations are synchronized.
The best way to avoid the problems discussed in this item is not to share mutable data. Either share immutable data (Item 15), or don’t share at all. In other words, confine mutable data to a single thread. If you adopt this policy, it is important to document it, so that it is maintained as your program evolves. It is also important to have a deep understanding of the frameworks and libraries you’re using, as they may introduce threads that you are unaware of.
It is acceptable for one thread to modify a data object for a while and then to share it with other threads, synchronizing only the act of sharing the object reference. Other threads can then read the object without further synchronization, so long as it isn’t modified again. Such objects are said to be effectively immutable [Goetz06 3.5.4]. Transferring such an object reference from one thread to others is called safe publication [Goetz06 3.5.3]. There are many ways to safely publish an object reference: you can store it in a static field as part of class initialization; you can store it in a volatile field, a final field, or a field that is accessed with normal locking; or you can put it into a concurrent collection (Item 69).
In summary, when multiple threads share mutable data, each thread that reads or writes the data must perform synchronization. Without synchronization, there is no guarantee that one thread’s changes will be visible to another. The penalties for failing to synchronize shared mutable data are liveness and safety failures. These failures are among the most difficult to debug. They can be intermittent and timing-dependent, and program behavior can vary radically from one VM to another.