Thread safety

Hi! A few days ago I end up into this StackOverflow thread from almost 7 years ago, this guy was playing with his Multithread code but was getting some funny behavior from the execution. Well, looked like a good excuse to recall some topics and do some new research.

To start with, let's just quickly review the differences between Multiprogramming, Multiprocessing, Multitasking, and Multithreading.

Multiprogramming

In a multiprogramming environment more than one program can be loaded to the main memory but only one of them will be able to use the CPU and execute instructions, inside this kind of environment programs that are not using CPU are waiting their turn, but off course this is not acceptable, imagine if the same program hold the CPU for 1h? To solve this problem the OS will interrupt this program as soon it starts to perform non CPU required operations, giving the control to one of the programs in memory ready to execute a.k.a context switch. This way no CPU time is wasted waiting for e.g I/O tasks to finish or hoping that the program will voluntarily release the CPU.

Please notice that at this point we're really talking about whole programs

Multiprocessing

To the end user multiprocessing may look similar to multiprogramming since we can see multiple programs executing at the same time, but different of multiprogramming that perform the trick through context switch and memory segregation, multiprocessing is more about CPU units, if the hardware provides more than one processor, more than one program will be executed at the same time. But just to keep in mind, multiprocessing and multiprogramming are not exclusive, a system can be both at the same time.

Multitasking

Multitasking is similar to what we have on multiprogramming but at this point we're not just talking about programs, we are now talking about programs, processes, tasks, and threads running at the same time if we have more than one CPU or the illusion of "same time" through context switch if not, yet once again, they are not exclusive. The difference now is that we may have small processes or even sub-tasks using the CPU in a fair way, CPU time quantums defined by the scheduler instead allowing the prosses using the CPU until it blocks.

Multithreading

Multithreading is a bit different from the previous models, the idea here is that now we have multiple sub-segments "threads" running concurrently inside the context of another process, in other words, threads are child process that shares the parent process resources but execute independently. This is interesting because with the previous model each process has it's own share of resources and they're not shared, with multithreading the children of the main process share it's resources so at this scenario we're allowed to perform complex computations while still doing something else. But this advantage comes with some burden too, the programmer now needs to handle race conditions.

So what can we do to handle race conditions to prevent the system from ending up in an inconsistent way? There are some approaches.

  • Stop sharing state across threads
  • Make state immutable
  • Use synchronization whenever the shared mutable data happen

To start with, I would like to do another recap, now about the Java thread memory model:
Java thread memory model

As you can see in the image above, Objects are kept in the Heap space and shared across all threads, do you remember the parent-child process idea? Java is the parent and threads the children so Java threads have access to the parent resources, including the Heap space.

JVMs can use a technique called escape analysis, by which they can tell that certain objects remain confined to a single thread for their entire lifetime, and that lifetime is bounded by the lifetime of a given stack frame. Such objects can be safely allocated on the stack instead of the heap.

But Java threads have its own private area, the thread stack, it holds local variables and partial results, and plays a part in method invocation and return. Details

Another important aspect when talking about threads is the CPU cache. In a multithreaded environment for performance reasons each thread may keep a copy of the variables inside the CPU cache in this case if the thread operating over the CPU A change a variable there's no guarantee that this change will be replicated back to the main memory so another thread would be looking to an inconsistent value visibility problem.

CPU Memory Cache

The mechanism available in Java to address the visibility problem is the volatile keyword, declaring a variable volatile indicates that changes to that variable will be written back to the main memory immediately and read as well will happen directly from the main memory. But even volatile could not be enough, let's suppose you need to read first to figure out the next value before writing, well you could end up with race conditions problems inside this small gap.

Enough recap let's take a look in 4 (Thread confinement, Stack confinement, Object confinement and Immutability) different approaches to design a thread safe class. Yes the designer duty to address race conditions internally, do you remember encapsulation? You apply in this case too, if you encapsulate your class properly, you can change the approach you've used to take care of race conditions it in the future without penalizing the clients.

There are many different scenarios that we could use to promote race conditions problems but I'll use dr kabutz example. A simple util class used to format dates.

public class DateFormatter {
    private final DateFormat df = new SimpleDateFormat("yyyy-MM-dd");

    public String format(Date date) {
        return df.format(date);
    }

    public Date parse(String date) throws ParseException {
        return df.parse(date);
    }
}

Thread confinement

Now if we simply execute this code 1_000_000 against 4 innocent threads, what do we have?

java.util.concurrent.ExecutionException: java.lang.NumberFormatException: multiple points

Looks like DateFormat is not thread-safe, oops. We could use ThreadLocal and confine our SimpleDateFormat to the current running thread.

@ThreadSafe
public class DateFormatter {

   private static final ThreadLocal threadLocal = ThreadLocal.withInitial(
         () -> new SimpleDateFormat("yyyy-MM-dd")
   );

   public String format(Date date) {
      return threadLocal.get().format(date);
   }

   public Date parse(String date) throws ParseException {
      return threadLocal.get().parse(date);
   }
}

Now we are safe and the GC log looks like.

time = 2038
durationOfLog=PT2.161S
numberOfGCs=12
numberOfYoungGCs=12
numberOfOldGCs=0
memoryReclaimedDuringYoung=3.562GB
memoryReclaimedDuringOld=0.000B
maxHeapAfterGC=1.484MB
totalMemoryAllocated=3.563GB
averageCreationRate=1.65GB/s
timeInGCs=PT0.015351S
timeInYoungGCs=PT0.015351S
averageTimeInYoungGCs=DoubleSummaryStatistics{count=12, sum=0.015351, min=0.000543, average=0.001279, max=0.002282}
timeInOldGCs=PT0S
percentageOfTimeInGC=0.71%

All right, I may be paranoid but internally ThreadLocal keeps a static class called ThreadLocalMap which keeps an Entry[] called table where Entry extends WeakReference. I'm not sure of what I'm about to say but I remember of seeing posts where people were pointing this feature as source of problems like if ThreadLocal maps are not cleaned up properly after a transaction is done, the next TransactionProcessingTask might inherit values from another previous, unrelated task and there are even articles about of how to clean ThreadLocals so be careful.

Stack confinement

Another solution would be stack confinement, do you remember the first image above? If not take another look. What about this:

The thread stack, it holds local variables and partial results, and plays a part in method invocation and return

@ThreadSafe
public class DateFormatter {

   public String format(Date date) {
      final DateFormat df = getDateFormat();
      return df.format(date);
   }

   public Date parse(String date) throws ParseException {
      final DateFormat df = getDateFormat();
      return df.parse(date);
   }

   private DateFormat getDateFormat() {
      return new SimpleDateFormat("yyyy-MM-dd");
   }
}

Now we have local variables defined when method invocations happen, this means that our DateFormat will live inside the thread stack and nothing is leaking to the heap. This is how the GC log looks now.

time = 4599
durationOfLog=PT4.962S
numberOfGCs=63
numberOfYoungGCs=63
numberOfOldGCs=0
memoryReclaimedDuringYoung=15.573GB
memoryReclaimedDuringOld=0.000B
maxHeapAfterGC=1.508MB
totalMemoryAllocated=15.574GB
averageCreationRate=3.14GB/s
timeInGCs=PT0.0453345S
timeInYoungGCs=PT0.0453345S
averageTimeInYoungGCs=DoubleSummaryStatistics{count=63, sum=0.045335, min=0.000501, average=0.000720, max=0.002288}
timeInOldGCs=PT0S
percentageOfTimeInGC=0.91%

Again, 1_000_000 rounds against 4 lovable threads.

Hmm, easier but apparently twice slower and the total memory allocated was quite higher too. But still, safe and easy.

Object confinement

Another option is the object confinement, here we're declaring DateFormat as a private final field and synchronizing the access to this object to avoid that two threads have access to it at the same time.

@ThreadSafe
public class DateFormatter {
   @GuardedBy("this")
   private final DateFormat df = new SimpleDateFormat("yyyy-MM-dd");

   public synchronized String format(Date date) {
      return df.format(date);
   }

   public synchronized Date parse(String date) throws ParseException {
      return df.parse(date);
   }
}

Let's see the GC log result:

time = 7307
durationOfLog=PT7.61S
numberOfGCs=98
numberOfYoungGCs=98
numberOfOldGCs=0
memoryReclaimedDuringYoung=4.124GB
memoryReclaimedDuringOld=0.000B
maxHeapAfterGC=1.492MB
totalMemoryAllocated=4.125GB
averageCreationRate=555.06MB/s
timeInGCs=PT0.0619821S
timeInYoungGCs=PT0.0619821S
averageTimeInYoungGCs=DoubleSummaryStatistics{count=98, sum=0.061982, min=0.000455, average=0.000632, max=0.002529}
timeInOldGCs=PT0S
percentageOfTimeInGC=0.81%

All right apparently this approach spare some memory allocation but the execution was even worse than stack confinement. This is probably because we're introducing some contention synchronizing these methods.

Immutability

This one will look a bit unfair because address the specific problem related to the Date subject but you can abstract it and thing about immutability for any other scenario.

A new date API was introduced with Java 8, a useful class here would be DateTimeFormatter that by definition is thread-safe.

A formatter created from a pattern can be used as many times as necessary, it is immutable and is thread-safe.

@ThreadSafe
public class DateFormatter {
   private static final DateTimeFormatter df = DateTimeFormatter.ISO_LOCAL_DATE;

   public String format(LocalDate date) {
      return df.format(date);
   }

   public LocalDate parse(String date) {
      return LocalDate.parse(date, df);
   }
}

The GC log result once again, 1_000_000 executions and 4 threads.

time = 1371
durationOfLog=PT1.434S
numberOfGCs=10
numberOfYoungGCs=10
numberOfOldGCs=0
memoryReclaimedDuringYoung=1.999GB
memoryReclaimedDuringOld=0.000B
maxHeapAfterGC=1.391MB
totalMemoryAllocated=2.000GB
averageCreationRate=1.39GB/s
timeInGCs=PT0.0147572S
timeInYoungGCs=PT0.0147572S
averageTimeInYoungGCs=DoubleSummaryStatistics{count=10, sum=0.014757, min=0.000533, average=0.001476, max=0.002788}
timeInOldGCs=PT0S
percentageOfTimeInGC=1.03%

What do we have here, 1.3 second and just 2G of memory allocated, not bad at all, apparently this Scala people have a point. Thread-safe, quick and cheap.

Conclusion

The conclusion here is inconclusive :) The approach to be used will probably depend on what is more important to the scenario in hand, are you short in memory, do you care more about performance or readability? Don't you care about any of these aspects what matters to you is to deliver quick and let the problem blow in someone else's hands? If so you know that there's a special place for you in the hell right?

Depending on what you're planning to do about your afterlife or about the scenario you have in hands in this world if should consider carefully the approach you're going to use to address thread-safe, you can make it concurrent proof but you may introduce other problems if you don't look to the options you have first.

Results summary:

Thread confinement
time = 2038
totalMemoryAllocated=3.563GB

Stack confinement
time = 4599
totalMemoryAllocated=15.574GB

Object confinement  
time = 7307
totalMemoryAllocated=4.125GB

Immutability
time = 1371
totalMemoryAllocated=2.000GB

References

https://www.javaspecialists.eu/archive/Issue229.html
https://gabrieletolomei.wordpress.com/miscellanea/operating-systems/virtual-memory-paging-and-swapping/
https://www.geeksforgeeks.org/operating-system-difference-multitasking-multithreading-multiprocessing/
https://medium.com/@unmeshvjoshi/how-java-thread-maps-to-os-thread-e280a9fb2e06
https://dzone.com/articles/java-memory-and-cpu-monitoring-tools-and-technique
http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html
https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-2.html#jvms-2.5.2
http://tutorials.jenkov.com/java-concurrency/volatile.html

comments powered by Disqus