Technology@Dhami: 2016

Sunday, December 18, 2016

Java: Finalization

This article is an extract of my own learning during the work assignments. Intended audience is the regular Java Developers who want to have a quick refresher on java.lang.ref.Finalizer and how GC treats it.

Recently I have faced twice the issue where JVM crashed due to OOM, and in both cases the root culprit was either directly the java.lang.ref.Finalizer Objects or one of the other Objects which was mysteriously not being GC.

In one case, we were trying to use a deprecated SUN API (sun.net.ftp.FtpClient) to get files from an FTP based on AS/400 machine. Our Server (Weblogic) used to start decently as:

Soon after we used to see the Eden space getting filled and Minor GC happening every 2-3 secs. And gradually, the tenured (Old Gen) space also used to get consumed, starting series of Full GCs, eventually leading to OOM:

Analyzing the Heap Dump we found that the main suspect was sun.net.ftp.FtpClient objects, which were being clogged in the Memory:

That led us to the magic world of Java Finalization!

What is Java Finalization?

Its a handle provided by JVM to developers to cleanup their non-heap, non-time-critical resources before the object containing references to these resources can be Garbage Collected.

How to use it?

Developer needs to override the protected void finalize() method in his API Class. By default empty implementation of finalize() is provided by java.lang.Object.

Whats the Catch?

Finalization seems to be a very insidious feature provided by JVM. If a developer decides to override finalize(), he might get into a trap where the Objects would stop (slow down) getting GC and eventually it would lead to OOM. How? read below:

A normal Object's (one using default finalize() from java.lang.Object) life cycle is something as:

Sequence of steps:

An Object A, is created in memory and is referenced by one or more references.
All the references pointing to the Object go out of scope, hence there is no reference pointing to the Object.
GC runs and detects the Object A is collectible.
Object is collected and memory is freed.

Whereas finalizable Object's (one having overridden finalize()) Life cycle would be:

Sequence of steps:

An Object B, is created in memory and is referenced by one or more references. JVM detects that B is a finalizable Object and hence creates a java.lang.ref.Finalizer Object, having a reference to Object B. This Finalizer object is retained by the java.lang.ref.Finalize Class.
All the references (other than the Finalizer) pointing to the Object go out of scope, hence there is no reference(but one) pointing to the Object B.
When GC runs, it detects that Object B is being referenced only by the Finalizer. GC adds the Finalizer into the Queue - java.lang.ref.Finalizer.ReferenceQueue.
Finalizer Thread, which is a Daemon thread, runs in a loop and one at a time, dequeue the Finalizers references from the Queue - java.lang.ref.Finalizer.ReferenceQueue.
Finalizer Thread then invokes the finalize() method of the referent Object of the Finalizer, i.e. Object B's finalize() is invoked.
Once the method call is completed, the Finalizer reference is removed from the java.lang.ref.Finalizer Class.
GC runs and detects that the Finalzer Object is collectible, hence the referent Object B is collectible too.
Object is collected and the memory is freed.

As you can see, a finalizable object needs at least two GCs to get collected. Also a buggy finalize() method will create a JVM level issue as Finalizer Thread would spend lot of time in the invocation, keeping many objects waiting in the queue to get their turn. This whole scenario can lead to JVM OOM error (that's why JVM doesn't guarantees that the finalize() of all finalizable objects will be called).

Hence try avoiding the finalization unless you really need it.

To know more on Finalization in detail and some alternative, read this - How to Handle Java Finalization's Memory-Retention Issues [By Tony Printezis].

Another great article on Finalization and its side effect - Debugging to understand Finalizers [By Nikita Salnikov-Tarnovski]

Thanks for the read!!

Saturday, November 26, 2016

Java: Thread Safe Coding

This article is an extract of my own learning during the work assignments. Intended audience is the regular Java Developers who want to refer and refresh the ways thread safe program should be written.

Avoiding a race condition while working on a Multi Threaded Java application has always been a challenge. Though Java has all the techniques available to do that, some out of touch Java Developers might use wrong methods to code. Even if its coded with utmost care and satisfaction - still you would find some 'Monday morning quarterback' to stand and question your intentions!!
And it can be difficult to support yourself if you are out of touch lately

Hope this blog would help!

There are three concepts which are fundamental to correct concurrent programming. When a concurrent program is not correctly written, the errors tend to fall into one of the three categories:

Atomicity - deals with actions and sets of actions which are to be executed either all-or-nothing.
Visibility - determines when the effects of one thread can be seen by another.
Ordering - determines when actions in one thread can be seen to occur out of order with respect to another. Note - When the compiler generates byte code, it can reorder certain program statements on the pretext of optimization.

Please read more on these concepts here - Atomicity, Visibility and Ordering [By Jeremy Manson].

Lets start with the simple example of a Singleton Rating Class where I count the number of LIKE hits to my page.

public class Rating{
  private int likeCount = 0;
  public void incrementLikeCount(){
    
    ++likeCount;
  }
}

The issue with this code is obvious, if two threads (two users LIKING the page simultaneously) try to incrementLikeCount at same time, both may see the likeCount as 0 and both may just increment it to 1. Thus, loosing a LIKE!

A code section which must not be executed by more than one thread at same time is known as Critical Section. There is a need to control the access to the critical section. Here the method incrementLikeCount() contains the critical section (++likeCount) as it increments likeCount and can be called by multiple threads.

Volatile:

To resolve this, the most faulty way is to just use volatile keyword:

public class Rating{
  private volatile int likeCount = 0;
  public void incrementLikeCount(){
    ++likeCount;
  }
}

Before going furter, lets see as how the increment of a volatile integer works internally, ++likeCount implies:

temp1 = likeCount;

temp2 = temp1 + 1;

likeCount = temp2;

When a volatile is written to (++likeCount), the value is written to main memory and not to local processor cache and all the other caches of other cores are informed of this change by message passing. Post Java 5, volatile ensures Ordering, of code instructions. Visibility, i.e. When one thread writes to a volatile variable, and another thread sees that write, the first thread is telling the second about all of the contents of memory up until it performed the write to that volatile variable. But, volatile doesn't ensures Atomicity, as it does nothing to control the access to the critical section!!

Please read more on volatile here - Volatile Does Not Mean Atomic! [By Jeremy Manson].

As stated above, when one thread writes to a volatile variable, essentially its sharing its memory space with other threads showing them all of the contents of its memory up until it performed the write to that volatile variable. Hence, using volatile variables might be considered as a security risk.

Please read more on this here - What Volatile Means in Java [By Jeremy Manson]

So to finally state it once again, volatile is not a solution to race conditions.

Synchronized:

Next way on list is to use synchronized blocks and methods:

public class Rating{
  private volatile int likeCount = 0;

  public synchronized void incrementLikeCount(){
    ++likeCount;
  }
}

The synchronized keyword is used in two primary contexts:

As a method modifier to mark a method that it can only be executed by one thread at a time.
By declaring a code block as a critical section – one that’s only available to a single thread at any given point in time.

Synchronized code blocks are implemented using two dedicated bytecode instructions, which are part of the official specification – MonitorEnter and MonitorExit. The compiler adds an implicit local variable to the method to hold the value of the locked object between the corresponding enter and exit calls.
Note 1: synchronized ensures thread safety of the code by locking the code and providing the access to threads one at a time. Hence, the disadvantage is the 'performance' of the application.
Note 2: We can try using 'double-checked-locking' in certain cases to lessen the performance degradation. In our current example we can't.
Note 3: We are still using volatile here, with synchronized!

Atomic Instructions:

Next way on list is use Java Concurrent API for Atomic Classes:

public class Rating{
  private AtomicInteger likeCount = new AtomicInteger(0);
  public void incrementLikeCount(){
    likeCount.incrementAndGet();
  }
}

AtomicInteger actually makes use of volatile and CAS (Compare And Sweep) for thread-safe implementation. CAS does not make use of locking rather it is very optimistic in nature. It follows these steps:

Compare the value of the primitive to the value we have got in hand.
If the values do not match it means some thread in between has changed the value. Else it will go ahead and swap the value with new value.

When there is a high contention and a large number of threads want to update the same atomic variable. In that case there is a possibility that locking will outperform the atomic variables. There is one more construct introduced in Java 8, LongAdder. As per the documentation:

This class is usually preferable to AtomicLong when multiple threads update a common sum that is used for purposes such as collecting statistics, not for fine-grained synchronization control. Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption.

So LongAdder is not always a replacement for AtomicLong:

When no contention is present AtomicLong performs better.
LongAdder will allocate Cells (a final class declared in abstract class Striped64) to avoid contention which consumes memory. So in case we have a tight memory budget we should prefer AtomicLong.

Biased Locking:

Since most Java objects are locked by at most one thread during their lifetime, that thread can bias an object toward itself. Once an Object is biased to a thread, that thread can subsequently lock and unlock the object without falling-back to other expensive techniques as atomic instructions or normal conventional locking. Biased locking is strictly a response to the latency of CAS (Compare And Sweep). It's important to note that CAS incurs local latency, but does not impact scalability on modern processors.

Before Java 6, following JVM option needs to be enabled for Biased Locking:

-XX:+UseBiasedLocking Enables a technique for improving the performance of uncontended synchronization. An object is "biased" toward the thread which first acquires its monitor via a monitor enter bytecode or synchronized method invocation; subsequent monitor-related operations performed by that thread are relatively much faster on multiprocessor machines.

An object can be biased toward at most one thread at any given time. That thread is termed as "bias holding thread". If another thread tries to acquire a biased object, we need to revoke the bias from the original thread. This is accomplished by the VM Operation RevokeBias.

To understand the concept of Bias Locking please read - Biased Locking in HotSpot [By David Dice].

To read more on RevokeBias - Java VM: Safepoint for RevokeBias.

Thanks for the read!!

Java VM: Safepoint for RevokeBias

This article is an extract of my own learning during the work assignments. Intended audience is the regular Java Developers who want to understand the meaning of JVM safepoints and what is RevokeBias.

We had a JVM Crash (running JDK 1.7u45 Solaris SPARK). It generated a hs_err_pid#####.log (HotSpot Error Log) having error.

Internal Error (synchronizer.cpp:1550), pid=xxxxx, tid=yy
guarantee(!mid->is_busy()) failed: invariant

While trying to debug the issue and deciphering the HotSpot Error Log, I came across an entry in the Logs:

VM_Operation (0xfffffffb174fecc8): RevokeBias, mode: safepoint, requested by thread 0x00000001129e3800

Essentially above log statement implies that a thread (0x00000001129e3800) requested a VM Operation (RevokeBias) to be performed.

In my pursuit to understand above more, I found:

I am acquainted with the term "Stop The World" pause in JVM world. That is mostly associated with garbage collection cycles of JVM during which all the running threads are paused to free up the heap.

What I didn't know was that GC is just one VM operation which causes this, there are many more. Will discuss those later below.

What is a Safepoint?

In HotSpot JVM Stop-the-World pause mechanism is called safepoint. There are many VM operations for which the Stop-the-World is to be initiated. A safepoint means that all threads need to reach a certain point of execution before they are stopped.

VM waits for all the running threads to reach a safepoint.
Once all the threads have reached safepoint, they are blocked.
Afterwards, VM does some internal cleaning activity.
Then the VM operation is performed.
Once VM is done with the operation execution, all threads are resumed.

Please read more on HotSpot Safepoints as How they work, - Safepoints in HotSpot JVM [By Alexey Ragozin].

Normally safepoints just work. Most of them, except GC ones, are extremely quick. Still, to debug and understand as how much time is being spent during safepoints OR which all VM operations are requesting the safepoints, we can use below JVM Options:

-XX:+PrintGCApplicationStoppedTime – this will actually report pause time for all safepoints.
-XX:+PrintSafepointStatistics –XX:PrintSafepointStatisticsCount=1 – this two options will force JVM to report reason and timings after each safepoint.

Please read more as how to read the output of these in logs - Debugging JVM Safepoint Pauses [By Christopher Berner].

VM Operations which cause safepoints?

Below are few VM operations which cause safepoint request:

Garbage collection pauses
Code deoptimization
Flushing code cache
Class redefinition (e.g. hot swap or instrumentation)
Biased lock revocation (RevokeBias)
EnableBiasLocking
PrintThreads
PrintJNI
Various debug operation (e.g. deadlock check, stacktrace dump, heap dump, thread dump)

What is RevokeBias?

-XX:+UseBiasedLocking Enables a technique for improving the performance of uncontended synchronization. An object is "biased" toward the thread which first acquires its monitor via a monitorenter bytecode or synchronized method invocation; subsequent monitor-related operations performed by that thread are relatively much faster on multiprocessor machines.

An object can be biased toward at most one thread at any given time. That thread is termed as "bias holding thread". If another thread tries to acquire a biased object, we need to revoke the bias from the original thread. This is accomplished by the VM Operation RevokeBias.

The effort involved in revocation is to coordinate the revoker and the revokee (the bias holding thread). Revocation can be implemented in various ways - signals, suspension, and safepoints to name a few. In current case we know now that it was done using safepoint (mode: safepoint).

To understand the concept of Bias Locking please read - Biased Locking in HotSpot [By David Dice].

Thanks for the read!!