Java – Random IT Utensils https://blog.adamfurmanek.pl IT, operating systems, maths, and more. Fri, 22 Mar 2024 20:53:42 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 Types and Programming Languages Part 4 – Diamond problem https://blog.adamfurmanek.pl/2021/02/06/types-and-programming-languages-part-4/ https://blog.adamfurmanek.pl/2021/02/06/types-and-programming-languages-part-4/#comments Sat, 06 Feb 2021 09:00:56 +0000 https://blog.adamfurmanek.pl/?p=3751 Continue reading Types and Programming Languages Part 4 – Diamond problem]]>

This is the fourth part of the Types and Programming Languages series. For your convenience you can find other parts in the table of contents in Part 1 — Do not return in finally

The Diamond Problem, sometimes called Deadly Diamond of Death, is a problem in which we inherit the same thing through multiple base entities. If you think of a diamond problem as “the one in C++ in which there are multiple instances created” or “the one that Java doesn’t have but C++ does” then you focus on the technical part too much. In this post I’ll show why there is a diamond problem in Java and that the issue is there since the day one.

Inheritance

We typically say that there is a single inheritance in Java and multiple interface implementation. This is true but it hides the much bigger picture.

Inheritance allows to inherit characteristics and features from the base entity (most of the times from a class or an object). There are many things we can inherit or many levels of inheritance:

  • Signature inheritance
  • Implementation inheritance
  • State inheritance
  • Identity inheritance

I’m using a bit different wording than you may be used to because I want to redefine couple things. Also, I’m not going much deeper into things like Hindley-Milner type system or theory of objects in this part, we’ll cover that some other day.

Signature inheritance

Signature inheritance can be considered an interface implementation in Java. There is some method declared in the interface, we inherit it and provide an implementation. Signature here indicates that it’s only a “header” of the method, no body or whatever else. It’s important to understand that this “inheritance signature” does not need to be the same one as the “calling signature” for method call. For instance, you cannot change return type when implementing interface in C# but return type is not a part of the “calling signature” (although there is an edge case where it is, but that’s a side note). Java allows for that (C# is also considering this feature) via bridge methods but it’s an implementation detail. What we think when talking about “signature inheritance” is just the method header we get from the base entity.

Implementation inheritance

In this type of inheritance we get not only the signature but also the whole method body. It wasn’t allowed in Java nor C# via interfaces but it’s now allowed via default interface implementations. We’ll cover implications of that a little later.

We can think of this as of traits. Even though there are some differences between implementation inheritance and traits, they are pretty close to each other. Also, “trait” here is not the “trait” in Scala, even though they are similar to some extent.

State inheritance

This is an inheritance of fields. You can emulate state inheritance with implementation inheritance only but most of the times it’s considered separate. In state inheritance we get the field from the base entity which we can use in subentity (subobject or subclass).

This is similar to mixins to some extent. It’s also worth noting that we may have state inheritance without implementation inheritance but most of the times these two come together.

Identity inheritance

This can be considered “constructor inheritance” (whout going much into the type theory). When you think what is the difference between mixing a mixin and inheriting from a class – it comes down to the constructor. You can create an instance and have a new identity.

Typically, we get the identity by constructing the base entity and “holding” it inside the subentity. It’s good to keep in mind that on a technical level we don’t need to hold the parent object as a part of the child object (the way it is implemented in JVM or CLR), these two can be linked via pointers but it’s not a popular way of implementing it. It is kind of similar to prototype inheritance in JavaScript but the latter uses one “base instance” which is reused across all subobjects. Also, when inheriting from multiple base classes we may end up with multiple base instances held in a single object (which can be controlled with virtual inheritance etc).

Inheritance in Java

C++ had multiple inheritance and didn’t differentiate between a class and an interface. Java was so scared of a multiple inheritance (because of the diamond problem) so it decided to ban everything but signature inheritance. It also introduced different terminology for signature inheritance, added separate keywords and made this difference clear and visible.

However, it is important to understand that saying that “there is no multiple inheritance in Java” is not true. There is a multiple inheritance for signatures and single inheritance for everything else (at least until Java 7).

So Java removed multiple inheritance and C# did the same. However, we later realized that it may not be the best idea and so Java added default interface implementation which is basically a “implementation inheritance” (to some extent as it doesn’t support full-blown polymorphism). Because of that we have the diamond problem “back”. As we’ll see later in this post, it was there from the very beginning.

Diamond problem

Wikipedia defines the diamond problem as a situation when two classes B and C inherit from class A, override something, and then class D inherits from classes B and C without overriding the thing from A. When we now want to use the thing from A in class D, we don’t know which one to use (the one from B or the one from C).

It’s important to understand that this has nothing to do with technical implementation of virtual inheritance in C++ or something like this. It’s a logical problem, not the technical one. C++ only provided a way of controlling internals a little bit better but it’s not the only approach we can take.

Before focusing on the problem itself, let’s talk about Diamond Situation. When we say “problem” we typically think of something which is not obvious how to tackle. However, the Diamond Situation can be trivially solved in some cases. For instance with the signature inheritance:

interface A{
	void foo();
}
 
interface B{
	void foo();
}
 
class C implements A, B{
	public void foo(){
		System.out.println("FOO");
	}
}
 
class Ideone
{
	public static void main (String[] args) throws java.lang.Exception
	{
		C c = new C();
		c.foo();
	}
}

Classes A and B declare method void foo. Class C implements both interfaces. We call foo in line 20 and it works — there is no issue here. Why is there no issue? Because it doesn’t matter which interface we use, signatures are the same. However, if we change the return type:

interface A{
	Object foo();
}
 
interface B{
	String foo();
}
 
class C implements A, B{
	public String foo(){
		return "Foo";
	}
}
 
class Ideone
{
	public static void main (String[] args) throws java.lang.Exception
	{
		C c = new C();
		System.out.println(c.foo());
	}
}

it works correctly in Java but doesn’t work in C# (Compilation error (line 11, col 11): ‘C’ does not implement interface member ‘A.foo()’. ‘C.foo()’ cannot implement ‘A.foo()’ because it does not have the matching return type of ‘object’.).

So we can see that the Diamond Situation in Java is actually a problem in C# because C# doesn’t use bridge methods.

Coming back to the problem. I mentioned that with default interface implementations the Diamond Problem is back. Let’s see the code:

interface A{
	default void foo(){
		System.out.println("A");
	}
}
 
interface B{
	default void foo(){
		System.out.println("B");
	}
}
 
class C implements A, B{
}
 
class Ideone
{
	public static void main (String[] args) throws java.lang.Exception
	{
		C c = new C();
		c.foo();
	}
}

It shows the compilation error

Main.java:13: error: types A and B are incompatible;
class C implements A, B{
^
  class C inherits unrelated defaults for foo() from types A and B
1 error

You may argue that we don’t have the diamond situation here but the code is written this on purpose to show that it’s not about some base type but about which things to use. How do we solve it? In Java we can add the following method to C:

public void foo(){
	A.super.foo();
}

And it works. No problem anymore, no virtual inheritance like in C++ etc.

So what is the Diamond Problem about? It’s not about inheriting incompatible things. It’s about deciding which one to use.

Diamond Situation is almost not interesting at all when we’re dealing with methods. It gets trickier when we introduce state in the base class. We need to decide whether we want to have independent states for each subclass (the regular inheritance) or share it between subclasses (virtual inheritance). If we share it then it may get broken easily (as two different implementations use the same variables). If we don’t share it then we need to specify which variables we’re referring to in the lowest subclass.

How did Java solve the problem? It gives compilation time error. But other languages do not stop here, for instance, Scala relies on linearization of traits and chooses “the rightmost one” first. It’s important to understand that the problem is not about getting two things but about how we decide which one wins. Compilation error is one of the solutions.

So we can see the problem is back in Java and has nice solution. No need to ban multiple inheritance, just show nice compilation error. But it’s not the end of the story.

Diamond Problem in Java since the day one

There is one more thing we need to consider with the Diamond Problem — compatibility. It may happen that your perfectly valid code works today but stops working tomorrow. How? Imagine that you implement two interfaces and only one of them provides default for method foo while the other interface doesn’t have foo at all. Your code works correctly. Then someone comes and adds default foo method to the second interface. When you recompile your code — it breaks.

That’s a big issue (just like each time we break compatibility) but it’s not something new. The Diamond Problem wasn’t in Java until version 8 but the essence of the problem was there since the beginning. Like we said in previous section, the problem is about deciding which thing wins when we have two of them. Let’s take this code:

class A {
	public void foo(long l){
		System.out.println("Long");
	}
 
	public void foo(double d){
		System.out.println("Double");
	}
}
 
class Ideone
{
	public static void main (String[] args) throws java.lang.Exception
	{
		A a = new A();
		a.foo(123);
	}
}

There are two foo methods, one accepting long, other accepting double. Can you easily tell which one is going to be used? The answer is: the former, accepting long parameter.

But let’s stop here and see what’s happening. We have two methods with different signatures. We want to call the method and we pass invalid value — value of a different type. However, Java is “clever” and just casts the value to a type which it likes more (here: long).

It’s exactly the same diamond problem as before when it comes to the essence. We have two things and we cannot decide which one to use. In the Diamond Problem with default interface implementations Java shows a compilation error but with method overloading it just chooses one method over another. Also, it has the same implications when it comes to breaking the compatibility — imagine that someone comes and adds another foo(int i) method. What’s going to happen with your code? Previously Java was casting int to long but after new method is added no cast is required — you’ll call the new method. It breaks the compatibility.

While accepting different numbers is a plausible situation, there is actually much more serious place where you may hit this issue. Source compatibility issue with Google Guava library post shows when Guava library added new override when accepting params array versus explicit parameters.

Summary

While it’s correct to say that there is no multiple inheritance in Java, it’s better to keep in mind that there are many levels of inheritance and we should be specific. Actually, we can inherit implementation since Java 8 — is it a multiple inheritance or not?
While it’s correct to say that there was no Diamond Problem in Java before version 8, the essence of the problem is there in methods overloading. And it has the same implications.
And it’s worth seeing how seemingly distant language elements lead to similar challenges. We’re all “afraid” of the Diamond Problem but we are not afraid of the method overloading. Even better — we think it’s a feature until one day we break compatibility.

]]>
https://blog.adamfurmanek.pl/2021/02/06/types-and-programming-languages-part-4/feed/ 1
Types and Programming Languages Part 3 — Finally during termination https://blog.adamfurmanek.pl/2021/01/23/types-and-programming-languages-part-3/ https://blog.adamfurmanek.pl/2021/01/23/types-and-programming-languages-part-3/#comments Sat, 23 Jan 2021 09:00:13 +0000 https://blog.adamfurmanek.pl/?p=3729 Continue reading Types and Programming Languages Part 3 — Finally during termination]]>

This is the third part of the Types and Programming Languages series. For your convenience you can find other parts in the table of contents in Part 1 — Do not return in finally

Let’s take the following code:

try{
	throw new Exception("Exception 1");
}finally{
	// cleanup
}

Let’s say there is no catch block anywhere on this thread. What’s going to happen?

That depends on the platform. For instance C# finally documentation says:

Within a handled exception, the associated finally block is guaranteed to be run. However, if the exception is unhandled, execution of the finally block is dependent on how the exception unwind operation is triggered. That, in turn, is dependent on how your computer is set up.

so the finally block may not be executed. JVM guarantees finally is executed according to this.

But the things are even more interesting because they may depend on the exception type. For instance, .NET has HandleProcessCorruptedStateException attribute:

Corrupted process state exceptions are exceptions that indicate that the state of a process has been corrupted. We do not recommend executing your application in this state.

By default, the common language runtime (CLR) does not deliver these exceptions to managed code, and the try/catch blocks (and other exception-handling clauses) are not invoked for them. If you are absolutely sure that you want to maintain your handling of these exceptions, you must apply the HandleProcessCorruptedStateExceptionsAttribute attribute to the method whose exception-handling clauses you want to execute. The CLR delivers the corrupted process state exception to applicable exception clauses only in methods that have both the HandleProcessCorruptedStateExceptionsAttribute and SecurityCriticalAttribute attributes.

So your application may survive but not all finally blocks may get executed.

Now similar question arises when instead of throwing exception you exit your application by calling exit(). Is the finally going to be run?

Why would we care? Because we typically release resources in the finally block. If these resources are local to the process then it’s not a big deal, but once you start using interprocess things (like system-wide mutexes) then it’s important to release them otherwise the other user may not know if the protected state is corrupted or not.

Not to mention that unhandled exception may (.NET) or may not (JVM) take whole application down.

Takeaway? Always put a global try-catch handler on the thread.

]]>
https://blog.adamfurmanek.pl/2021/01/23/types-and-programming-languages-part-3/feed/ 1
Types and Programming Languages Part 2 — Exception while handling exception https://blog.adamfurmanek.pl/2021/01/16/types-and-programming-languages-part-2/ https://blog.adamfurmanek.pl/2021/01/16/types-and-programming-languages-part-2/#comments Sat, 16 Jan 2021 09:00:12 +0000 https://blog.adamfurmanek.pl/?p=3725 Continue reading Types and Programming Languages Part 2 — Exception while handling exception]]>

This is the second part of the Types and Programming Languages series. For your convenience you can find other parts in the table of contents in Part 1 — Do not return in finally

Last time we saw what happens when we return in finally and that we shouldn’t do it. Today we explore a similar case of exception while handling exception. Let’s take this code in C#:

try{
	try{
		throw new Exception("Exception 1");
	}finally{
		throw new Exception("Exception 2");
	}
}catch(Exception e){
	Console.WriteLine(e);
}

What’s the output?

This question is a bit tricky. First, there are two exceptions in place and we know that typically various languages (including .NET platform) implement a two-pass exception system. First pass traverses the stack and looks for some handler capable of handling the exception, then second pass unwinds the stack and executes all finally blocks. But what if we throw exception in the second pass?

That depends and differs between languages. For instance, C# loses the exception, as specified by C# language specification:

If the finally block throws another exception, processing of the current exception is terminated.

Python 2 does the same, but Python 3 in PEP 3134 changes that:

The proposed semantics are as follows:
1. Each thread has an exception context initially set to None.
2. Whenever an exception is raised, if the exception instance does not already have a __context__ attribute, the interpreter sets it equal to the thread's exception context.
3. Immediately after an exception is raised, the thread's exception context is set to the exception.
4. Whenever the interpreter exits an except block by reaching the end or executing a return, yield, continue, or break statement, the thread's exception context is set to None.

It’s worth noting that some languages provide a field in the exception class which is supposed to store the previous one but if it’s not set automatically by the platform then the original problem still exists. What’s more, if that field is read only then it’s hard to fix the issue in place.

This is important when handling resources. Some languages provide a construct try with resources, for instance Java:

try (BufferedReader br = new BufferedReader(new FileReader(path))) {
	return br.readLine();
}

If it was implemented like this:

BufferedReader br = new BufferedReader(new FileReader(path));
try {
	return br.readLine();
} finally {
	if (br != null) br.close();
}

then exception thrown in finally block would erase the previous one. This is for instance how it’s implemented in C#. Java does it right, though.

]]>
https://blog.adamfurmanek.pl/2021/01/16/types-and-programming-languages-part-2/feed/ 1
Types and Programming Languages Part 1 — Do not return in finally https://blog.adamfurmanek.pl/2021/01/09/types-and-programming-languages-part-1/ https://blog.adamfurmanek.pl/2021/01/09/types-and-programming-languages-part-1/#comments Sat, 09 Jan 2021 09:00:37 +0000 https://blog.adamfurmanek.pl/?p=3712 Continue reading Types and Programming Languages Part 1 — Do not return in finally]]>

This is the first part of the Types and Programming Languages series. For your convenience you can find other parts using the links below:
Part 1 — Do not return in finally
Part 2 — Exception while handling exception
Part 3 — Finally during termination
Part 4 – Diamond problem
Part 5 – Sleeping and measuring time
Part 6 – Liskov substitution principle
Part 7 – Four types of experience in software engineering
Part 8 – Testing – is it worth it?
Part 9 – Real life testing
Part 10 – Dogmatic TDD
Part 11 – Principles of good debugging
Part 12 – A word on estimates and Story Points
Part 13 – Long-term planning
Part 14 – Pure functions vs impure ones
Part 15 – Prohibit vs Enable in the software engineering
Part 16 – Encapsulation and making all public
Part 17 – LSP in practice
Part 18 – Your defaults influence the way you think
Part 19 – Senior or Expert or what?
Part 20 – Measuring how hard your work is

Many languages provide exception handling construct, typically in a form of try and catch blocks. While details differ in terms what can be thrown, what can be handled etc, programmers generally assume these constructs work the same across languages. That’s not true, unfortunately, and details are often tricky when it comes to edge cases. We’ll cover that one day.

Some languages support additional block called finally which is supposed to be executed “no matter what” — whether the exception was thrown or not. That’s obviously not true, there are many situations when they may not be called, for instance unhandled exceptions, exiting the application, access violations (or segfaults) etc. I won’t be covering the details now, we’ll get to that some other time. What we’ll cover today is returning in finally.

Some languages let you return value from the finally block. Typical implementation makes the last returned value “win” over others. Let’s take this Java code:

class Ideone
{
	public static void main (String[] args) throws java.lang.Exception
	{
		System.out.println(foo());
	}
 
	public static int foo(){
		try{
			return 5;
		}finally{
			return 42;
		}
	}
}

The output is 42, because that’s the last returned value. You can observe the same behavior in Python, JS, Windows SEH, probably other platforms as well. Take your favorite language and check it out. One notable exception here is C# which doesn’t allow to return in finally, just to avoid this confusion.

Seems simple and makes sense. But what happens if you throw exception and then return? Let’s take this Java code:

class Ideone
{
	public static void main (String[] args) throws java.lang.Exception
	{
		System.out.println(foo());
	}
	
	public static int foo(){
		try{
			throw new RuntimeException("This disappears");
		}finally{
			return 42;
		}
	}
}

What’s the output? It’s still 42. Exception was lost.

You can see the same in JS:

function foo(){
  try{
    throw "This disappears";
  }finally{
    return 42;
  }
}
console.log(foo());

Python:

def foo():
  try:
    raise Exception("This disappears")
  finally:
    return 42

print(foo())

SEH:

#include "stdafx.h"
#include 

int Filter(){
	return EXCEPTION_EXECUTE_HANDLER;
}

int Foo(){
	__try{
		printf("%d", 1 / divideByZero);
		return 5;
	}
	__finally{
		return 42;
	}
}

int _tmain(int argc, _TCHAR* argv[])
{
	__try{
		printf("%d", Foo());
	}
	__except(Filter()){
	}
	return 0;
}

As a rule of thumb, never return in finally. It breaks exception handling mechanism.

]]>
https://blog.adamfurmanek.pl/2021/01/09/types-and-programming-languages-part-1/feed/ 7
Erasure type inference issue in Java https://blog.adamfurmanek.pl/2020/02/15/erasure-type-inference-issue-in-java/ https://blog.adamfurmanek.pl/2020/02/15/erasure-type-inference-issue-in-java/#respond Sat, 15 Feb 2020 09:00:13 +0000 https://blog.adamfurmanek.pl/?p=3243 Continue reading Erasure type inference issue in Java]]> Recently I was working with the following code:

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class Test {
    public static void main(String[] args) {
        List<String> strings = new ArrayList<>();
        List<Item> items = strings.stream().map(item -> new Item(new HashMap())).collect(Collectors.toList());
    }
}

class Item {
    public Map<String, String> metadata;

    Item(Map<String, String> metadata) {
        this.metadata = metadata;
    }
}

I was compiling it with

java version "1.8.0_221"
Java(TM) SE Runtime Environment (build 1.8.0_221-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.221-b11, mixed mode)

on Windows 10 x64. It wasn’t working because of the following:

/tmp/java_fR1LWz/Test.java:10: warning: [unchecked] unchecked method invocation: constructor <init> in class Item is applied to given types
        List<Item> items = strings.stream().map(item -> new Item(new HashMap())).collect(Collectors.toList());
                                                            ^
  required: Map<String,String>
  found: HashMap
/tmp/java_fR1LWz/Test.java:10: warning: [unchecked] unchecked conversion
        List<Item> items = strings.stream().map(item -> new Item(new HashMap())).collect(Collectors.toList());
                                                                     ^
  required: Map<String,String>
  found:    HashMap
/tmp/java_fR1LWz/Test.java:10: warning: [unchecked] unchecked method invocation: method map in interface Stream is applied to given types
        List<Item> items = strings.stream().map(item -> new Item(new HashMap())).collect(Collectors.toList());
                                               ^
  required: Function<? super T,? extends R>
  found: Function<String,Item>
  where T,R are type-variables:
    T extends Object declared in interface Stream
    R extends Object declared in method <R>map(Function<? super T,? extends R>)
/tmp/java_fR1LWz/Test.java:10: warning: [unchecked] unchecked call to <R,A>collect(Collector<? super T,A,R>) as a member of the raw type Stream
        List<Item> items = strings.stream().map(item -> new Item(new HashMap())).collect(Collectors.toList());
                                                                                            ^
  where R,A,T are type-variables:
    R extends Object declared in method <R,A>collect(Collector<? super T,A,R>)
    A extends Object declared in method <R,A>collect(Collector<? super T,A,R>)
    T extends Object declared in interface Stream
/tmp/java_fR1LWz/Test.java:10: error: incompatible types: Object cannot be converted to List<Item>
        List<Item> items = strings.stream().map(item -> new Item(new HashMap())).collect(Collectors.toList());
                                                                                            ^
1 error
4 warnings

You can try reproducing the issue at compilejava.net, it throws the error at the moment.

I was almost sure that it was a bug in javac, especially that the code was working fine in Java 12 as indicated by Ideone.

Fortunately, with some help from 4programmers.net community I was finally pointed out in the right direction. It is a bug and Oracle knows about that. You can see details at Oracle page.

Takeaway is: most of the time the problem is on the user side, compiler/platform/OS/library/CPU works well. However, sometimes one just hits a bug.

]]>
https://blog.adamfurmanek.pl/2020/02/15/erasure-type-inference-issue-in-java/feed/ 0
JVM Inside Out Part 4 — Locks and out of band exceptions https://blog.adamfurmanek.pl/2020/02/01/jvm-inside-out-part-4/ https://blog.adamfurmanek.pl/2020/02/01/jvm-inside-out-part-4/#respond Sat, 01 Feb 2020 09:00:18 +0000 https://blog.adamfurmanek.pl/?p=3229 Continue reading JVM Inside Out Part 4 — Locks and out of band exceptions]]>

This is the fourth part of the JVM Inside Out series. For your convenience you can find other parts in the table of contents in Part 1 — Getting object address

Typical locking pattern in Java (and other languages, even outside them JVM ecosystem) looks like this:

lock.lock();
try{
   ...
}finally{
    lock.unlock();
}

Simple enough, nothing should break here. However, there is a catch.

Our code is optimized a lot. Compiler (javac) does that, JIT does that, even CPU does that. It tries to preserve semantic of our application but if we don’t obey the rules (i.e. we don’t use barriers when accessing variables modified in other threads) we may get unexpected results.

try block in JVM is implemented using metadata. There is a piece of information saying that try is between instructions X and Y. If we don’t get to those lines then the try is not respected (and finally is not called). Under the hood it is very „basic” approach — operating system mechanisms are used (SEH, SJLJ, signals etc) to catch interrupt (whether hardware or software) and ultimately to compare addresses. Details may differ but general concept is similar across platforms.

Now, what happens if JIT decides to compile the code like this:

1: call lock.lock();
2: nop
3: anything from try

We finish taking lock and we end up in instruction 2 but we are not in try block yet. Now, if some out of band exception appears we never release the lock. Out of band exception like ThreadDeath or OutOfMemory.

Typically we would like to kill JVM when any of these out of band situations happen. But nothing stops us from catching them and stop the thread from being killed.

Let’s take this code:

import java.sql.Date;
import java.util.concurrent.locks.ReentrantLock;

public class Play{
    public static void main(String[] args) throws InterruptedException {
        final ReentrantLock lock = new ReentrantLock();
        Thread t = new Thread(){
            @Override
            public void run(){
                try {
                    lock.lock();
                    while (new Date(2019, 9, 19).getTime() > 0) {} // This emulates nop instruction (and infinite loop which isn't clearly infinite so the compiler accepts the code)
                    try{
                        System.out.println("Try: Never gonna get here");
                    }finally{
                        System.out.println("Finally: Never gonna get here");
                        lock.unlock();
                    }
                }catch(Throwable e){
                    System.out.println(e);
                }
                System.out.println("We caught the exception and can 'safely' carry on");
            }
        };
        t.start();

        Thread.sleep(1000);
        t.stop();

        System.out.println("Checking deadlock");
        lock.lock();
        System.out.println("Done, no deadlock");
        lock.unlock();
    }
}

Output is:

Checking deadlock
java.lang.ThreadDeath
We caught the exception and can 'safely' carry on

and the application hangs forever.

So what happened? We emulated the nop instruction inserted just before the try block and exception thrown right in that place. We can see that background thread handles the exception and continues execution but the lock is never released so the main thread is blocked forever.

Now let’s see what happens if we try taking the lock in the try block (warning: this code is not correct! it is just to show the idea):

import java.sql.Date;
import java.util.concurrent.locks.ReentrantLock;

public class Play{
    public static void main(String[] args) throws InterruptedException {
        final ReentrantLock lock = new ReentrantLock();
        Thread t = new Thread(){
            @Override
            public void run(){
                try {
                    try{
                        lock.lock();
                        while (new Date(2019, 9, 19).getTime() > 0) {} // This emulates nop instruction (and infinite loop which isn't clearly infinite so the compiler accepts the code)
                        System.out.println("Try: Never gonna get here");
                    }finally{
                        System.out.println("Finally: Never gonna get here");
                        lock.unlock();
                    }
                }catch(Throwable e){
                    System.out.println(e);
                }
                System.out.println("We caught the exception and can 'safely' carry on");
            }
        };
        t.start();

        Thread.sleep(1000);
        t.stop();

        System.out.println("Checking deadlock");
        lock.lock();
        System.out.println("Done, no deadlock");
        lock.unlock();
    }
}

Output:

Checking deadlock
Finally: Never gonna get here
Done, no deadlock
java.lang.ThreadDeath
We caught the exception and can 'safely' carry on

Application finishes successfully. Why is this code wrong? It’s because we try to release the lock in finally but we don’t know if we locked it. If someone else locked it then we may release it incorrectly or get exception. We may also break it in case of recursive situation.

Now the question is: is this just a theory or did it actually happen? I don’t know of any example in JVM world but this happened in .NET and was fixed in .NET 4.0. On the other hand I am not aware of any guarantee that this will not happen in JVM.

How to solve it? Avoid Thread.stop() as stopping threads is bad. But remember that it doesn’t solve the „problem” — what if you have distributed lock (whether it is OS lock across processes or something across machines)? You have exactly the same issue and saying „avoid Process.kill()” or „avoid getting your machine broken” is not an answer. This problem can always appear so think about it whenever you take the lock. And as a rule of thumb, track the owner and always take the lock with timeout.

]]>
https://blog.adamfurmanek.pl/2020/02/01/jvm-inside-out-part-4/feed/ 0
JVM Inside Out Part 3 — Java raw type trickery https://blog.adamfurmanek.pl/2020/01/25/jvm-inside-out-part-3/ https://blog.adamfurmanek.pl/2020/01/25/jvm-inside-out-part-3/#comments Sat, 25 Jan 2020 09:00:51 +0000 https://blog.adamfurmanek.pl/?p=3221 Continue reading JVM Inside Out Part 3 — Java raw type trickery]]>

This is the third part of the JVM Inside Out series. For your convenience you can find other parts in the table of contents in Part 1 — Getting object address

Erasure in Java seems pretty easy but sometimes it has unexpected consequences. One of them is erasure of whole class content, not only the generic type. According to JLS 4.6 we have

Type erasure also maps the signature (§8.4.2) of a constructor or method to a signature that has no parameterized types or type variables. The erasure of a constructor or method signature s is a signature consisting of the same name as s and the erasures of all the formal parameter types given in s.

Let’s take this code:

import java.util.*;
import java.lang.*;
import java.io.*;
 
class Ideone
{
	public static List<? extends Object> produce(){
		return null; // Whatever
	}
 
	public static void main (String[] args) throws java.lang.Exception
	{
	}
 
	public static void a(NoGeneric noGeneric){
		noGeneric.call(produce());
	}
 
	public static <T> void b(Generic<T> generic){
		generic.call(produce());
	}
 
	public static <T, U extends Generic<T>> void d(U generic){
		generic.call(produce());
	}
 
	public static <T extends Generic> void c(T raw){
		raw.call(produce());
	}
}
 
class NoGeneric{
	public void call(List<Object> objects){	}
}
 
class Generic<T> {
	public void call(List<Object> objects){}
}

Compiler signals this:

Main.java:16: error: incompatible types: List<CAP#1> cannot be converted to List<Object>
		noGeneric.call(produce());
		                      ^
  where CAP#1 is a fresh type-variable:
    CAP#1 extends Object from capture of ? extends Object
Main.java:20: error: incompatible types: List<CAP#1> cannot be converted to List<Object>
		generic.call(produce());
		                    ^
  where CAP#1 is a fresh type-variable:
    CAP#1 extends Object from capture of ? extends Object
Main.java:24: error: incompatible types: List<CAP#1> cannot be converted to List<Object>
		generic.call(produce());
		                    ^
  where CAP#1 is a fresh type-variable:
    CAP#1 extends Object from capture of ? extends Object
Note: Main.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
Note: Some messages have been simplified; recompile with -Xdiags:verbose to get full output
3 errors

Line 16 doesn’t work because we try to pass list of ? extends Object to list of Object. Java doesn’t allow this for generic types (it would work for arrays, though).

Line 20 doesn’t work because generic of T is not a raw type so we do the same as in line 16.

Line 24 doesn’t work because of the same reasoning.

However, line 28 works. It is because T extends Generic uses a raw type. According to JLS we remove generic parameters not only related to T in Generic but to other things as well. So method call(List< Object > objects) becomes call(List objects).

]]>
https://blog.adamfurmanek.pl/2020/01/25/jvm-inside-out-part-3/feed/ 1
Comparing numbers is hard https://blog.adamfurmanek.pl/2019/08/17/comparing-numbers-is-hard/ https://blog.adamfurmanek.pl/2019/08/17/comparing-numbers-is-hard/#respond Sat, 17 Aug 2019 08:00:21 +0000 https://blog.adamfurmanek.pl/?p=3057 Continue reading Comparing numbers is hard]]> We know that to compare floating point values we should use epsilon and not just compare bits. We may run into similar issues when comparing BigDecimal in Java:

BigDecimal a = BigDecimal.valueOf(0);
BigDecimal b = BigDecimal.valueOf(Double.valueOf(0));
System.out.println(a.equals(b));

What is the output? Of course it is false, otherwise I wouldn’t write this post. This is because BigDecimal includes scale:

Unlike compareTo, this method considers two BigDecimal objects equal only if they are equal in value and scale (thus 2.0 is not equal to 2.00 when compared by this method).

So we should use this:

System.out.println(a.compareTo(b));

and then the result is as we expect.

]]>
https://blog.adamfurmanek.pl/2019/08/17/comparing-numbers-is-hard/feed/ 0
Spark and NegativeArraySizeException https://blog.adamfurmanek.pl/2019/06/22/spark-and-negativearraysizeexception/ https://blog.adamfurmanek.pl/2019/06/22/spark-and-negativearraysizeexception/#respond Sat, 22 Jun 2019 08:00:44 +0000 https://blog.adamfurmanek.pl/?p=2905 Recently I was debugging the following crash in Spark:

java.lang.NegativeArraySizeException
	at com.esotericsoftware.kryo.util.IdentityObjectIntMap.resize(IdentityObjectIntMap.java:447)
	at com.esotericsoftware.kryo.util.IdentityObjectIntMap.putStash(IdentityObjectIntMap.java:245)
	at com.esotericsoftware.kryo.util.IdentityObjectIntMap.push(IdentityObjectIntMap.java:239)
	at com.esotericsoftware.kryo.util.IdentityObjectIntMap.put(IdentityObjectIntMap.java:135)
	at com.esotericsoftware.kryo.util.IdentityObjectIntMap.putStash(IdentityObjectIntMap.java:246)
	at com.esotericsoftware.kryo.util.IdentityObjectIntMap.push(IdentityObjectIntMap.java:239)
	at com.esotericsoftware.kryo.util.IdentityObjectIntMap.put(IdentityObjectIntMap.java:135)
	at com.esotericsoftware.kryo.util.MapReferenceResolver.addWrittenObject(MapReferenceResolver.java:41)
	at com.esotericsoftware.kryo.Kryo.writeReferenceOrNull(Kryo.java:658)
	at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:623)
	at com.twitter.chill.Tuple2Serializer.write(TupleSerializers.scala:37)
	at com.twitter.chill.Tuple2Serializer.write(TupleSerializers.scala:33)
	at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
	at com.twitter.chill.TraversableSerializer$$anonfun$write$1.apply(Traversable.scala:29)
	at com.twitter.chill.TraversableSerializer$$anonfun$write$1.apply(Traversable.scala:27)
	at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
	at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
	at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
	at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
	at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
	at com.twitter.chill.TraversableSerializer.write(Traversable.scala:27)
	at com.twitter.chill.TraversableSerializer.write(Traversable.scala:21)
	at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
	at org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:207)
	at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$blockifyObject$2.apply(TorrentBroadcast.scala:268)
	at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$blockifyObject$2.apply(TorrentBroadcast.scala:268)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1303)
	at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:269)
	at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:126)
	at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:88)
	at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
	at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:56)
	at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1411)

Disabling Kryo solves the issue. To do that just set spark.serializer to org.apache.spark.serializer.JavaSerializer.
Other workaround is to change Kryo’s reference management, as explained on Github:

Kryo kryo = new Kryo();
kryo.setReferences(false);

]]>
https://blog.adamfurmanek.pl/2019/06/22/spark-and-negativearraysizeexception/feed/ 0
Spark and NullPointerException in UTF8String.contains https://blog.adamfurmanek.pl/2019/06/15/spark-and-nullpointerexception-in-utf8string-contains/ https://blog.adamfurmanek.pl/2019/06/15/spark-and-nullpointerexception-in-utf8string-contains/#respond Sat, 15 Jun 2019 08:00:51 +0000 https://blog.adamfurmanek.pl/?p=2903 Continue reading Spark and NullPointerException in UTF8String.contains]]> Recently I was debugging a NullPointerException in Spark. The stacktrace was indicating this:

java.lang.NullPointerException
	at org.apache.spark.unsafe.types.UTF8String.contains(UTF8String.java:284)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225)
	at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:826)
	at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:826)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:99)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

After some digging I found out that the following query causes the problem:

df1
  .join(df2,
	df1("id1") === df2("id2")
	  && !isnull(df1("ref"))
	  && !isnull(df2("ref"))
	  && df2("ref").contains(df1("ref")) // <--- this is the problem
	, "left_outer"
  )
  .drop("id2")

If I commented out the line with the comment the NPE was no longer there. Also, when I replaced either df2("ref") or df1("ref") with lit("ref") it was not crashing as well so there was something wrong with the contains running on two dataframes.

In my case removing the cache helped — I was caching df2 with cache() method before running the join. When I removed the caching the problem disappeared. Spark version 2.1.0 with EMR 5.5.3.

]]>
https://blog.adamfurmanek.pl/2019/06/15/spark-and-nullpointerexception-in-utf8string-contains/feed/ 0