Recently I was debugging the following crash in Spark:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
java.lang.NegativeArraySizeException at com.esotericsoftware.kryo.util.IdentityObjectIntMap.resize(IdentityObjectIntMap.java:447) at com.esotericsoftware.kryo.util.IdentityObjectIntMap.putStash(IdentityObjectIntMap.java:245) at com.esotericsoftware.kryo.util.IdentityObjectIntMap.push(IdentityObjectIntMap.java:239) at com.esotericsoftware.kryo.util.IdentityObjectIntMap.put(IdentityObjectIntMap.java:135) at com.esotericsoftware.kryo.util.IdentityObjectIntMap.putStash(IdentityObjectIntMap.java:246) at com.esotericsoftware.kryo.util.IdentityObjectIntMap.push(IdentityObjectIntMap.java:239) at com.esotericsoftware.kryo.util.IdentityObjectIntMap.put(IdentityObjectIntMap.java:135) at com.esotericsoftware.kryo.util.MapReferenceResolver.addWrittenObject(MapReferenceResolver.java:41) at com.esotericsoftware.kryo.Kryo.writeReferenceOrNull(Kryo.java:658) at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:623) at com.twitter.chill.Tuple2Serializer.write(TupleSerializers.scala:37) at com.twitter.chill.Tuple2Serializer.write(TupleSerializers.scala:33) at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) at com.twitter.chill.TraversableSerializer$$anonfun$write$1.apply(Traversable.scala:29) at com.twitter.chill.TraversableSerializer$$anonfun$write$1.apply(Traversable.scala:27) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) at com.twitter.chill.TraversableSerializer.write(Traversable.scala:27) at com.twitter.chill.TraversableSerializer.write(Traversable.scala:21) at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) at org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:207) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$blockifyObject$2.apply(TorrentBroadcast.scala:268) at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$blockifyObject$2.apply(TorrentBroadcast.scala:268) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1303) at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:269) at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:126) at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:88) at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:56) at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1411) |
Disabling Kryo solves the issue. To do that just set spark.serializer
to org.apache.spark.serializer.JavaSerializer
.
Other workaround is to change Kryo’s reference management, as explained on Github:
1 2 |
Kryo kryo = new Kryo(); kryo.setReferences(false); |