Imagine that you need to load JAR file dynamically in Zeppelin working on your EMR cluster. One easy way is to deploy the file to the instance and load it from there, however, what can you do if you have almost no access to the cluster and the filesystem? You can load the JAR from S3 and load it dynamically via custom classloader.
First, load the file:
1 |
val jarBinary = sc.binaryFiles("s3://bucket/file.jar").map(_._2.toArray).collect.head |
Next, implement the classloader:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
class RemoteClassLoader(jarBytes: Array[Byte]) extends ClassLoader{ override def loadClass(name: String, resolve: Boolean): Class[_] = { var clazz = findLoadedClass(name) if(clazz != null){ return clazz } try{ val in = getResourceAsStream(name.replace(".", "/") + ".class") val out = new java.io.ByteArrayOutputStream() copy(in, out) val bytes = out.toByteArray clazz = defineClass(name, bytes, 0, bytes.length) if(resolve){ resolveClass(clazz) } }catch{ case e: Exception => clazz = super.loadClass(name, resolve) } return clazz } override def getResource(name: String) = null override def getResourceAsStream(name: String): java.io.InputStream = { try{ val jis = new java.util.jar.JarInputStream(new java.io.ByteArrayInputStream(jarBytes)) var entry = jis.getNextJarEntry while(entry != null){ if(entry.getName().equals(name)){ return jis; } entry = jis.getNextJarEntry } }catch{ case e: Exception => return null } return null } def copy(from: java.io.InputStream, to: java.io.OutputStream): Long = { val buf = new Array[Byte](8192) var total = 0 while (true) { val r = from.read(buf) if (r == -1) return total to.write(buf, 0, r) total += r } total } } |
It extracts JAR from byte array and goes through the resources. Finally, just create the class:
1 2 3 |
val loader = new RemoteClassLoader(jarBinary); val classToLoad = Class.forName("pl.adamfurmanek.blog.SampleClass", true, loader); val instance = classToLoad.newInstance(); |
Of course, using this instance will be harder as it is loaded in different classloader so you will probably need a lot of reflection.