Monday, May 4, 2015

Today I learned... Using Java inner classes in Scala

I needed to copy some files between two remote servers. The source and destination are both HDFS so I would like to use DistCp, but right now the servers can't even ping each other so until I get that resolved I have to go through my local computer first.

I used Jsch (with a local private key specified) to establish SSH connections to each server.
I get the files from Hadoop onto the filesystem of the source server using hadoop fs -copyToLocal.
Next I SFTP the files to my local computer.
Next I SFTP the files to the destination server.
Finally I put the files in HDFS using hdfs dfs -put.

To do the SFTP to local I use the ls method of ChannelSftp (which is part of the Jsch library).
This returns a Vector of Objects, but all the objects are really of type LsEntry (so why didn't they just make it generic?). LsEntry is an inner class inside the ChannelSftp class.

My application is written in Scala, which is a JVM language, so Scala and Java code can reference each other in the same way that C# code and VB.NET code can use each other in Microsoft-land.

The problem was I couldn't cast those objects to LsEntry because the compiler kept complaining that com.jcraft.jsch.ChannelSftp.LsEntry was not valid. The funny thing is, when I would just type obj.asInstanceOf[LsEntry], the IDE knew what I was referring to because the tooltip asked if I was trying to use com.jcraft.jsch.ChannelSftp.LsEntry. When I said ok, it auto-added the import statement import com.jcraft.jsh.ChannelSftp.LsEntry, and then gave a "not found" error for the import it just added.

After much head-banging (of the against-the-wall sort, not the heavy-metal rocker sort), I found out that there is no way to include a Java inner class in a Scala import statement.

For Scala code to reference a Java inner class, the only way is to use the hash symbol. What finally worked was:

obj.asInstanceOf[ChannelSftp#LsEntry].

Apparently the reason for this is, whereas Java treats an inner class as part of the enclosing class, Scala treats it as part of an object of the enclosing class. So by default, if you declare two enclosing objects of the same type, then declare in each an object of the inner class, those two inner objects will not have the same type, because they will be under different enclosing objects. The # allows you to get an inner class defined under the class, as Java does.

I might actually use this property to restructure my program... Instead of having Graph and Node at the same level, I might make Node an inner class of Graph, thus guaranteeing that all Node operations happen in the same Graph (e.g., a Node addChild method that takes in another node would only accept another node in the same graph).

2 comments:

  1. Brilliant thanks! I was having the same issue from scala with the ls method and obj.asInstanceOf[ChannelSftp#LsEntry] was exactly what I was looking for!

    ReplyDelete
  2. Thank you, I just encountered the exact same problem with Jsch and LsEntry. This solved it for me :)

    ReplyDelete