Profiling Java programs on OS X

Sounds easy, doesn’t it? Well it actually is quite simple but the error messages along the way can really trip you up! My first attempt at profiling a bit of code was to use the full-boat Eclipse stack: Eclipse Test & Performance Tools Platform Project! Well, what they don’t tell you anywhere on the project page is that it’s only supported on Windows and Linux. A Mac port was started sometime around 2004 and never completed. Yeah, it’s been that long! ...

May 19, 2011 · 2 min · chetan

Distributing JARs for Map/Reduce jobs via HDFS

Hadoop has a built-in feature for easily distributing JARs to your worker nodes via HDFS but, unfortunately, it’s broken. There’s a couple of tickets open with a patch again 0.18 and 0.21 (trunk) but for some reason they still haven’t been committed yet. We’re currently running 0.20 so the patch does me no good anyway. So here’s my simple solution: I essentially copied the technique used by ToolRunner when you pass a “libjars” argument on the command line. You simply pass the function the HDFS paths to the JAR files you want included and it’ll take care of the rest. ...

December 30, 2010 · 1 min · chetan

Using Hadoop's DistributedCache

Using Hadoop’s DistributedCache mechanism is fairly straightforward, but as I’m finding is common with everything-Hadoop, not very well documented. Adding files When setting up your Job configuration: // Create symlinks in the job's working directory using the link name // provided below DistributedCache.createSymlink(conf); // Add a file to the cache. It must already exist on HDFS. The text // after the hash is the link name. DistributedCache.addCacheFile( new URI("hdfs://localhost:9000/foo/bar/baz.txt#baz.txt"), conf); Accessing files Now that we’ve cached our file, let’s access it: ...

December 28, 2010 · 1 min · chetan

Launching srchmvn.com

Today I’m launching my latest personal project, srchmvn.com, to help Java developers find Maven artifacts. It’s also the first project I’ve finished* and released in a very long time. The Problem While Maven is, at it’s core, a build system, one of the most valuable features it offers is it’s centralized repository and transitive dependency management for your projects. You can simply include an artifact definition and Maven will, at build time, download and provide not only the selected artifact, but all it’s dependencies as well. ...

March 31, 2010 · 3 min · chetan

Using MySQL with JRuby

For some reason, I had a relatively hard time finding this info all in one spot, so here it is: Using MySQL with JRuby is actually pretty easy (and no annoying arch issues on OS X! :-) Install GEMs (DBI, JDBC driver, DBI adapter): $ jgem install dbi jdbc-mysql dbd-jdbc Then use it! require 'dbi' require 'jdbc/mysql' dbh = DBI.connect('dbi:jdbc:mysql://localhost:3306/test', 'root', '', { "driver" => "com.mysql.jdbc.Driver" } )

February 24, 2010 · 1 min · chetan

Writing command line interfaces for Spring apps

I recently needed to script some tasks for a Spring-based app at work so we could shove it into a crontab. It proved to be much easier than I thought. You can use your spring.xml config file for wiring up your beans as usual, but rather than deal with various property files you can easily override properties on the fly using system properties. See the following example: In your spring.xml make sure you have this line: ...

June 30, 2009 · 1 min · chetan