My experiments with Distributed Computing

I started this exercise with the intention of developing a framework which will allow a compute intensive task to be executed across systems.

The main goals were:

Easy to integrate with existing code, i.e least effort in converting existing code to distributed code.
Should be possible to execute code on any computer with any operating system.
Should be compatible with C++ code. Because many performance intensive applications are written in C++.

Day 1

So my first approach of implementing the distributed computing framework was to use OpenMP in conjunction with a mechanism of communicating different machines. The inspiration of OpenMP come from my experience in implementing high performance compute intensive algorithms using OpenMP.

For the communication framework I started looking at Rabbit MQ as it seemed to provide best cross platform support and plugins for multiple languages.

Day 2

After some initial search on Rabbit MQ, it seemed like it would not be able to provide a ‘out-of-the-box’ solution for a distributed communication framework. (Till now I had not written a single line of code using Rabbit MQ)

I then started looking at Open MPI. And it turned out that Open MPI is designed for exactly this purpose.

Day 3

I then started writing some sample applications using Open MPI.

My primary machine is a Macbook, so I had to install Open MPI on OSX and get it running with Xcode to compile correctly. It took a while, but with help from many online articles I was able to successfully get a C program running on my machine.

I then tried running the application on an Ubuntu system. And immediately I was faced with a ‘problem’. A C/C++ program compiled on OSX cannot be directly executed on Ubuntu (or Windows) without it being recompiled on the other OS. This posed a problem for me as my goal was that we should be able to remotely deploy applications on remote machine for execution, without the knowledge of the remote machine’s architecture.

Day 4

Faced with the problem of having a cross architecture compatible executable, I moved my attention from C++ to Java. Because a Java class is compiled on runtime, the code is compatible with all machines which have Java runtime.

I wrote a Java program on OSX and tried running it on Ubuntu using OpenMPI.

The command which I was using was:

mpirun -np 4 --preload-files Scheduler.class -wd /home/dibba java Scheduler

OpenMPI has some built-in assumptions which do not work well with my needs. OpenMPI assumes that all the nodes in your network are using the same OS. All these nodes have exact similar file system and structure. And the executable is either already present on the nodes, or if it is copied to the nodes, then they have same directory structure.

These assumptions as of now present a problem for me.

My intention is to run an executable on different machines, with different OS’s, different directory structures in a seamless fashion.

From the above ‘mpirun’ command you can see that we need to provide a working directory where the executable will be copied and executed from. Problem is that we do not have same working directories for the executable, in which case we cannot use mpirun directly. Don’t get me wrong, I like OpenMPI and the fact that you can send and receive information between applications seamlessly. But I will have to find a way to work around some of design assumptions of OpenMPI.

Day 5

Given the limitations of OpenMPI, if I was supposed to get a distributed system running with OpenMPI, I would need all nodes in the network to have same OS, same folder structure and same permissions. If I have to do all of this, why not try looking at Hadoop and its MapReduce framework to run distributed jobs. So I have setup couple of Ubuntu VM’s in an attempt to setup a Hadoop cluster and try running some MapReduce jobs.