A selection of Hadoop Docker Images

Posted on

When it comes to big data platforms one of the biggest challenges is getting a test environment setup where you can try out the various components. There are a few approaches to doing this this. The first is to setup your own virtual machine or some other container with the software. But this can be challenging to get just a handful of big data applications/software to work on one machine.

But there is an alternative approach. You can use one of the preconfigured environments from the likes of AWS, Google, Azure, Oracle, etc. But in most cases these come with a cost. Maybe not in the beginning but after a little us you will need to start handing over some dollars. But these require you to have access to the cloud i.e. wifi, to run these. Again not always possible!

So what if you want to have a local big data and Hadoop environment on your own PC or laptop or in your home or office test lab? There ware a lot of Virtual Machines available. But most of these have a sizeable hardware requirement. Particularly for memory, with many requiring 16+G of RAM ! Although in more recent times this might not be a problem but for many it still is. Your machines do not have that amount or your machine doesn’t allow you to upgrade.

What can you do?

Have you considered using Docker? There are many different Hadoop Docker images available and these are not as resource or hardware hungry, unlike the Virtual Machines.

Here is a list of some that I’ve tried out and you might find them useful.

Cloudera QuickStart image

You may have tried their VM, now go dry the Cloudera QuickStart docker image.

Read about it here.

Check our Docker Hub for lots and lots of images.

Docker Hub is not the only place to get Hadoop Docker images. There are lots on GitHub
Just do a quick Google search to find the many, many, many images.

These Docker Hadoop images are a great way for you to try out these Big Data platforms and environments with the minimum of resources.