hadoop-cdh-pseudo-docker

  • Owner: chali/hadoop-cdh-pseudo-docker
  • Platform:
  • License:: Apache License 2.0
  • Category::
  • Topic:
  • Like:
    0
      Compare:

Github stars Tracking Chart

CDH 5 pseudo-distributed cluster Docker image

Do you develop Hadoop mapreduce applications on top of Cloudera distribution? This docker image can help you. It contains basic CDH 5 setup with YARN. You can use it for developmeent and verification of your code in local environment without messing up your system with Hadoop instalation.

Docker image was prepared according to Installing CDH 5 with YARN on a Single Linux Node in Pseudo-distributed mode with a few adjustments for Docker environment.

Installed services
  • HDFS
  • YARN
  • JobHistoryServer
  • Oozie
  • Hue
  • Spark (installation for execution on top of YARN)

Execution

Get docker image

docker pull chalimartines/cdh5-pseudo-distributed

Run image with specified port mapping

docker run --name cdh -d -p 8020:8020 -p 50070:50070 -p 50010:50010 -p 50020:50020 -p 50075:50075 -p 8030:8030 -p 8031:8031 -p 8032:8032 -p 8033:8033 -p 8088:8088 -p 8040:8040 -p 8042:8042 -p 10020:10020 -p 19888:19888 -p 11000:11000 -p 8888:8888 -p 18080:18080 -p 9999:9999 chalimartines/cdh5-pseudo-distributed

Or you can use docker-compose configuration from here

If you are Mac OS user with docker machine with virtualbox driver and you would like to get from your local system to a cdh container add these port forwardings. Name of virtual machine is equal to your docker machine name (here it is "dev").

VBoxManage modifyvm "dev" --natpf1 "tcp-port8020,tcp,,8020,,8020"
VBoxManage modifyvm "dev" --natpf1 "tcp-port50070,tcp,,50070,,50070"
VBoxManage modifyvm "dev" --natpf1 "tcp-port50010,tcp,,50010,,50010"
VBoxManage modifyvm "dev" --natpf1 "tcp-port50020,tcp,,50020,,50020"
VBoxManage modifyvm "dev" --natpf1 "tcp-port50075,tcp,,50075,,50075"
VBoxManage modifyvm "dev" --natpf1 "tcp-port8030,tcp,,8030,,8030"
VBoxManage modifyvm "dev" --natpf1 "tcp-port8031,tcp,,8031,,8031"
VBoxManage modifyvm "dev" --natpf1 "tcp-port8032,tcp,,8032,,8032"
VBoxManage modifyvm "dev" --natpf1 "tcp-port8033,tcp,,8033,,8033"
VBoxManage modifyvm "dev" --natpf1 "tcp-port8088,tcp,,8088,,8088"
VBoxManage modifyvm "dev" --natpf1 "tcp-port8040,tcp,,8040,,8040"
VBoxManage modifyvm "dev" --natpf1 "tcp-port8042,tcp,,8042,,8042"
VBoxManage modifyvm "dev" --natpf1 "tcp-port10020,tcp,,10020,,10020"
VBoxManage modifyvm "dev" --natpf1 "tcp-port19888,tcp,,19888,,19888"
VBoxManage modifyvm "dev" --natpf1 "tcp-port11000,tcp,,11000,,11000"
VBoxManage modifyvm "dev" --natpf1 "tcp-port8888,tcp,,8888,,8888"
VBoxManage modifyvm "dev" --natpf1 "tcp-port9999,tcp,,9999,,9999"
VBoxManage modifyvm "dev" --natpf1 "tcp-port18080,tcp,,18080,,18080"

UI entry points

Those urls consider port forwarding from localhost.

  • name node - http://localhost:50070
  • resource manager - http://localhost:8088
  • job history server - http://localhost:19888
  • oozie console - http://localhost:11000
  • hue - http://localhost:8888
  • spark history server - http://localhost:18080

Hue login

You will be asked to create account during the first login. You can pick your prefered username and password. It will create home folder on HDFS and it can be used as hadoop user.

Mac OS and Docker in Virtualbox resource setting.

This docker image doesn't completely follow philosophy (one process = one image), but it prefers developer convenience to easily set up the whole hadoop dev stack. It comes with price and I recommend add resources to your virtualbox machine. I use 4 cores and 8GB RAM.

Custom port for your usecases

This image has exposed one port (9999). It is not used by any currently running service. It can be used by you for example when you need to attach debugger to running mapreduce job. So your mapreduce job can start debugging server on this port.

Main metrics

Overview
Name With Ownerchali/hadoop-cdh-pseudo-docker
Primary LanguageShell
Program languageShell (Language Count: 1)
Platform
License:Apache License 2.0
所有者活动
Created At2014-07-11 20:32:17
Pushed At2017-05-01 03:47:47
Last Commit At2017-04-30 20:41:15
Release Count3
Last Release Name1.2 (Posted on 2015-04-09 15:51:06)
First Release Name1.0 (Posted on 2014-07-12 11:24:00)
用户参与
Stargazers Count47
Watchers Count15
Fork Count32
Commits Count41
Has Issues Enabled
Issues Count1
Issue Open Count0
Pull Requests Count4
Pull Requests Open Count0
Pull Requests Close Count0
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private