Ember
Ember provides a solution for running Ambari and Cloudera Manager clusters in Docker (HDP, CDH, and HDF). It was designed to streamline training, testing, and development by enabling multi-node dev/test clusters to be installed on a single machine with minimal resource requirements.
Update January 24, 2019
- Rebranding to Ember
- Cloudera Manager and CDH Support
- Improved port mapping
- Updated to latest Ambari and HDP versions
Pre-built Images
Pre-built versions of the single node samples have been loaded into docker hub. They can be configured with their respective ini files and launched with the following commands:
./ember.sh createFromPrebuiltSample samples/yarnquickstart/yarnquickstart-sample.ini
./ember.sh createFromPrebuiltSample samples/hadoopkafka/hadoopkafka-sample.ini
./ember.sh createFromPrebuiltSample samples/druidkafka/druidkafka-sample.ini
./ember.sh createFromPrebuiltSample samples/hivespark/hivespark-sample.ini
./ember.sh createFromPrebuiltSample samples/nifiNode/nifiNode-sample.ini
./ember.sh createFromPrebuiltSample samples/cm_essentials/essentials-sample.ini
Docker images are composed of layers that can be shared by other images. This allows for a great reduction in the total size of images on disk and over the network. Ember's pre-built images are composed as much as possible to take advantage of this feature.
The following diagram shows how the images built on top of each other. For example, Ambari Agent + Ambari Server + YarnQuickstart + HadoopKafka + DruidKafka is a total of 6.09 GB in size, but all five layers are less than 3GB each and each can be reused independently or for other containers.
Prerequisites
-
8GB RAM and 30GB disk is recommended for the threeNode sample configuration. 4GB RAM or less is viable for smaller clusters.
-
Docker 17+
yum install -y yum-utils device-mapper-persistent-data lvm2 yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo yum install -y docker-ce systemctl start docker systemctl enable docker
-
(Optional) Configure for External Network Access to Nodes
- Add multiple IPs to Host OS (N+1 for N nodes)
- Use the interface from VMWare, VirtualBox, or the cloud provider to add extra network adaptors to the VM
- For example, the threeNode-sample configuration can use 4 IPs: 1 for host, 3 for the cluster.
- Limit SSH on host VM to listen on a Single IP. By default, SSH listens on 0.0.0.0
- Edit sshd_config
vi /etc/ssh/sshd_config
- Add the following line with the IP address for the host OS:
ListenAddress <IP Address>
- Restart sshd
service sshd restart
- Edit sshd_config
- Enable IPv4 forwarding
sysctl -w net.ipv4.ip_forward=1
Configuration
An .ini file is required to define hostnames and a cluster name. An external IP list can be defined to allow external access to the containers.
HDP/HDF and CDH can be installed manually or through Ambari Blueprints/CM templates. Example blueprint and template files are provided in the samples folder.