hadoop-cluster-docker

Run Hadoop Custer within Docker Containers

  • Owner: kiwenlau/hadoop-cluster-docker
  • Platform:
  • License:: Apache License 2.0
  • Category::
  • Topic:
  • Like:
    0
      Compare:

Github stars Tracking Chart

Run Hadoop Cluster within Docker Containers

alt tag

3 Nodes Hadoop Cluster

1. pull docker image
sudo docker pull kiwenlau/hadoop:1.0
2. clone github repository
git clone https://github.com/kiwenlau/hadoop-cluster-docker
3. create hadoop network
sudo docker network create --driver=bridge hadoop
4. start container
cd hadoop-cluster-docker
sudo ./start-container.sh

output:

start hadoop-master container...
start hadoop-slave1 container...
start hadoop-slave2 container...
root@hadoop-master:~# 
  • start 3 containers with 1 master and 2 slaves
  • you will get into the /root directory of hadoop-master container
5. start hadoop
./start-hadoop.sh
6. run wordcount
./run-wordcount.sh

output

input file1.txt:
Hello Hadoop

input file2.txt:
Hello Docker

wordcount output:
Docker    1
Hadoop    1
Hello    2

Arbitrary size Hadoop cluster

1. pull docker images and clone github repository

do 1~3 like section A

2. rebuild docker image
sudo ./resize-cluster.sh 5
  • specify parameter > 1: 2, 3..
  • this script just rebuild hadoop image with different slaves file, which pecifies the name of all slave nodes
3. start container
sudo ./start-container.sh 5
  • use the same parameter as the step 2
4. run hadoop cluster

do 5~6 like section A

Main metrics

Overview
Name With Ownerkiwenlau/hadoop-cluster-docker
Primary LanguageShell
Program languageShell (Language Count: 1)
Platform
License:Apache License 2.0
所有者活动
Created At2015-05-21 12:15:22
Pushed At2024-07-01 21:01:04
Last Commit At2017-06-07 09:19:44
Release Count1
Last Release Name0.1.0 (Posted on )
First Release Name0.1.0 (Posted on )
用户参与
Stargazers Count1.8k
Watchers Count89
Fork Count863
Commits Count88
Has Issues Enabled
Issues Count70
Issue Open Count40
Pull Requests Count5
Pull Requests Open Count8
Pull Requests Close Count8
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private