hadoop-cluster-docker

Run Hadoop Custer within Docker Containers

  • 所有者: kiwenlau/hadoop-cluster-docker
  • 平台:
  • 許可證: Apache License 2.0
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

Run Hadoop Cluster within Docker Containers

alt tag

3 Nodes Hadoop Cluster

1. pull docker image
sudo docker pull kiwenlau/hadoop:1.0
2. clone github repository
git clone https://github.com/kiwenlau/hadoop-cluster-docker
3. create hadoop network
sudo docker network create --driver=bridge hadoop
4. start container
cd hadoop-cluster-docker
sudo ./start-container.sh

output:

start hadoop-master container...
start hadoop-slave1 container...
start hadoop-slave2 container...
root@hadoop-master:~# 
  • start 3 containers with 1 master and 2 slaves
  • you will get into the /root directory of hadoop-master container
5. start hadoop
./start-hadoop.sh
6. run wordcount
./run-wordcount.sh

output

input file1.txt:
Hello Hadoop

input file2.txt:
Hello Docker

wordcount output:
Docker    1
Hadoop    1
Hello    2

Arbitrary size Hadoop cluster

1. pull docker images and clone github repository

do 1~3 like section A

2. rebuild docker image
sudo ./resize-cluster.sh 5
  • specify parameter > 1: 2, 3..
  • this script just rebuild hadoop image with different slaves file, which pecifies the name of all slave nodes
3. start container
sudo ./start-container.sh 5
  • use the same parameter as the step 2
4. run hadoop cluster

do 5~6 like section A

主要指標

概覽
名稱與所有者kiwenlau/hadoop-cluster-docker
主編程語言Shell
編程語言Shell (語言數: 1)
平台
許可證Apache License 2.0
所有者活动
創建於2015-05-21 12:15:22
推送於2024-07-01 21:01:04
最后一次提交2017-06-07 09:19:44
發布數1
最新版本名稱0.1.0 (發布於 )
第一版名稱0.1.0 (發布於 )
用户参与
星數1.8k
關注者數89
派生數863
提交數88
已啟用問題?
問題數70
打開的問題數40
拉請求數5
打開的拉請求數8
關閉的拉請求數8
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?