docker-hadoop-spark-workbench

[EXPERIMENTAL] This repo includes deployment instructions for running HDFS/Spark inside docker containers. Also includes spark-notebook and HDFS FileBrowser.

  • 所有者: big-data-europe/docker-hadoop-spark-workbench
  • 平台:
  • 許可證:
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

Gitter chat

How to use HDFS/Spark Workbench

To start an HDFS/Spark Workbench:

    docker-compose up -d

docker-compose does not work to scale up spark-workers, for distributed setup see swarm folder

Starting workbench with Hive support

Before starting the next command, check that the previous service is running correctly (with docker logs servicename).

docker-compose -f docker-compose-hive.yml up -d namenode hive-metastore-postgresql
docker-compose -f docker-compose-hive.yml up -d datanode hive-metastore
docker-compose -f docker-compose-hive.yml up -d hive-server
docker-compose -f docker-compose-hive.yml up -d spark-master spark-worker spark-notebook hue

Interfaces

  • Namenode: http://localhost:50070
  • Datanode: http://localhost:50075
  • Spark-master: http://localhost:8080
  • Spark-notebook: http://localhost:9001
  • Hue (HDFS Filebrowser): http://localhost:8088/home

Important

When opening Hue, you might encounter NoReverseMatch: u'about' is not a registered namespace error after login. I disabled 'about' page (which is default one), because it caused docker container to hang. To access Hue when you have such an error, you need to append /home to your URI: http://docker-host-ip:8088/home

Docs

Count Example for Spark Notebooks

val spark = SparkSession
  .builder()
  .appName("Simple Count Example")
  .getOrCreate()

val tf = spark.read.textFile("/data.csv")
tf.count()

Maintainer

  • Ivan Ermilov @earthquakesan

Note: this repository was a part of BDE H2020 EU project and no longer actively maintained by the project participants.

主要指標

概覽
名稱與所有者big-data-europe/docker-hadoop-spark-workbench
主編程語言Makefile
編程語言Shell (語言數: 2)
平台
許可證
所有者活动
創建於2016-03-21 22:26:31
推送於2020-10-01 11:30:09
最后一次提交2018-10-05 17:14:39
發布數0
用户参与
星數693
關注者數38
派生數377
提交數51
已啟用問題?
問題數60
打開的問題數18
拉請求數5
打開的拉請求數2
關閉的拉請求數5
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?