hdfs

A native go client for HDFS

  • 所有者: colinmarc/hdfs
  • 平台:
  • 許可證: MIT License
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

HDFS for Go

GoDoc build

This is a native golang client for hdfs. It connects directly to the namenode using
the protocol buffers API.

It tries to be idiomatic by aping the stdlib os package, where possible, and
implements the interfaces from it, including os.FileInfo and os.PathError.

Here's what it looks like in action:

client, _ := hdfs.New("namenode:8020")

file, _ := client.Open("/mobydick.txt")

buf := make([]byte, 59)
file.ReadAt(buf, 48847)

fmt.Println(string(buf))
// => Abominable are the tumblers into which he pours his poison.

For complete documentation, check out the Godoc.

The hdfs Binary

Along with the library, this repo contains a commandline client for HDFS. Like
the library, its primary aim is to be idiomatic, by enabling your favorite unix
verbs:

$ hdfs --help
Usage: hdfs COMMAND
The flags available are a subset of the POSIX ones, but should behave similarly.

Valid commands:
  ls [-lah] [FILE]...
  rm [-rf] FILE...
  mv [-fT] SOURCE... DEST
  mkdir [-p] FILE...
  touch [-amc] FILE...
  chmod [-R] OCTAL-MODE FILE...
  chown [-R] OWNER[:GROUP] FILE...
  cat SOURCE...
  head [-n LINES, -c BYTES] SOURCE...
  tail [-n LINES, -c BYTES] SOURCE...
  du [-sh] FILE...
  checksum FILE...
  get SOURCE [DEST]
  getmerge SOURCE DEST
  put SOURCE DEST

Since it doesn't have to wait for the JVM to start up, it's also a lot faster
hadoop -fs:

$ time hadoop fs -ls / > /dev/null

real  0m2.218s
user  0m2.500s
sys 0m0.376s

$ time hdfs ls / > /dev/null

real  0m0.015s
user  0m0.004s
sys 0m0.004s

Best of all, it comes with bash tab completion for paths!

Installing the commandline client

Grab a tarball from the releases page
and unzip it wherever you like.

To configure the client, make sure one or both of these environment variables
point to your Hadoop configuration (core-site.xml and hdfs-site.xml). On
systems with Hadoop installed, they should already be set.

$ export HADOOP_HOME="/etc/hadoop"
$ export HADOOP_CONF_DIR="/etc/hadoop/conf"

To install tab completion globally on linux, copy or link the bash_completion
file which comes with the tarball into the right place:

$ ln -sT bash_completion /etc/bash_completion.d/gohdfs

By default on non-kerberized clusters, the HDFS user is set to the
currently-logged-in user. You can override this with another environment
variable:

$ export HADOOP_USER_NAME=username

Using the commandline client with Kerberos authentication

Like hadoop fs, the commandline client expects a ccache file in the default
location: /tmp/krb5cc_<uid>. That means it should 'just work' to use kinit:

$ kinit bob@EXAMPLE.com
$ hdfs ls /

If that doesn't work, try setting the KRB5CCNAME environment variable to
wherever you have the ccache saved.

Compatibility

This library uses "Version 9" of the HDFS protocol, which means it should work
with hadoop distributions based on 2.2.x and above. The tests run against CDH
5.x and HDP 2.x.

Acknowledgements

This library is heavily indebted to snakebite.

主要指標

概覽
名稱與所有者colinmarc/hdfs
主編程語言Go
編程語言Shell (語言數: 3)
平台
許可證MIT License
所有者活动
創建於2014-10-08 19:37:57
推送於2025-01-22 22:13:07
最后一次提交2025-01-22 23:13:07
發布數21
最新版本名稱v2.4.0 (發布於 )
第一版名稱v0.1.0 (發布於 )
用户参与
星數1.4k
關注者數36
派生數354
提交數469
已啟用問題?
問題數198
打開的問題數41
拉請求數52
打開的拉請求數10
關閉的拉請求數88
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?