node-word2vec

Word2vec Model Reader for Node.js Client

Github stars Tracking Chart

node-word2vec

Word2vec model reader for Node.js Client.

Welcome

npm install node-word2vec-reader
var Word2vec = require("node-word2vec-reader");
var word2vec = new Word2vec();

为了保证性能,使用 Node.js C ++ Addon 模块管理词表和加载模型。

APIs

所有接口都返回 Promise

word2vec#init(model_file_path)

word2vec.init(model_file_path);

实例化后的 word2vec 先进行初始化,model_file_path是通过word2vec训练后得到的模型。

word2vec#getVocabSize()

获得词表的大小

word2vec.getVocabSize()
    .then(function(num){
        // do your magic
        })

word2vec#getEmbeddingDim()

获得词向量的维度

word2vec.getEmbeddingDim()
    .then(function(num){
        // do your magic
        })

word2vec#v(word)

获得一个词语的向量

word2vec.v("飞机")
    .then(function(vector){
        // do your magic
        })

word2vec#nearby(word, [topK])

获得一个词语最近的 k 个词语及分数。

word2vec.nearby("飞机", 10)
    .then(function(data){
        // do your magic
        })
  • 返回值 JSONArray

[[words], [scores]],包含两个列表,第一个是词语,第二个是对应位置词语的距离分数,同样是在[0~1]区间,越接近于 1 越相似。

比如:

[
    ["股市","股价","股票市场","股灾","楼市","股票","香港股市","行情","恒指","金融市场"],
    [1,0.786284,0.784575,0.751607,0.712255,0.712179,0.710806,0.694434,0.67501,0.666439]
]

word2vec#bow(words)

对传入的词语的列表返回 BoW 向量。

word2vec.bow(["飞机", "航母"])
    .then(function(vector){
        // do your magic
        })

更多详情

Contribute

admin/rebuilt.sh # 重新编译C++ Addon
admin/test.sh # 单元测试

Word2vec

word2vec 是用来训练词向量模型的工具,为了方便,将 word2vec 也放在代码库中。编译和使用 word2vec:

cd app/google
make clean
make
./word2vec

关于 Word2vec 更多信息,参考 word2vec/google/README

Give credits to

ofxMSAWord2Vec

LICENSE

Word2vec model reader for Node.js Client.
Copyright (C) 2018 Hai Liang Wanghailiang.hl.wang@gmail.com

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.

chatoper banner

Main metrics

Overview
Name With Ownerchatopera/node-word2vec
Primary LanguageC
Program languageShell (Language Count: 6)
Platform
License:Other
所有者活动
Created At2018-02-16 11:04:57
Pushed At2019-05-08 12:21:49
Last Commit At2019-05-08 07:17:08
Release Count0
用户参与
Stargazers Count15
Watchers Count1
Fork Count4
Commits Count13
Has Issues Enabled
Issues Count4
Issue Open Count1
Pull Requests Count1
Pull Requests Open Count0
Pull Requests Close Count0
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private