ParaphraseAutoencoder-octave

Minor changes to Richard Socher's Recursive Autoenocder code to work with GNU Octave. (http://www.socher.org/index.php/Main/DynamicPoolingAndUnfoldingRecursiveAutoencodersForParaphraseDetection)

Github星跟踪图

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
% Code %
% Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection %
% Richard Socher, Eric H. Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning %
% Advances in Neural Information Processing Systems (NIPS 2011) %
% See http://www.socher.org for more information or to ask questions %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

This code computes phrase vectors based on a trained,
unfolding recursive neural network as described in
the above paper.
It is designed to be easy to use, all you need to do
is to put phrases for which you want to compute a
compositional vector into a text file, one phrase or
sentence per line. The output will be another
textfile with the vectors.

This code is provided as is. It is free for
academic, non-commercial purposes.
For questions, please contact richard @ socher .org

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
% Installation %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

  • The code runs on any linux machine with bash,
    matlab and java installed.

  • After unpacking the zip file go to folder and make
    sure the executables have permission:

chmod 777 phrase2Vector.sh
chmod 777 stanford-parser-2011-09-14/lexparser.sh

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
% Running the Code %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

  • You can see how it creates vectors for phrases
    in the input.txt file by just running:

    ./phrase2Vector.sh

  • To get phrase vectors for your own phrases,
    you need to change the file

    input.txt

  • Each line of the input.txt file should contain a
    phrase or sentence.

  • The code will the produce as output a text file:

    outVectors.txt

  • In this file, the nth line of the file is the
    vector for the nth phrase in the input.txt file.

    ./phrase2Vector.sh

  • For debugging purposes, the program also output
    the file phrases.txt, which shows which words
    were unknown the the vocabulary.

  • In summary:

INPUT
input.txt: one phrase or sentence per line for
which you want to compute vector representations

OUTPUT
outVectors.txt - the nth line of the file is
the vector for the nth phrase
phrases.txt - shows which words were in our
dictionary and which ones are unknown

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
% Included Packages %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

This archive includes 2 external packages for convenience:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
% Bibtex %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

If you use the code, please cite:

@incollection{SocherEtAl2011:PoolRAE,
title = {{Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection}},
author = {Richard Socher and Eric H. Huang and Jeffrey Pennington and Andrew Y. Ng and Christopher D. Manning},
booktitle = {{Advances in Neural Information Processing Systems 24}},
year = {2011}
}

主要指标

概览
名称与所有者jeremysalwen/ParaphraseAutoencoder-octave
主编程语言Shell
编程语言Matlab (语言数: 4)
平台
许可证
所有者活动
创建于2014-02-27 20:25:54
推送于2014-02-27 20:29:21
最后一次提交2014-02-27 20:23:09
发布数0
用户参与
星数15
关注者数3
派生数5
提交数7
已启用问题?
问题数1
打开的问题数1
拉请求数0
打开的拉请求数0
关闭的拉请求数0
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?