ParaphraseAutoencoder-octave

Minor changes to Richard Socher's Recursive Autoenocder code to work with GNU Octave. (http://www.socher.org/index.php/Main/DynamicPoolingAndUnfoldingRecursiveAutoencodersForParaphraseDetection)

Github stars Tracking Chart

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
% Code %
% Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection %
% Richard Socher, Eric H. Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning %
% Advances in Neural Information Processing Systems (NIPS 2011) %
% See http://www.socher.org for more information or to ask questions %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

This code computes phrase vectors based on a trained,
unfolding recursive neural network as described in
the above paper.
It is designed to be easy to use, all you need to do
is to put phrases for which you want to compute a
compositional vector into a text file, one phrase or
sentence per line. The output will be another
textfile with the vectors.

This code is provided as is. It is free for
academic, non-commercial purposes.
For questions, please contact richard @ socher .org

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
% Installation %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

  • The code runs on any linux machine with bash,
    matlab and java installed.

  • After unpacking the zip file go to folder and make
    sure the executables have permission:

chmod 777 phrase2Vector.sh
chmod 777 stanford-parser-2011-09-14/lexparser.sh

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
% Running the Code %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

  • You can see how it creates vectors for phrases
    in the input.txt file by just running:

    ./phrase2Vector.sh

  • To get phrase vectors for your own phrases,
    you need to change the file

    input.txt

  • Each line of the input.txt file should contain a
    phrase or sentence.

  • The code will the produce as output a text file:

    outVectors.txt

  • In this file, the nth line of the file is the
    vector for the nth phrase in the input.txt file.

    ./phrase2Vector.sh

  • For debugging purposes, the program also output
    the file phrases.txt, which shows which words
    were unknown the the vocabulary.

  • In summary:

INPUT
input.txt: one phrase or sentence per line for
which you want to compute vector representations

OUTPUT
outVectors.txt - the nth line of the file is
the vector for the nth phrase
phrases.txt - shows which words were in our
dictionary and which ones are unknown

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
% Included Packages %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

This archive includes 2 external packages for convenience:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
% Bibtex %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

If you use the code, please cite:

@incollection{SocherEtAl2011:PoolRAE,
title = {{Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection}},
author = {Richard Socher and Eric H. Huang and Jeffrey Pennington and Andrew Y. Ng and Christopher D. Manning},
booktitle = {{Advances in Neural Information Processing Systems 24}},
year = {2011}
}

Main metrics

Overview
Name With Ownerjeremysalwen/ParaphraseAutoencoder-octave
Primary LanguageShell
Program languageMatlab (Language Count: 4)
Platform
License:
所有者活动
Created At2014-02-27 20:25:54
Pushed At2014-02-27 20:29:21
Last Commit At2014-02-27 20:23:09
Release Count0
用户参与
Stargazers Count15
Watchers Count3
Fork Count5
Commits Count7
Has Issues Enabled
Issues Count1
Issue Open Count1
Pull Requests Count0
Pull Requests Open Count0
Pull Requests Close Count0
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private