php-stanford-corenlp-adapter

PHP adapter for Stanford CoreNLP

Github stars Tracking Chart

PHP Stanford CoreNLP adapter

Version
Total Downloads
Maintenance
Minimum PHP Version
License

PHP adapter for use with Stanford CoreNLP

Features

  • Connect to Stanford University CoreNLP API online
  • Connect to Stanford CoreNLP 3.7.0 server
  • Annotators available: tokenize,ssplit,pos, parse, depparse, ner, regexner,lemma, mention, natlog, coref, openie, kbp
  • The package creates Part-Of-Speech Trees with depth, parent- and child ID

Requirements

  • PHP 5.5 or higher: it also works on PHP 7
  • Windows or Linux 64-bit, 8Gb memory or more recommended
  • Either Guzzle HTTP Client (installed by default) or only cURL.
  • Composer for PHP
    https://getcomposer.org/

Update 24th February 2018

PHP7 Type hinting removed, because it was causing issues for some users.

Update 28th January 2019

Fixed issue with PHP 7.1 upwards

Installation using ZIP files

  • Install Stanford CoreNLP Server. See the installation walkthrough below.
  • Download and unpack the files from this package.
  • Copy the files to your to your webserver directory. Usually "htdocs" or "var/www".
  • Run a Composer update

Installation using Composer

  • Insert the following line into the "require" of your "composer.json" file.
    {
        "require": {
            "dennis-de-swart/php-stanford-corenlp-adapter": "*"
        }
    }
  • Run a composer update

Using the Stanford CoreNLP online API service

The adapter by default uses Stanford's online API service. This should work right after the composer update.
Note that the online API is a public service. If you want to analyze large volumes of text or sensitive data,
please install the Java server version.

OpenIE

OpenIE creates "subject-relation-object" tuples. This is similar (but not the same) as the "Subject-Verb-Object" concept of the English language.

Notes:

  • OpenIE is only available on the Java offline version, not with the "online" mode. See the installation walkthrough below
  • OpenIE data is not always available. Sometimes the result array might show empty, this is not an error.
http://nlp.stanford.edu/software/openie.html
https://en.wikipedia.org/wiki/Subject-verb-object

Installation / Walkthrough for Java server version

Step 1: install Java

https://java.com/en/download/help/index_installing.xml?os=All+Platforms&j=8&n=20

Step 2: installing the Stanford CoreNLP 3.7.0 server

http://stanfordnlp.github.io/CoreNLP/index.html#download

Step 3: Port for server

Default port for the Java server is port 9000. If port 9000 is not available you can change the port in the "bootstrap.php" file. Example:

define('CURLURL' , 'http://localhost:9000/');

Step 4: Start the CoreNLP serve from the command line.

Go to the download directory, then enter the following command:

java -mx8g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000

Important note: the Stanford manual says "-mx4g", however I found that this can lead to a Java OutOfMemory error. It is also important to use a 64-bit operating system with at enough memory (8Gb or more recommended)

Step 5: Test if the server has started by surfing to it's URL

http://localhost:9000/

When you surf to this URL, you should see the CoreNLP GUI. If you have problems with installation you can check the manual:

http://stanfordnlp.github.io/CoreNLP/corenlp-server.html

Step 6: Set ONLINE_API to FALSE

In "bootstrap.php" set define('ONLINE_API' , FALSE). This tells the Adapter to use the Java version

Usage examples

Instantiate the adapter:

$coreNLP = new CorenlpAdapter();

To process a text, call the "getOutput" method:

 $text = 'The Golden Gate Bridge was designed by Joseph Strauss.'; 
 $coreNLP->getOutput($text);

Note that the first time that you process a text, the server takes about 20 to 30 seconds extra to load definitions. All other calls to the server after that will be much faster. Small texts are usually processed within seconds.

The results

If successful the following properties will be available:

 $coreNLP->serverMemory;      //contains all of the server output
 $coreNLP->trees;             //contains processed flat trees. Each part of the tree is assigned an ID key
 
 $coreNLP->getWordValues($coreNLP->trees[1])  // get just the words from a tree

Diagram A: Tree With Tokens


Array
(
   [1] => Array
       (
           [parent] => 
           [pennTreebankTag] => ROOT
           [depth] => 0
       )

   [2] => Array
       (
           [parent] => 1
           [pennTreebankTag] => S
           [depth] => 2
       )

   [3] => Array
       (
           [parent] => 2
           [pennTreebankTag] => NP
           [depth] => 4
       )

   [4] => Array
       (
           [parent] => 3
           [pennTreebankTag] => PRP
           [depth] => 6
           [word] => I
           [index] => 1
           [originalText] => I
           [lemma] => I
           [characterOffsetBegin] => 0
           [characterOffsetEnd] => 1
           [pos] => PRP
           [ner] => O
           [before] => 
           [after] =>  
           [openIE] => Array
               (
                   [0] => subject
                   [1] => subject
                   [2] => subject
               )

       )

   [5] => Array
       (
           [parent] => 2
           [pennTreebankTag] => VP
           [depth] => 4
       )

   [6] => Array
       (
           [parent] => 5
           [pennTreebankTag] => MD
           [depth] => 6
           [word] => will
           [index] => 2
           [originalText] => will
           [lemma] => will
           [characterOffsetBegin] => 2
           [characterOffsetEnd] => 6
           [pos] => MD
           [ner] => O
           [before] =>  
           [after] =>  
           [openIE] => Array
               (
                   [0] => subject
                   [1] => subject
                   [2] => relation
               )

       )

   [7] => Array
       (
           [parent] => 5
           [pennTreebankTag] => VP
           [depth] => 6
       )

   [8] => Array
       (
           [parent] => 7
           [pennTreebankTag] => VB
           [depth] => 8
           [word] => meet
           [index] => 3
           [originalText] => meet
           [lemma] => meet
           [characterOffsetBegin] => 7
           [characterOffsetEnd] => 11
           [pos] => VB
           [ner] => O
           [before] =>  
           [after] =>  
           [openIE] => Array
               (
                   [0] => subject
                   [1] => subject
                   [2] => relation
               )

       )

   [9] => Array
       (
           [parent] => 7
           [pennTreebankTag] => NP
           [depth] => 8
       )

   [10] => Array
       (
           [parent] => 9
           [pennTreebankTag] => NP
           [depth] => 10
       )

   [11] => Array
       (
           [parent] => 10
           [pennTreebankTag] => NNP
           [depth] => 12
           [word] => Mary
           [index] => 4
           [originalText] => Mary
           [lemma] => Mary
           [characterOffsetBegin] => 12
           [characterOffsetEnd] => 16
           [pos] => NNP
           [ner] => PERSON
           [before] =>  
           [after] =>  
           [openIE] => Array
               (
                   [1] => subject
                   [2] => object
                   [3] => subject
                   [0] => subject
               )

       )

   [12] => Array
       (
           [parent] => 9
           [pennTreebankTag] => PP
           [depth] => 10
       )

   [13] => Array
       (
           [parent] => 12
           [pennTreebankTag] => IN
           [depth] => 12
           [word] => in
           [index] => 5
           [originalText] => in
           [lemma] => in
           [characterOffsetBegin] => 17
           [characterOffsetEnd] => 19
           [pos] => IN
           [ner] => O
           [before] =>  
           [after] =>  
           [openIE] => Array
               (
                   [1] => relation
                   [3] => relation
                   [0] => relation
               )

       )

   [14] => Array
       (
           [parent] => 12
           [pennTreebankTag] => NP
           [depth] => 12
       )

   [15] => Array
       (
           [parent] => 14
           [pennTreebankTag] => NNP
           [depth] => 14
           [word] => New
           [index] => 6
           [originalText] => New
           [lemma] => New
           [characterOffsetBegin] => 20
           [characterOffsetEnd] => 23
           [pos] => NNP
           [ner] => LOCATION
           [before] =>  
           [after] =>  
           [openIE] => Array
               (
                   [1] => relation
                   [3] => object
                   [0] => object
               )

       )

   [16] => Array
       (
           [parent] => 14
           [pennTreebankTag] => NNP
           [depth] => 14
           [word] => York
           [index] => 7
           [originalText] => York
           [lemma] => York
           [characterOffsetBegin] => 24
           [characterOffsetEnd] => 28
           [pos] => NNP
           [ner] => LOCATION
           [before] =>  
           [after] =>  
           [openIE] => Array
               (
                   [1] => object
                   [3] => object
               )

       )

   [17] => Array
       (
           [parent] => 7
           [pennTreebankTag] => PP
           [depth] => 8
       )

   [18] => Array
       (
           [parent] => 17
           [pennTreebankTag] => IN
           [depth] => 10
           [word] => at
           [index] => 8
           [originalText] => at
           [lemma] => at
           [characterOffsetBegin] => 29
           [characterOffsetEnd] => 31
           [pos] => IN
           [ner] => O
           [before] =>  
           [after] =>  
           [openIE] => Array
               (
                   [1] => object
               )

       )

   [19] => Array
       (
           [parent] => 17
           [pennTreebankTag] => NP
           [depth] => 10
       )

   [20] => Array
       (
           [parent] => 19
           [pennTreebankTag] => CD
           [depth] => 12
           [word] => 10pm
           [index] => 9
           [originalText] => 10pm
           [lemma] => 10pm
           [characterOffsetBegin] => 32
           [characterOffsetEnd] => 36
           [pos] => CD
           [ner] => TIME
           [normalizedNER] => T22:00
           [before] =>  
           [after] => 
           [timex] => Array
               (
                   [tid] => t1
                   [type] => TIME
                   [value] => T22:00
               )

           [openIE] => Array
               (
                   [0] => object
                   [1] => object
               )

       )

)


Diagram B: The ServerMemory contains all the server data


Array
(
    [0] => Array
        (
            [sentences] => Array
                (
                    [0] => Array
                        (
                            [index] => 0
                            [parse] => (ROOT
  (S
    (NP (PRP I))
    (VP (MD will)
      (VP (VB meet)
        (NP
          (NP (NNP Mary))
          (PP (IN in)
            (NP (NNP New) (NNP York))))
        (PP (IN at)
          (NP (CD 10pm)))))))
                            [basic-dependencies] => Array
                                (
                                    [0] => Array
                                        (
                                            [dep] => ROOT
                                            [governor] => 0
                                            [governorGloss] => ROOT
                                            [dependent] => 3
                                            [dependentGloss] => meet
                                        )

                                    [1] => Array
                                        (
                                            [dep] => nsubj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 1
                                            [dependentGloss] => I
                                        )

                                    [2] => Array
                                        (
                                            [dep] => aux
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 2
                                            [dependentGloss] => will
                                        )

                                    [3] => Array
                                        (
                                            [dep] => dobj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 4
                                            [dependentGloss] => Mary
                                        )

                                    [4] => Array
                                        (
                                            [dep] => case
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 5
                                            [dependentGloss] => in
                                        )

                                    [5] => Array
                                        (
                                            [dep] => compound
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 6
                                            [dependentGloss] => New
                                        )

                                    [6] => Array
                                        (
                                            [dep] => nmod
                                            [governor] => 4
                                            [governorGloss] => Mary
                                            [dependent] => 7
                                            [dependentGloss] => York
                                        )

                                    [7] => Array
                                        (
                                            [dep] => case
                                            [governor] => 9
                                            [governorGloss] => 10pm
                                            [dependent] => 8
                                            [dependentGloss] => at
                                        )

                                    [8] => Array
                                        (
                                            [dep] => nmod
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 9
                                            [dependentGloss] => 10pm
                                        )

                                )

                            [collapsed-dependencies] => Array
                                (
                                    [0] => Array
                                        (
                                            [dep] => ROOT
                                            [governor] => 0
                                            [governorGloss] => ROOT
                                            [dependent] => 3
                                            [dependentGloss] => meet
                                        )

                                    [1] => Array
                                        (
                                            [dep] => nsubj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 1
                                            [dependentGloss] => I
                                        )

                                    [2] => Array
                                        (
                                            [dep] => aux
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 2
                                            [dependentGloss] => will
                                        )

                                    [3] => Array
                                        (
                                            [dep] => dobj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 4
                                            [dependentGloss] => Mary
                                        )

                                    [4] => Array
                                        (
                                            [dep] => case
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 5
                                            [dependentGloss] => in
                                        )

                                    [5] => Array
                                        (
                                            [dep] => compound
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 6
                                            [dependentGloss] => New
                                        )

                                    [6] => Array
                                        (
                                            [dep] => nmod:in
                                            [governor] => 4
                                            [governorGloss] => Mary
                                            [dependent] => 7
                                            [dependentGloss] => York
                                        )

                                    [7] => Array
                                        (
                                            [dep] => case
                                            [governor] => 9
                                            [governorGloss] => 10pm
                                            [dependent] => 8
                                            [dependentGloss] => at
                                        )

                                    [8] => Array
                                        (
                                            [dep] => nmod:at
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 9
                                            [dependentGloss] => 10pm
                                        )

                                )

                            [collapsed-ccprocessed-dependencies] => Array
                                (
                                    [0] => Array
                                        (
                                            [dep] => ROOT
                                            [governor] => 0
                                            [governorGloss] => ROOT
                                            [dependent] => 3
                                            [dependentGloss] => meet
                                        )

                                    [1] => Array
                                        (
                                            [dep] => nsubj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 1
                                            [dependentGloss] => I
                                        )

                                    [2] => Array
                                        (
                                            [dep] => aux
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 2
                                            [dependentGloss] => will
                                        )

                                    [3] => Array
                                        (
                                            [dep] => dobj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 4
                                            [dependentGloss] => Mary
                                        )

                                    [4] => Array
                                        (
                                            [dep] => case
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 5
                                            [dependentGloss] => in
                                        )

                                    [5] => Array
                                        (
                                            [dep] => compound
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 6
                                            [dependentGloss] => New
                                        )

                                    [6] => Array
                                        (
                                            [dep] => nmod:in
                                            [governor] => 4
                                            [governorGloss] => Mary
                                            [dependent] => 7
                                            [dependentGloss] => York
                                        )

                                    [7] => Array
                                        (
                                            [dep] => case
                                            [governor] => 9
                                            [governorGloss] => 10pm
                                            [dependent] => 8
                                            [dependentGloss] => at
                                        )

                                    [8] => Array
                                        (
                                            [dep] => nmod:at
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 9
                                            [dependentGloss] => 10pm
                                        )

                                )

                            [openie] => Array
                                (
                                    [0] => Array
                                        (
                                            [subject] => I
                                            [subjectSpan] => Array
                                                (
                                                    [0] => 0
                                                    [1] => 1
                                                )

                                            [relation] => will meet Mary at
                                            [relationSpan] => Array
                                                (
                                                    [0] => 1
                                                    [1] => 3
                                                )

                                            [object] => 10pm
                                            [objectSpan] => Array
                                                (
                                                    [0] => 8
                                                    [1] => 9
                                                )

                                        )

                                    [1] => Array
                                        (
                                            [subject] => I
                                            [subjectSpan] => Array
                                                (
                                                    [0] => 0
                                                    [1] => 1
                                                )

                                            [relation] => will meet
                                            [relationSpan] => Array
                                                (
                                                    [0] => 1
                                                    [1] => 3
                                                )

                                            [object] => Mary in New York
                                            [objectSpan] => Array
                                                (
                                                    [0] => 3
                                                    [1] => 7
                                                )

                                        )

                                    [2] => Array
                                        (
                                            [subject] => I
                                            [subjectSpan] => Array
                                                (
                                                    [0] => 0
                                                    [1] => 1
                                                )

                                            [relation] => will meet
                                            [relationSpan] => Array
                                                (
                                                    [0] => 1
                                                    [1] => 3
                                                )

                                            [object] => Mary
                                            [objectSpan] => Array
                                                (
                                                    [0] => 3
                                                    [1] => 4
                                                )

                                        )

                                    [3] => Array
                                        (
                                            [subject] => Mary
                                            [subjectSpan] => Array
                                                (
                                                    [0] => 3
                                                    [1] => 4
                                                )

                                            [relation] => is in
                                            [relationSpan] => Array
                                                (
                                                    [0] => 4
                                                    [1] => 5
                                                )

                                            [object] => New York
                                            [objectSpan] => Array
                                                (
                                                    [0] => 5
                                                    [1] => 7
                                                )

                                        )

                                )

                            [tokens] => Array
                                (
                                    [0] => Array
                                        (
                                            [index] => 1
                                            [word] => I
                                            [originalText] => I
                                            [lemma] => I
                                            [characterOffsetBegin] => 0
                                            [characterOffsetEnd] => 1
                                            [pos] => PRP
                                            [ner] => O
                                            [before] => 
                                            [after] =>  
                                        )

                                    [1] => Array
                                        (
                                            [index] => 2
                                            [word] => will
                                            [originalText] => will
                                            [lemma] => will
                                            [characterOffsetBegin] => 2
                                            [characterOffsetEnd] => 6
                                            [pos] => MD
                                            [ner] => O
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [2] => Array
                                        (
                                            [index] => 3
                                            [word] => meet
                                            [originalText] => meet
                                            [lemma] => meet
                                            [characterOffsetBegin] => 7
                                            [characterOffsetEnd] => 11
                                            [pos] => VB
                                            [ner] => O
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [3] => Array
                                        (
                                            [index] => 4
                                            [word] => Mary
                                            [originalText] => Mary
                                            [lemma] => Mary
                                            [characterOffsetBegin] => 12
                                            [characterOffsetEnd] => 16
                                            [pos] => NNP
                                            [ner] => PERSON
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [4] => Array
                                        (
                                            [index] => 5
                                            [word] => in
                                            [originalText] => in
                                            [lemma] => in
                                            [characterOffsetBegin] => 17
                                            [characterOffsetEnd] => 19
                                            [pos] => IN
                                            [ner] => O
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [5] => Array
                                        (
                                            [index] => 6
                                            [word] => New
                                            [originalText] => New
                                            [lemma] => New
                                            [characterOffsetBegin] => 20
                                            [characterOffsetEnd] => 23
                                            [pos] => NNP
                                            [ner] => LOCATION
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [6] => Array
                                        (
                                            [index] => 7
                                            [word] => York
                                            [originalText] => York
                                            [lemma] => York
                                            [characterOffsetBegin] => 24
                                            [characterOffsetEnd] => 28
                                            [pos] => NNP
                                            [ner] => LOCATION
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [7] => Array
                                        (
                                            [index] => 8
                                            [word] => at
                                            [originalText] => at
                                            [lemma] => at
                                            [characterOffsetBegin] => 29
                                            [characterOffsetEnd] => 31
                                            [pos] => IN
                                            [ner] => O
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [8] => Array
                                        (
                                            [index] => 9
                                            [word] => 10pm
                                            [originalText] => 10pm
                                            [lemma] => 10pm
                                            [characterOffsetBegin] => 32
                                            [characterOffsetEnd] => 36
                                            [pos] => CD
                                            [ner] => TIME
                                            [normalizedNER] => T22:00
                                            [before] =>  
                                            [after] => 
                                            [timex] => Array
                                                (
                                                    [tid] => t1
                                                    [type] => TIME
                                                    [value] => T22:00
                                                )

                                        )

                                )

                        )

                )

        )

Any questions?

Please let me know.

Credits

Some functions are forked from this "Stanford parser" package:

 https://github.com/agentile/PHP-Stanford-NLP

Main metrics

Overview
Name With OwnerDennisDeSwart/php-stanford-corenlp-adapter
Primary LanguagePHP
Program languagePHP (Language Count: 1)
Platform
License:
所有者活动
Created At2016-10-04 18:43:32
Pushed At2020-01-01 13:27:18
Last Commit At2020-01-01 14:27:11
Release Count4
Last Release Name7.1.0 (Posted on )
First Release Name4.0.0 (Posted on )
用户参与
Stargazers Count26
Watchers Count1
Fork Count4
Commits Count107
Has Issues Enabled
Issues Count8
Issue Open Count2
Pull Requests Count0
Pull Requests Open Count0
Pull Requests Close Count0
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private