1 # Nit wrapper for Stanford CoreNLP
3 Stanford CoreNLP provides a set of natural language analysis tools which can take
4 raw text input and give the base forms of words, their parts of speech, whether
5 they are names of companies, people, etc., normalize dates, times, and numeric
6 quantities, and mark up the structure of sentences in terms of phrases and word
7 dependencies, indicate which noun phrases refer to the same entities, indicate
10 This wrapper needs the Stanford CoreNLP jars that run on Java 1.8+.
12 See http://nlp.stanford.edu/software/corenlp.shtml.
19 var proc = new NLPProcessor("path/to/StanfordCoreNLP/jars")
21 var doc = proc.process("String to analyze")
23 for sentence in doc.sentences do
24 for token in sentence.tokens do
25 print "{token.lemma}: {token.pos}"
32 The NLPServer provides a wrapper around the StanfordCoreNLPServer.
34 See `https://stanfordnlp.github.io/CoreNLP/corenlp-server.html`.
37 var cp = "/path/to/StanfordCoreNLP/jars"
38 var srv = new NLPServer(cp, 9000)
44 The NLPClient is used as a NLPProcessor with a NLPServer backend.
47 var cli = new NLPClient("http://localhost:9000")
48 var doc = cli.process("String to analyze")
53 NLPIndex extends the StringIndex to use a NLPProcessor to tokenize, lemmatize and
54 tag the terms of a document.
57 var index = new NLPIndex(proc)
59 var d1 = index.index_string("Doc 1", "/uri/1", "this is a sample")
60 var d2 = index.index_string("Doc 2", "/uri/2", "this and this is another example")
61 assert index.documents.length == 2
63 matches = index.match_string("this sample")
64 assert matches.first.document == d1
70 * Use options to choose CoreNLP analyzers
71 * Analyze sentences dependencies