nit.git
6 years agodoc/commands: introduce docdown related commands
Alexandre Terrasa [Tue, 24 Oct 2017 03:39:00 +0000 (23:39 -0400)]
doc/commands: introduce docdown related commands

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/docdown: render mdoc as markdown
Alexandre Terrasa [Wed, 25 Oct 2017 01:53:24 +0000 (21:53 -0400)]
doc/docdown: render mdoc as markdown

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/commands: introduce commands group
Alexandre Terrasa [Tue, 24 Oct 2017 03:43:11 +0000 (23:43 -0400)]
doc/commands: introduce commands group

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/commands: introduce html rendering for commands
Alexandre Terrasa [Tue, 24 Oct 2017 03:37:44 +0000 (23:37 -0400)]
doc/commands: introduce html rendering for commands

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/templates: introduce model to html translations
Alexandre Terrasa [Thu, 26 Oct 2017 19:01:39 +0000 (15:01 -0400)]
doc/templates: introduce model to html translations

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/html: bootstrap use optional annotation
Alexandre Terrasa [Wed, 25 Oct 2017 02:01:36 +0000 (22:01 -0400)]
lib/html: bootstrap use optional annotation

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/commands: introduce commands results to json translation
Alexandre Terrasa [Tue, 24 Oct 2017 20:16:56 +0000 (16:16 -0400)]
doc/commands: introduce commands results to json translation

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/commands: introduce command initialization from HTTP requests
Alexandre Terrasa [Tue, 24 Oct 2017 20:35:36 +0000 (16:35 -0400)]
doc/commands: introduce command initialization from HTTP requests

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/commands: introduce commands parser
Alexandre Terrasa [Tue, 24 Oct 2017 03:27:16 +0000 (23:27 -0400)]
doc/commands: introduce commands parser

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/commands: introduce catalog commands
Alexandre Terrasa [Tue, 24 Oct 2017 21:51:01 +0000 (17:51 -0400)]
doc/commands: introduce catalog commands

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/commands: introduce usage commands
Alexandre Terrasa [Tue, 24 Oct 2017 03:15:49 +0000 (23:15 -0400)]
doc/commands: introduce usage commands

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/commands: introduce graph commands
Alexandre Terrasa [Tue, 24 Oct 2017 03:15:35 +0000 (23:15 -0400)]
doc/commands: introduce graph commands

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/commands: introduce model commands
Alexandre Terrasa [Tue, 24 Oct 2017 03:15:27 +0000 (23:15 -0400)]
doc/commands: introduce model commands

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agodoc/commands: introduce doc commands
Alexandre Terrasa [Tue, 24 Oct 2017 02:37:22 +0000 (22:37 -0400)]
doc/commands: introduce doc commands

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonitweb: use catalog_json
Alexandre Terrasa [Tue, 24 Oct 2017 20:56:47 +0000 (16:56 -0400)]
nitweb: use catalog_json

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agocatalog: introduce catalog to json translation
Alexandre Terrasa [Tue, 24 Oct 2017 20:56:34 +0000 (16:56 -0400)]
catalog: introduce catalog to json translation

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agocatalog: create a group for catalog modules
Alexandre Terrasa [Tue, 24 Oct 2017 20:56:17 +0000 (16:56 -0400)]
catalog: create a group for catalog modules

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomarkdown: reset headlines collection between two processing
Alexandre Terrasa [Tue, 24 Oct 2017 02:24:51 +0000 (22:24 -0400)]
markdown: reset headlines collection between two processing

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel_collect: fix collect_ancestors
Alexandre Terrasa [Sat, 14 Oct 2017 03:12:35 +0000 (23:12 -0400)]
model_collect: fix collect_ancestors

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel_views: introduce `mentities_by_name` in views
Alexandre Terrasa [Fri, 13 Oct 2017 17:48:12 +0000 (13:48 -0400)]
model_views: introduce `mentities_by_name` in views

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agoMerge: Model filters: extract filters from ModelVisitor
Jean Privat [Mon, 13 Nov 2017 16:03:20 +0000 (11:03 -0500)]
Merge: Model filters: extract filters from ModelVisitor

A list of filters that can be applied on a MEntity

By default ModelFilter accepts all mentity.

~~~nit
var filter = new ModelFilter
assert filter.accept_mentity(my_mentity) == true
~~~

To quickly configure the filters, options can be passed to the constructor:
~~~nit
var filter = new ModelFilter(
        min_visibility = protected_visibility,
accept_fictive = false,
accept_test = false,
accept_redef = false,
accept_extern = false,
accept_attribute = false,
accept_empty_doc = false
)
~~~

With this, one can use temporary filters with the model visitors and views:

~~~nit
var default_filter = new ModelFilter(private_visibility)
var view = new ModelView(view, default_filter)
# ...
if view.accept_mentity(mentity) then
   # ...
end
# ...
var custom_filter = new ModelFilter(public_visibility)
if view.accept_mentity(mentity, custom_filter) then
   # ...
end
~~~

Pull-Request: #2567
Reviewed-by: Jean Privat <jean@pryen.org>

6 years agonitdoc: protect package access when ModelFilters allows fictive modules
Alexandre Terrasa [Mon, 16 Oct 2017 23:53:43 +0000 (19:53 -0400)]
nitdoc: protect package access when ModelFilters allows fictive modules

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agotests: fix tests for model visitor
Alexandre Terrasa [Sun, 22 Oct 2017 21:08:18 +0000 (17:08 -0400)]
tests: fix tests for model visitor

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agotests: fix tests for model filters
Alexandre Terrasa [Mon, 16 Oct 2017 23:52:33 +0000 (19:52 -0400)]
tests: fix tests for model filters

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agotests: fix tests for model index
Alexandre Terrasa [Mon, 16 Oct 2017 23:44:13 +0000 (19:44 -0400)]
tests: fix tests for model index

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonituml: use model filters
Alexandre Terrasa [Sun, 22 Oct 2017 21:11:32 +0000 (17:11 -0400)]
nituml: use model filters

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonitweb: use model filters
Alexandre Terrasa [Thu, 19 Oct 2017 00:23:53 +0000 (20:23 -0400)]
nitweb: use model filters

Also introduce new options to control filters

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonitdoc: use model filters
Alexandre Terrasa [Thu, 17 Aug 2017 20:07:11 +0000 (16:07 -0400)]
nitdoc: use model filters

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonitmetrics: use model filters
Alexandre Terrasa [Tue, 7 Nov 2017 17:12:54 +0000 (12:12 -0500)]
nitmetrics: use model filters

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel: json use filters
Alexandre Terrasa [Fri, 10 Nov 2017 17:28:38 +0000 (12:28 -0500)]
model: json use filters

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agoMerge: NLP: More Natural Language Processing features
Jean Privat [Wed, 8 Nov 2017 20:34:30 +0000 (15:34 -0500)]
Merge: NLP: More Natural Language Processing features

# Nit wrapper for Stanford CoreNLP

Stanford CoreNLP provides a set of natural language analysis tools which can take
raw text input and give the base forms of words, their parts of speech, whether
they are names of companies, people, etc., normalize dates, times, and numeric
quantities, and mark up the structure of sentences in terms of phrases and word
dependencies, indicate which noun phrases refer to the same entities, indicate
sentiment, etc.

This wrapper needs the Stanford CoreNLP jars that run on Java 1.8+.

See http://nlp.stanford.edu/software/corenlp.shtml.

## NLPProcessor

### Java client

~~~nit
var proc = new NLPProcessor("path/to/StanfordCoreNLP/jars")

var doc = proc.process("String to analyze")

for sentence in doc.sentences do
for token in sentence.tokens do
print "{token.lemma}: {token.pos}"
end
end
~~~

### NLPServer

The NLPServer provides a wrapper around the StanfordCoreNLPServer.

See `https://stanfordnlp.github.io/CoreNLP/corenlp-server.html`.

~~~nit
var cp = "/path/to/StanfordCoreNLP/jars"
var srv = new NLPServer(cp, 9000)
srv.start
~~~

### NLPClient

The NLPClient is used as a NLPProcessor with a NLPServer backend.

~~~nit
var cli = new NLPClient("http://localhost:9000")
var doc = cli.process("String to analyze")
~~~

## NLPIndex

NLPIndex extends the StringIndex to use a NLPProcessor to tokenize, lemmatize and
tag the terms of a document.

~~~nit
var index = new NLPIndex(proc)

var d1 = index.index_string("Doc 1", "/uri/1", "this is a sample")
var d2 = index.index_string("Doc 2", "/uri/2", "this and this is another example")
assert index.documents.length == 2

matches = index.match_string("this sample")
assert matches.first.document == d1
~~~

Pull-Request: #2566
Reviewed-by: Jean Privat <jean@pryen.org>

6 years agoMerge: nitcc: add nfa transformation to remove epsilon
Jean Privat [Wed, 8 Nov 2017 20:34:29 +0000 (15:34 -0500)]
Merge: nitcc: add nfa transformation to remove epsilon

A NFA can be transformed to another NFA without epsilon-transition.
This is mostly useless but can make large NFA more readable.

Also add some bugfixes

Pull-Request: #2573

6 years agomodel: views use filters
Alexandre Terrasa [Thu, 17 Aug 2017 20:04:48 +0000 (16:04 -0400)]
model: views use filters

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel: visitor uses filters
Alexandre Terrasa [Sun, 22 Oct 2017 21:06:04 +0000 (17:06 -0400)]
model: visitor uses filters

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel: introduce filters
Alexandre Terrasa [Fri, 29 Sep 2017 21:47:43 +0000 (17:47 -0400)]
model: introduce filters

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel: tag groups as tests
Alexandre Terrasa [Fri, 29 Sep 2017 21:47:59 +0000 (17:47 -0400)]
model: tag groups as tests

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agocontrib/rss_downloader: fix nullable usage for title and link
Alexandre Terrasa [Mon, 6 Nov 2017 19:40:23 +0000 (14:40 -0500)]
contrib/rss_downloader: fix nullable usage for title and link

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agoMerge: model: ModelView uses mainmodule
Jean Privat [Fri, 27 Oct 2017 14:06:42 +0000 (10:06 -0400)]
Merge: model: ModelView uses mainmodule

ModelView uses a mainmodule to flatten mclass hierarchies.

Pull-Request: #2568

6 years agoMerge: json::dynamic: extend `get` to support arrays and keys with dots
Jean Privat [Fri, 27 Oct 2017 14:05:50 +0000 (10:05 -0400)]
Merge: json::dynamic: extend `get` to support arrays and keys with dots

Extend `JsonValue::get` to support arrays and keys containing the '.' character.

As a general cleanup, remove services specific to parsing errors as clients should check errors only once, and update and standardize the documentation.

Pull-Request: #2558

6 years agoMerge: popcorn: pop_test uses NIT_TESTING_ID to determine test port
Jean Privat [Fri, 27 Oct 2017 14:03:37 +0000 (10:03 -0400)]
Merge: popcorn: pop_test uses NIT_TESTING_ID to determine test port

This PR modifies the *pop-tests* so they use `NIT_TESTING_ID` to select the listening port.

This should avoid ports conflicts when testing multiple PR at the same time on Jenkins.

Let's try this!

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

Pull-Request: #2571

6 years agoMerge: nitunit: some fixes and improvements for the `before_all`, `after_all` annotations
Jean Privat [Fri, 27 Oct 2017 14:03:36 +0000 (10:03 -0400)]
Merge: nitunit: some fixes and improvements for the `before_all`, `after_all` annotations

This PR does three things:
* fixes the importation of `before_all` and `after_all` methods
* allows use of `before_all` and `after_all` methods within classes
* changes the test execution order for imported test suites

### Fix the importation of `before_all` and `after_all` methods

First of all, this PR fixes the behavior of nitunit with multiple test_suite importation.

Be two modules `a` and `b`:

~~~nit
module a is test

class TestA
    test

    # some test cases
end

fun setup is before_all do # important things
~~~

~~~nit
module b is test
import module a

class TestB
     super TestA
     test
end
~~~

In this case, because `b` does not introduce any `before_all` method, the method from `a` was not executed.
This is fixed now.

### Class-level `before_all` and `after_all` methods

`before_all` and `after_all` can now be used inside a class definition to indicate methods that must be executed before / after all methods inside the class:

~~~nit
class TestC
    test

    fun setup is before_all do # something before all test cases of `TestC`

    # some test cases
end
~~~

### Tests execution order

Methods with `before*` and `after*` annotations are linearized and called in different ways.

* `before*` methods are called from the least specific to the most specific
* `after*` methods are called from the most specific to the least specific

~~~nit
module test_bdd_connector

import bdd_connector

# Testing the bdd_connector
class TestConnector
test
# test cases using a server
end

# Method executed before testing the module
fun setup_db is before_all do
# start server before all test cases
end

# Method executed after testing the module
fun teardown_db is after_all do
# stop server after all test cases
end
~~~

When dealing with multiple test suites, niunit allows you to import other test suites to factorize your tests:

~~~nit
module test_bdd_users

import test_bdd_connector

# Testing the user table
class TestUsersTable
test
# test cases using the db server from `test_bdd_connector`
end

fun setup_table is before_all do
# create user table
end

fun teardown_table is after_all do
# drop user table
end
~~~

In the previous example, the execution order would be:

1. `test_bdd_connector::setup_db`
2. `test_bdd_users::setup_table`
3. `all test cases from test_bdd_users`
4. `test_bdd_users::teardown_table`
5. `test_bdd_connector::teardown_db`

Pull-Request: #2572

6 years agonitcc: generate more intermediate automaton
Jean Privat [Fri, 27 Oct 2017 13:17:10 +0000 (09:17 -0400)]
nitcc: generate more intermediate automaton

Signed-off-by: Jean Privat <jean@pryen.org>

6 years agonitcc: add transformation from a NFA to a epsilonless NFA
Jean Privat [Fri, 27 Oct 2017 13:16:46 +0000 (09:16 -0400)]
nitcc: add transformation from a NFA to a epsilonless NFA

Signed-off-by: Jean Privat <jean@pryen.org>

6 years agonitcc: remove a truism warning
Jean Privat [Fri, 27 Oct 2017 13:16:17 +0000 (09:16 -0400)]
nitcc: remove a truism warning

Signed-off-by: Jean Privat <jean@pryen.org>

6 years agonitcc: to_minimal_dfa is a little faster
Jean Privat [Fri, 27 Oct 2017 13:15:54 +0000 (09:15 -0400)]
nitcc: to_minimal_dfa is a little faster

Signed-off-by: Jean Privat <jean@pryen.org>

6 years agonitcc: a empty automaton has at least a start state, even if non-terminal
Jean Privat [Fri, 27 Oct 2017 13:10:27 +0000 (09:10 -0400)]
nitcc: a empty automaton has at least a start state, even if non-terminal

Signed-off-by: Jean Privat <jean@pryen.org>

6 years agonitunit: update documentation
Alexandre Terrasa [Thu, 26 Oct 2017 22:18:58 +0000 (18:18 -0400)]
nitunit: update documentation

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonitunit: fix nitunits within the README
Alexandre Terrasa [Thu, 26 Oct 2017 22:05:13 +0000 (18:05 -0400)]
nitunit: fix nitunits within the README

Most of them seem to be nitish since there is a lot a fake modules used.

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonitunit: linearize test execution
Alexandre Terrasa [Thu, 26 Oct 2017 21:44:11 +0000 (17:44 -0400)]
nitunit: linearize test execution

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonitunit: introduce before/after class tests
Alexandre Terrasa [Thu, 26 Oct 2017 21:14:05 +0000 (17:14 -0400)]
nitunit: introduce before/after class tests

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonitunit: do not execute a before/after test twice
Alexandre Terrasa [Thu, 26 Oct 2017 21:13:27 +0000 (17:13 -0400)]
nitunit: do not execute a before/after test twice

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonitunit: fix `before-all` and `after-all` detection
Alexandre Terrasa [Thu, 26 Oct 2017 20:37:34 +0000 (16:37 -0400)]
nitunit: fix `before-all` and `after-all` detection

Before this commit, nitunit did not lookup annotation from parent modules
if no local definition of Sys was found.

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agopopcorn: pop_test uses NIT_TESTING_ID to determine test port
Alexandre Terrasa [Mon, 23 Oct 2017 14:05:42 +0000 (10:05 -0400)]
popcorn: pop_test uses NIT_TESTING_ID to determine test port

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel_json: update ModelView
Alexandre Terrasa [Thu, 19 Oct 2017 00:27:19 +0000 (20:27 -0400)]
model_json: update ModelView

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonitweb: update ModelView
Alexandre Terrasa [Mon, 16 Oct 2017 03:18:55 +0000 (23:18 -0400)]
nitweb: update ModelView

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agoMerge: core: implement `Float::to_precision` in C without callbacks
Jean Privat [Tue, 17 Oct 2017 20:09:26 +0000 (16:09 -0400)]
Merge: core: implement `Float::to_precision` in C without callbacks

Fix #2561, int overflows in `Float::to_precision` with a high float value or a high precision.

The native implementation was removed by 9cb09ccf to support the interpreter. Since then, we added support for the FFI in the interpreter so we can bring back a native implementation.

Pull-Request: #2562

6 years agoMerge: markdown: merge MDProcessor and MDEmitter
Jean Privat [Tue, 17 Oct 2017 20:09:25 +0000 (16:09 -0400)]
Merge: markdown: merge MDProcessor and MDEmitter

The emitter was unecessary.

Also done some cleaning.

Pull-Request: #2563

6 years agoMerge: lib/config: fix doc
Jean Privat [Tue, 17 Oct 2017 20:09:24 +0000 (16:09 -0400)]
Merge: lib/config: fix doc

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

Pull-Request: #2564

6 years agoMerge: lib/markdown: fix `text` for nested markdown blocks
Jean Privat [Tue, 17 Oct 2017 20:09:23 +0000 (16:09 -0400)]
Merge: lib/markdown: fix `text` for nested markdown blocks

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

Pull-Request: #2565

6 years agotests: update tests.sh
Alexandre Terrasa [Mon, 16 Oct 2017 23:08:44 +0000 (19:08 -0400)]
tests: update tests.sh

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonituml: update ModelView
Alexandre Terrasa [Mon, 16 Oct 2017 03:18:46 +0000 (23:18 -0400)]
nituml: update ModelView

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agometrics: update ModelView
Alexandre Terrasa [Mon, 16 Oct 2017 03:18:31 +0000 (23:18 -0400)]
metrics: update ModelView

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agonitdoc: update ModelView
Alexandre Terrasa [Mon, 16 Oct 2017 03:18:18 +0000 (23:18 -0400)]
nitdoc: update ModelView

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel_index: update ModeView
Alexandre Terrasa [Mon, 16 Oct 2017 03:17:59 +0000 (23:17 -0400)]
model_index: update ModeView

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel_collect: update ModelView
Alexandre Terrasa [Mon, 16 Oct 2017 03:17:36 +0000 (23:17 -0400)]
model_collect: update ModelView

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel_views: expect a mainmodule
Alexandre Terrasa [Mon, 16 Oct 2017 03:17:03 +0000 (23:17 -0400)]
model_views: expect a mainmodule

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/nlp: provide more examples
Alexandre Terrasa [Fri, 29 Sep 2017 19:18:23 +0000 (15:18 -0400)]
lib/nlp: provide more examples

And remove old example nitnlp

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/nlp: combine nlp and vsm to create a search engine index
Alexandre Terrasa [Wed, 27 Sep 2017 03:06:20 +0000 (23:06 -0400)]
lib/nlp: combine nlp and vsm to create a search engine index

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/nlp: avoid crash when reading token XML
Alexandre Terrasa [Fri, 22 Sep 2017 20:36:42 +0000 (16:36 -0400)]
lib/nlp: avoid crash when reading token XML

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/nlp: add wrapper to the web REST api
Alexandre Terrasa [Fri, 22 Sep 2017 20:35:56 +0000 (16:35 -0400)]
lib/nlp: add wrapper to the web REST api

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/nlp: extract NLPProcessor from the Java wrapper
Alexandre Terrasa [Wed, 20 Sep 2017 22:06:37 +0000 (18:06 -0400)]
lib/nlp: extract NLPProcessor from the Java wrapper

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/curl: allow raw body string in CurlHTTPRequest
Alexandre Terrasa [Wed, 20 Sep 2017 21:43:05 +0000 (17:43 -0400)]
lib/curl: allow raw body string in CurlHTTPRequest

So we can send something else than POST formatted data.

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/dom: avoid crash on empty tags data access
Alexandre Terrasa [Fri, 22 Sep 2017 20:36:23 +0000 (16:36 -0400)]
lib/dom: avoid crash on empty tags data access

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/dom: allow `xml` in tag name
Alexandre Terrasa [Wed, 20 Sep 2017 21:38:49 +0000 (17:38 -0400)]
lib/dom: allow `xml` in tag name

I'm not sure, what was the reason of this, security concerns maybe?
In this case I can change it as an option.

But we want this parser to parse correctly formed XML documents.

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/vsm: actually use tf.idf when matching documents
Alexandre Terrasa [Thu, 12 Oct 2017 00:48:27 +0000 (20:48 -0400)]
lib/vsm: actually use tf.idf when matching documents

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/vsm: access to non-existing keys return 0.0
Alexandre Terrasa [Thu, 12 Oct 2017 00:47:50 +0000 (20:47 -0400)]
lib/vsm: access to non-existing keys return 0.0

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/markdown: fix `text` for nested markdown blocks
Alexandre Terrasa [Thu, 12 Oct 2017 00:46:11 +0000 (20:46 -0400)]
lib/markdown: fix `text` for nested markdown blocks

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/config: fix doc
Alexandre Terrasa [Thu, 12 Oct 2017 00:44:15 +0000 (20:44 -0400)]
lib/config: fix doc

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/markdown: merge processor and emitter
Alexandre Terrasa [Wed, 11 Oct 2017 03:11:51 +0000 (23:11 -0400)]
lib/markdown: merge processor and emitter

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/markdown: fix nitunits in README
Alexandre Terrasa [Wed, 11 Oct 2017 02:54:30 +0000 (22:54 -0400)]
lib/markdown: fix nitunits in README

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/markdown: remove warnings
Alexandre Terrasa [Wed, 11 Oct 2017 02:53:10 +0000 (22:53 -0400)]
lib/markdown: remove warnings

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agoMerge: model_collect: collect more things
Jean Privat [Tue, 10 Oct 2017 16:52:04 +0000 (12:52 -0400)]
Merge: model_collect: collect more things

A lot of new shortcuts.

Pull-Request: #2560

6 years agoMerge: model: is_accessor
Jean Privat [Tue, 10 Oct 2017 16:52:03 +0000 (12:52 -0400)]
Merge: model: is_accessor

Add a way to associate attributes and their getters/setters.

This will be useful for filtering things and groupings properties together.

Pull-Request: #2559

6 years agoMerge: Vector Space Model
Jean Privat [Tue, 10 Oct 2017 16:52:01 +0000 (12:52 -0400)]
Merge: Vector Space Model

# Vector Space Model

Vector Space Model (VSM) is an algebraic model for representing text documents
(and any objects, in general) as vectors of identifiers, such as, for example,
index terms.

It is used in information filtering, information retrieval, indexing and
relevancy rankings.

The `vsm` package provides the following features:
* Vector comparison with cosine similarity.
* Vector indexing and matching with tf * idf.
* File indexing and matching to free text queries.

## Vectors

With VSM, documents are represented by a n-dimensions vector.
Each dimension represent an attribute of the document or object.

For text document, the count of each term found in the document if often used to
build vectors.

### Creating a vector

~~~nit
var vector = new Vector
vector["term1"] = 2.0
vector["term2"] = 1.0
assert vector["term1"] == 2.0
assert vector["term2"] == 1.0
assert vector.norm.is_approx(2.236, 0.001)
~~~

### Comparing vectors

~~~nit
var v1 = new Vector
v1["term1"] = 1.0
v1["term2"] = 2.0

var v2 = new Vector
v2["term2"] = 1.0
v2["term3"] = 3.0

var query = new Vector
query["term2"] = 1.0

var s1 = query.cosine_similarity(v1)
var s2 = query.cosine_similarity(v2)
assert s1 > s2
~~~

## VSMIndex

VSMIndex is a Document index based on VSM.

Using VSMIndex you can index documents associated with their vector.
Documents can then be matched to query vectors.

This represents a minimalistic search engine.

~~~nit
var index = new VSMIndex

var d1 = new Document("Doc 1", "/uri/1", v1)
index.index_document(d1)

var d2 = new Document("Doc 2", "/uri/2", v2)
index.index_document(d2)

assert index.documents.length == 2

query = new Vector
query["term1"] = 1.0

var matches = index.match_vector(query)
assert matches.first.document == d1
~~~

## StringIndex

The StringIndex provides usefull services to index and match strings.

~~~nit
index = new StringIndex

d1 = index.index_string("Doc 1", "/uri/1", "this is a sample")
d2 = index.index_string("Doc 2", "/uri/2", "this and this is another example")
assert index.documents.length == 2

matches = index.match_string("this sample")
assert matches.first.document == d1
~~~

## FileIndex

The FileIndex is a StringIndex able to index and retrieve files.

~~~nit
index = new FileIndex

index.index_files(["/path/to/doc/1", "/path/to/doc/2"])
~~~

Pull-Request: #2556
Reviewed-by: Jean Privat <jean@pryen.org>

6 years agonlp: use new vector representation
Alexandre Terrasa [Fri, 6 Oct 2017 17:46:36 +0000 (13:46 -0400)]
nlp: use new vector representation

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agocore: implement Float::to_precision in C without callbacks
Alexis Laferrière [Tue, 3 Oct 2017 14:53:59 +0000 (10:53 -0400)]
core: implement Float::to_precision in C without callbacks

Fix int overflows in Float::to_precision with a high float value
or a high precision.

The native implementation was removed by 9cb09ccf to support the
interpreter, which, at the time, did not support the FFI. Since then, we
added support for thhe FFI in the interpreter.

Signed-off-by: Alexis Laferrière <alexis.laf@xymus.net>

6 years agonitweb: use model_collect for definitions lists
Alexandre Terrasa [Fri, 29 Sep 2017 21:32:16 +0000 (17:32 -0400)]
nitweb: use model_collect for definitions lists

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel_collect: collect more things
Alexandre Terrasa [Fri, 29 Sep 2017 21:31:58 +0000 (17:31 -0400)]
model_collect: collect more things

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel_collect: uniformize documentation
Alexandre Terrasa [Fri, 29 Sep 2017 21:05:48 +0000 (17:05 -0400)]
model_collect: uniformize documentation

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agotests: update tests for vsm
Alexandre Terrasa [Fri, 29 Sep 2017 19:13:23 +0000 (15:13 -0400)]
tests: update tests for vsm

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/vsm: add README
Alexandre Terrasa [Wed, 27 Sep 2017 02:42:39 +0000 (22:42 -0400)]
lib/vsm: add README

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/vsm: introduce an indexing process based on VSM
Alexandre Terrasa [Fri, 22 Sep 2017 20:37:19 +0000 (16:37 -0400)]
lib/vsm: introduce an indexing process based on VSM

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/vsm: accept anything as a dimension
Alexandre Terrasa [Wed, 20 Sep 2017 22:23:40 +0000 (18:23 -0400)]
lib/vsm: accept anything as a dimension

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agolib/nlp: move vsm.nit to its own package
Alexandre Terrasa [Wed, 20 Sep 2017 22:23:09 +0000 (18:23 -0400)]
lib/nlp: move vsm.nit to its own package

We don't need nlp to use vsm

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel: remove a warning :p
Alexandre Terrasa [Fri, 29 Sep 2017 18:46:04 +0000 (14:46 -0400)]
model: remove a warning :p

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodelize: define attribute getters and setters
Alexandre Terrasa [Fri, 29 Sep 2017 18:45:54 +0000 (14:45 -0400)]
modelize: define attribute getters and setters

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agomodel: tag getters and setters
Alexandre Terrasa [Fri, 29 Sep 2017 18:45:34 +0000 (14:45 -0400)]
model: tag getters and setters

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

6 years agoMerge: lib: introduce `fca` a module for formal concept analysis
Jean Privat [Thu, 28 Sep 2017 23:39:24 +0000 (19:39 -0400)]
Merge: lib: introduce `fca` a module for formal concept analysis

## Building a FormalContext

We use the example from https://en.wikipedia.org/wiki/Formal_concept_analysis:

~~~nit
var fc = new FormalContext[Int, String]
fc.set_object_attributes(1, ["odd", "square"])
fc.set_object_attributes(2, ["even", "prime"])
fc.set_object_attributes(3, ["odd", "prime"])
fc.set_object_attributes(4, ["even", "composite", "square"])
fc.set_object_attributes(5, ["odd", "prime"])
fc.set_object_attributes(6, ["even", "composite"])
fc.set_object_attributes(7, ["odd", "prime"])
fc.set_object_attributes(8, ["even", "composite"])
fc.set_object_attributes(9, ["odd", "square", "composite"])
fc.set_object_attributes(10, ["even", "composite"])
~~~

## Computing the set of FormalConcept

~~~nit
var concepts = fc.formal_concepts
for concept in concepts do
print concept
end
~~~

## Visualizing formal concept with ConceptLattice

~~~nit
var cl = new ConceptLattice[Int, String].from_concepts(concepts)
cl.show_dot
~~~

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

Pull-Request: #2557
Reviewed-by: Jean Privat <jean@pryen.org>

6 years agoMerge: .gitignore: ignore more vim files
Jean Privat [Thu, 28 Sep 2017 23:39:23 +0000 (19:39 -0400)]
Merge: .gitignore: ignore more vim files

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

Pull-Request: #2555

6 years agoMerge: model_visitor: reject is_before and is_after as they are tests
Jean Privat [Thu, 28 Sep 2017 23:39:22 +0000 (19:39 -0400)]
Merge: model_visitor: reject is_before and is_after as they are tests

Signed-off-by: Alexandre Terrasa <alexandre@moz-code.org>

Pull-Request: #2554