Merge: Intro Codec
authorJean Privat <jean@pryen.org>
Thu, 20 Aug 2015 14:45:44 +0000 (10:45 -0400)
committerJean Privat <jean@pryen.org>
Thu, 20 Aug 2015 14:45:44 +0000 (10:45 -0400)
commita672b1d158eeb2cbf1dfb5316f6f37c558971832
treea9ced0efcab76273fc9101ae668511102f6f5ed3
parent97eccaa9928ce2edd3b5b0b3a832463ad157bf12
parentb07676fbfece3d34c9bd305e17d4d6b6a8cdc35a
Merge: Intro Codec

As UTF-8 is now part of Nit, the standard imposes conforming implementations to properly handle borderline cases like overlong sequences and such.

The codec defined here sanitizes an input before letting Nit play with it, avoiding potential security [issues](https://www.owasp.org/index.php/Canonicalization,_locale_and_Unicode)

The codec architecture can also be used later to handle different codings for source files (that or we decide that all that is not UTF-8 is to be rejected/misinterpreted) or text.

Pull-Request: #1628
Reviewed-by: Jean Privat <jean@pryen.org>
Reviewed-by: Alexandre Terrasa <alexandre@moz-code.org>
lib/standard/stream.nit