From: Jean Privat Date: Thu, 10 Sep 2015 00:25:10 +0000 (-0400) Subject: Merge: UTF-8 Regex X-Git-Tag: v0.7.8~37 X-Git-Url: http://nitlanguage.org Merge: UTF-8 Regex This PR closes #1684 Instead of making `byte_to_char_index` public, it has been removed as it had no real reason to live. Names are corrected and should correctly reflect their use. Some examples of regular expressions with UTF-8 have been included. Note however that the C-library underneath does not have UTF-8 semantics, as such, when using repetition operators on UTF-8 strings, capture the problematic characters with parentheses as in the example, or else the result will be erroneous. Additionally, performances should be a bit better since less allocations and copy_to should be done. Pull-Request: #1692 Reviewed-by: Jean Privat Reviewed-by: Alexis Laferrière --- afb92589df321a71b53cdf19336e5cc306e40db8