utf8lib¶

char¶

function utf8.char(code: integer, ...integer)
  -> string

Receives zero or more integers, converts each one to its corresponding UTF-8 byte sequence and returns a string with the concatenation of all these sequences.

View documents

charpattern¶

string

The pattern which matches exactly one UTF-8 byte sequence, assuming that the subject is a valid UTF-8 string.

View documents

codepoint¶

function utf8.codepoint(s: string, i?: integer, j?: integer, lax?: boolean)
  -> code: integer
  2. ...integer

Returns the codepoints (as integers) from all characters in s that start between byte position i and j (both included).

View documents

codes¶

function utf8.codes(s: string, lax?: boolean)
  -> fun(s: string, p: integer):integer, integer

Returns values so that the construction

for p, c in utf8.codes(s) do
    body
end

will iterate over all UTF-8 characters in string s, with p being the position (in bytes) and c the code point of each character. It raises an error if it meets any invalid byte sequence.

View documents

len¶

function utf8.len(s: string, i?: integer, j?: integer, lax?: boolean)
  -> integer?
  2. errpos: integer?

Returns the number of UTF-8 characters in string s that start between positions i and j (both inclusive).

View documents

offset¶

function utf8.offset(s: string, n: integer, i?: integer)
  -> p: integer

Returns the position (in bytes) where the encoding of the n-th character of s (counting from position i) starts.

View documents