lights FAQ Forum   github.com/luapower/libunibreak
libunibreak

Unicode
utf8
ucdn
libunibreak
harfbuzz
fribidi

libunibreak

unicode line breaking


local ub = require'libunibreak'

A ffi binding to libunibreak, a C library implementing the unicode line breaking algorithm and word breaking from unicode text segmentation.

Line breaking

ub.linebreaks_utf8(s[,size[,lang]]) -> line_breaks
ub.linebreaks_utf16(s[,size[,lang]]) -> line_breaks
ub.linebreaks_utf32(s[,size[,lang]]) -> line_breaks

The returned line_breaks is a 0-based array of flags, one for each byte of the input string:

0 Break is mandatory.
1 Break is allowed.
2 No break is possible.
3 A UTF-8/16 sequence is unfinished.

Word breaking

ub.wordbreaks_utf8(s[,size[,lang]]) -> word_breaks
ub.wordbreaks_utf16(s[,size[,lang]]) -> word_breaks
ub.wordbreaks_utf32(s[,size[,lang]]) -> word_breaks

The returned word_breaks is a 0-based array of flags, one for each byte of the input string:

0 Break is allowed.
1 No break is allowed.
2 A UTF-8/16 sequence is unfinished.

Unicode helpers

ub.chars_utf8(s) -> iter() -> i, codepoint
ub.chars_utf16(s) -> iter() -> i, codepoint
ub.chars_utf32(s) -> iter() -> i, codepoint

Iterate codepoints.

ub.len_utf8(s[,size]) -> len
ub.len_utf16(s[,size]) -> len
ub.len_utf32(s[,size]) -> len

Get the number of codepoints in string.


Last updated: 2 years ago | Edit on GitHub

Pkg type:Lua+ffi
Version: r2-22-g3872ab9
Last commit:
License: ZLIB/LIBPNG
Import ver: 1.0
Requires: luajit 
Required by: none

Top