Previous: Unicode Representations, Up: Unicode
Applications often need to manipulate sets of characters, such as the set of alphabetic characters or the set of whitespace characters. The alphabet abstraction provides an efficient implementation of sets of Unicode code points.
Returns a Unicode alphabet containing the wide characters passed as arguments.
Returns a Unicode alphabet containing the code points described by items. Items must satisfy
well-formed-code-points-list?
.
Returns a well-formed code-points list that describes the code points represented by alphabet.
Returns
#t
if object is a well-formed code-points list, otherwise returns#f
. A well-formed code-points list is a proper list, each element of which is either a code point or a pair of code points. A pair of code points represents a contiguous range of code points. The car of the pair is the lower limit, and the cdr is the upper limit. Both limits are inclusive, and the lower limit must be strictly less than the upper limit.
Returns
#t
if char is a member of alphabet, otherwise returns#f
.
Character sets and alphabets can be converted to one another, provided that the alphabet contains only 8-bit code points. This is true because 8-bit code points in Unicode map directly to ISO-8859-1 characters, which is what character sets contain.
Returns a Unicode alphabet containing the code points that correspond to characters that are members of char-set.
Returns a character set containing the characters that correspond to 8-bit code points that are members of alphabet. (Code points outside the 8-bit range are ignored.)
Returns a Unicode alphabet containing the code points corresponding to the characters in string. Equivalent to
(char-set->alphabet (string->char-set string))
Returns a newly-allocated string containing the characters corresponding to the 8-bit code points in alphabet. (Code points outside the 8-bit range are ignored.)
Returns
#t
if alphabet contains only 8-bit code points, otherwise returns#f
.