Next: , Previous: ISO-8859-1 Characters, Up: Characters


5.6 Character Sets

MIT/GNU Scheme's character-set abstraction is used to represent groups of characters, such as the letters or digits. Character sets may contain only ISO-8859-1 characters; use the alphabet abstraction (see Unicode if you need to cover the entire Unicode range.

— procedure: char-set? object

Returns #t if object is a character set; otherwise returns #f.

— variable: char-set:upper-case
— variable: char-set:lower-case
— variable: char-set:alphabetic
— variable: char-set:numeric
— variable: char-set:alphanumeric
— variable: char-set:whitespace
— variable: char-set:not-whitespace
— variable: char-set:graphic
— variable: char-set:not-graphic
— variable: char-set:standard

These variables contain predefined character sets. To see the contents of one of these sets, use char-set-members.

Alphabetic characters are the 52 upper and lower case letters. Numeric characters are the 10 decimal digits. Alphanumeric characters are those in the union of these two sets. Whitespace characters are #\space, #\tab, #\page, #\linefeed, and #\return. Graphic characters are the printing characters and #\space. Standard characters are the printing characters, #\space, and #\newline. These are the printing characters:

          ! " # $ % & ' ( ) * + , - . /
          0 1 2 3 4 5 6 7 8 9
          : ; < = > ? @
          A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
          [ \ ] ^ _ `
          a b c d e f g h i j k l m n o p q r s t u v w x y z
          { | } ~
— procedure: char-upper-case? char
— procedure: char-lower-case? char
— procedure: char-alphabetic? char
— procedure: char-numeric? char
— procedure: char-alphanumeric? char
— procedure: char-whitespace? char
— procedure: char-graphic? char
— procedure: char-standard? object

These predicates are defined in terms of the respective character sets defined above.

— procedure: char-set-members char-set

Returns a newly allocated list of the characters in char-set.

— procedure: char-set-member? char-set char

Returns #t if char is in char-set; otherwise returns #f.

— procedure: char-set=? char-set-1 char-set-2

Returns #t if char-set-1 and char-set-2 contain exactly the same characters; otherwise returns #f.

— procedure: char-set char ...

Returns a character set consisting of the specified ISO-8859-1 characters. With no arguments, char-set returns an empty character set.

— procedure: chars->char-set chars

Returns a character set consisting of chars, which must be a list of ISO-8859-1 characters. This is equivalent to (apply char-set chars).

— procedure: string->char-set string

Returns a character set consisting of all the characters that occur in string.

— procedure: ascii-range->char-set lower upper

Lower and upper must be exact non-negative integers representing ISO-8859-1 character codes, and lower must be less than or equal to upper. This procedure creates and returns a new character set consisting of the characters whose ISO-8859-1 codes are between lower (inclusive) and upper (exclusive).

For historical reasons, the name of this procedure refers to “ASCII” rather than “ISO-8859-1”.

— procedure: predicate->char-set predicate

Predicate must be a procedure of one argument. predicate->char-set creates and returns a character set consisting of the ISO-8859-1 characters for which predicate is true.

— procedure: char-set-difference char-set1 char-set2

Returns a character set consisting of the characters that are in char-set1 but aren't in char-set2.

— procedure: char-set-intersection char-set ...

Returns a character set consisting of the characters that are in all of the char-sets.

— procedure: char-set-union char-set ...

Returns a character set consisting of the characters that are in at least one o the char-sets.

— procedure: char-set-invert char-set

Returns a character set consisting of the ISO-8859-1 characters that are not in char-set.