GROFF_CHAR(7) GROFF_CHAR(7)

NAME

groff_char - groff glyph names

DESCRIPTION

This manual page lists the standard groff glyph names and the default input mapping, latin-1. The glyphs in this document will look different depending on which output device was chosen (with option -T for the man(1) program or the roff formatter). Glyphs not available for the device that is being used to print or view this manual page will be marked with `(N/A)'.

In the actual version, groff provides only 8-bit characters for direct input and named entities for further glyphs. On ASCII platforms, input character codes in the range 0 to 127 (decimal) represent the usual 7-bit ASCII characters, while codes between 127 and 255 are interpreted as the corresponding characters in the Latin-1 (ISO-8859-1) code set by default. This mapping is contained in the file latin1.tmac and can be changed by loading a different input encoding. Note that some of the input characters are reserved by groff, either for internal use or for special input purposes. On EBCDIC platforms, only code page cp1047 is supported (which contains the same characters as Latin-1; the input encoding file is called cp1047.tmac). Again, some input characters are reserved for internal and special purposes. It is rather straightforward (for the experienced user) to set up other 8-bit encodings like Latin-2; since groff will use Unicode in the next major version, no additional encodings are provided.

All roff systems provide the concept of named glyphs. In traditional roff systems, only names of length 2 were used, while groff also provides support for longer names. It is strongly suggested that only named glyphs are used for all character representations outside of the printable 7-bit ASCII range.

Some of the predefined groff escape sequences (with names of length 1) also produce single characters; these exist for historical reasons or are printable versions of syntactical characters. They include `\\', `\́', `\`', `\-', `\.', and `\e'; see groff(7).

In groff, all of these different types of characters and glyphs can be tested positively with the `.if c' conditional.

REFERENCE

In this section, the glyphs in groff are specified in tabular form. The meaning of the columns is as follows.

Output shows how the glyph is printed for the current device; although this can have quite a different shape on other devices, it always represents the same glyph.

Input name specifies how the glyph is input either directly by a key on the keyboard, or by a groff escape sequence.

Input code applies to glyphs which can be input with a single character, and gives the ISO Latin-1 decimal code of that input character. Note that this code is equivalent to the lowest 256 Unicode characters, including 7-bit ASCII in the range 0 to 127.

PostScript name gives the usual PostScript name of the glyph.

Unicode decomposed is the glyph name used in composite glyph names.

7-bit Character Codes 32-126

These are the basic glyphs having 7-bit ASCII code values assigned. They are identical to the printable characters of the character standards ISO-8859-1 (Latin-1) and Unicode (range C0 Controls and Basic Latin). The glyph names used in composite glyph names are `u0020' up to `u007E'.

Note that input characters in the range 0-31 and character 127 are not printable characters. Most of them are invalid input characters for groff anyway, and the valid ones have special meaning. For EBCDIC, the printable characters are in the range 66-255.

48-57 Decimal digits 0 to 9 (print as themselves).

65-90 Upper case letters A-Z (print as themselves).

97-122 Lower case letters a-z (print as themselves).

Most of the remaining characters not in the just described ranges print as themselves; the only exceptions are the following characters:

̀ the ISO Latin-1 `Grave Accent' (code 96) prints as `, a left single quotation mark; the original character can be obtained with `\`'.

' the ISO Latin-1 `Apostrophe' (code 39) prints as ', a right single quotation mark; the original character can be obtained with `\(aq'.

- the ISO Latin-1 `Hyphen, Minus Sign' (code 45) prints as a hyphen; a minus sign can be obtained with `\-'.

~ the ISO Latin-1 `Tilde' (code 126) is reduced in size to be usable as a diacritic; a larger glyph can be obtained with `\(ti'.

^ the ISO Latin-1 `Circumflex Accent' (code 94) is reduced in size to be usable as a diacritic; a larger glyph can be obtained with `\(ha'.

OutputInputInputPostScriptUnicodeNotes name code name decomposed

8-bit Character Codes 160 to 255

They are interpreted as printable characters according to the Latin-1 (iso-8859-1) code set, being identical to the Unicode range C1 Controls and Latin-1 Supplement.

Input characters in range 128-159 (on non-EBCDIC hosts) are not printable characters.

160 the ISO Latin-1 no-break space is mapped to `\~', the stretchable space character.

173 the soft hyphen control character. groff never uses this character for output (thus it is omitted in the table below); the input character 173 is mapped onto `\%'.

The remaining ranges (161-172, 174-255) are printable characters that print as themselves. Although they can be specified directly with the keyboard on systems with a Latin-1 code page, it is better to use their glyph names; see next section.

OutputInputInputPostScriptUnicodeNotes name code name decomposed

Named Glyphs

Glyph names can be embedded into the document text by using escape sequences. groff(7) describes how these escape sequences look. Glyph names can consist of quite arbitrary characters from the ASCII or Latin-1 code set, not only alphanumeric characters. Here some examples:

\c A glyph having the name c, which consists of a single character (length 1).

\(ch A glyph having the 2-character name ch.

\[char_name] A glyph having the name char_name (having length 1, 2, 3, ...).

\[base_glyph composite_1 composite_2 ...] A composite glyph; see below for a more detailed description.

In groff, each 8-bit input character can also referred to by the construct `\[char n]' where n is the decimal code of the character, a number between 0 and 255 without leading zeros (those entities are not glyph names). They are normally mapped onto glyphs using the .trin request. Another special convention is the handling of glyphs with names directly derived from a Unicode code point; this is discussed below. Moreover, new glyph names can be created by the .char request; see groff(7).

In the following, a plus sign in the `Notes' column indicates that this particular glyph name appears in the PS version of the original troff documentation, CSTR 54.

OutputInputPostScriptUnicodeNotes name name decomposed Ligatures and Other Latin Glyphs

Accented Characters

Accents

The composite request is used to map most of the accents to non-spacing glyph names; the values given in parentheses are the original (spacing) ones.

6 February 2006 Groff Version 1.19.2