Unicode typefaces
From Wikinfo
Unicode typefaces (also known as UCS fonts and Unicode fonts) are typefaces containing a wide range of characters, letters, digits, glyphs, symbols, ideograms, logograms, etc., which are collectively mapped into the standard Universal Character Set, derived from many different languages and scripts from around the world. Unlike most conventional computer fonts, which are specific to a particular language or legacy character set and contain only a small subset of the UCS characters, these fonts attempt to include many thousands of possible glyphs, so that they can be used as a single typeface across multi-lingual documents.
The Unicode standard does not specify the typeface (a collection of graphical shapes called glyphs) itself, but rather instead, it defines the abstract characters as a specific number (known as a codepoint) and also defines the required changes of shape depending on the context the glyph is used in (e.g., Combining characters, precomposed characters and letter-diacritic combinations). The choice of font, which governs how the abstract UCS characters are converted into a bitmap or vector output that can be viewed on a screen or printed, is left up to the user. If a font is chosen which does not contain a glyph for a codepoint used in the document, typically a question mark ("?"), a box, or some other Substitute character is displayed.
Currently (July, 2006), no single "Unicode font" includes all the characters defined in the present revision of the ISO 10646 (Unicode) standard. Many are continually updated to incorporate characters which were previously omitted or which were added in a newer version of the standard. Additionally, fonts may be updated to correct errors in past versions.
The UCS has over 1.1 million code points, but only the first 65,536 (the Plane 0: Basic Multilingual Plane, or BMP) had entered into common use before 2000. (See the Mapping of Unicode characters article for more information on other planes, including Plane 1: SMP, Plane 2: SIP, Plane 14: SSP, Plane 15 and 16: reserved for PUA.)
The first Unicode font (with very large character set, and supporting many Unicode blocks) was Lucida Sans Unicode, it was developed by Charles Bigelow & Kris Holmes' in March, 1993 (Shipped with Windows NT 3.1). Second was Unihan font, developed by Ross Paterson in 1993. Third was Everson Mono Unicode font, released in 1995, developed by Michael Everson.
| Unicode |
|---|
| Encodings |
| UCS |
| Mapping |
| Bi-directional text |
| BOM |
| Han unification |
| Unicode and HTML |
| Unicode and E-mail |
| Unicode typefaces |
Contents |
Issues
There are typographical ambiguities in Unicode, so that some of the unified Chinese characters will be typographically different in different regions. For example, Unicode point U+9AA8 (骨) is typographically different between simplified Chinese and traditional Chinese. This has implications for the idea that a single typeface can satisfy the needs of all locales.[1]
Application of Unicode typefaces
Beside all the issues, Unicode is now the base character set for many new standards and protocols, and is built into the architecture of operating systems (Microsoft Windows, Apple Mac OS X, and many versions of Unix), programming languages (Ada, Perl, Python, Java, Common LISP, APL), and libraries (IBM International Components for Unicode (ICU) along with the Pango, Graphite, Scribe, Uniscribe, and ATSUI rendering engines), font formats (TrueType and OpenType) and so on. Many other standards are also getting upgraded to Unicode compliance, day by day.
Utility software
Utility software can be used to see exactly which characters are included inside a font file:
- Character Map applet included with Windows 2000/XP
- Font Book application included with Mac OS X
- gucharmap for GNOME
- kcharmap for KDE
- MainType (by HighLogic, commercial)
- BabelMap (by Andrew West, free, donation-ware)
- Unicode Font Viewer (by Mike Lischke, freeware)
- Quick Key (by Nathanael Jones, opensource, free)
List of Unicode fonts
Of the many Unicode fonts available, the few are listed below are the most commonly used by a majority of users around the world on mainstream computing platforms. More Unicode fonts can be found in the (List of typefaces) article's "Unicode fonts" section.
| Font | Char(s) | Glyphs | Kernpairs | Version | Font Family | Font style | Font type | Serif style | License | Notes |
|---|---|---|---|---|---|---|---|---|---|---|
| Arial | 1,419 | 1,674 | 909 | 3.00 | Arial | Regular | OTF+TTO | Normal Sans | Proprietary | Included with Microsoft Windows. |
| Arial Unicode MS | 38,917 | 50,377 | 0 | 1.00 | Arial | Regular | OTF+TTO | Normal Sans | Proprietary | Included with Microsoft Office. |
| Bitstream Cyberbit | 32,910 | 29,934 | 935 | 2.0 beta | Bitstream Cyberbit | Roman | TTF | Cove | Freeware | For non-commercial use only. |
| Cardo | 2,879 | 2,882 | 216 | 0.098 (2004) | Cardo | Regular | TTF | Cove | Freeware | For non-commercial and non-profit uses only. |
| Caslon Roman | 3,684 | 3,686 | 0 | 001.000 16-12-2001 | Caslon | Roman | TTF | BSD-like license | ||
| Code2000 | 51,239 | 61,864 | 115 | 1.16 | Code2000 | Regular | TTF | Any | Shareware | Register after "reasonable" period (author's words). |
| Charis SIL | 1,958 | 3,084 | 0 | 4.002 | Charis SIL | Regular | TTF | Any | OFL | |
| Chryſanþi Unicode (Chrysanthi Unicode) | 4,818 | 4,383 | 0 | 3.1 | Chrysanthi | Regular | TTF | Cove | Freeware | |
| ClearlyU | - | 9,538 | 0 | 1.9 | - | - | - | - | Freeware | |
| DejaVu Sans | 5,223 | 5,427 | 2,558 | 2.18 | DejaVu | Book | OTF+TTO | Normal Sans | Bitstream Vera license and public domain for additions | |
| Doulos SIL | 1,958 | 3,083 | 0 | 4.014 | Doulos SIL | Regular | TTF | Any | OFL | |
| Everson Mono Unicode | 4,893 | 4,899 | 0 | 3.2b4 | Everson Mono | Regular | TTF | Any | Shareware | Monospaced width. |
| FreeSerif | 3,914 | 5,257 | 0 | 1.52 | FreeSerif | Medium | TTF | Cove | GPL | Sans serif (FreeSans) and monospaced (FreeMono) variants. |
| Gentium Regular | 1,469 | 1,699 | 2,857 | 1.0.2 (2005) | Gentium | Regular | TTF | Any | OFL | |
| GNU Unifont | 33,580 | 33,583 | 0 | 001.000 | Unifont | Medium | Bitmap | Any | GPL | |
| Junicode | 2,235 | 2,256 | 0 | 0.6.12 | Junicode | Regular | TTF | Any | GPL | |
| Linux Libertine | 1,982 | 1,985 | 0 | 2.2.0 | Linux Libertine | Regular | OTF+TTO | Any | GPL, OFL | |
| Lucida Grande | 2,245 | 2,826 | 0 | 5.0d8e1 (Revision 1.002) | Lucida Grande | Regular | OTF | Normal Sans | Proprietary | Included with Mac OS X. Any proportion. |
| Lucida Sans Unicode | 1,765 | 1,776 | 0 | 2.00 | Lucida Sans | Regular | OTF+TTO | Normal Sans | Proprietary | Included with Microsoft Windows. |
| Microsoft Sans Serif | 2,301 | 2,257 | 0 | 1.41 | Microsoft Sans Serif | Regular | OTF+TTO | Normal Sans | Proprietary | Included with Microsoft Windows. |
| New Gulim | 46,567 | 49,284 | 0 | 3.10 | New Gulim | Regular | TTF | Obtuse Cove | Proprietary | Included with Microsoft Office 2000. Any Proportion. |
| Tahoma | 1,912 | 2,034 | 674 | 3.14 | Tahoma | Regular | OTF+TTO | Normal Sans | Proprietary | Included with Microsoft Windows. |
| Times New Roman | 1,419 | 1,674 | 867 | 3.00 | Times New Roman | Regular | OTF+TTO | Cove | Proprietary | Included with Microsoft Windows. |
| TITUS Cyberbit Basic | 9,341 | 10,044 | 0 | 3.0 (2000) (Revision 4.00) | TITUS Cyberbit | Regular | TTF | Cove | Freeware | |
| Y.OzFontN | 21,360 | 59,678 | 0 | 9.41 | Y.OzFontN | Regular | TTF | Any | Freeware | Sans-serif (for Japanese) and Monospace (for Latin). |
| Font | Char(s) | Glyphs | Kernpairs | Version | Font Family | Font style | Font type | Serif style | License | Notes |
- Note
- ^† OTF+TTO: OpenType font with TrueType outlines.
- ^‡ OpenType fonts sometimes don't contain a one-by-one Kernpair table but a kern-by-classes table where groups of similar characters are seen as one kern group. I.e. have V and W nearly the same left and right geometry. So “0” doesn't mean that no kerning is supported!
Comparison of fonts
Number of characters included by the above version of fonts, for different Unicode blocks (or, ranges), are listed below.
0000-077F
- N = Numerical digits. This number of characters are included in the font for that range.
- Template:U2713 = Most or some portion out of all characters in that range are present in the font.
- X = No characters are included in the font for that range or Unicode block.
- - = Data not available now.
0780-139F
13A0-1DBF
1DC0-257F
2580-2DDF
2E00-4DBF
4DC0-FE2F
FE30-FFFF
10000-1D7FF
See also
References
- ^ Ken Lunde, CJKV Information Processing, O'Reilly Inc, 1999. Page 128, "CJKV character form differences"
External links
- ISO/IEC JTC1/SC2/WG2, the working group in charge of ISO 10646
- Fonts and Keyboards at Unicode.org
- Unicode Font Guide For Free/Libre Open Source Operating Systems - a huge index of high quality free fonts
- Alan Wood's Unicode Resources
- Character sets - Ken Fowles, Microsoft, 1997. - Enable Unicode for applications.
- Arial Unicode Font at AscenderCorp.com
| This page uses content from Wikipedia. The original article was at Unicode typefaces. The list of authors can be seen in the page history. The text of this Wikinfo article is available under the GNU Free Documentation License and the Creative Commons Attribution-Share Alike 3.0 license. |

