About This Code
Brief Description:
All Unicode Characters Database
Contributor:
Richard H Schwartz
Notes Version:
R6.x, R8.x, R7.x
Last Modified:
15 Feb 2010
OpenNTF Disclaimer
All of the program code and information presented in the OpenNTF.org Code Bin are provided "as-is", and should be used at your own risk. OpenNTF.org make no express or implied warranty about anything in the Code Bin, and OpenNTF.org will not be responsible or liable for any damage caused by the use or misuse of anything from this site. OpenNTF.org makes no guarantees about anything. Please thoroughly test all of the knowledge and code you find here before you attempt to use them in your production environment.
Code / Description
The UnicodeData.nsf database contains one document for each Unicode code point that is documented in the standard UnicodeData.txt file on the Unicode Consortium web site. Each document contains the decimal and hex values of a Unicode code point, and a text field populated with the actual LMBCS character that corresponds to that code point, along with the Unicode name for the code point and several other values. The database has a view containing all the separate code point documents that is conveniently sorted by the decimal Unicode code point.
There is also another view containing several additional useful documents: a document with the original UnicodeData.txt file as an attachment, a document containing all the code points rendered 16 per line as LMBCS characters in a Notes rich text field, and a document containing all the code points rendered 1 per line as LMBCS code points. All scripts used in creation of the documents are included in the database.
Usage / Example
Simply open the database to the All Characters view and browse the view and/or open the document to locate any Unicode character, or to the Documents view to locate any of the miscellaneous documents included in the database.
Note that only the code points listed in the UnicodeData.txt file on the Unicode Consortium's web site are included in the database. This means that the database does not necessarily contain a complete set of LMBCS characters, as it is not clear whether all LMBCS characters match up to Unicode code points.
Also note that actually viewing the characters on the screen or in hard copy is dependent on the fonts on your computer. The forms and rich text in the database are set to use the @Arial Unicode MS font that is available on my Windows Vista machine, and while this properly renders characters from many languages correctly, there are still many characters that do not render.