Character sets are one of the basic issues for information interchange. Most current national standard character sets extend 7-bit ASCII. These extensions conflict with each other and make the design of multilingual information systems complicated. Unicode or the Universal Character Set (UCS) is a character set that covers symbols in the major written languages. Text files and strings usually have no header to indicate which character set is in use, and they currently use one of the national standards by default. The transition from national standards to Unicode may take a longer time than expected. This paper presents the following methods to help the transition. (1) A text file format of fixed-width characters: if the first character in a text file is a nonzero control code, the file is in UCS; otherwise, it is in the default national standard. The control code indicates which UCS subset or byte order is in use. (2) A tagged string storage: each string has a tag representing which character set or coding format is in use, e.g., the default national standard, 8-bit subset of UCS-2, UCS-2, or UCS-4. (3) A method for assigning the format of string literals: all string literals use the same syntax notation, and their storage format is the same as that of their source files. These methods can improve multilingual support without introducing much complexity. Copyright © 2000 John Wiley & Sons, Ltd.