Page 495 - Introduction to Programming with Java: A Problem Solving Approach
P. 495
11.13 GUI Track: Unicode (Optional) 461 escape sequence, however, so you can embed the \u#### anywhere in a string. The u must be lowercase,
5
If you want to print characters using Unicode escape sequences, you can use System.out.println in a text-based environment for the first 128 characters, but for the other characters, System.out.println in a text-based environment doesn’t work consistently. That’s because text-based environments recognize just the ASCII portion of the Unicode table; that is, the first 128 characters. To print all the characters in the Unicode table, you need to use graphical user interface (GUI) commands in a GUI environment.
The program in Figure 11.11 provides a GUI window and uses it to illustrate a small sampling of the many characters that are available. The codes array contains int code values for the Unicode escape sequences for the first characters in blocks of characters that we choose to display. These Unicode escape se- quences automatically promote from type char to type int in the initializing assignment. The array called descriptions contains a simple String description for each block of characters.
For the window, we use an instance of the Java API JFrame class, which is in the javax.swing package. We set the window size at 600 pixels wide and 285 pixels high. We include in the window a single JTextArea object called area, and we enable its line-wrap capability. We use JTextArea’s append method to add each new string or character to whatever is already there.
Before looping, we display some general font information. The outer for loop displays the value of the first code number in one of the chosen blocks of characters and then a description of that block. The inner for loop displays the first 73 characters in that block. In the append method’s argument, notice how we
and there must be exactly four hexadecimal digits.
Using Unicode in Java Programs
add the loop count, j , to the initial Unicode value to get each individual Unicode value as an int. Then we Apago PDF Enhancer
cast that int into a char. Then the concatenated " the append method’s parameter type.
" converts that char into a String, which matches
Figure 11.12 shows the GUI output this program generates. The characters in the codes array in Figure 11.11 are the Unicode escape sequences for the first character in each block of characters shown in Figure 11.12. The hollow squares indicate code numbers that don’t have symbols assigned to them or sym- bols that are not present in the current computer’s software library. Notice that both the Greek and Cyrillic blocks include both upper and lower case characters, and they include some additional characters beyond the normal final values of () and Я (я), respectively. These (and other) additional characters are needed for some of the individual languages in the families of languages using these alphabets. Of course, the char- acters shown in Figure 11.12 are just a tiny sampling of all the characters in Unicode.
Notice that the different characters shown in Figure 11.12 have generally different widths. To get constant-width characters, you’d have to change the font type to something like Courier New. You could do that—and also change the style to bold and size to 10 points—by inserting a statement like this:
area.setFont(new Font("Courier New", Font.BOLD, 10));
Suppose you want the Unicode value for ≈. That’s the last mathematical operator displayed in Figure 11.12. As indicated by the third codes value in the UnicodeDisplay program, the first mathematical operator has
5 The supplementary Unicode characters have numeric values that require more than 4 hexadecimal digits. To specify one of these supplementary characters, use a decimal or hexadecimal int representation of the character, or prefix the \u-representation of the 4 least-significant hexadecimal digits with an appropriate u-representation in the range, \uD800 through \uDFFF. The pre- fix, called a surrogate, has no independent character association. (See documentation on Java’s Character class and http://www .unicode.org/Public/UNIDATA/Blocks.txt.) There’s also another surrogate scheme which represents characters with an 8-bit base value and multiple 8-bit surrogates. This latter scheme is used in communications.