======================================================================
=                   Korean_language_and_computers                    =
======================================================================

                             Introduction                             
======================================================================
The writing system of the Korean language is a syllabic alphabet of
character parts () organized into character blocks () representing
syllables. The character parts cannot be written from left to right on
the computer, as in many Western languages. Every possible syllable in
Korean would have to be rendered as syllable blocks by a font, or each
character part would have to be encoded separately. Unicode has both
options; the character parts  (h) and  (a), and the combined syllable
(ha), are encoded.


           {{anchor|Character encodings}}Character encoding           
======================================================================
In RFC 1557, a method known as ISO-2022-KR for seven-bit encoding of
Korean characters in email was described. Where eight bits are
allowed, EUC-KR encoding is preferred. These two encodings combine
US-ASCII (ISO 646) with the Korean standard KS X 1001:1992 (previously
named KS C 5601:1987). Another character set, KPS 9566 (similar to KS
X 1001), is used in North Korea.

The international Unicode standard contains special characters for the
Korean language in the Hangul phonetic system. Unicode supports two
methods. The method used by Microsoft Windows is to have each of the
11,172 syllable combinations as code and a preformed font character.
The other method encodes letters ('jamos') and lets the software
combine them correctly. The Windows method requires more font memory
but allows better shapes, since it is complicated to create
stylistically correct combinations (preferable for documents).

Another possibility is stacking a sequence of medial(s) ('jungseong')
and a sequence of final(s) ('jongseong') or a Middle Korean pitch mark
(if needed) on top of the sequence of initial(s) ('choseong') if the
font has medial and final 'jamos' with zero-width spacing inserted to
the left of the cursor or caret, thus appearing in the right place
below (or to the right of) the initial. If a syllable has a horizontal
medial (, , ,  or ), the initial will probably appear further left in
a complete syllable than in preformed syllables due to the space that
must be reserved for a vertical medial, making aesthetically poor what
may be the only way to display Middle Korean hangul text without
resorting to images, romanization, replacement of obsolete jamo or
non-standard encodings. However, most current fonts do not support
this.

The Unicode standard also has attempted to create a unified CJK
character set which can represent Chinese (Hanzi) and the Japanese
(Kanji) and Korean (Hanja) derivatives of this script through Han
unification, which does not discriminate by language or region in
rendering Chinese characters if the typographic traditions have not
resulted in major differences in what a character looks like. Han
unification has been criticized.


                              Text input                              
======================================================================
On a Korean computer keyboard, text is typically entered by pressing a
key for the appropriate jamo; the operating system creates each
composite character on the fly. Depending on the Input method editor
and keyboard layout, double consonants can be entered by holding the
shift button. When all jamo making up a syllabic block has been
entered, the user may initiate a conversion to hanja (or other special
characters) using a keyboard shortcut or interface button; South
Korean keyboards have a key for this. Subsequent semi-automated hanja
conversion is supported in varying degrees by word processors.

When using a keyboard with another language, most operating systems
require the user to type with an original Korean keyboard layout; the
most common is Dubeolsik. In other languages, such as Japanese, text
can be entered on non-native keyboards with romanization.

Operating systems such as Linux allow
'engine/hangul/hangul-keyboard='ro', resulting in a romaja keyboard;
typing "seonggye" results in 성계. In this configuration, ㄲ is obtained
by "gg" rather than . This allows keying "jasanGun" to obtain 자산군,
instead of keying "jasangun" (which would provide 자상운).


 {{anchor|Pre-division of Korea}}Before Korean division 
========================================================
Korean text input is related to Korean typewriters (타자기) before
computers. The first Korean typewriter is unclear; according to Jang
Bong Seon, Horace Grant Underwood made a Korean typewriter during the
first decade of the 20th century. Lee Won Ik, living in the United
States, has been credited with developing the first Korean typewriter
in 1914. In 1927, Song Ki Joo invented the first Dubeolsik typewriter
in Chicago; however, it no longer exists. Song's 1934 typewriter is
stored in the Hangul museum as the oldest existing Korean typewriter.
The invention of the typewriter led to the development of other
typewriters in 1945 by Kim Joon Sung and 1950 by Kong Byung Woo.


 {{anchor|Division of Korea}}After division 
============================================
South Korea originally had a Nebeolsik standard, but Dubeolsik became
standard in 1985.


                                Hanja                                 
======================================================================
Some Korean fonts do not include hanja, and word processors do not
allow a user to specify which font to use as a fallback for any hanja
in a text; each hanja sequence must be manually formatted for a
desired font.


      {{anchor|Special situations}}Pitch marks and vertical text      
======================================================================
Vertical text is supported poorly (or not at all) by HTML and most
word processors. This is not an issue for modern Korean, which is
usually written horizontally; until the second half of the 20th
century, however, Korean was often written vertically.
Fifteenth-century texts written in hangul had pitch marks to the left
of syllables which are included in Unicode, although current fonts do
not support them.


                               Programs                               
======================================================================
Programs designed for Korean language-related use include:
* Language recognition
** A North Korean speech recognition program is said to recognize
100,000 words, with a success rate of over 90 percent.
** 'Mongnan' (; Korea Computer Center, North Korea) Optical character
recognition software, with a reported success rate of 99 percent for
printed text and 95 percent for handwriting recognition.
* Input method editors
** 'Tan'gun' (; Pyongyang Information Center, North Korea) Allows
hangul on English versions of Windows.
** Nalgaeset Hangul Input Method Editor (날개셋 한글 입력기); Kim Yongmook,
South Korea) A hangul input method developed for the 3(se)-beolsik
Windows keyboard layout
** 'Nabi' (), 'ami' (; South Korea)Permits hangul on Linux
** m17nPermits revised romanization for hangul input on Unix
** SCIM and IBusPermits hangul and hanja input on POSIX operating
systems (including Linux and BSD)
* Word processorsThe following programs include domestic hangul fonts,
non-hangul fonts and a hangul-hanja conversion utility.
** Hangul (Hancom, South Korea)
** Changdok (; PIC, North Korea)MS-DOS program developed in April
1990; a Windows version was developed in 1996. It has a
personality-cult feature in which pressing  or  produces titles
praising Kim Il Sung and Kim Jong Il, respectively.


                          Hangul in Unicode                           
======================================================================
Hangul letters are detailed in several parts of Unicode:

* Hangul Syllables (AC00-D7A3)
* Hangul Jamo (1100-11FF)
* Hangul Compatibility Jamo (3130-318F)
* Hangul Jamo Extended-A (A960-A97F)
* Hangul Jamo Extended-B (D7B0-D7FF)


 {{anchor|Hangul Syllables block}}Hangul syllables block 
=========================================================
Pre-composed hangul syllables in the Unicode hangul syllables block
are algorithmically defined with the following formula:
: [(initial) × 588 + (medial) × 28 + (final)] + 44032

* Initial consonants


* Medial vowels


* Final consonants


To find the code point of "한" in Unicode:

* The value of the initial consonant (ㅎ) is 18.
* The value of the medial vowel (ㅏ) is 0.
* The value of the final consonant (ㄴ) is 4.

Substituting these values in the formula above yields [(18 × 588) + (0
× 28) + 4] + 44032 = 54620. The Unicode value of 한 is 54620 in
decimal, &amp;#54620; in numeric character reference, and U+D55C in
hexadecimal Unicode notation.


 How to code this in Rust 
==========================
With the below module, calling e.g. hangul::from_jamo('ㅎ', 'ㅏ',
Some('ㄴ')) will return Some('한').


mod hangul {
const INITIAL_JAMO: [char; 19] = [
'ㄱ', 'ㄲ', 'ㄴ', 'ㄷ',
'ㄸ', 'ㄹ', 'ㅁ', 'ㅂ',
'ㅃ', 'ㅅ', 'ㅆ', 'ㅇ',
'ㅈ', 'ㅉ', 'ㅊ', 'ㅋ',
'ㅌ', 'ㅍ', 'ㅎ',
];

const VOWEL_JAMO: [char; 21] = [
'ㅏ', 'ㅐ', 'ㅑ', 'ㅒ',
'ㅓ', 'ㅔ', 'ㅕ', 'ㅖ',
'ㅗ', 'ㅘ', 'ㅙ', 'ㅚ',
'ㅛ', 'ㅜ', 'ㅝ', 'ㅞ',
'ㅟ', 'ㅠ', 'ㅡ', 'ㅢ',
'ㅣ',
];

const FINAL_JAMO: [Option; 28] = [
None,       Some('ㄱ'), Some('ㄲ'), Some('ㄳ'),
Some('ㄴ'), Some('ㄵ'), Some('ㄶ'), Some('ㄷ'),
Some('ㄹ'), Some('ㄺ'), Some('ㄻ'), Some('ㄼ'),
Some('ㄽ'), Some('ㄾ'), Some('ㄿ'), Some('ㅀ'),
Some('ㅁ'), Some('ㅂ'), Some('ㅄ'), Some('ㅅ'),
Some('ㅆ'), Some('ㅇ'), Some('ㅈ'), Some('ㅊ'),
Some('ㅋ'), Some('ㅌ'), Some('ㅍ'), Some('ㅎ'),
];

const GA_LOCATION: u32 = '가' as u32; // = 44_032

pub fn from_jamo(initial: char, medial: char, last: Option) -&gt;
Option {
if !(
self::INITIAL_JAMO.contains(&amp;initial)
&amp;&amp; self::VOWEL_JAMO.contains(&amp;medial)
&amp;&amp; self::FINAL_JAMO.contains(&amp;last)
) {
return None;
}
char::from_u32(
self::GA_LOCATION
+ 588 * (INITIAL_JAMO.iter().position(|&amp;c| c ==
initial)? as u32)
+ 28 * (VOWEL_JAMO.iter().position(|&amp;c| c == medial)?
as u32)
+ FINAL_JAMO.iter().position(|&amp;c| c == last)? as u32
)
}
}


 Hangul Compatibility Jamo block 
=================================
The Unicode Hangul Compatibility Jamo block has been allocated for
compatibility with the KS X 1001 character set. It is usually used to
represent hangul without distinguishing initials and finals.


 Hangul Jamo blocks 
====================
The Hangul Jamo, Hangul Jamo Extended-A and Hangul Jamo Extended-B
blocks contain initial, medial and final jamo, including obsolete
jamo.


 Hanyang Private Use Area code 
===============================
Hangul (word processor) shipped with fonts from Hanyang Information
and Communication, which map obsolete hangul characters with Unicode's
Private Use Areas. Despite the use of PUAs instead of dedicated code
points, Hanyang's mapping was the most popular way to represent
obsolete hangul in South Korea in 2007. With its Hangul 2010, however,
Hancom deprecated Hanyang PUA code and began representing obsolete
hangul characters with Unicode hangul jamo.


                               See also                               
======================================================================
* Japanese language and computers
* Vietnamese language and computers
* List of CJK fonts
* Chinese input methods for computers
* McCune-Reischauer
* Yale romanization of Korean
* Revised Romanization of Korean
* New Korean Orthography


                            External links                            
======================================================================
* [https://www.branah.com/korean Online Korean Virtual Keyboard]
* [http://inputking.com InputKing Online Input System], an online tool
for typing Korean
*
*
* , an online tool for converting Korean text into various coding
formats and vice versa


 License 
=========
All content on Gopherpedia comes from Wikipedia, and is licensed under CC-BY-SA
License URL: http://creativecommons.org/licenses/by-sa/3.0/
Original Article: http://en.wikipedia.org/wiki/Korean_language_and_computers


.