xref: /illumos-gate/usr/src/man/man9f/kiconv_open.9f (revision c10c16de)
te
Copyright (c) 2007, Sun Microsystems, Inc., All Rights Reserved
The contents of this file are subject to the terms of the Common Development and Distribution License (the "License"). You may not use this file except in compliance with the License.
You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing. See the License for the specific language governing permissions and limitations under the License.
When distributing Covered Code, include this CDDL HEADER in each file and include the License file at usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your own identifying information: Portions Copyright [yyyy] [name of copyright owner]
kiconv_open 9F "16 Oct 2007" "SunOS 5.11" "Kernel Functions for Drivers"
NAME
kiconv_open - code conversion descriptor allocation function
SYNOPSIS

#include <sys/sunddi.h>



kiconv_t kiconv_open(const char *tocode, const char *fromcode);
INTERFACE LEVEL

Solaris DDI specific (Solaris DDI).

PARAMETERS

tocode

Points to a target codeset name string.

fromcode

Points to a source codeset name string.

DESCRIPTION

The kiconv_open() function returns a code conversion descriptor that describes a conversion from the codeset specified by fromcode to the codeset specified by tocode. For state-dependent encodings, the conversion descriptor is in a codeset-dependent initial state (ready for immediate use with the kiconv() function).

Supported code conversions are between UTF-8 and the following:

Name Description

 Big5 Traditional Chinese Big5
 Big5-HKSCS Traditional Chinese Big5-Hong Kong
 Supplementary Character Set
 CP720 DOS Arabic 
 CP737 DOS Greek 
 CP850 DOS Latin-1 (Western European) 
 CP852 DOS Latin-2 (Eastern European) 
 CP857 DOS Latin-5 (Turkish) 
 CP862 DOS Hebrew 
 CP866 DOS Cyrillic Russian 
 CP932 Japanese Shift JIS (Windows) 
 CP950-HKSCS Traditional Chinese HKSCS-2001 (Windows) 
 CP1250 Central Europe
 CP1251 Cyrillic
 CP1252 Western Europe
 CP1253 Greek
 CP1254 Turkish
 CP1255 Hebrew
 CP1256 Arabic
 CP1257 Baltic
 EUC-CN Simplified Chinese EUC
 EUC-JP Japanese EUC
 EUC-JP-MS Japanese EUC MS
 EUC-KR Korean EUC
 EUC-TW Traditional Chinese EUC
 GB18030 Simplified Chinese GB18030
 GBK Simplified Chinese GBK 
 ISO-8859-1 Latin-1 (Western European)
 ISO-8859-2 Latin-2 (Eastern European) 
 ISO-8859-3 Latin-3 (Southern European) 
 ISO-8859-4 Latin-4 (Northern European) 
 ISO-8859-5 Cyrillic
 ISO-8859-6 Arabic
 ISO-8859-7 Greek
 ISO-8859-8 Hebrew
 ISO-8859-9 Latin-5 (Turkish)
 ISO-8859-10 Latin-6 (Nordic) 
 ISO-8859-13 Latin-7 (Baltic)
 ISO-8859-15 Latin-9 (Western European with euro sign)
 KOI8-R Cyrillic
 Shift_JIS Japanese Shift JIS (JIS) 
 TIS_620 Thai (a.k.a. ISO 8859-11)
 Unified-Hangul Korean Unified Hangul 
 

UTF-8 and the above names can be used at tocode and fromcode to specify the desired code conversion. The following aliases are also supported as alternative names to be used:

Aliases Original Name 
 720 CP720 
 737 CP737 
 850 CP850 
 852 CP852 
 857 CP857 
 862 CP862 
 866 CP866 
 932 CP932 
 936, CP936 GBK 
 949, CP949 Unified-Hangul 
 950, CP950 Big5 
 1250 CP1250 
 1251 CP1251 
 1252 CP1252 
 1253 CP1253 
 1254 CP1254 
 1255 CP1255 
 1256 CP1256 
 1257 CP1257 
 ISO-8859-11 TIS_620 
 PCK, SJIS Shift_JIS 

A conversion descriptor remains valid until it is closed by using kiconv_close().

RETURN VALUES

Upon successful completion, kiconv_open() returns a code conversion descriptor for use on subsequent calls to kiconv(). Otherwise, if the conversion specified by fromcode and tocode is not supported or for any other reasons the code conversion descriptor cannot be allocated, kiconv_open() returns (kiconv_t)-1 to indicate the error.

CONTEXT

kiconv_close() can be called from user context only.

EXAMPLES

Example 1 Opening a Code Conversion

The following example shows how to open a code conversion from ISO 8859-15 to UTF-8

#include <sys/sunddi.h>

kiconv_t cd;

cd = kiconv_open("UTF-8", "ISO-8859-15");
if (cd == (kiconv_t)-1) {
 /* Cannot open up the code conversion. */
 return (-1);
}
ATTRIBUTES

See attributes(5) for descriptions of the following attributes:

ATTRIBUTE TYPEATTRIBUTE VALUE
Interface StabilityCommitted
SEE ALSO

iconv(3C), iconv_close(3C), iconv_open(3C), u8_strcmp(3C), u8_textprep_str(3C), u8_validate(3C), uconv_u16tou32(3C), uconv_u16tou8(3C), uconv_u32tou16(3C), uconv_u32tou8(3C), uconv_u8tou16(3C), uconv_u8tou32(3C), attributes(5), kiconv(9F), kiconvstr(9F), kiconv_close(9F), u8_strcmp(9F), u8_textprep_str(9F), u8_validate(9F), uconv_u16tou32(9F), uconv_u16tou8(9F), uconv_u32tou16(9F), uconv_u32tou8(9F), uconv_u8tou16(9F), uconv_u8tou32(9F)

The Unicode Standard

http://www.unicode.org/standard/standard.html

NOTES

The code conversions are available between UTF-8 and the above noted codesets. For example, to convert from EUC-JP to Shift_JIS, first convert EUC-JP to UTF-8 and then convert UTF-8 to Shift_JIS.

The code conversions supported are based on simple one-to-one mappings. There is no special treatment or processing done during code conversions such as case conversion, Unicode Normalization, or mapping between combining or conjoining sequences of UTF-8 and pre-composed characters in non-UTF-8 codesets.

All supported non-UTF-8 codesets use pre-composed characters only. However, UTF-8 allows combining or conjoining characters too. For this reason, using a form of Unicode Normalizations on UTF-8 text with u8_textprep_str() before or after doing code conversions might be necessary.