summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorCorinna Vinschen <corinna@vinschen.de>2009-03-24 12:37:02 +0000
committerCorinna Vinschen <corinna@vinschen.de>2009-03-24 12:37:02 +0000
commit1c6743b74d5dc40545daa4b18577ae304340a446 (patch)
treeeef192ae1cdc9a92a80862268809cc40a759517a
parent161211d186a16e4f090b8b3c63040f0b9aee25d4 (diff)
downloadcygnal-1c6743b74d5dc40545daa4b18577ae304340a446.tar.gz
cygnal-1c6743b74d5dc40545daa4b18577ae304340a446.tar.bz2
cygnal-1c6743b74d5dc40545daa4b18577ae304340a446.zip
* cygwinenv.sgml: Move "codepage:xxx" to the removed options section.
Change text accordingly. * new-features.sgml: Try to explain new way to define character sets.
-rw-r--r--winsup/doc/ChangeLog6
-rw-r--r--winsup/doc/cygwinenv.sgml33
-rw-r--r--winsup/doc/new-features.sgml28
3 files changed, 39 insertions, 28 deletions
diff --git a/winsup/doc/ChangeLog b/winsup/doc/ChangeLog
index 281c5107d..4ead25882 100644
--- a/winsup/doc/ChangeLog
+++ b/winsup/doc/ChangeLog
@@ -1,3 +1,9 @@
+2009-03-24 Corinna Vinschen <corinna@vinschen.de>
+
+ * cygwinenv.sgml: Move "codepage:xxx" to the removed options section.
+ Change text accordingly.
+ * new-features.sgml: Try to explain new way to define character sets.
+
2009-03-18 Corinna Vinschen <corinna@vinschen.de>
* cygwin-ug-net.in.sgml: Update date.
diff --git a/winsup/doc/cygwinenv.sgml b/winsup/doc/cygwinenv.sgml
index 48cb5a6c8..c7f1e98ff 100644
--- a/winsup/doc/cygwinenv.sgml
+++ b/winsup/doc/cygwinenv.sgml
@@ -12,29 +12,6 @@ by prefixing with <literal>no</literal>.</para>
<itemizedlist mark="bullet">
<listitem>
-<para><envar>codepage:[ansi|oem|utf8]</envar> - This option controls
-which single- or multibyte character set is used for file and console
-operations. Windows is using UTF-16 characters internally and this
-option specifies how 8-byte character sets are converted to UTF-16 and
-vice versa. The default setting is <envar>ansi</envar> which means,
-conversion is based on the current ANSI codepage, typically 1252 in
-many Western language versions of Windows. The name originates from the
-ANSI Latin1 (ISO 8859-1) standard, used in Windows 1.0, though the
-character sets have since diverged from any standard. The second
-setting selects an older, DOS-based character set, containing various
-line drawing and special characters. It is called <envar>oem</envar>
-since it was originally encoded in the firmware of IBM PCs by original
-equipment manufacturers (OEMs).</para>
-<para>If you find that some characters (especially non-US or 'graphical' ones)
-do not display correctly in Cygwin, you can use this option to select an
-appropriate codepage. Finally, <envar>utf8</envar> treats all file names
-and console characters as UTF-8 chars. Please note that, for correct
-operation, you have to set the environment variable LANG or LC_ALL to
-somthing like "en_US.UTF-8", otherwise many applications will not be
-able to recognize UTF-8 strings correctly.</para>
-</listitem>
-
-<listitem>
<para><envar>(no)dosfilewarning</envar> - If set, Cygwin will warn the
first time a user uses an "MS-DOS" style path name rather than a POSIX-style
path name. Defaults to set.</para>
@@ -195,6 +172,16 @@ information, read the documentation in <xref linkend="mount-table"></xref> and
</listitem>
<listitem>
+<para><envar>codepage:[ansi|oem]</envar> - This option controled
+which character set is used for file and console operations. Since Cygwin
+is now doing all character conversion by itself, depending on the
+application call to the <function>setlocale()</function> function, and in
+turn by the setting of the environment variables <envar>$LANG</envar>,
+<envar>$LC_ALL</envar>, or <envar>$LC_CTYPE</envar>, this setting
+got useless.</para>
+</listitem>
+
+<listitem>
<para><envar>(no)ntea</envar> - This option has been removed since it
only fakes security which is considered dangerous and useless. It also
created an uncontrollably large file on FAT and was entirely useless
diff --git a/winsup/doc/new-features.sgml b/winsup/doc/new-features.sgml
index 57bac4f44..4f8db0f02 100644
--- a/winsup/doc/new-features.sgml
+++ b/winsup/doc/new-features.sgml
@@ -17,13 +17,18 @@
are only local to the current session and disappear when the last
Cygwin process in the session exits.
+- If a filename cannot be represented in the current character set,
+ the character will be converted to a sequence Ctrl-N + UTF-8 representation
+ of the character. This allows to access all files, even those not
+ having a valid representation of their filename in the current character
+ set (codepage). To have always a valid string, use the UTF-8 charset
+ by setting the environment variable $LANG, $LC_ALL, or $LC_CTYPE to a
+ valid POSIX value, for instance in Cygwin.bat like this:
+
+ set LC_CTYPE=en_US.UTF-8
+
- PATH_MAX is now 4096. Internally, path names can be as long as the
underlying OS can handle (32K).
-
-- UTF-8 filenames are supported now. So far, this requires to set
- the environment variable CYGWIN to contain "codepage:utf8". but this
- will likely disappear at one point. The setting of $LANG or $LC_CTYPE
- will be used instead.
- struct dirent now supports d_type, filled out with DT_REG or DT_DIR.
All other file types return as DT_UNKNOWN for performance reasons.
@@ -176,6 +181,19 @@
<sect2 id="ov-new1.7-posix"><title>Other POSIX related changes</title>
<screen>
+- A lot of character sets are supported now via a call to setlocale().
+ The setting of the environment variables $LANG, $LC_ALL or $LC_CTYPE will
+ be used. For instance, setting $LANG to "de_DE.ISO-8859-15" before
+ starting a Cygwin session will use the ISO-8859-15 character set in
+ the entire session. UTF-8 is supported as well, as in "en_US.UTF-8".
+
+ The full list of supported character sets: "ASCII", "ISO-8859-x" with x
+ in 1-16, except 12, "UTF-8", Windows codepages "CPxxx", with xxx in
+ (437, 720, 737, 775, 850, 852, 855, 857, 858, 862, 866, 874, 1125,
+ 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258), "JIS", "SJIS",
+ "eucJP", "Big5". The leading language and territory part (en_US) is not
+ used by Cygwin yet, but is required for POSIX compatibility.
+
- Allow multiple concurrent read locks per thread for pthread_rwlock_t.
- Implement pthread_kill(thread, 0) as per POSIX.