aboutsummaryrefslogtreecommitdiff
path: root/coreutils-5.3.0-bin/contrib/gawk/3.1.6/gawk-3.1.6-src/README_d/README.multibyte
diff options
context:
space:
mode:
authorIndrajith K L2022-12-03 17:00:20 +0530
committerIndrajith K L2022-12-03 17:00:20 +0530
commitf5c4671bfbad96bf346bd7e9a21fc4317b4959df (patch)
tree2764fc62da58f2ba8da7ed341643fc359873142f /coreutils-5.3.0-bin/contrib/gawk/3.1.6/gawk-3.1.6-src/README_d/README.multibyte
downloadcli-tools-windows-master.tar.gz
cli-tools-windows-master.tar.bz2
cli-tools-windows-master.zip
Adds most of the toolsHEADmaster
Diffstat (limited to 'coreutils-5.3.0-bin/contrib/gawk/3.1.6/gawk-3.1.6-src/README_d/README.multibyte')
-rw-r--r--coreutils-5.3.0-bin/contrib/gawk/3.1.6/gawk-3.1.6-src/README_d/README.multibyte29
1 files changed, 29 insertions, 0 deletions
diff --git a/coreutils-5.3.0-bin/contrib/gawk/3.1.6/gawk-3.1.6-src/README_d/README.multibyte b/coreutils-5.3.0-bin/contrib/gawk/3.1.6/gawk-3.1.6-src/README_d/README.multibyte
new file mode 100644
index 0000000..135ba86
--- /dev/null
+++ b/coreutils-5.3.0-bin/contrib/gawk/3.1.6/gawk-3.1.6-src/README_d/README.multibyte
@@ -0,0 +1,29 @@
+Fri Jun 3 12:20:17 IDT 2005
+============================
+
+As noted in the NEWS file, as of 3.1.5, gawk uses character values instead
+of byte values for `index', `length', `substr' and `match'. This works
+in multibyte and unicode locales.
+
+Wed Jun 18 16:47:31 IDT 2003
+============================
+
+Multibyte locales can cause occasional weirdness, in particular with
+ranges inside brackets: /[....]/. Something that works great for ASCII
+will choke for, e.g., en_US.UTF-8. One such program is test/gsubtst5.awk.
+
+By default, the test suite runs with LC_ALL=C and LANG=C. You
+can change this by doing (from a Bourne-style shell):
+
+ $ GAWKLOCALE=some_locale make check
+
+Then the test suite will set LC_ALL and LANG to the given locale.
+
+As of this writing, this works for en_US.UTF-8, and all tests
+pass except gsubtst5.
+
+For the normal case of RS = "\n", the locale is largely irrelevant.
+For other single byte record separators, using LC_ALL=C will give you
+much better performance when reading records. Otherwise, gawk has to
+make several function calls, *per input character* to find the record
+terminator. You have been warned.