*BSD News Article 9548


Return to BSD News archive

Received: by minnie.vk1xwt.ampr.org with NNTP
	id AA5954 ; Sat, 02 Jan 93 04:19:36 EST
Xref: sserve comp.unix.bsd:9605 comp.std.internat:1575
Newsgroups: comp.unix.bsd,comp.std.internat
Path: sserve!manuel.anu.edu.au!munnari.oz.au!uunet!psinntp!ficc!peter
From: peter@ferranti.com (peter da silva)
Subject: Re: Dumb Americans (was INTERNATIONALIZATION: JAPAN, FAR EAST)
Message-ID: <id.E1FW.PX5@ferranti.com>
Followup-To: comp.std.internat
Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages
Organization: Xenix Support, FICC
References: <1ht8v4INNj7i@rodan.UU.NET> <1993Jan1.094759.8021@fcom.cc.utah.edu> <1i2k09INN4hl@rodan.UU.NET>
Date: Mon, 4 Jan 1993 15:35:27 GMT
Lines: 34

In article <1i2k09INN4hl@rodan.UU.NET> avg@rodan.UU.NET (Vadim Antonov) writes:
> You omitted one small "detail" -- you need to know the language of the word
> the letter belongs to to make a conversion.

Yes.

> Since Unicode does not
> provide for specifying the language it is obvious that is should be
> obtained from user or kept somewhere off the text. In both cases
> as our program ALREADY knows the language from the environment it knows
> the particular (small) alphabet -- no need to use multibyte encodings!

Unless you want your document to contain multilingual data. *your* solution
is only useful for documents containing a single language, in which case
why bother with ISO8859.*... a separate character code table for every
language is quite acceptable. For that matter, you can take the next step
and say "why standardise character sets when every application has specific
needs? Financial packages need dozens of currency symbols, for example, and
mathematics needs a whole host of its own symbols... each application knows
the set it needs to use... and for most text documents a 6-bit code is quite
adequate"...

You have identified two problems with Unicode and ISO 10646: case conversion
and lexical ordering.

> See how Unicode renders itself useless?

Unless you want to work on multilingual documents, yes. It could be better,
certainly, but to say it's *useless* is hyperbole.
-- 
Peter da Silva                                            `-_-'
Ferranti International Controls Corporation                'U` 
Sugar Land, TX  77487-5012 USA
+1 713 274 5180                            "Zure otsoa besarkatu al duzu gaur?"