*BSD News Article 9617


Return to BSD News archive

Received: by minnie.vk1xwt.ampr.org with NNTP
	id AA6110 ; Mon, 04 Jan 93 23:11:58 EST
Xref: sserve comp.unix.bsd:9674 comp.std.internat:1621
Path: sserve!manuel.anu.edu.au!munnari.oz.au!sgiblab!nec-gw!nec-tyo!wnoc-tyo-news!cs.titech!titccy.cc.titech!necom830!mohta
From: mohta@necom830.cc.titech.ac.jp (Masataka Ohta)
Newsgroups: comp.unix.bsd,comp.std.internat
Subject: Dumb Terry
Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages
Message-ID: <2637@titccy.cc.titech.ac.jp>
Date: 7 Jan 93 06:41:06 GMT
References: <2615@titccy.cc.titech.ac.jp> <1993Jan5.090747.29232@fcom.cc.utah.edu> <2628@titccy.cc.titech.ac.jp> <1993Jan7.045612.13244@fcom.cc.utah.edu>
Sender: news@titccy.cc.titech.ac.jp
Followup-To: comp.unix.bsd
Organization: Tokyo Institute of Technology
Lines: 118

In article <1993Jan7.045612.13244@fcom.cc.utah.edu>
	terry@cs.weber.edu (A Wizard of Earth C) writes:

>Before I proceed, I will [ once again ] remove the "dumb Americans" from my
>original topic line.

I changed the subject to reflect the content better.

>>>>>This I don't understand.  The maximum translation table from one 16 bit value
>>>>>to another is 16k.

>>>>WHAAAAT? It's 128KB, not 16k.

>It is still a translation of one 16 bit value to another.  In is *not* an
>*arbitrary* translation we are talking about, since the spanning sets will
>be known.

You wrote MAXIMUM.

>>>>>This means 2 16k tables for translation into/out of
>>>>>Unicode for Input/Output devices,

>Sorry; I misspoke (mistyped?) here.  

You are dumb.

>I meant to refer to any arbitrary 8-bit
>set for which a localization set is available (example: and ISO 8859-x set).

Do you know what HASHING is? If not, read Knuth. 

>Obviously, by this response, you meant "cat two files to a third file" rather
>than what you stated,

You don't have to create a third file, as the output might be piped.

>what you stated, which would have resulted in the files going to the
>screen.  Display device attribution based on supported character

While you may not know UNIX at all, "cat" has nothing to do with display.
Instead, some device drivers and terminal emulators might.

>Obviously what you are asking is "how do I make two monolingual/bilingual/
>multilingual files of different language attribution into a single bilingual/
>multilingual file using cat" -- not the question as you have phrased it, nor
>as I have answered it, but in the context of the discussion, clearly the
>intended tack.

"How to "cat" files with different attributes" is the classic question
to piss off attribute-lovers, which all UNIX lovers know.

Of course, there are several other reasons why not to use file attributes,
which yuu don't know. But, I'm tired.

>Rather than pretending I don't know what you are getting at,

Then, don't post anymore.

>The answer is "you don't use 'cat'".  The "cat" command does not deal with

OK, say it in comp.unix.misc and see what happens.

>What this means is that all files which are multilingual in nature require
>a compound document architecture.

No thank you. I do want to grep my multilingual files.

>What this means is that a utility to combine documents (let's call it
>"combine") must have the ability to either generate language attributed
>files (if the source files are all of a single language attribution) or
>our default compound document format (TBD).

You are making simple problem unsolvable.

>The correct approach is to note that since Unicode does not provide a
>mechanism directly for language attribution, and that file attribution
>is only a partial soloution,

So, the correct aproach is not to use Unicode as it is.

>What this means is that a utility to combine documents (let's call it
>"combine")

Wow!

>Does this answer your "cat" question sufficiently?  

Conglaturations! You are now prepared to accept the second question.

Under internationalized environment, we often create a file with Japanese
name. At the same time,

	1) we might have a file having Chinese name in the same directory.
	2) we might have a file having Chinese name in the different directory.
	3) the Japanses file's full pathname might contain Chinese at its
	   intermediate directory name.

Could you design a replacement of "ls" for such a situation?

Then, the third:

>Attribution of output and clever construction of out output device drivers
>would even allow us to switch fonts as dictated by the compound document
>architecture controls embedded in the file and/or the attribution of the
>file descriptor (the absence of such attribution being an indicator of a
>compund document).

Given the above situation for "ls", I'm afraid that "argv" to any command
be the compound document. Am I correct? Is it still have a type "char"?
Do you think the entire OS still UNIX?

>The problem seemed to
>be that there was not a means around the problem from your point of view.

Just include language information in character code, and the problem
disappears.

						Masataka Ohta