*BSD News Article 62500


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!nntp.coast.net!howland.reston.ans.net!psinntp!psinntp!psinntp!spunky.RedBrick.COM!nntp.et.byu.edu!cwis.isu.edu!news.cc.utah.edu!park.uvsc.edu!usenet
From: Terry Lambert <terry@lambert.org>
Newsgroups: comp.unix.bsd.freebsd.misc
Subject: Re: conflicting types for `wchar_t'
Date: 19 Feb 1996 00:31:58 GMT
Organization: Utah Valley State College, Orem, Utah
Lines: 70
Message-ID: <4g8ge0$o1e@park.uvsc.edu>
References: <4eijt4$5sa@dali.cs.uni-magdeburg.de> <SODA.96Feb17221233@srapc133.sra.CO.JP>
NNTP-Posting-Host: hecate.artisoft.com

soda@sra.CO.JP (Noriyuki Soda) wrote:
] > 32 bits is a silly value for the size, since XDrawString16 exists
] > and XDrawString32 does not.
] 
] Umm...
] No. How about representing X11 COMPOUND_TEXT character by wide
] character ? (Unicode violates source-code-separation on 
] COMPOUND_TEXT <-> Unicode conversion.)
] In this case, 32 bits is necessary.

The X11R6 Xt compound text abstration is broken (IMO).  It is
really dependent on toolkit, in any case, and you can choose to
not use that toolkit (or that feature of the toolkit while using
the rest of it).

The Motif XmString is proof of this posit, actually.

] You can argue that there is no necessity of soruce-code-separtion
] in this case. But source-code-separation is required to support
] multi-script of hanzi/hanja/kanji, mainly due to font availability
] problem.

Agreed.  This is a problem with the use of Unicode as a method
of implementing multinationalization; in fact, it is only suitable
for internationalization (ie: soft configurability for a single
locale).

I frequently see the multinationalization argument used to
support ISO 10646 32 bit encoding.  The problem is that most
uses are not for multilingual encoding.

Even translation, a multilingual use, uses encoding that is
frequently non-intersecting.  Look at JIS208 + JIS212 for 24
language encoding (World24).

It's only the CJK unification that causes problems, and then
usually only for translators.

I'd argue that pushing conflicting round-trip sets into the
application/library is an acceptable solution (though I have
no real love of compound documents).

I could make the CJK unification argument with English longhand
and printed French., since the unified glyphs are not usably
interchangable.


There seems to be a hellacious bias in the Unicode community
against fixed cell rendering technologies in general and against
X in particular.

I dislike standardization of proprietary technology, and will be
using 16 bit (ISO10646/0) Unicode in most of my future text
processing work as storage encoding (just like Windows 95 and
Windows NT) for a good long time.

I'll deal with the real problems, the inability to specify a
font with glyph variants in a single set due to the clustering
of the "private use areas" when it becomes a pressing issue.
That is, if someone pays me a lot of money for a Hebrew, Arabic,
Tamil, or Devengari version of an application.

The real problems with Unicode lie in areas other than encoding.


                                        Terry Lambert
                                        terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.