*BSD News Article 8657


Return to BSD News archive

Newsgroups: comp.unix.bsd
Path: sserve!manuel.anu.edu.au!munnari.oz.au!metro!ipso!runxtsa!bde
From: bde@runx.oz.au (Bruce Evans)
Subject: Re: [386BSD] patch: ctype.h non-ANSI behaviour
Message-ID: <1992Dec8.172726.20540@runx.oz.au>
Organization: RUNX Un*x Timeshare.  Sydney, Australia.
References: <HGdbc8hesR@astral.msk.su>
Date: Tue, 8 Dec 92 17:27:26 GMT
Lines: 34

In article <HGdbc8hesR@astral.msk.su> ache@astral.msk.su writes:
>Hi, it is a really bug, man programs do something like:
>
>#include <ctype.h>
>...
>	for (s = string; *s; s++)
>		*s = tolower(*s);
>
>and got very strange results on non-alpha characters.

Code like isfoo(*s) and toofoo(*s) is usually broken on negative
characters.  The argument has to be cast to an unsigned char before
it can by passed to a ctype macro or function.  This bug is often
introduced when correct old code is ported to a STDC environment.
Where the old code has `isascii(*s) && isdigit(s*)' the new code
may have only isdigit(*s) the STDC implementation didn't have
isascii and the programmer did not fully understand the ctype
functions.

>I saw fix for library functions, but didn't saw it for macros,
>here it is:

>! #define _toupper(c)     ((c) - 'a' + 'A')
>! #define _tolower(c)     ((c) - 'A' + 'a')
>! #define toupper(c)      (islower(c) ? _toupper(c) : (c))
>! #define tolower(c)      (isupper(c) ? _tolower(c) : (c))

This is not standard conforming, because the argument to the macros
is evaluated more than once.  This problem is avoided by not providing
macros for toupper or tolower.  If the functions are gcc-inline functions
then they should be fast enough.  One correct way to implement toupper
as a macro is to use a table lookup.
-- 
Bruce Evans  (bde@runx.oz.au)