*BSD News Article 83692


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!nntp.coast.net!howland.erols.net!EU.net!sun4nl!fwi.uva.nl!not-for-mail
From: casper@fwi.uva.nl (Casper H.S. Dik)
Newsgroups: comp.unix.solaris,comp.unix.bsd.misc,comp.unix.internals
Subject: Re: Solaris 2.6
Supersedes: <cancel.casper.329c06bc@mail.fwi.uva.nl>
Date: 27 Nov 1996 10:15:41 +0100
Organization: Sun Microsystems, Netherlands
Lines: 248
Distribution: inet
Message-ID: <casper.329c06bc@mail.fwi.uva.nl>
References: <32986299.AC7@mail.esrin.esa.it> <casper.329abb76@mail.fwi.uva.nl> <57ej3a$7ij@panix2.panix.com> <casper.329ae8f2@mail.fwi.uva.nl> <57fipg$q7j@panix2.panix.com>
NNTP-Posting-Host: mail.fwi.uva.nl
Xref: euryale.cc.adfa.oz.au comp.unix.solaris:90594 comp.unix.bsd.misc:1605 comp.unix.internals:11342

tls@panix.com (Thor Lancelot Simon) writes:

>No, you don't.  If you absolutely have to support old .o files, keep an old
>set of headers and an old libc around until you can rebuild all of the old
>applications.  This works -- it's not tremendously elegant, but it works --
>and it's a lot cleaner than having two exposed system calls, and requiring
>some new programs to know about them!

We must support old .o files; many, many people ship libraries for customers
to use.

And think about it: all Sun's compilers (F77, f90, C++, etc) would *all*
require linking against the old C library & include files until a new
release of those compilers was out.  A release that would
then only be supported on 2.6+.

I think that duplicating headers and libraries requires is not
a *clean* solution.

And what about support calls about people trying to link differently
compiled .o files together?  A nightmare.  Now they can sort-of mix
and match and have no problems.  They can upgrade to 2.6 and still
continue with their development without having to change compiler options.

2.6 also allows you to specify linker option that in effect say "satisfy
2.5/2.4 symbols from the libraries only" so you can build for older
releases too.  That is another reason why such old symbols need to
be preserved.

>There are plenty of examples out there of commercial vendors doing just
>exactly the above.

Well, you mention HP/UX and SCO and I have no idea how they did size
transitions.  And perhaps you can specify what changes they made and how
they made them.

I don't think it matters much what happens under teh hood, what matters
much more is:

	- easy of use
	- transparency

using a differnet set of libraries/headers for old programs and not
being able to mix old & new objects doesn't qualify for either.

>Nonsense.  You're attributing to me an attitude which I don't have, and
>libelling a number of whole projects besides.  The 4.4BSD-derived systems have
>gone to great pains to ensure that not only can they run native precompiled
>software across OS revisions -- including the one which changed the size of
>off_t -- they can run precompiled software for other operating systems as
>well.  In fact, for a long time NetBSD/sparc not only ran its own binaries
>and 4.4BSD/sparc binaries quite happily, it ran quite a few SunOS 4 binaries
>that you folks somehow didn't quite get working under SunOS 5.  Still runs
>those, and many Solaris binaries, too.

Perhaps Linux is the worst offendor here; I've seen many comments about
stuff needing to be recompiled when new releases arrived or that broke with new
libraries.  The *BSD projects are much better controlled.

>Code that uses "long" to calculate with off_t's is *broken* -- but it won't
>lose unless lseek's argument is explicitly cast to "long", or its return
>value stored in a "long" -- because the header files take care of the former,
>and the compiler can and does warn about the latter.



Or if they declare lseek() themselves:

long lseek(....)

Lint will find this of course.

And it can be claimed, C standard in hand, that there cannot be an integral
type > long.


Yet all these things will work in Solaris 2.6.  Yes, the code is broken anyway,
but it's harder for Sun to say "that code is broken anyway" than it is
for non-comemrcial unixes.

>And what happens to your snappy response on a machine where "long" is 64
>bits?

The entire argument doesn't apply because off_t would already be 64 bits.

>For compatibility on machines where "long" is 32 bits, you have to provide
>a larger type, yes.  However, in case you didn't notice, the filesystem code
>_already_ requires this -- and has since 4.2BSD -- so since you can run UFS
>we can safely presume that you can do this; you have quad_t, implemented in
>_some_ contiguous fashion.

Yes, but the filesystem code is totally out of view of user programs.
The system calls are not.  So you are in effect introducing a new type,
as the "long long" or whatever gets much more expsure.


>No! You bump the library major number and the old applications get the old
>library.

You seem to conveniently overlook the .o files we can't recompile.

>I'm telling you, we already wrestled this tiger -- and it turned out to be
>a kitten.

I still believe your vanatge point is vastly different from those
of commercial OS vendors.  And as for BSDI, they started with 4.4 BSD which
already had large files.

>Those "major DB vendors" may be a lot less dumb than you think.  They can
>already handle 64 bit off_t at the source level -- have to, for the Alpha
>and various other systems -- and certainly _they_ can recompile.  In fact,
>their installation schemes are so incredibly system-dependent that one
>generally has to obtain a new set of media for the database when performing
>an OS upgrade anyway -- for example, between Solaris/x86 2.5 and 2.5.1, I
>had to get new Oracle media, because 7.2.3 would no loner install.

You didn't read what I wrote.

I said:

	major DB vendors sell APIs shipped in .o/.a/.so files.
	These require to be linked against the old libraries, in *your*
	scenario.
	In Sun's scenario they can still be used.

>Certainly, with warning, they can just type "make".  And if you really don't
>believe that they can, as a temporary expedient you can ship a "compatibility"
>set of headers and libraries for them to link against.

It makes development a whole lot messier, more complex, more error prone,
etc.

>Libraries created/maintained locally can be recompiled; think that one over
>for a moment.

Right, so you have to upgrade all your stuff to 2.6 *at once*.

No can do.   You'd first need to get all your vendors of .a/.o/.so stuff
to send you new 2.6 versions of their objects, then rebuild all your
stuff.   (nice, rebuild X11 just for an OS upgrade).

Or maintain two /usr/local for a prolonged time.


Now tell me, which solution is easier from a system administrators,
developer perspective?  Your:

	- recompile all your libraries
	- throw away all old object files
	- reinstall large parts of your locally installed software, fixing
	  possible long vs off_t bugs along the way
	  [ keeping in mind that not all system administrators are all
	  that comfortable with source code ]
    or
	- go through contortions as:
		if you want to link X11R6, you need to compile with the
	        old stuff, if you want to link with libfoo, you can just
		use the new stuff.

		Oh, and you can't combine libfoo and R6 yet until we recompiled
		and installed R6, sorry.


or our:
	- everything will work just fine; it's just different
	  under the hood.  If you want to take advantage of large files,
	  please use an extra compile flag.

>Yes, objects with different ideas of the size of basic types should not be
>linked together.  This is pretty elementary.  However, this is a pretty
>small fraction of the total number of precompiled programs out there, and
>there _are_ methods of dealing with it; if you have to, you can ship a
>32-bit-offset toolchain _temporarily_ instead of gunking up interfaces and
>libraries _forever_.

Well, forever is a bit harsh.  There's still only a 32bit Solaris., this
will be rectified when you have get a 64 bit Solaris 

>Requiring a compile-time option and gunking up the libraries with two versions
>of "open" is so ugly I have trouble understanding how you can have talked
>yourself into thinking that it could ever really be the best option --
>remember, you're stuck with this from now on always!

It's not pretty, but it is the only solution that satisfies all constraints.


And it's not an interface you have to program to; it's all hidden from
view until you use tools to disect it.

>There are commercial operating systems based on 4.4BSD.  In fact, the last
>time I looked, despite Sun's touting itself a bigtime player in the 
>Internet/intranet server market, it was _losing_ market share there to
>4.4-based systems, although of course everyone loses market share to Linux,
>too.

There was no installed 4.3 BSD base worth mention on hardware now supported
on 4.4 BSD.   There never was a big transition requireing serious binary
compatibility.

>No, but guess what?  In the timespan between 4.3BSD and 4.4BSD, Sun did that
>_twice_.  Tell me, can you take SunOS 3 .o files and link them to a SunOS 5
>program?  Same span of years.  Pot-kettle-black.

Twice?  Once.  And do you think this experiience is something Sun or its
customers want to repeat?

>Well, yes and no.  In the first place, if you do this the 4.4 way, there _is_
>no such thing as a "large" fd.  Now, obviously, a 32-bit program can't seek
>off the end of a long file; the old lseek() won't let it.  The only problem
>you really have is with programs which explicitly do an ftruncate() without
>checking to see if they're at the EOF first.  There aren't many of these.  If
>you make sure you change all the system utilities, nobody's ever likely to
>get bit here -- almost nothing actually ever creates long files, and the
>applications which do do so, like databases, handle their files with their
>own tools, so when the database engine goes 64-bit its tools will, too.

There are many programs doing seeks/reads and writes intermiexed.

a 32 bit program can easily read past the 31 bit boundary fetch its
current location with lseek() (truncated to 32 bits) and store it to
later seek to the wrong, truncated address, and than write there and
corrupt the files.

>Even exporting this difference to the libraries is a huge lose.  And I can't
>use non-broken 32-bit programs with the new long files without explicitly
>performing some kind of conversion on the fd before handing it to them, right?

You can; you must open it in large file aware process.

The solution chosen in the large file summit had one consraint:
there shall be no file corruption when using 32 bit applications.


Your solution hasn't given us that (but then, there was no installed base
that could complain).  And making objects incompatible, without
the ability to tell them apart or with (by changing the ELF version)
would have increased the hassle for system administrators and developers
a like.

The largefile solution chosen is painless for customers, though harder to
maintain.

Casper
-- 
Casper Dik - Sun Microsystems - via my guest account at the University
of Amsterdam.  My work e-mail address is: Casper.Dik@Holland.Sun.COM
Statements on Sun products included here are not gospel and may
be fiction rather than truth.