*BSD News Article 7605


Return to BSD News archive

Path: sserve!manuel.anu.edu.au!munnari.oz.au!spool.mu.edu!sol.ctr.columbia.edu!emory!swrinde!cs.utexas.edu!uunet!mcsun!Germany.EU.net!tools!ws
From: ws@tools.de (Wolfgang Solfrank)
Newsgroups: comp.unix.bsd
Subject: Re: Largest file size for 386BSD ?
Date: 9 Nov 92 15:32:43
Organization: TooLs GmbH, Bonn, Germany
Lines: 65
Message-ID: <WS.92Nov9153243@kurt.tools.de>
References: <1992Nov6.031757.20766@ntuix.ntu.ac.sg>
	<1992Nov6.173454.17896@fcom.cc.utah.edu>
NNTP-Posting-Host: kurt.tools.de
In-reply-to: terry@cs.weber.edu's message of 6 Nov 92 17:34:54 GMT

In article <1992Nov6.173454.17896@fcom.cc.utah.edu> terry@cs.weber.edu (A Wizard of Earth C) writes:
   In article <1992Nov6.031757.20766@ntuix.ntu.ac.sg> eoahmad@ntuix.ntu.ac.sg (Othman Ahmad) writes:
   >
   >Original Unix has the largest file size of 4Gbyte, because it uses 24-bit
   >pointer.(shouldn't it be 16Gbyte?)

   No, 4 Gig; think of identification of indirect blocks for a total of two
   levels of indirection using a 24 bit value.

While the use of 24-bit pointers (actually block numbers) in some ancious
version (V7 up to SysV) of Unix is correct, this limits the size of the
partition and only indirectly the size of one file. The limit in the
partition size thus implied is 2^24*1024 = 16GB as the logical block
size on these partition in newer versions of SysV is 1K. Actually it could
even be larger because the 24-bit block numbers are only used for the direct
blocks. The indirect blocks use 32-bit block numbers which would extend the
possible partition size to 4TeraByte (1TB = 1024GB).

This file system had 10 direct blocks, 1 sinlge-indirect, 1 double-indirect
and 1 triple-indirect block per inode. This implies a theoretical limit
to the file size of (10 + 256 + 256*256 + 256*256*256)*1024, about 16GB.

   >What is the size for 386bsd?
   >If it still uses 24-bit pointers, then the largest size is still 4Gbyte.

   Yep.

In BSD filesystems the structure is a little different. The logical block size
is 512 bytes and the block numbers in the inode (and anywhere else) are
always 32-bit. Thus the maximum partition size is limited to 2^32*512 = 2TB.

The inode has 12 direct blocks and a similar structure of indirect blocks.
But as indirect blocks are always in terms of filesystem blocks and not
fragments, they can store, depending on filesystem block size, up to
8192/4 = 2048 block address per indirect block. This results in a theoretical
limit to the file size of (12 + 2048 + 2048*2048 + 2048*2048*2048)*512,
about 4TB.

Of course both filesystems' file (and partition) sizes are actually limited
by the file size entry in the inode, which is 32-bit (there is room in the
UFS inode for extending this to 64-bit), and by the fact that file offsets
used by lseek are signed longs. This limits the actual file size to 2GB.

   >This will be an important issue because soon we'll have hundreds of gigabytes,
   >instead of magabytes soon.
   >	It took the jump from tens mega to hundreds in just 10 years.

   Get around the problem:

   1)	Multiple partitions not exceeding the 4 Gig limit.
   2)	Larger terminal blocks.
   3)	Additional indirection levels.
   4)	Assumption of larger files = log-structure file systems (ala Sprite).

   I don't think it will be an issue that soon anyway.

As the limits to both the file size and the partition size implied by the
filesystem is large enough for quite some time (especially in the case
of the UFS, IMHO) the only change eventually required would be a modified
version of lseek. This would be far more easy to add with backward compatibility
if the now last parameter would be before the offset and not behind it in
the call structure. The noew established version seems to require a new
system call.
--
ws@tools.de     (Wolfgang Solfrank, TooLs GmbH) +49-228-985800