*BSD News Article 32425

Path: sserve!newshost.anu.edu.au!harbinger.cc.monash.edu.au!msuinfo!agate!howland.reston.ans.net!swrinde!news.uh.edu!moocow.cs.uh.edu!wjin
From: wjin@moocow.cs.uh.edu (Woody Jin)
Newsgroups: comp.os.386bsd.misc
Subject: Re: FreeBSD platform
Date: 5 Jul 1994 04:39:17 GMT
Organization: University of Houston
Lines: 60
Message-ID: <2vao5l$sp2@masala.cc.uh.edu>
References: <2uacoc$gfi@reuter.cse.ogi.edu> <2v7tf4$ctj@mojo.eng.umd.edu> <2v87c0$6vo@masala.cc.uh.edu> <CsEJx0.L3@tfs.com>
NNTP-Posting-Host: moocow.cs.uh.edu

In article <CsEJx0.L3@tfs.com>, Julian Elischer <julian@tfs.com> wrote:
>In article <2v87c0$6vo@masala.cc.uh.edu>,
>Woody Jin <wjin@moocow.cs.uh.edu> wrote:
>>However, in Unix, a program does not reside in consecutive pages, they
>>are all scattered around the physical memory.
>[....]
>>While the above story is not realistic, it is typical that similar story
>>happens very often.  We may easily know this since direct mapped cache (DMC)
>>can't do LRU. For example,
>[...]
>>One problem is that it is very difficult to expect the hit ratio of cache in
>>muti-tasking OS.
>>The only way is to compare the performance  by running various applications.
>
>I think you are confusing the buffer cache with the ram cache.
>The Ram cache is not page aligned or even page based.
>it caches  memory locations on a much smaller granularity,
>and cares not about the relative locations of pages.

Well, I am not confuesed with the buffer cache.  I was overloading ( 8-) )
the term, "page".  Maybe just "block" should have been right.
BTW, do you know the block size of the ram cache for normal PCs ?

>While it is true that two processes could easily be running with
>active code paths running to the same cache lines, with a large (256k)
>cache and few programs running, this doesn't happen that often. Anyway
>caches are used mostly to speed up looped code, in which case the 
>scale of loop times is so small compared to the scale of the scheduling
>quanta, that even thoug a cache line may be flushed out by a competing process
>it is still effectlive in assiting Both processes because it assists
>all but the first loop on each sheduling burst.

Probably, if a single user is running just several processes, you maybe
right.  In fact, I am glad to hear that direct-mapped cache just does
the right thing for this case, since I can safely buy cheap PCs with such
cache system (8-)).
But what I was talking about was *multi-user* case - where possibly many
users are running many serious processes.
I doubt that whether direct mapped cache can really help in this case.
Most quality products come with (8-way) set associative cache, which can do
LRU and which is not directly mapped.
If we go down one level down, Power PC, for example,  has  an 8-way
set associative cache (32K).  On the other hand, Pentium has
two  2-way set associative cache (8K each), which is pretty much
similar to direct mapped (only 2-way).


>running with rm cache turned off on a 486DX50 will
>result in a 3 to 5 times reduction in speed (from experience).

This is strange.
First, my motherboard manual says that in case of DX50, read/write to
ram cache requires 2 wait status using 20nsec cache SRAM
(and 0 wait status for DX2-66, DX-33, when 15nsec ram cache was used).
Considering that 486DX also has an (internal) cache, 3 to 5 time reduction 
is too much, I think.   Does 486DX also use diretct mapped cache for the 
internal one ?


Woody Jin