*BSD News Article 67680


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!news.rmit.EDU.AU!aggedor.rmit.edu.au!phillip
From: phillip@mirriwinni.cse.rmit.edu.au (Phillip Musumeci)
Newsgroups: comp.os.linux.development.system,comp.unix.bsd.386bsd.misc,comp.unix.bsd.bsdi.misc,comp.unix.bsd.netbsd.misc,comp.unix.bsd.freebsd.misc,comp.os.linux.advocacy
Subject: Re: reading WORD ``attachments''    (aaaarrggh!)
Date: 04 May 1996 08:23:11 GMT
Organization: Computer Systems Engineering Department, RMIT Australia
Lines: 67
Message-ID: <PHILLIP.96May4182311@mirriwinni.cse.rmit.edu.au>
References: <4ki055$60l@Radon.Stanford.EDU> <4mb38b$680@solaria.cc.gatech.edu>
	<4mckp4$34m@rigel.tm.informatik.uni-frankfurt.de>
	<4md20d$c2@solaria.cc.gatech.edu> <4metnh$dio@sidhe.memra.com>
Reply-To: phillip@rmit.edu.au
NNTP-Posting-Host: pm.cse.rmit.edu.au
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
In-reply-to: michael@memra.com's message of 3 May 1996 23:33:21 -0700
Xref: euryale.cc.adfa.oz.au comp.os.linux.development.system:22989 comp.unix.bsd.386bsd.misc:924 comp.unix.bsd.bsdi.misc:3658 comp.unix.bsd.netbsd.misc:3495 comp.unix.bsd.freebsd.misc:18697 comp.os.linux.advocacy:47984

>>>>> "Michael" == Michael Dillon <michael@memra.com> writes:

    Michael> Nah, just use save it to disk and use "less" to read
    Michael> it. Ignore the warning about a binary file. Guess what, you
    Michael> will see all the text that the person *INTENDED* you to see
    Michael> plus some other stuff they may not have intended like scraps
    Michael> of previous versions and previous filenames that the document
    Michael> had.

This sure is TRUE :-)

Here is a very low-tech filter (not a conversion tool really) to remove
some of the weird stuff from the file before piping it into less/more.
phillip

#include <stdio.h>

/* output skip status */
int skip=0;            /* =1 whenever output is deleted */

/* output c, preceded by char ' ' if previous output was OFF */
void out_char( c )
char c;
{
  if( skip == 1 ) putchar( ' ' );
  putchar( c );
  skip = 0;
}

void main()
{
	char c;        /* input-output char */
	int x;         /* raw input         */

	while ( (x=getchar()) != EOF )
	{
	  x = x & 0x7f;
	  c = (char) x;
	  if( (x >= (int) ' ') && (x < 0x7f) ){
	    out_char( c );          /* output 7 bit char */
	  } else
	    switch ( c ){
	    case '\r':
	      out_char( '\n' );     /* map CR to NL, then output */
	      break;
	    case '\n':
	    case '\t':
	      out_char( c );        /* output NL and TAB */
	      break;
	    default: skip = 1;      /* skip other chars */
	      break;
	    }
	}
}

-- 
---------------------------------------------------------------------------
Dr Phillip Musumeci       __  /\  Postal Address:
Telephone:               /  \/  \     Dept. of Computer Systems Engineering 
 ++61 3 96605317(w1)    /        \    RMIT, GPO Box 2476V
 ++61 3 96605383(w2)   /         /    Melbourne 3001
 ++61 3 96605340(fax)  \   __   /     AUSTRALIA
RMIT Building 87.2.15,  `-'  \*/    WWW: http://pm.cse.rmit.edu.au/~phillip
410 Elizabeth Street.         .   EMAIL: phillip@rmit.edu.au
---------------------------------------------------------------------------
UNIX _IS_ user friendly.  It's just selective about who its friends are.
                                                                  --unknown