TCLUG Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [TCLUG:18438] regex for stripping FONT tags?



try

perl -i -p -e 'BEGIN{$/=0} s{</*font.*?>}{}igs' [filenames]

The BEGIN puts you in paragraph mode.  The 's' causes '.' to match a
newline.  Should work, haven't tested it .

Patrick McCabe


----- Original Message -----
From: <barnabas@pobox.com>
To: <tclug-list@mn-linux.org>
Sent: Thursday, June 01, 2000 6:43 AM
Subject: Re: [TCLUG:18438] regex for stripping FONT tags?


> You might try undefining the input record seperator, $/ (IIRC) or
> $INPUT_RECORD_SEPARATOR if you're use-ing English.  This will cause the
> entire file to be read into a single string and should allow you to use
> the regex below to get rid of <font> tags that cross new lines.
>
> HTH
>
> Eric
>
> Mike Hicks wrote:
> >
> > Luke Francl wrote:
> > >
> > > On Wed, 31 May 2000, Mike Hicks wrote:
> > >
> > > > I think you might try
> > > >
> > > > s/<\/*font.*?>//i
> > > >
> > > > The ? will make the regex find the nearest ">" rather than one at
the
> > > > end of the line or the end of the document
> > >
> > > Ah, thank you. I really need to buy "Mastering Regular Expressions"...
> > >
> > > I also needed to add "g" to the end to find all occurances; It was
only
> > > finding one <font> tag per line.
> > >
> > > I've ended up with the following little blurb:
> > >
> > > perl -i -p -e 's/<\/*font.*?>//ig' [filenames]
> > >
> > > It works pretty nice, but doesn't match font tags that break across
> > > newlines. I tried trowing a \n* in there and adding "s" so that it
treats
> > > the string as a single line, but neither helped.
> >
> > Hmm.. I think that perl is already breaking the string up into
> > line-by-line strings.  You'd probably have to somehow join() them back
> > together into one long string or prevent perl from breaking them up in
> > the first place.
> >
> > > This is OK since I can clean those culprits out by hand. Still, any
ideas
> > > on how to fix that?
> > >
> > > Thanks,
> > >
> > > Luke
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: tclug-list-unsubscribe@mn-linux.org
> > > For additional commands, e-mail: tclug-list-help@mn-linux.org
> >
> > --
> >  _  _  _  _ _  ___    _ _  _  ___ _ _  __   Microsoft Windows:  A
> > / \/ \(_)| ' // ._\  / - \(_)/ ./| ' /(__   virus with mouse support.
> > \_||_/|_||_|_\\___/  \_-_/|_|\__\|_|_\ __)
> > [ Mike Hicks | http://umn.edu/~hick0088/ | mailto:hick0088@tc.umn.edu ]
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: tclug-list-unsubscribe@mn-linux.org
> > For additional commands, e-mail: tclug-list-help@mn-linux.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tclug-list-unsubscribe@mn-linux.org
> For additional commands, e-mail: tclug-list-help@mn-linux.org
>
>