File search using regex
Michael Cohen
michael.cohen at netspeed.com.au
Thu Apr 19 10:05:16 CST 2007
On Thu, Apr 19, 2007 at 12:03:36AM +0930, Mark Newton wrote:
> All you wordy python programmers are weenies. What ever happened to
> efficient code, I ask you? Sheesh!
>
> - mark
Mark,
dont mistake wordy for inefficient - we would rather type a few more
characters in to make our code more readable - its probably just as efficient
though.
Compare your example below and have someone that has never programmed in perl
read it. I used to write heaps of perl, but i havent done any for a while now -
and could not remember exactly what $_, $/ mean (although I probably can
figure it out from the context). It probalbly doesnt matter for something as
simple as this though :-)
BTW does your code handle the sliding window problem? Its unclear to anyone
who is not intimately familiar with how perl works internally. The following
questions come to my mind when reading this (bear in mind I forgot most of my
perl):
Does grep read lines or fixed sized buffers? (It must read buffers i suppose
or you program would not work because the pattern is multi lined). How
large are the buffers then?
But the regex is fixed at ^ (start of line) so maybe it reads lines? and how
many lines then does it read at once?
If it reads lines what is the line size? If it comes across a 10GB file with
no \n what happens? Does it try to load the whole lot into memory or does it
have a maximum line length. And if so what is it?
Perl code tends to have lots of implicit assumptions and behaviours which are
fine as long as you know them. Thats why it takes so long to truely master
because it takes time to learn all those querky implicint behaviours.
Michael.
>
> #!/usr/bin/perl
>
> # usage: search.pl filespec
> # (filespec can be a list of files, wildcard, etc...)
>
> local $/;
>
> map {
> open FH, "<$_";
> print "$_\n" if grep (/^!D2\nInvoice\n!C\nAUSTALIA EIGHT/, <FH>);
> close FH;
> } @ARGV;
>
> # end
>
>
>
> --------------------------------------------------------------------
> I tried an internal modem, newton at atdot.dotat.org
> but it hurt when I walked. Mark Newton
> ----- Voice: +61-4-1620-2223 ------------- Fax: +61-8-82356937 -----
> --
> LinuxSA WWW: http://www.linuxsa.org.au/ IRC: #linuxsa on irc.freenode.net
> To unsubscribe or change your options:
> http://www.netcraft.com.au/mailman/listinfo/linuxsa
More information about the linuxsa
mailing list