File search using regex

Michael Cohen michael.cohen at netspeed.com.au
Thu Apr 19 10:05:16 CST 2007


On Thu, Apr 19, 2007 at 12:03:36AM +0930, Mark Newton wrote:
> All you wordy python programmers are weenies.  What ever happened to
> efficient code, I ask you?  Sheesh!
> 
>   - mark

Mark,
  dont mistake wordy for inefficient - we would rather type a few more
  characters in to make our code more readable - its probably just as efficient
  though.

  Compare your example below and have someone that has never programmed in perl
  read it. I used to write heaps of perl, but i havent done any for a while now -
  and could not remember exactly what $_, $/ mean (although I probably can
  figure it out from the context). It probalbly doesnt matter for something as
  simple as this though :-)

  BTW does your code handle the sliding window problem? Its unclear to anyone
  who is not intimately familiar with how perl works internally. The following
  questions come to my mind when reading this (bear in mind I forgot most of my
  perl):
  
  Does grep read lines or fixed sized buffers? (It must read buffers i suppose
  or you program would not work because the pattern is multi lined).   How
  large are the buffers then?

  But the regex is fixed at ^ (start of line) so maybe it reads lines? and how
  many lines then does it read at once?

  If it reads lines what is the line size? If it comes across a 10GB file with
  no \n what happens?  Does it try to load the whole lot into memory or does it
  have a maximum line length.  And if so what is it?

  Perl code tends to have lots of implicit assumptions and behaviours which are
  fine as long as you know them. Thats why it takes so long to truely master
  because it takes time to learn all those querky implicint behaviours.

Michael.
  
> 
> #!/usr/bin/perl
> 
> # usage:  search.pl filespec
> # (filespec can be a list of files, wildcard, etc...)
> 
> local $/;
> 
> map {
>     open FH, "<$_";
>     print "$_\n" if grep (/^!D2\nInvoice\n!C\nAUSTALIA EIGHT/, <FH>);
>     close FH;
> } @ARGV;
> 
> # end
> 
> 
> 
> --------------------------------------------------------------------
> I tried an internal modem,                    newton at atdot.dotat.org
>      but it hurt when I walked.                          Mark Newton
> ----- Voice: +61-4-1620-2223 ------------- Fax: +61-8-82356937 -----
> -- 
> LinuxSA WWW: http://www.linuxsa.org.au/ IRC: #linuxsa on irc.freenode.net
> To unsubscribe or change your options:
>  http://www.netcraft.com.au/mailman/listinfo/linuxsa


More information about the linuxsa mailing list