This month I started a new job at Alert Logic, a cybersecurity provider with Perl (among many other things) at its beating heart. I’ve been learning a lot, and part of the process has been understanding the APIs in the code base. To that end, I’ve been writing small test scripts to tease apart data structures, using Perl array-​processing, list-​processing, and hash- (i.e., associative array)-processing functions.

I’ve covered map, grep, and friends a couple times before. Most recently, I described using List::Util’s any function to check if a condition is true for any item in a list. In the simplest case, you can use it to check to see if a given value is in the list at all:

use feature 'say';
use List::Util 'any';
my @colors =
  qw(red orange yellow green blue indigo violet);
say 'matched' if any { /^red$/ } @colors;

However, if you’re going to be doing this a lot with arbitrary strings, Perl FAQ section 4 advises turning the array into the keys of a hash and then checking for membership there. For example, here’s a simple script to check if the colors input (either from the keyboard or from files passed as arguments) are in the rainbow:

#!/usr/bin/env perl

use v5.22; # introduced <<>> for safe opening of arguments
use warnings;
 
my %in_colors = map {$_ => 1}
  qw(red orange yellow green blue indigo violet);

while (<<>>) {
  chomp;
  say "$_ is in the rainbow" if $in_colors{$_};
}

List::Util has a bunch of functions for processing lists of pairs that I’ve found useful when pawing through hashes. pairgrep, for example, acts just like grep but instead assigns $a and $b to each key and value passed in and returns the resulting pairs that match. I’ve used it as a quick way to search for hash entries matching certain value conditions:

use List::Util 'pairgrep';
my %numbers = (zero => 0, one => 1, two => 2, three => 3);
my %odds = pairgrep {$b % 2} %numbers;

Sure, you could do this by invoking a mix of plain grep, keys, and a hash slice, but it’s noisier and more repetitive:

use v5.20; # for key/value hash slice 
my %odds = %numbers{grep {$numbers{$_} % 2} keys %numbers};

pairgreps compiled C‑based XS code can also be faster, as evidenced by this Benchmark script that works through a hash made of the Unix words file (479,828 entries on my machine):

#!/usr/bin/env perl

use v5.20;
use warnings;
use List::Util 'pairgrep';
use Benchmark 'cmpthese';

my (%words, $count);
open my $fh, '<', '/usr/share/dict/words'
  or die "can't open words: $!";
while (<$fh>) {
  chomp;
  $words{$_} = $count++;
}
close $fh;

cmpthese(100, {
  grep => sub {
    my %odds = %words{grep {$words{$_} % 2} keys %words};
  },
  pairgrep => sub {
    my %odds = pairgrep {$b % 2} %words;
  },
} );

Benchmark output:

           Rate     grep pairgrep
grep     1.47/s       --     -20%
pairgrep 1.84/s      25%       --

In general, I urge you to work through the Perl documentations tutorials on references, lists of lists, the data structures cookbook, and the FAQs on array and hash manipulation. Then dip into the various list-​processing modules (especially the included List::Util and CPAN’s List::SomeUtils) for ready-​made functions for common operations. You’ll find a wealth of techniques for creating, managing, and processing the data structures that your programs need.

2 thoughts on “Perl list processing is for hashes, too

Comments are closed.