As a Perl developer, you’re probably aware of the language’s strengths as a text-​processing language and how many computing tasks can be broken down into those types of tasks. You might not realize, though, that Perl is also a world-​class list processing language and that many problems can be expressed in terms of lists and their transformations.

Chief among Perl’s tools for list processing are the functions map and grep. I can’t count how many times in my twenty-​five years as a developer I’ve run into code that could’ve been simplified if only the author was familiar with these two functions. Once you understand map and grep, you’ll start seeing lists everywhere and the opportunity to make your code more succinct and expressive at the same time.

What are lists?

Before we get into functions that manipulate lists, we need to understand what they are. A list is an ordered group of elements, and those elements can be any kind of data you can represent in the language: numbers, strings, objects, regular expressions, references, etc., as long as they’re stored as scalars. You might think of a list as the thing that an array stores, and in fact Perl is fine with using an array where a list can go.

my @foo = (1, 2, 3);

Here we’re assigning the list of numbers from 1 to 3 to the array @foo. The difference between the array and the list is that the list is a fixed collection, while arrays and their elements can be modified by various operations. perlfaq4 has a great discussion on the differences between the two.

Lists are everywhere, man!

Ever wanted to sort some data? You were using a list.

join a bunch of things together into a string? List again.

split a string into pieces? You got a list back (in list context; in scalar context, you got the size of the list.)

Heck, even the humble print function and its cousin say take a list (and an optional filehandle) as arguments; it’s why you can treat Perl as an upscale AWK and feed it scalars to output with a field separator.

You’re using lists all the time and may not even know it.

map: The list transformer

The map function is devious in its simplicity: It takes two inputs, an expression or block of code, and a list to run it on. For every item in the list, it will alias $_ to it, and then return none, one, or many items in a list based on what happens in the expression or code block. You can call it like this:

my @foo = map bar($_), @list;

Or like this:

my @foo = map { bar($_) } @list;

We’re going to ignore the first way, though because Conway (Perl Best Practices, 2005) tells us that when you specify the first argument as an expression, it’s harder to tell it apart from the remaining arguments, especially if that expression uses a built-​in function where the parentheses are optional. So always use a code block!

You should always turn to map (and not, say, a for or foreach loop) when generating a new list from an old list. For example:

my @lowercased = map { lc } @mixed_case;

When paired with a lookup table, map is also the most efficient way to tell if a member of a list equals a string, especially if that list is static:

use Const::Fast;

const my %IS_EXIT_WORD => map { ($_ => 1) }
  qw(q quit bye exit stop done last finish aurevoir);


die if $IS_EXIT_WORD{$command};

Here we’re using maps ability to return multiple items per source element to generate a constant hash, and then testing membership in that hash.

grep: The list filter

You may recognize the word grep” from the Unix command of the same name. It’s a tool for finding lines of text inside of other text using a regular expression describing the desired result.

Perl, of course, is really good at regular expressions, but its grep function goes beyond and enables you to match using any expression or code block. Think of it as a partner to map; where map uses a code block to transform a list, grep uses one to filter it down. In fact, other languages typically call this function filter.

You can, of course, use regular expressions with grep, especially because a regexp match in Perl defaults to matching on the $_ variable and grep happens to provide that to its code block argument. So:

my @months_with_a = grep { /[Aa]/ } qw(
  January February March
  April   May      June
  July    August   September
  October November December

But grep really comes into its own when used for its general filtering capabilities; for instance, making sure that you don’t accidentally try to compare an undefined value:

say $_ > 5
  ? "$_ is bigger"
  : "$_ is equal or smaller"
  for grep { defined } @numbers;

Or when executing a complicated function that returns true or false depending on its arguments:

my @results = grep { really_large_database_query($_) }

You might even consider chaining map and grep together. Here’s an example for getting the JPEG images out of a file list and then lowercasing the results:

my @jpeg_files = map  { lc }
                grep { /\.jpe?g$/i } @files;

Side effects may include…” (updated)

When introducing map above I noted that it aliased $_ for every element in the list. I used that term deliberately because modifications to $_ will modify the original element itself, and that is usually an error. Programmers call that a side effect,” and they can lead to unexpected behavior or at least difficult-​to-​maintain code. Consider:

my @needs_docs = grep { s/\.pm$/.pod/ && !-e }

The intent may have been to find files ending in .pm that don’t have a corresponding .pod file, but the actual behavior is replacing the .pm suffix with .pod, then checking whether that filename exists. If it doesn’t, it’s passed through to @needs_docs; regardless, @pm_files has had its contents modified.

If you really do need to modify a copy of each element, assign a variable within your code block like this:

my @needs_docs = grep {
                   my $file = $_;
                   $file =~ s/\.pm$/.pod/;
                   !-e $file
                 } @pm_files;

But at that point you should probably refactor your multi-​line block as a separate function:

my @needs_docs = grep { file_without_docs($_) }

sub file_without_docs {
    my $file = shift;
    $file =~ s/\.pm$/.pod/;
    return !-e $file;

In this case of using the substitution operator s///, you could also do this when using Perl 5.14 or above to get non-​destructive substitution:

use v5.14;

my @needs_docs = grep { !-e s/\.pm$/.pod/r }

And if you do need side effects, just use a for or foreach loop; future code maintainers (i.e., you in six months) will thank you.

Taking you higher

map and grep are examples of higher-​order functions, since they take a function (in the form of a code block) as an argument. So congratulations, you just significantly leveled up your knowledge of Perl and computer science. If you’re interested in more such programming techniques, I recommend Mark Jason Dominus’ Higher Order Perl (2005), available for free online.

2 thoughts on “Better Perl: Using map and grep

  1. Using modern perl and the r” modified for s/​/​/​you can rewrite the last example as:

    my @needs_docs = grep { !-e s{[.]pm}{.pod}r } @pm_files;

    The /​r modifier is one of the best things to come into Perl in recent years…

    • See, this is what Neil Bowers meant when he recently wrote, People aren’t sure which features came in which version of perl, or whether they have a guard.” I had completely forgotten this was a thing and which version I’d need to specify. I’ll work on an update.

Comments are closed.