arrow communication direction display

When I first start­ed writ­ing Perl in my ear­ly 20’s, I tend­ed to fol­low a lot of the struc­tured pro­gram­ming con­ven­tions I had learned in school through Pascal, espe­cial­ly the notion that every func­tion has a sin­gle point of exit. For example:

sub double_even_number {
    # not using signatures, this is mid-1990's code
    my $number = shift;

    if (not $number % 2) {
        $number *= 2;
    }

    return $number; 
}

This could get pret­ty con­vo­lut­ed, espe­cial­ly if I was doing some­thing like val­i­dat­ing mul­ti­ple argu­ments. And at the time I didn’t yet grok how to han­dle excep­tions with eval and die, so I’d end up with code like:

sub print_postal_address {
    # too many arguments, I know
    my ($name, $street1, $street2, $city, $state, $zip) = @_;
    # also this notion of addresses is naive and US-centric

    my $error;

    if (!$name) {
        $error = 'no name';
    }
    else {
        print "$name\n";

        if (!$street1) {
            $error = 'no street';
        }
        else {
            print "$street1\n";

            if ($street2) {
                print "$street2\n";
            }

            if (!$city) {
                $error = 'no city';
            }
            else {
                print "$city, ";

                if (!$state) {
                    $error = 'no state';
                }
                else {
                    print "$state ";

                    if (!$zip) {
                        $error = 'no ZIP code';
                    }
                    else {
                        print "$zip\n";
                    }
                }
            }
        }
    }

    return $error;
}

What a mess. Want to count all those braces to make sure they’re bal­anced? This is some­times called the arrow anti-​pattern, with the arrowhead(s) being the most nest­ed state­ment. The default ProhibitDeepNests perlcritic pol­i­cy is meant to keep you from doing that.

The way out (lit­er­al­ly) is guard claus­es: check­ing ear­ly if some­thing is valid and bail­ing out quick­ly if not. The above exam­ple could be written:

sub print_postal_address {
    my ($name, $street1, $street2, $city, $state, $zip) = @_;

    if (!$name) {
        return 'no name';
    }
    if (!$street1) {
        return 'no street1';
    }
    if (!$city) {
        return 'no city';
    }
    if (!$state) {
        return 'no state';
    }
    if (!$zip) {
        return 'no zip';
    }

    print join "\n",
      $name,
      $street1,
      $street2 ? $street2 : (),
      "$city, $state $zip\n";

    return;
}

With Perl’s state­ment mod­i­fiers (some­times called post­fix con­trols) we can do even better:

    ...

    return 'no name'    if !$name;
    return 'no street1' if !$street1;
    return 'no city'    if !$city;
    return 'no state'   if !$state;
    return 'no zip'     if !$zip;

    ...

(Why if instead of unless? Because the lat­ter can be con­fus­ing with double-​negatives.)

Guard claus­es aren’t lim­it­ed to the begin­nings of func­tions or even exit­ing func­tions entire­ly. Often you’ll want to skip or even exit ear­ly con­di­tions in a loop, like this exam­ple that process­es files from stan­dard input or the com­mand line:

while (<>) {
    next if /^SKIP THIS LINE: /;
    last if /^END THINGS HERE$/;

    ...
}

Of course, if you are val­i­dat­ing func­tion argu­ments, you should con­sid­er using actu­al sub­rou­tine sig­na­tures if you have a Perl new­er than v5.20 (released in 2014), or one of the oth­er type val­i­da­tion solu­tions if not. Today I would write that postal func­tion like this, using Type::Params for val­i­da­tion and named arguments:

use feature qw(say state); 
use Types::Standard 'Str';
use Type::Params 'compile_named';

sub print_postal_address {
    state $check = compile_named(
        name    => Str,
        street1 => Str,
        street2 => Str, {optional => 1},
        city    => Str,
        state   => Str,
        zip     => Str,
    );
    my $arg = $check->(@_);

    say join "\n",
      $arg->{name},
      $arg->{street1},
      $arg->{street2} ? $arg->{street2} : (),
      "$arg->{city}, $arg->{state} $arg->{zip}";

    return;
}

print_postal_address(
    name    => 'J. Random Hacker',
    street1 => '123 Any Street',
    city    => 'Somewhereville',
    state   => 'TX',
    zip     => 12345,
);

Note that was this part of a larg­er pro­gram, I’d wrap that print_postal_address call in a try block and catch excep­tions such as those thrown by the code ref­er­ence $check gen­er­at­ed by compile_named. This high­lights one con­cern of guard claus­es and oth­er return ear­ly” pat­terns: depend­ing on how much has already occurred in your pro­gram, you may have to per­form some resource cleanup either in a catch block or some­thing like Syntax::Keyword::Try’s finally block if you need to tidy up after both suc­cess and failure.