I remember a brief time in the mid-​2000s insisting on so-​called Yoda conditions” in my Perl. I would place constants to the left of equality comparisons. In case I accidentally typed a single = instead of ==, the compiler would catch it instead of blithely assigning a variable. E.g.:

if ( $foo == 42 ) { ... } # don’t do this
if ( 42 == $foo ) { ... } # do this
if ( $foo = 42  ) { ... } # to prevent this

And because a foolish consistency is the hobgoblin of little minds, I would even extend this to string and relational comparisons.

if ( 'bar' eq $foo ) { ... } # weirdo
if ( 42 > $foo )     { ... } # make it stop

It looks weird, and it turns out it’s unnecessary as long as you precede your code with use warnings;. Perl will then warn you: Found = in conditional, should be ==“. (Sidenote: Perl v5.36, due in mid-​2022, is slated to enable warnings by default if you do use v5.35; or above, in addition to the strictness that was enabled with use v5.11;. Yay for less boilerplate!)

If you want to fatally catch this and many other warnings, use the strictures module from CPAN in your code like this:

use strictures 2;

This will cause your code to throw an exception if it commits many categories of mistakes. If you’re running in a version control system’s working directory (specifically Git, Subversion, Mercurial, or Bazaar), the module also prevents you from using indirect object syntax, Perl 4‑style multidimensional arrays, and bareword filehandles.

Getting back to assignments vs. conditionals, there is one case where I’ve found it to be acceptable to use an assignment inside an if statement, and that’s when I need to use the result of a check inside the condition. For example:

if ( my $foo = some_truthy_function() ) {
    ... # do something further with $foo
}

This keeps the scope of some_truthy_function()s result inside the block so that I don’t pollute the outer scope with a temporary variable. Fortunately, Perl doesn’t warn on this syntax.

Look, I get it. You don’t like the Perl programming language or have otherwise disregarded it as dead.” (Or perhaps you haven’t, in which case please check out my other blog posts!) It has weird noisy syntax, mixing regular expressions, sigils on variable names, various braces and brackets for data structures, and a menagerie of cryptic special variables. It’s old: 34 years in December, with a history of (sometimes amateur) developers that have used and abused that syntax to ship code of questionable quality. Maybe you grudgingly accept its utility but think it should die gracefully, maintained only to run legacy applications.

But you know what? Perl’s still going. It’s had a steady cadence of yearly releases for the past decade, introducing new features and fencing in bad behavior while maintaining an admirable level of backward compatibility. Yes, there was a too-​long adventure developing what started as Perl 6, but that language now has its own identity as Raku and even has facilities for mixing Perl with its native code or vice versa.

And then there’s CPAN, the Comprehensive Perl Archive Network: a continually-​updated collection of over 200,000 open-​source modules written by over 14,000 authors, the best of which are well-​tested and ‑documented (applying peer pressure to those that fall short), presented through a search engine and front-​end built by scores of contributors. Through CPAN you can find distributions for things like:

All of this is available through a mature installation toolchain that doesn’t break from month to month.

Finally and most importantly, there’s the global Perl community. The COVID-​19 pandemic has put a damper on the hundreds of global Perl Mongers groups’ meetups, but that hasn’t stopped the yearly Perl and Raku Conference from meeting virtually. (In the past there have also been yearly European and Asian conferences, occasional forays into South America and Russia, as well as hackathons and workshops worldwide.) There are IRC servers and channels for chat, mailing lists galore, blogs (yes, apart from this one), and a quirky social network that predates Facebook and Twitter.

So no, Perl isn’t dead or even dying, but if you don’t like it and favor something newer, that’s OK! Technologies can coexist on their own merits and advocates of one don’t have to beat down their contemporaries to be successful. Perl happens to be battle-​tested (to borrow a term from my friend Curtis Ovid” Poe), it runs large parts of the Web (speaking from direct and ongoing experience in the hosting business here), and it’s still evolving to meet the needs of its users.

I publish Perl stories on this blog once a week, and it seems every time there’s at least one response on social media that amounts to, I hate Perl because of its weird syntax.” Or, It looks like line noise.” (Perl seems to have outlasted that one—when’s the last time you used an acoustic modem?) Or the quote attributed to Keith Bostic: The only language that looks the same before and after RSA encryption.”

So let’s address, confront, and demystify this hate. What are these objectionable syntactical, noisy, possibly encrypted bits? And why does Perl have them?

Regular expressions

Regular expressions, or regexps, are not unique to Perl. JavaScript has them. Java has them. Python has them as well as another module that adds even more features. It’s hard to find a language that doesn’t have them, either natively or through the use of a library. It’s common to want to search text using some kind of pattern, and regexps provide a fairly standardized if terse mini-​language for doing so. There’s even a C‑based library called PCRE, or Perl Compatible Regular Expressions,” enabling many other pieces of software to embed a regexp engine that’s inspired by (though not quite compatible) with Perl’s syntax.

Being itself inspired by Unix tools like grep, sed, and awk, Perl incorporated regular expressions into the language as few other languages have, with binding operators of =~ and !~ enabling easy matching and substitutions against expressions, and pre-​compilation of regexps into their own type of value. Perl then added the ability to separate regexps by whitespace to improve readability, use different delimiters to avoid the leaning-​toothpick syndrome of escaping slash (/) characters with backslashes (\), and name your capture groups and backreferences when substituting or extracting strings.

All this is to say that Perl regular expressions can be some of the most readable and robust when used to their full potential. Early on this helped cement Perl’s reputation as a text-​processing powerhouse, though the core of regexps’ succinct syntax can result in difficult-​to-​read code. Such inscrutable examples can be found in any language that implements regular expressions; at least Perl offers the enhancements mentioned above.

Sigils

Perl has three built-​in data types that enable you to build all other data structures no matter how complex. Its variable names are always preceded by a sigil, which is just a fancy term for a symbol or punctuation mark.

  • A scalar contains a string of characters, a number, or a reference to something, and is preceded with a $ (dollar sign).
  • An array is an ordered list of scalars beginning with an element numbered 0 and is preceded with a @ (at sign). 
  • A hash, or associative array, is an unordered collection of scalars indexed by string keys and is preceded with a % (percent sign).

So variable names $look @like %this. Individual elements of arrays or hashes are scalars, so they $look[0] $like{'this'}. (That’s the first element of the @look array counting from zero, and the element in the %like hash with a key of 'this'.)

Perl also has a concept of slices, or selected parts of an array or hash. A slice of an array looks like @this[1, 2, 3], and a slice of a hash looks like @that{'one', 'two', 'three'}. You could write it out long-​hand like ($this[1], $this[2], $this[3]) and ($that{'one'}, $that{'two'}, $that{'three'} but slices are much easier. Plus you can even specify one or more ranges of elements with the .. operator, so @this[0 .. 9] would give you the first ten elements of @this, or @this[0 .. 4, 6 .. 9] would give you nine with the one at index 5 missing. Handy, that.

In other words, the sigil always tells you what you’re going to get. If it’s a single scalar value, it’s preceded with a $; if it’s a list of values, it’s preceded with a @; and if it’s a hash of key-​value pairs, it’s preceded with a %. You never have to be confused about the contents of a variable because the name will tell you what’s inside.

Data structures, anonymous values, and dereferencing

I mentioned earlier that you can build complex data structures from Perl’s three built-​in data types. Constructing them without a lot of intermediate variables requires you to use things like:

  • lists, denoted between ( parentheses )
  • anonymous arrays, denoted between [ square brackets ]
  • and anonymous hashes, denoted between { curly braces }.

Given these tools you could build, say, a scalar referencing an array of street addresses, each address being an anonymous hash:

$addresses = [
  { 'name'    => 'John Doe',
    'address' => '123 Any Street',
    'city'    => 'Anytown',
    'state'   => 'TX',
  },
  { 'name'    => 'Mary Smith',
    'address' => '100 Other Avenue',
    'city'    => 'Whateverville',
    'state'   => 'PA',
  },
];

(The => is just a way to show correspondence between a hash key and its value, and is just a funny way to write a comma (,). And like some other programming languages, it’s OK to have trailing commas in a list as we do for the 'state' entries above; it makes it easier to add more entries later.)

Although I’ve nicely spaced out my example above, you can imagine a less sociable developer might cram everything together without any spaces or newlines. Further, to extract a specific value from this structure this same person might write the following, making you count dollar signs one after another while reading right-​to-​left then left-to-right:

say $$addresses[1]{'name'};

We don’t have to do that, though; we can use arrows that look like -> to dereference our array and hash elements:

say $addresses->[1]->{'name'};

We can even use postfix dereferencing to pull a slice out of this structure, which is just a fancy way of saying always reading left to right”:

say for $addresses->[1]->@{'name', 'city'};

Which prints out:

Mary Smith
Whateverville

Like I said above, the sigil always tells you what you’re going to get. In this case, we got:

  • a sliced list of values with the keys 'name' and 'city' out of…
  • an anonymous hash that was itself the second element (counting from zero, so index of 1) referenced in…
  • an anonymous array which was itself referenced by…
  • the scalar named $addresses.

That’s a mouthful, but complicated data structures often are. That’s why Perl provides a Data Structures Cookbook as the perldsc documentation page, a references tutorial as the perlreftut page, and finally a detailed guide to references and nested data structures as the perlref page.

Special variables

Perl was also inspired by Unix command shell languages like the Bourne shell (sh) or Bourne-​again shell (bash), so it has many special variable names using punctuation. There’s @_ for the array of arguments passed to a subroutine, $$ for the process number the current program is using in the operating system, and so on. Some of these are so common in Perl programs they are written without commentary, but for the others there is always the English module, enabling you to substitute in friendly (or at least more awk-like) names.

With use English; at the top of your program, you can say:

All of these predefined variables, punctuation and English names alike, are documented on the perlvar documentation page.

The choice to use punctuation variables or their English equivalents is up to the developer, and some have more familiarity with and assume their readers understand the punctuation variety. Other less-​friendly developers engage in code golf,” attempting to express their programs in as few keystrokes as possible.

To combat these and other unsociable tendencies, the perlstyle documentation page admonishes, Perl is designed to give you several ways to do anything, so consider picking the most readable one.” Developers can (and should) also use the perlcritic tool and its included policies to encourage best practices, such as prohibiting all but a few common punctuation variables.

Conclusion: Do you still hate Perl?

There are only two kinds of languages: the ones people complain about and the ones nobody uses.

Bjarne Stroustrup, designer of the C++ programming language

It’s easy to hate what you don’t understand. I hope that reading this article has helped you decipher some of Perl’s noisy” quirks as well as its features for increased readability. Let me know in the comments if you’re having trouble grasping any other aspects of the language or its ecosystem, and I’ll do my best to address them in future posts.

Back To The Future DeLorean

Last week saw the release of Perl 5.34.0 (you can get it here), and with it comes a year’s worth of new features, performance enhancements, bug fixes, and other improvements. It seems like a good time to highlight some of my favorite changes over the past decade and a half, especially for those with more dated knowledge of Perl. You can always click on the headers below for the full releases’ perldelta pages.

Perl 5.10 (2007)

This was a big release, coming as it did over five years after the previous major 5.8 release. Not that Perl developers were idle—but it wouldn’t be until version 5.14 that the language would adopt a steady yearly release cadence.

Due to the build-​up time, many core enhancements were made but the most important was arguably the feature pragma, enabling the addition of new syntax that would otherwise break Perl’s backward compatibility. 5.10 also introduced the defined-​or operator (//), state variables that persist their previous value, the say function for automatically appending a newline on output (so much saved typing), and a large collection of improvements to regular expressions. In addition, this release introduced smart matching (~~), though version 5.18 would eventually relegate it to experimental status.

Perl 5.12 (2010)

This release also saw many new features added, but if I had to pick one marquee item it would be experimental support for pluggable keywords, which enabled authors to extend the language itself without modifying the core. Previously one would either use plain functions, hacky source filters, or the deprecated Devel::Declare module to simulate this functionality. CPAN authors would go on to create all kinds of new syntax, sometimes prototyping features that would eventually make their way into core.

Perl 5.14 (2011)

5.14 had a big list of enhancements, including Unicode 6.0 support and a gaggle of regular expression features. My favorite of these was the /r switch for non-​destructive substitutions.

But as the first yearly cadence release, the changes in policy took center stage. The Perl 5 Porters (p5p) explicitly committed to supporting the two most recent stable release series, providing security patches only for release series occurring in the past three years. They also defined an explicit compatibility and deprecation policy, with definitions for features that may be experimental, deprecated, discouraged, and removed.

Perl 5.16 (2012)

Another year, another version bump. This time the core enhancements were all over the map (although no enhancements to the map function 😀 ).

May I highlight another documentation change, though? The perlootut Object-​Oriented Programming in Perl Tutorial replaced the old perltoot, perltooc, perlboot, and perlbot pages, providing an introduction to object-​oriented design concepts before strongly recommending the use of one of the OO systems from CPAN. Mentioned are Moose, its alternative Mouse, Class::Accessor, Object::Tiny, and Role::Tinys usage with the latter two. Later versions of perlootut would recommend Moo rather than Mouse.

Perl 5.18 (2013)

As mentioned earlier, Perl 5.18 rendered smartmatch experimental, as well as lexical use of the $_ variable. With these came a new category of warnings for experimental features and a method for overriding such warnings feature-​by-​feature. Fitting in with the security and safety theme, hashes were overhauled to randomize key/​value order, increasing their resistance to algorithmic complexity attacks.

But it wasn’t all fencing in bad behavior. Lexical subroutines made their first (experimental) appearance, and although I confess I haven’t had much call for them in my work, others have come up with some interesting uses. Four years later they became non-​experimental.

Perl 5.20 (2014)

Three new syntax features arrived in 2014: experimental subroutine signatures (of which I’ve written more about here), key/​value hash slices and index/​value array slices, and experimental postfix dereferencing. This last enables cleaner left-​to-​right syntax when dereferencing variables:

  • @{ $array_ref } becomes $array_ref->@*
  • %{ $hash_ref } becomes $hash_ref->%*
  • Etc.

Postfix dereferencing became non-​experimental in Perl 5.24, and vigorous discussion continues on subroutine signatures’ future.

Perl 5.22 (2015)

Speaking of subroutine signatures, their location moved to between the subroutine name (if any) and the attribute list (if any). Previously they appeared after attributes. The lesson? Remain conscious of experimental features in your code, and be prepared to make changes when upgrading.

In addition to the enhancements, security updates, performance fixes, and deprecations, developers removed the historically notable CGI module. First added to core in 1997 in recognition of its critical role in enabling web development, it’s been supplanted by better alternatives on CPAN.

Perl 5.24 (2016)

Perl 5.20s postfix dereferencing was no longer experimental, and developers removed both lexical $_ and autodereferencing on calls to push, pop, shift, unshift, splice, keys, values, and each.

Perl 5.26 (2017)

The incorporation of experimental features continued, with lexical subroutines moving into full support. I like the added readability enhancements, though: indented here-​documents; the /xx regular expression modifier for tabs and spaces in character classes; and @{^CAPTURE}, %{^CAPTURE}, and %{^CAPTURE_ALL} for regexp matches with a little more self-documentation.

Perl 5.28 (2018)

Experimental subroutine signature and attribute ordering flipped back to its Perl 5.20 sequence of attributes-​then-​signature. Bit of a rollercoaster ride on this one. You could do worse than using something like Type::Params until this settles and get a wide variety of type constraints in the bargain.

Perl 5.30 (2019)

Pour one out for AWK and Fortran programmers migrating to Perl: the $[variable for setting the lower bound of arrays could no longer be set to anything other than zero. This had a long deprecation cycle starting in Perl 5.12.

Perl 5.32 (2020)

In 2020 Perl’s development moved to GitHub. And once again, I’m going to highlight readability enhancements: the experimental isa operator could be used to say:

if ( $obj isa Some::Class ) { ... }

instead of

use Scalar::Util 'blessed';
if ( blessed($obj) and $obj->isa('Some::Class') { ... }

You could also chain comparison operators, leading to the more mathematically concise if ( $x < $y <= $z ) {...} rather than if ( $x < $y and $y <= $z ) {...}.

Perl 5.34 (2021)

Finally, we come to last week’s release and its introduction of experimental try/​catch exception handling syntax. If you need to support earlier versions of Perl back to 5.14, you can use Feature::Compat::Try. Earlier this year I interviewed the feature and module’s author, Paul LeoNerd” Evans, for Perl.com. This year also marked the debut of Perl’s new governance model with the appointment of a Core Team and a three-​member Steering Council.

What are some of your favorite Perl improvements over the years? Check out the perlhist document for a detailed chronology and refresher with the various perldelta pages and leave me a comment below.

person doing card trick

Perl is said (sometimes frustratingly) to be a do-​what-​I-​mean programming language. Many of its statements and constructions are designed to be forgiving or have analogies to natural languages. Still others are said to be magic,” behaving differently depending on how they’re used. Adept use of Perl asks you to not only understand this magic, but to embrace it and the expressiveness it enables. Here, then, are five ways you can bring some magic to your code.

$_

Perl has many special variables, and first among them (literally, it’s the first documented) is $_. Also spelled $ARG if you use the English module, the documentation describes it as the default input and pattern-​matching space.” Many, many functions and statements will assume it as the default or implicit argument; you can find the full list in the documentation. Here’s an example that uses it implicitly to output the numbers from 1 to 5:

say for 1 .. 5;

Output:

1
2
3
4
5

Where some languages require an iterator variable in a for or foreach loop, in the absence of one Perl assigns it to $_.

Statement modifiers

We then use our second trick; where some other languages require a block to enclose every loop or conditional (whether denoted by braces { } or indentation), Perl allows you to put said looping or conditional statement after a single other statement, in this case the say which prints its argument(s) followed by a newline.

However, above we have no arguments passed to say and so once again the default $_ is used, now containing a number from 1 to 5 which is then printed out. It’s a very powerful and expressive idiom, enabling both the writer and reader of code to concentrate on the important thing that’s happening. It’s also entirely optional. You can just as easily type:

for my $foo (1..5) {
    say $foo;
}

But where’s the magic in that?

Magic variables and use English

We mentioned the $_ variable above, and that it could also be spelled $ARG if you add use English to your code. It can be hard to read code with large amounts of punctuation, though, and even harder to remember what each variable does. Thankfully the English module provides aliases, and the perlvar man page lists them in order. It’s much easier to read and write things like $LIST_SEPARATOR, $PROCESS_ID, or $MATCH rather than $", $$, and $&, and goes a long way towards reducing Perl’s reputation as a write-​only language.

List and scalar contexts

Like natural languages, Perl has a concept of context” in which words mean different things depending on their surroundings. In Perl’s case, expressions may behave differently depending on whether they expect to produce a list of values or a single value, called a scalar. Here’s a trivial example:

my @foo = (1, 2, 3); # list context, @foo contains the list
my $bar = (1, 2, 3); # scalar context, $bar contains 3

In the first line, we assign the list of numbers (1, 2, 3) to the array @foo. But in the second line, we’re assigning to the scalar variable $bar, which now contains the last item in the list.

Here’s another example, using the reverse function:

my @foo = ('one', 'two', 'three');
my @bar = reverse @foo; # @bar contains ('three', 'two', 'one')
my $baz = reverse @foo; # $baz contains 'eerhtowteno'

In list context, reverse takes its arguments and returns them in the opposite order. But in scalar context, it concatenates all of the arguments together and returns a string with the characters in opposite order.

In general, there is no general rule for deducing a function’s behavior in scalar context from its behavior in list context.” (Dominus 1998) You’ll just have to look up the function to determine what it does, though in general, it does what you want, but if you want to force scalar context use the scalar operator:

my @foo = ('aa', 'aab', 'bbc');
my @bar = scalar grep /aa/, @foo; # returns a list (2), counting the number of matches

Hash slices

One of Perl’s three built-​in data types is the hash, also known as an associative array. It’s an unordered collection of scalars indexed by string, rather than the numbers used by normal arrays. It’s a useful construct, and you can develop complicated data structures using just scalars, arrays, and hashes. What’s not widely known is that you can access several elements of of a hash using a hash slice, using syntax that’s similar to array slices. Here’s an example:

my ($who, $home) = @ENV{'USER', 'HOME'};

It works the other way, too: you can assign to a slice.

@colors{'red', 'green', 'blue'} = (0xff0000, 0x00ff00, 0x0000ff);

I use this a lot when assigning arguments received from functions or methods (see my previous article on subroutine signatures):

use v5.24; # for postfix dereferencing
use Types::Standard qw(Str Int);
use Type::Params 'compile_named';

foo('hello', 42);

sub foo {
    state $check = compile_named(
        param1 => Str,
        param2 => Int, {optional => 1},
    );
    my ($param1, $param2) =
        $check->(@_)->@{'param1', 'param2'};

    say $param1, $param2;
}

In the example above, $check->(@_) returns the type-​checked arguments to the foo() function courtesy of Type::Paramscompile_named() function. It’s returned as a hash reference, and since hashes are unordered, we specify the order in which we want the values by dereferencing and then slicing the resulting hash. The postfix dereferencing syntax was added in Perl 5.20 and made a default feature in 5.24, and reduces the number of nested brackets and braces we have to deal with.

Conclusion

I hope this article has given you a taste of some of the magic available in the Perl language. It’s these sort of features that make programming in it a bit more joyful. As always, check the documentation for complete information on these and other topics, or look for answers and ask questions on PerlMonks or Stack Overflow.