After a lot of pro­cras­ti­na­tion, I’ve decid­ed my talk for this week’s ePhEmeRaL mini­conf will be Cunningham’s Law: A Year of Being Wrong on the Internet, or «prêch­er le faux pour savoir le vrai.»

The event starts at 8:00 AM CST on Thursday, November 18; you can find out more about it includ­ing the full sched­ule and a time zone con­vert­er here.

animal antler big close up

At my work, we exten­sive­ly use the Moose object sys­tem to take care of what would ordi­nar­i­ly be very tedious boil­er­plate object-​oriented Perl code. In one part of the code­base, we have a fam­i­ly of class­es that, among oth­er things, map Perl meth­ods to the names of var­i­ous calls in a third-​party API with­in our larg­er orga­ni­za­tion. Those pri­vate Perl meth­ods are in turn called from pub­lic meth­ods pro­vid­ed by roles con­sumed by these class­es so that oth­er areas aren’t con­cerned with said API’s details.

Without going into too many specifics, I had a bunch of class­es all with sec­tions that looked like this:

sub _create_method    { return 'api_add'     }
sub _retrieve_method  { return 'api_info'    }
sub _search_method    { return 'api_list'    }
sub _update_method    { return 'api_update'  }
sub _cancel_method    { return 'api_remove'  }
sub _suspend_method   { return 'api_disable' }
sub _unsuspend_method { return 'api_restore' }

... # etc.

The val­ues returned by these very sim­ple meth­ods might dif­fer from class to class depend­ing on the API call need­ed, and dif­fer­ent class­es might have a dif­fer­ent mix of these meth­ods depend­ing on what roles they consume.

These meth­ods had built up over time as devel­op­ers had expand­ed the class­es’ func­tion­al­i­ty, and this week it was my turn. I decid­ed to apply the DRY (don’t repeat your­self) prin­ci­ple and cre­ate them from a sim­ple hash table like so:

my %METHOD_MAP = (
  _create_method    => 'api_add',
  _retrieve_method  => 'api_info',
  _search_method    => 'api_list',
  _update_method    => 'api_update',
  _cancel_method    => 'api_remove',
  _suspend_method   => 'api_disable',
  _unsuspend_method => 'api_restore',
);

At first, I thought to myself, These look like pri­vate read-​only attrib­ut­es!” So I wrote:

use Moose;

...

has $_ => (
  is       => 'ro',
  init_arg => undef,
  default  => $METHOD_MAP{$_},
) for keys %METHOD_MAP;

Of course, I’d have to move the class­es’ with state­ments after these def­i­n­i­tions so the roles they con­sume could see” these runtime-​defined attrib­ut­es. But some of the meth­ods used to read these are class meth­ods (e.g., called as ClassName->foo() rather than $object->foo()), and Moose attrib­ut­es are only set after the con­struc­tion of a class instance.

Then I thought, Hey, Moose has a MOP (meta-​object pro­to­col)! I’ll use that to gen­er­ate these meth­ods at runtime!”

my $meta = __PACKAGE__->meta;

while (my ($method, $api_call) = each %METHOD_MAP) {
    $meta->add_method( $method => sub {$api_call} );
}

The add_method doc­u­men­ta­tion strong­ly encourage[s]” you to pass a metamethod object rather than a code ref­er­ence, though, so that would look like:

use Moose::Meta::Method;

my $meta = __PACKAGE__->meta;

while (my ($method, $api_call) = each %METHOD_MAP) {
    $meta->add_method( $method = Moose::Meta::Method->wrap(
      sub {$api_call}, __PACKAGE__, $meta,
    );
}

This was get­ting ugly. There had to be a bet­ter way, and for­tu­nate­ly there was in the form of Dave Rolskys MooseX::ClassAttribute mod­ule. It sim­pli­fies the above to:

use MooseX::ClassAttribute;

class_has $_ => (
  is      => 'ro',
  default => $METHOD_MAP{$_},
) for keys %METHOD_MAP;

Note there’s no need for init_arg => undef to pre­vent set­ting the attribute in the con­struc­tor. Although they’re still Moose attrib­ut­es, they act like class meth­ods so long as the class con­sumes the roles that require them after the attribute definitions.

Lastly, if we were using Moo as a light­weight alter­na­tive to Moose, I could have instead select­ed Toby Inksters MooX::ClassAttribute. Although it has some caveats, it’s pret­ty much the only alter­na­tive to our ini­tial class method def­i­n­i­tions as Moo lacks a meta-​object pro­to­col.

The les­son as always is to check CPAN (or the appro­pri­ate mix of your language’s soft­ware repos­i­to­ry, forums like Stack Overflow, etc.) for any­thing that could con­ceiv­ably have appli­ca­tion out­side of your par­tic­u­lar cir­cum­stances. Twenty-​five years into my career and I’m still leap­ing into code with­out first con­sid­er­ing that some­one smarter than me has already done the work.

Template proces­sors and engines are one of those pieces of soft­ware where it seems every devel­op­er wants to rein­vent the wheel. Goodness knows I’ve done it ear­li­er in my career. Tell me if this sounds familiar:

  1. You need to mix data into a doc­u­ment so you start with Perl’s string inter­po­la­tion in "dou­ble quotes" or sprintf for­mats. (Or maybe you inves­ti­gate formats, but the less said about them the bet­ter.)
  2. You real­ize your doc­u­ments need to dis­play things based on cer­tain con­di­tions, or you want to loop over a list or some oth­er structure.
  3. You add these fea­tures via key­word pars­ing and escape char­ac­ters, think­ing it’s OK since this is just a small bespoke project.
  4. Before you know it you’ve invent­ed anoth­er domain-​specific lan­guage (DSL) and have to sup­port it on top of the appli­ca­tion you were try­ing to deliv­er in the first place.

Stop. Just stop. Decades of oth­ers who have walked this same path have already done this for you. Especially if you’re using a web frame­work like Dancer, Mojolicious, or Catalyst, where the tem­plate proces­sor is either built-​in or plug­gable from CPAN. Even if you’re not devel­op­ing a web appli­ca­tion there are sev­er­al general-​purpose options of var­i­ous capa­bil­i­ties like Template Toolkit and Template::Mustache. Investigate the alter­na­tives and deter­mine if they have the fea­tures, per­for­mance, and sup­port you need. If you’re sure none of them tru­ly meet your unique require­ments, then maybe, maybe con­sid­er rolling your own.

Whatever you decide, real­ize that as your appli­ca­tion or web­site grows your invest­ment in that selec­tion will only deep­en. Porting to a new tem­plate proces­sor can be as chal­leng­ing as port­ing any source code to a new pro­gram­ming language.

Unfortunately, there are about as many opin­ions on how to choose a tem­plate proces­sor as there are tem­plate proces­sors. For exam­ple, in 2013 Roland Koehler wrote a good Python-​oriented arti­cle on sev­er­al con­sid­er­a­tions and the dif­fer­ent approach­es avail­able. Although he end­ed up devel­op­ing his own (quelle sur­prise), he makes a good case that a tem­plate proces­sor ought to at least pro­vide var­i­ous log­ic con­structs as well as embed­ded expres­sions, if not a full pro­gram­ming lan­guage. Koehler specif­i­cal­ly warns against the lat­ter, though, as a tem­plate devel­op­er might change an application’s data mod­el, to say noth­ing of the pos­si­bil­i­ty of exe­cut­ing arbi­trary destruc­tive code.

I can appre­ci­ate this rea­son­ing. I’ve suc­cess­ful­ly used Perl tem­plate proces­sors like the afore­men­tioned Template::Toolkit (which has both log­ic direc­tives and an option­al facil­i­ty for eval­u­at­ing Perl code) and Text::Xslate (which sup­ports sev­er­al tem­plate syn­tax­es includ­ing a sub­set of Template::Toolkits, but with­out the abil­i­ty to embed Perl code). We use the lat­ter at work com­bined with Text::Xslate::Bridge::TT2Likes emu­la­tion of var­i­ous Template::Toolkit vir­tu­al meth­ods and it’s served us well.

But using those mod­ules’ DSLs means more sophis­ti­cat­ed tasks need extra time and effort find­ing the cor­rect log­ic and expres­sions. This also assumes that their designer(s) have antic­i­pat­ed my needs either through built-​in fea­tures or exten­sions. I’m already writ­ing Perl; why should I switch to anoth­er, more lim­it­ed lan­guage and envi­ron­ment pro­vid­ed I can remain dis­ci­plined enough to avoid issues like those described above by Koehler?

So for my per­son­al projects, I favor tem­plate proces­sors that use the full pow­er of the Perl lan­guage like Mojolicious’ embed­ded Perl ren­der­er or the ven­er­a­ble Text::Template for non-​web appli­ca­tions. It saves me time and I’ll like­ly want more than any DSL can pro­vide. This may not apply to your sit­u­a­tion, though, and I’m open to counter-arguments.

What’s your favorite tem­plate proces­sor and why? Let me know in the comments.

woman looking at the map

Six months ago I gave an overview of Perl’s list pro­cess­ing fun­da­men­tals, briefly describ­ing what lists are and then intro­duc­ing the built-​in map and grep func­tions for trans­form­ing and fil­ter­ing them. Later on, I com­piled a list (how appro­pri­ate) of list pro­cess­ing mod­ules avail­able via CPAN, not­ing there’s some con­fus­ing dupli­ca­tion of effort. But you’re a busy devel­op­er, and you just want to know the Right Thing To Do™ when faced with a list pro­cess­ing challenge.

First, some cred­it is due: these are all restate­ments of sev­er­al Perl::Critic poli­cies which in turn cod­i­fy stan­dards described in Damian Conway’s Perl Best Practices (2005). I’ve repeat­ed­ly rec­om­mend­ed the lat­ter as a start­ing point for higher-​quality Perl devel­op­ment. Over the years these prac­tices con­tin­ue to be re-​evaluated (includ­ing by the author him­self) and var­i­ous authors release new pol­i­cy mod­ules, but perlcritic remains a great tool for ensur­ing you (and your team or oth­er con­trib­u­tors) main­tain a con­sis­tent high stan­dard in your code.

With that said, on to the recommendations!

Don’t use grep to check if any list elements match

It might sound weird to lead off by rec­om­mend­ing not to use grep, but some­times it’s not the right tool for the job. If you’ve got a list and want to deter­mine if a con­di­tion match­es any item in it, you might try:

if (grep { some_condition($_) } @my_list) {
    ... # don't do this!
}

Yes, this works because (in scalar con­text) grep returns the num­ber of match­es found, but it’s waste­ful, check­ing every ele­ment of @my_list (which could be lengthy) before final­ly pro­vid­ing a result. Use the stan­dard List::Util module’s any func­tion, which imme­di­ate­ly returns (“short-​circuits”) on the first match:

use List::Util 1.33 qw(any);

if (any { some_condition($_) } @my_list) {
... # do something
}

Perl has includ­ed the req­ui­site ver­sion of this mod­ule since ver­sion 5.20 in 2014; for ear­li­er releas­es, you’ll need to update from CPAN. List::Util has many oth­er great list-​reduction, key/​value pair, and oth­er relat­ed func­tions you can import into your code, so check it out before you attempt to re-​invent any wheels.

As a side note for web devel­op­ers, the Perl Dancer frame­work also includes an any key­word for declar­ing mul­ti­ple HTTP routes, so if you’re mix­ing List::Util in there don’t import it. Instead, call it explic­it­ly like this or you’ll get an error about a rede­fined function:

use List::Util 1.33;

if (List::Util::any { some_condition($_) } @my_list) {
... # do something
}

This rec­om­men­da­tion is cod­i­fied in the BuiltinFunctions::ProhibitBooleanGrep Perl::Critic pol­i­cy, comes direct­ly from Perl Best Practices, and is rec­om­mend­ed by the Software Engineering Institute Computer Emergency Response Team (SEI CERT)’s Perl Coding Standard.

Don’t change $_ in map or grep

I men­tioned this back in March, but it bears repeat­ing: map and grep are intend­ed as pure func­tions, not muta­tors with side effects. This means that the orig­i­nal list should remain unchanged. Yes, each ele­ment alias­es in turn to the $_ spe­cial vari­able, but that’s for speed and can have sur­pris­ing results if changed even if it’s tech­ni­cal­ly allowed. If you need to mod­i­fy an array in-​place use some­thing like:

for (@my_array) {
$_ = ...; # make your changes here
}

If you want some­thing that looks like map but won’t change the orig­i­nal list (and don’t mind a few CPAN depen­den­cies), con­sid­er List::SomeUtilsapply function:

use List::SomeUtils qw(apply);

my @doubled_array = apply {$_ *= 2} @old_array;

Lastly, side effects also include things like manip­u­lat­ing oth­er vari­ables or doing input and out­put. Don’t use map or grep in a void con­text (i.e., with­out a result­ing array or list); do some­thing with the results or use a for or foreach loop:

map { print foo($_) } @my_array; # don't do this
print map { foo($_) } @my_array; # do this instead

map { push @new_array, foo($_) } @my_array; # don't do this
@new_array = map { foo($_) } @my_array; # do this instead

This rec­om­men­da­tion is cod­i­fied by the BuiltinFunctions::ProhibitVoidGrep, BuiltinFunctions::ProhibitVoidMap, and ControlStructures::ProhibitMutatingListFunctions Perl::Critic poli­cies. The lat­ter comes from Perl Best Practices and is an SEI CERT Perl Coding Standard rule.

Use blocks with map and grep, not expressions

You can call map or grep like this (paren­the­ses are option­al around built-​in functions):

my @new_array  = map foo($_), @old_array; # don't do this
my @new_array2 = grep !/^#/, @old_array; # don't do this

Or like this:

my @new_array  = map { foo($_) } @old_array;
my @new_array2 = grep {!/^#/} @old_array;

Do it the sec­ond way. It’s eas­i­er to read, espe­cial­ly if you’re pass­ing in a lit­er­al list or mul­ti­ple arrays, and the expres­sion forms can con­ceal bugs. This rec­om­men­da­tion is cod­i­fied by the BuiltinFunctions::RequireBlockGrep and BuiltinFunctions::RequireBlockMap Perl::Critic poli­cies and comes from Perl Best Practices.

Refactor multi-​statement maps, greps, and other list functions

map, grep, and friends should fol­low the Unix phi­los­o­phy of Do One Thing and Do It Well.” Your read­abil­i­ty and main­tain­abil­i­ty drop with every state­ment you place inside one of their blocks. Consider junior devel­op­ers and future main­tain­ers (this includes you!) and refac­tor any­thing with more than one state­ment into a sep­a­rate sub­rou­tine or at least a for loop. This goes for list pro­cess­ing func­tions (like the afore­men­tioned any) import­ed from oth­er mod­ules, too.

This rec­om­men­da­tion is cod­i­fied by the Perl Best Practices-inspired BuiltinFunctions::ProhibitComplexMappings and BuiltinFunctions::RequireSimpleSortBlock Perl::Critic poli­cies, although those only cov­er map and sort func­tions, respectively.


Do you have any oth­er sug­ges­tions for list pro­cess­ing best prac­tices? Feel free to leave them in the com­ments or bet­ter yet, con­sid­er cre­at­ing new Perl::Critic poli­cies for them or con­tact­ing the Perl::Critic team to devel­op them for your organization.

Look, I get it. You don’t like the Perl pro­gram­ming lan­guage or have oth­er­wise dis­re­gard­ed it as dead.” (Or per­haps you haven’t, in which case please check out my oth­er blog posts!) It has weird noisy syn­tax, mix­ing reg­u­lar expres­sions, sig­ils on vari­able names, var­i­ous braces and brack­ets for data struc­tures, and a menagerie of cryp­tic spe­cial vari­ables. It’s old: 34 years in December, with a his­to­ry of (some­times ama­teur) devel­op­ers that have used and abused that syn­tax to ship code of ques­tion­able qual­i­ty. Maybe you grudg­ing­ly accept its util­i­ty but think it should die grace­ful­ly, main­tained only to run lega­cy applications.

But you know what? Perl’s still going. It’s had a steady cadence of year­ly releas­es for the past decade, intro­duc­ing new fea­tures and fenc­ing in bad behav­ior while main­tain­ing an admirable lev­el of back­ward com­pat­i­bil­i­ty. Yes, there was a too-​long adven­ture devel­op­ing what start­ed as Perl 6, but that lan­guage now has its own iden­ti­ty as Raku and even has facil­i­ties for mix­ing Perl with its native code or vice versa.

And then there’s CPAN, the Comprehensive Perl Archive Network: a continually-​updated col­lec­tion of over 200,000 open-​source mod­ules writ­ten by over 14,000 authors, the best of which are well-​tested and ‑doc­u­ment­ed (apply­ing peer pres­sure to those that fall short), pre­sent­ed through a search engine and front-​end built by scores of con­trib­u­tors. Through CPAN you can find dis­tri­b­u­tions for things like:

All of this is avail­able through a mature instal­la­tion tool­chain that doesn’t break from month to month.

Finally and most impor­tant­ly, there’s the glob­al Perl com­mu­ni­ty. The COVID-​19 pan­dem­ic has put a damper on the hun­dreds of glob­al Perl Mongers groups’ mee­tups, but that hasn’t stopped the year­ly Perl and Raku Conference from meet­ing vir­tu­al­ly. (In the past there have also been year­ly European and Asian con­fer­ences, occa­sion­al for­ays into South America and Russia, as well as hackathons and work­shops world­wide.) There are IRC servers and chan­nels for chat, mail­ing lists galore, blogs (yes, apart from this one), and a quirky social net­work that pre­dates Facebook and Twitter.

So no, Perl isn’t dead or even dying, but if you don’t like it and favor some­thing new­er, that’s OK! Technologies can coex­ist on their own mer­its and advo­cates of one don’t have to beat down their con­tem­po­raries to be suc­cess­ful. Perl hap­pens to be battle-​tested (to bor­row a term from my friend Curtis Ovid” Poe), it runs large parts of the Web (speak­ing from direct and ongo­ing expe­ri­ence in the host­ing busi­ness here), and it’s still evolv­ing to meet the needs of its users.

Whether learn­ing a pro­gram­ming lan­guage, work­ing through a prob­lem, or try­ing to under­stand a new library, it may be tempt­ing to flail around craft­ing just the right search engine query or cry for help on a forum like Stack Overflow. But look at any guide to ask­ing good ques­tions and you’ll find this com­mand­ment at the top: do your research. And one of the pri­ma­ry sources of that research is the offi­cial doc­u­men­ta­tion for the lan­guage or library in question.

The perldoc command

Fortunately, Perl and its mod­ules come installed with exten­sive doc­u­men­ta­tion, also avail­able on the Web. Locally, you can use the perldoc command-​line pro­gram, or if you have the doc­u­men­ta­tion installed as Linux/​Unix man pages (short for man­u­al pages) you can use the stan­dard man com­mand. Because that isn’t always the case and because it has more Perl-​specific options, I rec­om­mend perldoc.

Note that if you’re using the Perl pro­vid­ed by your oper­at­ing sys­tem dis­tri­b­u­tion you may not have any doc­u­men­ta­tion installed at all with­out a sep­a­rate pack­age. For exam­ple, Ubuntu Linux calls this pack­age perl-doc; you can install it with this command:

sudo apt-get install perl-doc

(Splitting Perl across sev­er­al pack­ages is a com­mon oper­at­ing sys­tem dis­tri­b­u­tion tac­tic; it’s one of the rea­sons I rec­om­mend installing your own Perl.)

Once you’re sure you have the com­mand avail­able, run the fol­low­ing com­mand to get an intro­duc­tion to the Perl pro­gram­ming language:

perldoc perlintro

All of the built-​in doc­u­men­ta­tion begins with perl” as above, so you can say perldoc perlrun to find out how to exe­cute Perl, perldoc perlfaq to get a direc­to­ry of fre­quent­ly asked ques­tions, or perldoc perltoc for a table of con­tents list­ing every­thing available.

As I men­tioned above, the perldoc com­mand has sev­er­al use­ful options. You can run perldoc perldoc for a full descrip­tion, but here are some highlights:

  • perldoc -h: a brief help message
  • perldoc -f function: describe a spe­cif­ic func­tion from perlfunc
  • perldoc -q regular-expression: search the ques­tions in the var­i­ous perlfaq documents
  • perldoc -v variable-name: describe a spe­cif­ic pre­de­fined vari­able from perlvar

The last argu­ment to perldoc can be a doc­u­men­ta­tion page as described ear­li­er, an installed mod­ule, a Perl pro­gram name in your path, or the URL to a Perl mod­ule or Plain Old Documentation (POD) file.

The Perldoc Browser website

You may have noticed a num­ber of links above to the perldoc.perl.org web­site. This is a great friend­ly resource if you pre­fer read­ing in a web brows­er, and you can use fea­tures like bookmarks/​favorites and send­ing hyper­links to oth­ers. It has a search engine and you can even switch between dif­fer­ent Perl ver­sions (includ­ing development).

MetaCPAN

Perl is more than the lan­guage and its stan­dard mod­ules. The Comprehensive Perl Archive Network (CPAN) has over 200,000 open-​source Perl mod­ules con­tributed by authors all over the world, solv­ing com­mon and not-​so-​common soft­ware devel­op­ment prob­lems from asyn­chro­nous pro­gram­ming to XML devel­op­ment.

In addi­tion to using the perldoc com­mand described ear­li­er to under­stand locally-​installed mod­ules, you can use the MetaCPAN web sites search engine (in both reg­u­lar and reg­u­lar expres­sion fla­vors) to dis­cov­er and learn how to save time piec­ing togeth­er pro­grams instead of rein­vent­ing the wheel. I always check MetaCPAN first when I need to write some Perl.

Writing your own

Perl has its own markup lan­guage called Plain Old Documentation (POD), and you can either inter­leave it with your own pro­gram source code or save it in sep­a­rate files. The perlpod doc­u­men­ta­tion page has all the details and the perlpodstyle page will guide you in writ­ing in a style con­sis­tent with oth­er doc­u­men­ta­tion. And if you need to pub­lish else­where, there are trans­la­tors for HTML, LaTeX, PDF, and oth­er for­mats on CPAN. You can also pro­vide your doc­u­men­ta­tion to oth­ers via a ded­i­cat­ed web appli­ca­tion; this is an excel­lent tool for teams with internal-​only (“DarkPAN”) code.

Documentation is like a present you give to both your lat­er self and any­body else that needs to use your work. I high­ly rec­om­mend you spend time writ­ing it in addi­tion to the usu­al developer-​level source code com­ments. This is espe­cial­ly impor­tant if you plan to release open source on CPAN; oth­er­wise, no one will under­stand what it does or why they should care. I’m incred­i­bly grate­ful to every devel­op­er who has put in the effort.

clear light bulb planter on gray rock

Twitter recent­ly rec­om­mend­ed a tweet to me (all hail the algo­rithm) tout­ing what the author viewed as the top 5 web devel­op­ment stacks.”

JavaScript/​Node.js options dom­i­nat­ed the four-​letter acronyms as expect­ed, but the fifth one sur­prised me: LAMP, the com­bi­na­tion of the Linux oper­at­ing sys­tem, Apache web serv­er, MySQL rela­tion­al data­base, and Perl, PHP, or Python pro­gram­ming lan­guages. A quick web search for sim­i­lar lists yield­ed sim­i­lar results. Clearly, this meme (in the Dawkins sense) has out­last­ed its pop­u­lar­iza­tion by tech pub­lish­er O’Reilly in the 2000s.

Originally coined in 1998 dur­ing the dot-​com” bub­ble, I had thought that the term LAMP” had fad­ed with devel­op­ers in the inter­ven­ing decades with the rise of language-​specific web frame­works for:

Certainly on the Perl side (with which I’m most famil­iar), the com­mu­ni­ty has long since rec­om­mend­ed the use of a frame­work built on the PSGI spec­i­fi­ca­tion, dep­re­cat­ing 1990s-​era CGI scripts and the mod_​perl Apache exten­sion. Although general-​purpose web servers like Apache or Nginx may be part of an over­all sys­tem, they’re typ­i­cal­ly used as prox­ies or load bal­ancers for Perl-​specific servers either pro­vid­ed by the frame­work or a third-​party mod­ule.

Granted, PHP still relies on web server-​specific mod­ules, APIs, or vari­a­tions of the FastCGI pro­to­col for inter­fac­ing with a web serv­er. And Python web appli­ca­tions typ­i­cal­ly make use of its WSGI pro­to­col either as a web serv­er exten­sion or, like the Perl exam­ples above, as a prox­ied stand­alone serv­er. But all of these are deploy­ment details and do lit­tle to describe how devel­op­ers imple­ment and extend a web application’s structure.

Note how the var­i­ous four-​letter JavaScript stacks (e.g., MERN, MEVN, MEAN, PERN) dif­fer­en­ti­ate them­selves most­ly by fron­tend frame­work (e.g., Angular, React, Vue.js) and maybe by the (rela­tion­al or NoSQL) data­base (e.g., MongoDB, MySQL, PostgreSQL). All how­ev­er seem stan­dard­ized on the Node.js run­time and Express back­end web frame­work, which could, in the­o­ry, be replaced with non-​JavaScript options like the more mature LAMP-​associated lan­guages and frame­works. (Or if you pre­fer lan­guages that don’t start with P”, there’s C#, Go, Java, Ruby, etc.)

My point is that LAMP” as the name of a web devel­op­ment stack has out­lived its use­ful­ness. It’s at once too spe­cif­ic (about oper­at­ing sys­tem and web serv­er details that are often abstract­ed away for devel­op­ers) and too broad (cov­er­ing three sep­a­rate pro­gram­ming lan­guages and not the frame­works they favor). It also leaves out oth­er non-​JavaScript back-​end lan­guages and their asso­ci­at­ed frameworks.

The ques­tion is: what can replace it? I’d pro­pose NoJS” as rem­i­nis­cent of NoSQL,” but that inac­cu­rate­ly excludes JavaScript from its nec­es­sary role in the front-​end. NJSB” doesn’t exact­ly roll off the tongue, either, and still has the same ambi­gu­i­ty prob­lem as LAMP.”

How about pithy sort-​of-​acronyms pat­terned like database-​frontend-​backend? Here are some Perl examples:

  • MRDancer: MySQL, React, and Dancer (I use this at work. Yes, the M could also stand for MongoDB. Naming things is hard.)
  • MRMojo: MongoDB, React, and Mojolicious
  • PACat: PostgreSQL, Angular, and Catalyst
  • etc.

Ultimately it comes down to com­mu­ni­ty and indus­try adop­tion. If you’re involved with back-​end web devel­op­ment, please let me know in the com­ments if you agree or dis­agree that LAMP” is still a use­ful term, and if not, what should replace it.

a car parked beside airplane

One of Perl’s car­di­nal strengths is the depth and vari­ety of add-​on mod­ules to extend its capa­bil­i­ties, col­lect­ed for the past 26 years and count­ing on CPAN, the Comprehensive Perl Archive Network. Starting with ver­sion 5.004 in 1997, Perl has come pack­aged with a mod­ule (also called CPAN) and asso­ci­at­ed command-​line client for down­load­ing and installing from this ser­vice. Some devel­op­ers favor alter­na­tive tools such as CPANPLUS and its cpanp com­mand or cpan­mi­nus and its cpanm, or tools built on the lat­ter such as Carton and Carmel.

My favorite of these over the past sev­er­al years has been Shoichi Kaji’s cpm, main­ly because it’s blaz­ing­ly fast. As an exam­ple, the doc­u­men­ta­tion cites an instal­la­tion of Plack, the Perl web appli­ca­tion toolk­it, as tak­ing three times as long using cpanm ver­sus cpm. Both use the same Menlo core code but cpm achieves its speed by break­ing down depen­den­cies into indi­vid­ual streams, installing mod­ules in par­al­lel, and syn­chro­niz­ing the nec­es­sary work­er processes.

Shoichi’s pre­sen­ta­tion from The Perl Conference 2016 pro­vides a great summary:

Shoichi Kaji: Why a new CPAN client cpm’ is fast” (2016)

It’s very impor­tant to note that cpm is not a drop-​in replace­ment for the cpan or cpanm command-​line tools. Firstly, it uses the sub­com­mand install, e.g., cpm install Module::Name. Also, by default, it installs mod­ules into a sub­di­rec­to­ry named local/ as if you spec­i­fied cpanm --local-lib-contained local. You might want this if you’re set­ting up a Perl project with its non-​core depen­den­cies in a sep­a­rate loca­tion addressed by the local::lib mod­ule; oth­er­wise, you should use cpm install --global to install into a direc­to­ry in Perl’s @INC array. I tend to do the lat­ter when devel­op­ing, declar­ing my project’s depen­den­cies in a cpanfile.

Speaking of cpanfiles, like cpanm --installdeps cpm will use a cpanfile to dri­ve project depen­den­cy instal­la­tion. In fact, it defaults to look­ing for one if you don’t spec­i­fy indi­vid­ual mod­ules on the com­mand line and sup­ports the version-​controlled cpanfile.snapshot file intro­duced by Carton for track­ing exact depen­den­cies used by your project. This is great for repeat­ed­ly build­ing Docker con­tain­ers and cpm makes that process even faster.

Although speed is its most impor­tant fea­ture, cpm has a cou­ple more tricks up its sleeve like installing from a Git repos­i­to­ry or self-​hosted DarkPAN.” Check out its includ­ed tuto­r­i­al.

The perlcritic tool is often your first defense against awk­ward, hard to read, error-​prone, or uncon­ven­tion­al con­structs in your code,” per its descrip­tion. It’s part of a class of pro­grams his­tor­i­cal­ly known as lin­ters, so-​called because like a clothes dry­er machine’s lint trap, they detect small errors with big effects.” (Another such lin­ter is perltidy, which I’ve ref­er­enced in the past.)

You can use perlcritic at the com­mand line, inte­grat­ed with your edi­tor, as a git pre-​commit hook, or (my pref­er­ence) as part of your author tests. It’s dri­ven by poli­cies, indi­vid­ual mod­ules that check your code against a par­tic­u­lar rec­om­men­da­tion, many of them from Damian Conway’s Perl Best Practices (2005). Those poli­cies, in turn, are enabled by PPI, a library that trans­forms Perl code into doc­u­ments that can be pro­gram­mat­i­cal­ly exam­ined and manip­u­lat­ed much like the Document Object Model (DOM) is used to pro­gram­mat­i­cal­ly access web pages.

perlcritic enables the fol­low­ing poli­cies by default unless you cus­tomize its con­fig­u­ra­tion or install more. These are just the gen­tle” (sever­i­ty lev­el 5) poli­cies, so con­sid­er them the bare min­i­mum in detect­ing bad prac­tices. The full set of includ­ed poli­cies goes much deep­er, ratch­et­ing up the sever­i­ty to stern,” harsh,” cru­el,” and bru­tal.” They’re fur­ther orga­nized accord­ing to themes so that you might selec­tive­ly review your code against issues like secu­ri­ty, main­te­nance, com­plex­i­ty, and bug prevention.

My favorite above is prob­a­bly ProhibitEvilModules. Aside from the col­or­ful name, a devel­op­ment team can use it to steer peo­ple towards an organization’s favored solu­tions rather than dep­re­cat­ed, bug­gy, unsup­port­ed, or inse­cure” ones. By default, it pro­hibits Class::ISA, Pod::Plainer, Shell, and Switch, but you should curate and con­fig­ure a list with­in your team.

Speaking of work­ing with­in a team, although perlcritic is meant to be a vital tool to ensure good prac­tices, it’s no sub­sti­tute for man­u­al peer code review. Those reviews can lead to the cre­ation or adop­tion of new auto­mat­ed poli­cies to save time and set­tle argu­ments, but such work should be done col­lab­o­ra­tive­ly after achiev­ing some kind of con­sen­sus. This is true whether you’re a team of employ­ees work­ing on pro­pri­etary soft­ware or a group of vol­un­teers devel­op­ing open source.

Of course, rea­son­able peo­ple can and do dis­agree over any of the includ­ed poli­cies, but as a rea­son­able per­son, you should have good rea­sons to dis­agree before you either con­fig­ure perlcritic appro­pri­ate­ly or selec­tive­ly and know­ing­ly bend the rules where required. Other CPAN authors have even pro­vid­ed their own addi­tions to perlcritic, so it’s worth search­ing CPAN under Perl::Critic::Policy::” for more exam­ples. In par­tic­u­lar, these community-​inspired poli­cies group a num­ber of rec­om­men­da­tions from Perl devel­op­ers on Internet Relay Chat (IRC).

Personally, although I adhere to my employer’s stan­dard­ized con­fig­u­ra­tion when test­ing and review­ing code, I like to run perlcritic on the bru­tal” set­ting before com­mit­ting my own. What do you pre­fer? Let me know in the com­ments below.

depth of field photography of brown tree logs

A recent Lobsters post laud­ing the virtues of AWK remind­ed me that although the lan­guage is pow­er­ful and lightning-​fast, I usu­al­ly find myself exceed­ing its capa­bil­i­ties and reach­ing for Perl instead. One such appli­ca­tion is ana­lyz­ing volu­mi­nous log files such as the ones gen­er­at­ed by this blog. Yes, WordPress has stats, but I’ve nev­er let rein­ven­tion of the wheel get in the way of a good pro­gram­ming exercise.

So I whipped this script up on Sunday night while watch­ing RuPaul’s Drag Race reruns. It pars­es my Apache web serv­er log files and reports on hits from week to week.

#!/usr/bin/env perl

use strict;
use warnings;
use Syntax::Construct 'operator-double-diamond';
use Regexp::Log::Common;
use DateTime::Format::HTTP;
use List::Util 1.33 'any';
use Number::Format 'format_number';

my $parser = Regexp::Log::Common->new(
    format  => ':extended',
    capture => [qw<req ts status>],
);
my @fields      = $parser->capture;
my $compiled_re = $parser->regexp;

my @skip_uri_patterns = qw<
  ^/+robots.txt
  [-\w]*sitemap[-\w]*.xml
  ^/+wp-
  /feed/?$
  ^/+?rest_route=
>;

my ( %count, %week_of );
while ( <<>> ) {
    my %log;
    @log{@fields} = /$compiled_re/;

    # only interested in successful or cached requests
    next unless $log{status} =~ /^2/ or $log{status} == 304;

    my ( $method, $uri, $protocol ) = split ' ', $log{req};
    next unless $method eq 'GET';
    next if any { $uri =~ $_ } @skip_uri_patterns;

    my $dt  = DateTime::Format::HTTP->parse_datetime( $log{ts} );
    my $key = sprintf '%u-%02u', $dt->week;

    # get first date of each week
    $week_of{$key} ||= $dt->date;
    $count{$key}++;
}

printf "Week of %s: % 10s\n", $week_of{$_}, format_number( $count{$_} )
  for sort keys %count;

Here’s some sam­ple output:

Week of 2021-07-31:      2,672
Week of 2021-08-02:     16,222
Week of 2021-08-09:     12,609
Week of 2021-08-16:     17,714
Week of 2021-08-23:     14,462
Week of 2021-08-30:     11,758
Week of 2021-09-06:     14,811
Week of 2021-09-13:        407

I first start­ed pro­to­typ­ing this on the com­mand line as if it were an awk one-​liner by using the perl -n and -a flags. The for­mer wraps code in a while loop over the <> dia­mond oper­a­tor”, pro­cess­ing each line from stan­dard input or files passed as argu­ments. The lat­ter splits the fields of the line into an array named @F. It looked some­thing like this while I was list­ing URIs (loca­tions on the website):

gunzip -c ~/logs/phoenixtrap.com-ssl_log-*.gz | \
perl -anE 'say $F[6]'

But once I real­ized I’d need to fil­ter out a bunch of URI pat­terns and do some aggre­ga­tion by date, I turned it into a script and turned to CPAN.

There I found Regexp::Log::Common and DateTime::Format::HTTP, which let me pull apart the Apache log for­mat and its time­stamp strings with­out hav­ing to write even more com­pli­cat­ed reg­u­lar expres­sions myself. (As not­ed above, this was already a wheel-​reinvention exer­cise; no need to com­pound that further.)

Regexp::Log::Common builds a com­piled reg­u­lar expres­sion based on the log for­mat and fields you’re inter­est­ed in, so that’s the con­struc­tor on lines 11 through 14. The expres­sion then returns those fields as a list, which I’m assign­ing to a hash slice with those field names as keys in line 29. I then skip over requests that aren’t suc­cess­ful or brows­er cache hits, skip over requests that don’t GET web pages or oth­er assets (e.g., POSTs to forms or updat­ing oth­er resources), and skip over the URI pat­terns men­tioned earlier.

(Those pat­terns are worth a men­tion: they include the robots.txt and sitemap XML files used by search engine index­ers, WordPress admin­is­tra­tion pages, files used by RSS news­read­ers sub­scribed to my blog, and routes used by the Jetpack WordPress add-​on. If you’re adapt­ing this for your site you might need to cus­tomize this list based on what soft­ware you use to run it.)

Lines 38 and 39 parse the time­stamp from the log into a DateTime object using DateTime::Format::HTTP and then build the key used to store the per-​week hit count. The last lines of the loop then grab the first date of each new week (assum­ing the log is in chrono­log­i­cal order) and incre­ment the count. Once fin­ished, lines 46 and 47 pro­vide a report sort­ed by week, dis­play­ing it as a friend­ly Week of date” and the hit counts aligned to the right with sprintf. Number::Format’s format_number func­tion dis­plays the totals with thou­sands separators.

Update: After this was ini­tial­ly pub­lished. astute read­er Chris McGowan not­ed that I had a bug where $log{status} was assigned the val­ue 304 with the = oper­a­tor rather than com­pared with ==. He also sug­gest­ed I use the double-​diamond <<>> oper­a­tor intro­duced in Perl v5.22.0 to avoid maliciously-​named files. Thanks, Chris!

Room for improvement

DateTime is a very pow­er­ful mod­ule but this comes at a price of speed and mem­o­ry. Something sim­pler like Date::WeekNumber should yield per­for­mance improve­ments, espe­cial­ly as my logs grow (here’s hop­ing). It requires a bit more man­u­al mas­sag­ing of the log dates to con­vert them into some­thing the mod­ule can use, though:

#!/usr/bin/env perl

use strict;
use warnings;
use Syntax::Construct qw<
  operator-double-diamond
  regex-named-capture-group
>;
use Regexp::Log::Common;
use Date::WeekNumber 'iso_week_number';
use List::Util 1.33 'any';
use Number::Format 'format_number';

my $parser = Regexp::Log::Common->new(
    format  => ':extended',
    capture => [qw<req ts status>],
);
my @fields      = $parser->capture;
my $compiled_re = $parser->regexp;

my @skip_uri_patterns = qw<
  ^/+robots.txt
  [-\w]*sitemap[-\w]*.xml
  ^/+wp-
  /feed/?$
  ^/+?rest_route=
>;

my %month = (
    Jan => '01',
    Feb => '02',
    Mar => '03',
    Apr => '04',
    May => '05',
    Jun => '06',
    Jul => '07',
    Aug => '08',
    Sep => '09',
    Oct => '10',
    Nov => '11',
    Dec => '12',
);

my ( %count, %week_of );
while ( <<>> ) {
    my %log;
    @log{@fields} = /$compiled_re/;

    # only interested in successful or cached requests
    next unless $log{status} =~ /^2/ or $log{status} == 304;

    my ( $method, $uri, $protocol ) = split ' ', $log{req};
    next unless $method eq 'GET';
    next if any { $uri =~ $_ } @skip_uri_patterns;

    # convert log timestamp to YYYY-MM-DD
    # for Date::WeekNumber
    $log{ts} =~ m!^
      (?<day>\d\d) /
      (?<month>...) /
      (?<year>\d{4}) : !x;
    my $date = "$+{year}-$month{ $+{month} }-$+{day}";

    my $week = iso_week_number($date);
    $week_of{$week} ||= $date;
    $count{$week}++;
}

printf "Week of %s: % 10s\n", $week_of{$_}, format_number( $count{$_} )
  for sort keys %count;

It looks almost the same as the first ver­sion, with the addi­tion of a hash to con­vert month names to num­bers and the actu­al con­ver­sion (using named reg­u­lar expres­sion cap­ture groups for read­abil­i­ty, using Syntax::Construct to check for that fea­ture). On my serv­er, this results in a ten- to eleven-​second sav­ings when pro­cess­ing two months of com­pressed logs.

What’s next? Pretty graphs? Drilling down to spe­cif­ic blog posts? Database stor­age for fur­ther queries and analy­sis? Perl and CPAN make it pos­si­ble to go far beyond what you can do with AWK. What would you add or change? Let me know in the comments.