The Phoenix Trap

Code, music, philosophy, etc.

Tag: webdev

Scraping the Dragon with Perl and Mojolicious
Every extended Labor Day weekend, 80,000 fans of pop culture descend on Atlanta for Dragon Con. It’s a sprawling choose-your-own adventure of a convention with 38 programming tracks and over 5,000 hours of events. It spans five downtown host hotels, and there is no way to see it all.

Sadly, this year’s con is almost over. Still, I thought I’d share a little script I wrote to help me make sense of it all.

The official mobile app is fine for searching and bookmarking events, speakers, and exhibitors. Nonetheless, it’s not suitable for scanning the whole landscape at once. I wanted a single, scrollable view of every event, before I even packed my cosplay.

Even in the app’s tablet version, the Dragon Con Events area is a scroll-fest.

The web version of the app gave me exactly what I needed: predictable per-day URLs and semantically marked-up HTML. That meant I can skip the API hunt, skip the manual scrolling, and go straight to scraping.

Inspecting the HTML reveals per-day event URLs and per-event <div> blocks.

From Chaos to Clarity in 40 lines

We’re about to turn a messy, multi-day, multi-hotel schedule into one clean, scroll-once list. This is the forty-five-line Perl map that gets us there, aided by the Mojolicious web toolkit.

Laying the Groundwork: Tools for the Job
```
#!/usr/bin/env perl

use v5.40;

use Carp;
use English;
use Mojo::UserAgent;
use Mojo::URL;
use Mojo::DOM;
use Mojo::Collection q(c);
use Time::Piece;
use HTML::HTML5::Entities;
use Memoize;

binmode STDOUT, ':encoding(UTF-8)'
  or croak "Couldn't encode STDOUT: $OS_ERROR";

my $ua   = Mojo::UserAgent->new();
my $site = Mojo::URL->new('https://app.core-apps.com');
my $path = '/dragoncon25/events/view_by_day';
```
What’s happening: Load the modules that will do the heavy lifting–HTTP fetches, DOM parsing, date handling, Unicode cleanup. Lock STDOUT to UTF‑8 so characters like curly quotes and em-dashes don’t break the output. Point the script at the base schedule URL.

Remembering the Days Without Re-Parsing
```
my $date_from_dom = memoize( sub ($dom) {
  return content_at( $dom, 'div.section_header[class~="alt"]' );
} );
```
What’s happening: Create a memoized helper that plucks the date from a day’s HTML and caches it. That way, if we need it again, we skip the DOM re-parse and keep the pipeline fast.

content_at is a helper function I define later.

Starting Where the App Starts
```
my $today_dom = Mojo::DOM->new( $ua->get("$site$path")->result->text );
```
What’s happening: Fetch the “today” view–the same default the app shows. This is so we have a known starting point for building the full timeline.

Collecting the Whole Timeline
```
my $day_doms = c(
  $today_dom,
  $today_dom->find(qq(div.filter_box-days > a[href^="$path?day="]))
    ->map( \&dom_from_anchor )
    ->to_array->@*,
)->sort( sub { day_epoch($a) <=> day_epoch($b) } );
```
What’s happening: Grab every day link from the filter bar, fetch each day’s HTML, and sort them chronologically. Now we’ve got the entire con’s schedule in memory, ready to process.

dom_from_anchor and day_epoch are two more helper functions explained further down.

Turning HTML into a Human-Readable Schedule
```
$day_doms->each( sub {    # process each day's events
  my $date = $date_from_dom->($_);

  $_->find('a.bookmark[data-type="events"] + a.object_link')
    ->each( sub {         # output start time + title

      my $time    = content_at( $_, 'div.line[class~="two"]' );
      my $title   = content_at( $_, 'div.line[class~="one"]' );
      my ($start) = split /\s*\p{Dash_Punctuation}/, $time;

      say "$date $start: ", decode_entities($title);
    } );
} );
```
What’s happening: For each day, find every event link and pull out the start time and title. Split the time cleanly on any dash and decode HTML entities so the output reads like a real schedule.

The Little Routines That Make It All Work
```
sub dom_from_anchor ($dom) {    # fetch DOM for a day link
  return Mojo::DOM->new(
    $ua->get( Mojo::URL->new( $dom->attr('href') )->to_abs($site) )
      ->result->text );
}

sub day_epoch ($dom) {    # parse date into epoch
  return Time::Piece->strptime( $date_from_dom->($dom), '%A, %b %e' )
    ->epoch;
}

# extract and trim text from selector
sub content_at ( $dom, @args ) { return trim $dom->at(@args)->content }
```
What’s happening:
1. dom_from_anchor: fetch and parses a linked days’ HTML.
2. day_epoch: turn a date string into a sort-able epoch.
3. content_at: extract and trim text from a DOM fragment, given a CSS selector.
These helpers keep the main flow readable and re-usable.
The Schedule, Unlocked

Run the script and you get a clean, UTF-8-safe list of every event, in chronological order, across all days. No swiping around, no tapping, no “what did I miss?” anxiety. (Ha, who am I kidding? There’s too much going on at Dragon Con to not end up missing something.)

An example run of the script in my terminal. Each line is “Day, Date Time: Event Title”, sorted chronologically across the whole con.

And here’s just a small slice of the 2,500+ lines it produces:

Sunday, Aug 31 11:30 AM: Unmasking Sherlock: Beyond the Many Faces
Sunday, Aug 31 11:30 AM: Weaponization of the FCC and Other Agencies to Chill Speech
Sunday, Aug 31 11:30 AM: Where Physics Gets Weird
. . .
Sunday, Aug 31 11:50 AM: Photo Session: Amelia Tyler
Sunday, Aug 31 11:50 AM: Photo Session: Cissy Jones
Sunday, Aug 31 11:50 AM: Photo Session: Emma Gregory
. . .
Sunday, Aug 31 12:00 PM: Dragon Con Mashups
Sunday, Aug 31 12:00 PM: James J. Butcher and R.R. Virdi signing at The Missing Volume booth# 1300
Sunday, Aug 31 12:00 PM: JoeDan Worley and Eric Dontigney signing at the Shadow Alley Press Booth# 2
. . .
Sunday, Aug 31 12:00 PM: Photo Session: Robert Duncan McNeill
Sunday, Aug 31 12:00 PM: Photo Session: Robert Picardo
Sunday, Aug 31 12:00 PM: Photo Session: Tamara Taylor
Key Techniques

Here’s the fun part–the techniques that make this tidy, scroll-once list possible.

CSS selectors for precision

I used a.bookmark[data-type="events" + a.object_link] to grab only the event title links, and div.line[class~="two" /div.line[class~="one"] for time and title, respectively. This avoids scraping unrelated elements.

Memoization for efficiency

memoize caches the date string for each day’s DOM so I didn’t end up re-parsing the HTML fragment multiple times.

Unicode-safe splitting

\p{Dash_Punctuation} matches any dash type (em, en, hyphen-minus, etc.), so I could split times reliably without worrying about which dash the site uses.

Functional chaining

Mojo::Collection’s map, sort, and each methods let me express the scrape→transform→output pipeline in a linear, readable way.

Entity decoding at output

HTML::HTML5::Entities’ decode_entities is applied right before printing, so HTML entities like & or " are human-readable in the final output.
A Pattern You Can Take Anywhere

The same approach that tamed Dragon Con’s chaos works anywhere you’ve got:
- Predictable URLs–so you can iterate without guesswork
- Consistent HTML structure–so your selectors stay stable
- A need to see everything at once–so you can make decisions without paging or filtering
From fan conventions to conference schedules, from local sports fixtures to film festival line‑ups–the same pattern applies. Sometimes the right tool isn’t a sprawling framework or heavyweight API client. It’s a forty‑odd‑line Perl script that does one thing with ruthless clarity.

Because once you’ve tamed a schedule like this, the only lines you’ll stand in are the ones that feel like part of the show.
August 31, 2025
WordPress, ActivityPub, and Friends

I’ve also been messing with the Friends and ActivityPub plugins for WordPress on my blog, and I share Shelley’s concerns about the former bloating the database with feed items. You can control this somewhat by setting retention values in days or a number of posts, but you have to go into each friend’s Feeds tab and do it manually–there’s no default setting.

After reading that post, I’m also considering disabling Friends in favor of a feed reader, especially because (as Shelley also noted) there are gaps when with favorites and comment conversations bridging between WordPress and Mastodon servers. Like her, I’m not keen on installing a single-user Mastodon instance or other fediverse server that requires managing an unfamiliar programming language.

I’m also trying to do this in tandem with a suite of IndieWeb plugins, and I’m running into an issue with my friends feed page not showing any posts when the Post Kinds plugin is activated. I really want to keep this plugin because it lets me interact better with other IndieWeb sites as well as the Bridgy POSSE/backfeed service connecting me to other social networks.

My ideal is a personal website where I write everything, including long-form articles, short statuses, and replies like these. Folks can then find me via a single identifiable address and then subscribe/follow the entire firehose of content or choose subsets according to post types, topics, or tags. They’d then be able to reply or react on my site or their favored platform, which my site would collect regardless of origin, with subsequent replies and reactions getting pushed out to them. Oh, and it should work with both ActivityPub clients and servers, IndieWeb sites, and syndicate/backfeed to other social networks either with or akin to the Bridgy service I mentioned above.

So far I haven’t seen anything that ticks all these boxes, and I’m getting itchy to write my own. Perl is my favorite programming language, so I’m looking at the Yancy CMS as a base. But I know that it would still be a hell of a project, and one of the reasons I chose WordPress for blogging was that it was well-established and ‑supported but still easily extensible so that I could concentrate on writing instead of endlessly tweaking the engine. Unfortunately, I’m starting to fall into that trap anyway.

January 8, 2023
How much is that BLÅHAJ in the (terminal) window?
IKEA’s toy BLÅHAJ shark has become a beloved Internet icon over the past several years. I thought it might be cute to write a little Perl to get info about it and even display a cuddly picture right in the terminal where I’m running the code. Maybe this will give you some ideas for your own quick web clients. Of course, you could accomplish all of these things using a pipeline of individual command-line utilities like curl, jq, and GNU coreutils’ base64. These examples focus on Perl as the glue, though.

Warning: dodgy API ahead

I haven’t found a publicly-documented and ‑supported official API for querying IKEA product information but others have deconstructed the company’s web site AJAX requests so we can use that instead. The alternative would be to scrape the IKEA web site directly which, although possible, would be more tedious and prone to failure should their design change. An unofficial API is also unreliable but the simpler client code is easier to change should any errors surface.

Enter the Mojolicious

My original goal was to do this in a single line issued to the perl command, and luckily the Mojolicious framework’s ojo module is tailor-made for such things. By adding a -Mojo switch to the perl command, you get over a dozen quick single-character functions for spinning up a quick web application or, in our case, making and interpreting web requests without a lot of ceremony. Here’s the start of my one-line request to the IKEA API for information on their BLÅHAJ product, using ojo’s g function to perform an HTTP GET and displaying the JSON from the response body to the terminal.
```
$ perl -Mojo -E 'say g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->body'
```
This currently returns over 2,400 lines of data, so after reading it over I’ll convert the response body JSON to a Perl data structure and dump only the main product information using ojo’s r function:
```
$ perl -Mojo -E 'say r g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}'
{
  "availability" => [],
  "breathTaking" => bless( do{\(my $o = 0)}, 'JSON::PP::Boolean' ),
  "colors" => [
    {
      "hex" => "0058a3",
      "id" => 10007,
      "name" => "blue"
    },
    {
      "hex" => "ffffff",
      "id" => 10156,
      "name" => "white"
    }
  ],
  "contextualImageUrl" => "https://www.ikea.com/us/en/images/products/blahaj-soft-toy-shark__0877371_pe633608_s5.jpg",
  "currencyCode" => "USD",
  "discount" => "",
  "features" => [],
  "gprDescription" => {
    "numberOfVariants" => 0,
    "variants" => []
  },
  "id" => 90373590,
  "itemMeasureReferenceText" => "39 \x{bc} \"",
  "itemNo" => 90373590,
  "itemNoGlobal" => 30373588,
  "itemType" => "ART",
  "lastChance" => $VAR1->{"breathTaking"},
  "mainImageAlt" => "BL\x{c5}HAJ Soft toy, shark, 39 \x{bc} \"",
  "mainImageUrl" => "https://www.ikea.com/us/en/images/products/blahaj-soft-toy-shark__0710175_pe727378_s5.jpg",
  "name" => "BL\x{c5}HAJ",
  "onlineSellable" => bless( do{\(my $o = 1)}, 'JSON::PP::Boolean' ),
  "pipUrl" => "https://www.ikea.com/us/en/p/blahaj-soft-toy-shark-90373590/",
  "price" => {
    "decimals" => 99,
    "isRegularCurrency" => $VAR1->{"breathTaking"},
    "prefix" => "\$",
    "separator" => ".",
    "suffix" => "",
    "wholeNumber" => 19
  },
  "priceNumeral" => "19.99",
  "quickFacts" => [],
  "tag" => "NONE",
  "typeName" => "Soft toy"
}
```
If I just want the price I can do:
```
$ perl -Mojo -E 'say g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}->@{qw(currencyCode priceNumeral)}'
USD19.99
```
That ->@{qw(currencyCode priceNumeral)} towards the end uses the postfix reference slicing syntax introduced experimentally in Perl v5.20 and made official in v5.24. If you’re using an older perl, you’d say:
```
$ perl -Mojo -E 'say @{g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}}{qw(currencyCode priceNumeral)}'
USD19.99
```
I prefer the former, though, because it’s easier to read left-to-right.

But I’m not in the United States! Where’s my native currency?

You can either replace the ”us/en” in the URL above or use the core I18N::LangTags::Detect module added in Perl v5.8.5 if you’re really determined to be portable across different users’ locales. This is really stretching the definition of ”one-liner,” though.
```
$ LANG=de_DE.UTF-8 perl -Mojo -MI18N::LangTags::Detect -E 'my @lang = (split /-/, I18N::LangTags::Detect::detect)[1,0]; say g("https://sik.search.blue.cdtapps.com/" . join("/", @lang == 2 ? @lang : ("us", "en")) . "/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}->@{qw(currencyCode priceNumeral)}'
EUR27.99
```
Window dressing

It’s hard to envision cuddling a number, but luckily the product information returned above links to a JPEG file in the mainImageUrl key. My favorite terminal app iTerm2 can display images inline from either a file or Base64 encoded data, so adding an extra HTTP request and encoding from the core MIME::Base64 module yields:
```
$ perl -Mojo -MMIME::Base64 -E 'say "\c[]1337;File=inline=1;width=100%:", encode_base64(g(g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}{mainImageUrl})->body), "\cG"'
```
(You could just send the image URL to iTerm2’s bundled imgcat utility, but where’s the fun in that?)
```
$ imgcat --url `perl -Mojo -E 'print g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}{mainImageUrl}'`
```
But I don’t have iTerm2 or a Mac!

I got you. At the expense of a number of other dependencies, here’s a version that will work on any terminal that supports 256-color mode with ANSI codes using Image::Term256Color from CPAN and a Unicode font with block characters. I’ll also use Term::ReadKey to size the image for the width of your window. (Again, this stretches the definition of “one-liner.”)
```
$ perl -Mojo -MImage::Term256Color -MTerm::ReadKey -E 'say for Image::Term256Color::convert(g(g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}{mainImageUrl})->body, {scale_x => (GetTerminalSize)[0], utf8 => 1})'
```
I hate Mojolicious! Can’t you just use core modules?

Fine. Here’s retrieving the product price using HTTP::Tiny and the pure-Perl JSON parser JSON::PP, which were added to core in version 5.14.
```
$ perl -MHTTP::Tiny -MJSON::PP -E 'say @{decode_json(HTTP::Tiny->new->get("https://sik.search.blue.cdtapps.com/us/en/search-result-page?types=PRODUCT&q=BLÅHAJ")->{content})->{searchResultPage}{products}{main}{items}[0]{product}}{qw(currencyCode priceNumeral)}'
USD19.99
```
Fetching and displaying a picture of the huggable shark using MIME::Base64 or Image::Term256Color as above is left as an exercise to the reader.
April 12, 2022
34 at 34 for v5.34: Modern Perl features for Perl’s birthday
Friday, December 17, 2021, marked the thirty-fourth birthday of the Perl programming language, and coincidentally this year saw the release of version 5.34. There are plenty of Perl developers out there who haven’t kept up with recent (and not-so-recent) improvements to the language and its ecosystem, so I thought I might list a batch. (You may have seen some of these before in May’s post “Perl can do that now!”)

The feature pragma

Perl v5.10 was released in December 2007, and with it came feature, a way of enabling new syntax without breaking backward compatibility. You can enable individual features by name (e.g., use feature qw(say fc); for the say and fc keywords), or by using a feature bundle based on the Perl version that introduced them. For example, the following:
```
use feature ':5.34';
```
…gives you the equivalent of:
```
use feature qw(bareword_filehandles bitwise current_sub evalbytes fc indirect multidimensional postderef_qq say state switch unicode_eval unicode_strings);
```
Boy, that’s a mouthful. Feature bundles are good. The corresponding bundle also gets implicitly loaded if you specify a minimum required Perl version, e.g., with use v5.32;. If you use v5.12; or higher, strict mode is enabled for free. So just say:
```
use v5.34;
```
And lastly, one-liners can use the -E switch instead of -e to enable all features for that version of Perl, so you can say the following on the command line:
```
perl -E 'say "Hello world!"'
```
Instead of:
```
perl -e 'print "Hello world!\n"'
```
Which is great when you’re trying to save some typing.

The experimental pragma

Sometimes new Perl features need to be driven a couple of releases around the block before their behavior settles. Those experiments are documented in the perlexperiment page, and usually, you need both a use feature (see above) and no warnings statement to safely enable them. Or you can simply pass a list to use experimental of the features you want, e.g.:
```
use experimental qw(isa postderef signatures);
```
Ever-expanding warnings categories

March 2000 saw the release of Perl 5.6, and with it, the expansion of the -w command-line switch to a system of fine-grained controls for warning against “dubious constructs” that can be turned on and off depending on the lexical scope. What started as 26 main and 20 subcategories has expanded into 31 main and 43 subcategories, including warnings for the aforementioned experimental features.

As the relevant Perl::Critic policy says, “Using warnings, and paying attention to what they say, is probably the single most effective way to improve the quality of your code.” If you must violate warnings (perhaps because you’re rehabilitating some legacy code), you can isolate such violations to a small scope and individual categories. Check out the strictures module on CPAN if you’d like to go further and make a safe subset of these categories fatal during development.

Document other recently-introduced syntax with Syntax::Construct

Not every new bit of Perl syntax is enabled with a feature guard. For the rest, there’s E. Choroba’s Syntax::Construct module on CPAN. Rather than having to remember which version of Perl introduced what, Syntax::Construct lets you declare only what you use and provides a helpful error message if someone tries to run your code on an older unsupported version. Between it and the feature pragma, you can prevent many head-scratching moments and give your users a chance to either upgrade or workaround.

Make built-in functions throw exceptions with autodie

Many of Perl’s built-in functions only return false on failure, requiring the developer to check every time whether a file can be opened or a system command executed. The lexical autodie pragma replaces them with versions that raise an exception with an object that can be interrogated for further details. No matter how many functions or methods deep a problem occurs, you can choose to catch it and respond appropriately. This leads us to…

try/catch exception handling and Feature::Compat::Try

This year’s Perl v5.34 release introduced experimental try/catch syntax for exception handling that should look more familiar to users of other languages while handling the issues surrounding using block eval and testing of the special $@ variable. If you need to remain compatible with older versions of Perl (back to v5.14), just use the Feature::Compat::Try module from CPAN to automatically select either v5.34’s native try/catch or a subset of the functionality provided by Syntax::Keyword::Try.

Pluggable keywords

The abovementioned Syntax::Keyword::Try was made possible by the introduction of a pluggable keyword mechanism in 2010’s Perl v5.12. So was the Future::AsyncAwait asynchronous programming library and the Object::Pad testbed for new object-oriented Perl syntax. If you’re handy with C and Perl’s XS glue language, check out Paul “LeoNerd” Evans’ XS::Parse::Keyword module to get a leg up on developing your own syntax module.

Define packages with versions and blocks

Perl v5.12 also helped reduce clutter by enabling a package namespace declaration to also include a version number, instead of requiring a separate our $VERSION = ...; v5.14 further refined packages to be specified in code blocks, so a namespace declaration can be the same as a lexical scope. Putting the two together gives you:
```
package Local::NewHotness v1.2.3 {
    ...
}
```
Instead of:
```
{
    package Local::OldAndBusted;
    use version 0.77; our $VERSION = version->declare("v1.2.3");
    ...
}
```
I know which I’d rather do. (Though you may want to also use Syntax::Construct qw(package-version package-block); to help along with older installations as described above.)

The // defined-or operator

This is an easy win from Perl v5.10:
```
defined $foo ? $foo : $bar  # replace this
$foo // $bar                # with this
```
And:
```
$foo = $bar unless defined $foo  # replace this
$foo //= $bar                    # with this
```
Perfect for assigning defaults to variables.

state variables only initialize once

Speaking of variables, ever want one to keep its old value the next time a scope is entered, like in a sub? Declare it with state instead of my. Before Perl v5.10, you needed to use a closure instead.

Save some typing with say

Perl v5.10’s bumper crop of enhancements also included the say function, which handles the common use case of printing a string or list of strings with a newline. It’s less noise in your code and saves you four characters. What’s not to love?

Note unimplemented code with ...

The ... ellipsis statement (colloquially “yada-yada”) gives you an easy placeholder for yet-to-be-implemented code. It parses OK but will throw an exception if executed. Hopefully, your test coverage (or at least static analysis) will catch it before your users do.

Loop and enumerate arrays with each, keys, and values

The each, keys, and values functions have always been able to operate on hashes. Perl v5.12 and above make them work on arrays, too. The latter two are mainly for consistency, but you can use each to iterate over an array’s indices and values at the same time:
```
while (my ($index, $value) = each @array) {
    ...
}
```
This can be problematic in non-trivial loops, but I’ve found it helpful in quick scripts and one-liners.

delete local hash (and array) entries

Ever needed to delete an entry from a hash (e.g, an environment variable from %ENV or a signal handler from %SIG) just inside a block? Perl v5.12 lets you do that with delete local.

Paired hash slices

Jumping forward to 2014’s Perl v5.20, the new %foo{'bar', 'baz'} syntax enables you to slice a subset of a hash with its keys and values intact. Very helpful for cherry-picking or aggregating many hashes into one. For example:
```
my %args = (
    verbose => 1,
    name    => 'Mark',
    extra   => 'pizza',
);
# don't frob the pizza
$my_object->frob( %args{ qw(verbose name) };
```
Paired array slices

Not to be left out, you can also slice arrays in the same way, in this case returning indices and values:
```
my @letters = 'a' .. 'z';
my @subset_kv = %letters[16, 5, 18, 12];
# @subset_kv is now (16, 'p', 5, 'e', 18, 'r', 12, 'l')
```
More readable dereferencing

Perl v5.20 introduced and v5.24 de-experimentalized a more readable postfix dereferencing syntax for navigating nested data structures. Instead of using {braces} or smooshing sigils to the left of identifiers, you can use a postfixed sigil-and-star:
```
push @$array_ref,    1, 2, 3;  # noisy
push @{$array_ref},  1, 2, 3;  # a little easier
push $array_ref->@*, 1, 2, 3;  # read from left to right
```
So much of web development is slinging around and picking apart complicated data structures via JSON, so I welcome anything like this to reduce the cognitive load.

when as a statement modifier

Starting in Perl v5.12, you can use the experimental switch feature’s when keyword as a postfix modifier. For example:
```
for ($foo) {
    $a =  1 when /^abc/;
    $a = 42 when /^dna/;
    ...
}
```
But I don’t recommend when, given, or given’s smartmatch operations as they were retconned as experiments in 2013’s Perl v5.18 and have remained so due to their tricky behavior. I wrote about some alternatives using stable syntax back in February.

Simple class inheritance with use parent

Sometimes in older object-oriented Perl code, you’ll see use base as a pragma to establish inheritance from another class. Older still is the direct manipulation of the package’s special @ISA array. In most cases, both should be avoided in favor of use parent, which was added to core in Perl v5.10.1.

Mind you, if you’re following the Perl object-oriented tutorial’s advice and have selected an OO system from CPAN, use its subclassing mechanism if it has one. Moose, Moo, and Class::Accessor’s “antlers” mode all provide an extends function; Object::Pad provides an :isa attribute on its class keyword.

Test for class membership with the isa operator

As an alternative to the isa() method provided to all Perl objects, Perl v5.32 introduced the experimental isa infix operator:
```
$my_object->isa('Local::MyClass')
# or
$my_object isa Local::MyClass
```
The latter can take either a bareword class name or string expression, but more importantly, it’s safer as it also returns false if the left argument is undefined or isn’t a blessed object reference. The older isa() method will throw an exception in the former case and might return true if called as a class method when $my_object is actually a string of a class name that’s the same as or inherits from isa()’s argument.

Lexical subroutines

Introduced in Perl v5.18 and de-experimentalized in 2017’s Perl v5.26, you can now precede sub declarations with my, state, or our. One use of the first two is truly private functions and methods, as described in this 2018 Dave Jacoby blog and as part of Neil Bowers’ 2014 survey of private function techniques.

Subroutine signatures

I’ve written and presented extensively about signatures and alternatives over the past year, so I won’t repeat that here. I’ll just add that the Perl 5 Porters development mailing list has been making a concerted effort over the past month to hash out the remaining issues towards rendering this feature non-experimental. The popular Mojolicious real-time web framework also provides a shortcut for enabling signatures and uses them extensively in examples.

Indented here-documents with <<~

Perl has had shell-style “here-document” syntax for embedding multi-line strings of quoted text for a long time. Starting with Perl v5.26, you can precede the delimiting string with a ~ character and Perl will both allow the ending delimiter to be indented as well as strip indentation from the embedded text. This allows for much more readable embedded code such as runs of HTML and SQL. For example:
```
if ($do_query) {
    my $rows_deleted = $dbh->do(<<~'END_SQL', undef, 42);
      DELETE FROM table
      WHERE status = ?
      END_SQL
    say "$rows_deleted rows were deleted."; 
}
```
More readable chained comparisons

When I learned math in school, my teachers and textbooks would often describe multiple comparisons and inequalities as a single expression. Unfortunately, when it came time to learn programming every computer language I saw required them to be broken up with a series of and (or &&) operators. With Perl v5.32, this is no more:
```
if ( $x < $y && $y <= $z ) { ... }  # old way
if ( $x < $y <= $z )       { ... }  # new way
```
It’s more concise, less noisy, and more like what regular math looks like.

Self-documenting named regular expression captures

Perl’s expressive regular expression matching and text-processing prowess are legendary, although overuse and poor use of readability enhancements often turn people away from them (and Perl in general). We often use regexps for extracting data from a matched pattern. For example:
```
if ( /Time: (..):(..):(..)/ ) {  # parse out values
    say "$1 hours, $2 minutes, $3 seconds";
}
```
Named capture groups, introduced in Perl v5.10, make both the pattern more obvious and retrieval of its data less cryptic:
```
if ( /Time: (?<hours>..):(?<minutes>..):(?<seconds>..)/ ) {
    say "$+{hours} hours, $+{minutes} minutes, $+{seconds} seconds";
}
```
More readable regexp character classes

The /x regular expression modifier already enables better readability by telling the parser to ignore most whitespace, allowing you to break up complicated patterns into spaced-out groups and multiple lines with code comments. With Perl v5.26 you can specify /xx to also ignore spaces and tabs inside [bracketed] character classes, turning this:
```
/[d-eg-i3-7]/
/[!@"#$%^&*()=?<>']/
```
…into this:
```
/ [d-e g-i 3-7]/xx
/[ ! @ " # $ % ^ & * () = ? <> ' ]/xx
```
Set default regexp flags with the re pragma

Beginning with Perl v5.14, writing use re '/xms'; (or any combination of regular expression modifier flags) will turn on those flags until the end of that lexical scope, saving you the trouble of remembering them every time.

Non-destructive substitution with s///r and tr///r

The s/// substitution and tr/// transliteration operators typically change their input directly, often in conjunction with the =~ binding operator:
```
s/foo/bar/;  # changes the first foo to bar in $_
$baz =~ s/foo/bar/;  # the same but in $baz
```
But what if you want to leave the original untouched, such as when processing an array of strings with a map? With Perl v5.14 and above, add the /r flag, which makes the substitution on a copy and returns the result:
```
my @changed = map { s/foo/bar/r } @original;
```
Unicode case-folding with fc for better string comparisons

Unicode and character encoding in general are complicated beasts. Perl has handled Unicode since v5.6 and has kept pace with fixes and support for updated standards in the intervening decades. If you need to test if two strings are equal regardless of case, use the fc function introduced in Perl v5.16.

Safer processing of file arguments with <<>>

The <> null filehandle or “diamond operator” is often used in while loops to process input per line coming either from standard input (e.g., piped from another program) or from a list of files on the command line. Unfortunately, it uses a form of Perl’s open function that interprets special characters such as pipes (|) that would allow it to insecurely run external commands. Using the <<>> “double diamond” operator introduced in Perl v5.22 forces open to treat all command-line arguments as file names only. For older Perls, the perlop documentation recommends the ARGV::readonly CPAN module.

Safer loading of Perl libraries and modules from @INC

Perl v5.26 removed the ability for all programs to load modules by default from the current directory, closing a security vulnerability originally identified and fixed as CVE-2016–1238 in previous versions’ included scripts. If your code relied on this unsafe behavior, the v5.26 release notes include steps on how to adapt.

HTTP::Tiny simple HTTP/1.1 client included

To bootstrap access to CPAN on the web in the possible absence of external tools like curl or wget, Perl v5.14 began including the HTTP::Tiny module. You can also use it in your programs if you need a simple web client with no dependencies.

Test2: The next generation of Perl testing frameworks

Forked and refactored from the venerable Test::Builder (the basis for the Test::More library that many are familiar with), Test2 was included in the core module library beginning with Perl v5.26. I’ve experimented recently with using the Test2::Suite CPAN library instead of Test::More and it looks pretty good. I’m also intrigued by Test2::Harness’ support for threading, forking, and preloading modules to reduce test run times.

Task::Kensho: Where to start for recommended Perl modules

This last item may not be included when you install Perl, but it’s where I turn for a collection of well-regarded CPAN modules for accomplishing a wide variety of common tasks spanning from asynchronous programming to XML. Use it as a starting point or interactively select the mix of libraries appropriate to your project.

And there you have it: a selection of 34 features, enhancements, and improvements for the first 34 years of Perl. What’s your favorite? Did I miss anything? Let me know in the comments.
December 21, 2021
Video: “A Year of Being Wrong on the Internet”

I’m busy this week hosting my parents’ first visit to Houston, but I didn’t want to let this Tuesday go by without linking to my talk from last week’s Ephemeral Miniconf. Thanks so much to Thibault Duponchelle for organizing such a terrific event, to all the other speakers for coming together to present, and to everyone who attended for welcoming me.

November 23, 2021