person reaching out to a robot

Last week the research laboratory startup OpenAI set the technology world ablaze with the debut of ChatGPT, a prototype conversational program or chatbot”. It uses a large language model tuned with machine learning techniques to provide answers on a vast variety of subjects drawn from books and the World Wide Web, including Reddit and Wikipedia. Many users and commentators wondered if its detailed and seemingly well-​reasoned responses could be used in place of human-​written content such as academic essays and explanations of unfamiliar topics. Others noticed that it authoritatively mixed in factually incorrect information that might slip past non-​experts, and wondered if that might be fixed like any other software bug.”

The fundamental problem is that an artificial intelligence” like ChatGPT is unconcerned with the outside consequences of its use. Unlike humans, it cannot hold its own life as a standard of value. It does not remain alive” through self-​sustaining and self-​generated action. It does not have to be any more or less rational than its programming to continue its existence, not that it cares” about that since it has all the life of an electrical switchboard.

AI can’t know to respect reality, reason, and rights because it has no existential connection to those concepts. It can only fake it, and it can fail without remorse or consequence at any point. In short, artificial intelligence” is a red herring. Let me know when we’re working on actual ethics. Tell me when you can teach a computer (or a human!) pride and shame and everything in between.

This week’s Perl and Raku Conference 2022 in Houston was packed with great presentations, and I humbly added to them with a five-​ish minute lightning talk on two of Perl’s more misunderstood functions: map and grep.

Sorry about the um”s and ah”s…

I’ve written much about list processing in Perl, and this talk was based on the following blog posts:

Overall I loved attending the conference, and it really invigorated my participation in the Perl community. Stay tuned as I resume regular posting!

Update for Raku

On Twitter I nudged prominent Raku hacker (and recovered Perl hacker) Elizabeth Mattijsen to write about the Raku programming language’s map and grep functionality. Check out her five-​part series on DEV.to.

IKEA BLÅHAJ shark toys

IKEA’s toy BLÅHAJ shark has become a beloved Internet icon over the past several years. I thought it might be cute to write a little Perl to get info about it and even display a cuddly picture right in the terminal where I’m running the code. Maybe this will give you some ideas for your own quick web clients. Of course, you could accomplish all of these things using a pipeline of individual command-​line utilities like curl, jq, and GNU coreutilsbase64. These examples focus on Perl as the glue, though.

Warning: dodgy API ahead

I haven’t found a publicly-​documented and ‑supported official API for querying IKEA product information but others have deconstructed the company’s web site AJAX requests so we can use that instead. The alternative would be to scrape the IKEA web site directly which, although possible, would be more tedious and prone to failure should their design change. An unofficial API is also unreliable but the simpler client code is easier to change should any errors surface.

Enter the Mojolicious

My original goal was to do this in a single line issued to the perl command, and luckily the Mojolicious framework’s ojo module is tailor-​made for such things. By adding a -Mojo switch to the perl command, you get over a dozen quick single-​character functions for spinning up a quick web application or, in our case, making and interpreting web requests without a lot of ceremony. Here’s the start of my one-​line request to the IKEA API for information on their BLÅHAJ product, using ojo’s g function to perform an HTTP GET and displaying the JSON from the response body to the terminal.

$ perl -Mojo -E 'say g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->body'

This currently returns over 2,400 lines of data, so after reading it over I’ll convert the response body JSON to a Perl data structure and dump only the main product information using ojo’s r function:

$ perl -Mojo -E 'say r g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}'
{
  "availability" => [],
  "breathTaking" => bless( do{\(my $o = 0)}, 'JSON::PP::Boolean' ),
  "colors" => [
    {
      "hex" => "0058a3",
      "id" => 10007,
      "name" => "blue"
    },
    {
      "hex" => "ffffff",
      "id" => 10156,
      "name" => "white"
    }
  ],
  "contextualImageUrl" => "https://www.ikea.com/us/en/images/products/blahaj-soft-toy-shark__0877371_pe633608_s5.jpg",
  "currencyCode" => "USD",
  "discount" => "",
  "features" => [],
  "gprDescription" => {
    "numberOfVariants" => 0,
    "variants" => []
  },
  "id" => 90373590,
  "itemMeasureReferenceText" => "39 \x{bc} \"",
  "itemNo" => 90373590,
  "itemNoGlobal" => 30373588,
  "itemType" => "ART",
  "lastChance" => $VAR1->{"breathTaking"},
  "mainImageAlt" => "BL\x{c5}HAJ Soft toy, shark, 39 \x{bc} \"",
  "mainImageUrl" => "https://www.ikea.com/us/en/images/products/blahaj-soft-toy-shark__0710175_pe727378_s5.jpg",
  "name" => "BL\x{c5}HAJ",
  "onlineSellable" => bless( do{\(my $o = 1)}, 'JSON::PP::Boolean' ),
  "pipUrl" => "https://www.ikea.com/us/en/p/blahaj-soft-toy-shark-90373590/",
  "price" => {
    "decimals" => 99,
    "isRegularCurrency" => $VAR1->{"breathTaking"},
    "prefix" => "\$",
    "separator" => ".",
    "suffix" => "",
    "wholeNumber" => 19
  },
  "priceNumeral" => "19.99",
  "quickFacts" => [],
  "tag" => "NONE",
  "typeName" => "Soft toy"
}

If I just want the price I can do:

$ perl -Mojo -E 'say g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}->@{qw(currencyCode priceNumeral)}'
USD19.99

That ->@{qw(currencyCode priceNumeral)} towards the end uses the postfix reference slicing syntax introduced experimentally in Perl v5.20 and made official in v5.24. If you’re using an older perl, you’d say:

$ perl -Mojo -E 'say @{g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}}{qw(currencyCode priceNumeral)}'
USD19.99

I prefer the former, though, because it’s easier to read left-to-right.

But I’m not in the United States! Where’s my native currency?

You can either replace the us/en” in the URL above or use the core I18N::LangTags::Detect module added in Perl v5.8.5 if you’re really determined to be portable across different users’ locales. This is really stretching the definition of one-​liner,” though.

$ LANG=de_DE.UTF-8 perl -Mojo -MI18N::LangTags::Detect -E 'my @lang = (split /-/, I18N::LangTags::Detect::detect)[1,0]; say g("https://sik.search.blue.cdtapps.com/" . join("/", @lang == 2 ? @lang : ("us", "en")) . "/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}->@{qw(currencyCode priceNumeral)}'
EUR27.99

Window dressing

It’s hard to envision cuddling a number, but luckily the product information returned above links to a JPEG file in the mainImageUrl key. My favorite terminal app iTerm2 can display images inline from either a file or Base64 encoded data, so adding an extra HTTP request and encoding from the core MIME::Base64 module yields:

$ perl -Mojo -MMIME::Base64 -E 'say "\c[]1337;File=inline=1;width=100%:", encode_base64(g(g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}{mainImageUrl})->body), "\cG"'

(You could just send the image URL to iTerm2’s bundled imgcat utility, but where’s the fun in that?)

$ imgcat --url `perl -Mojo -E 'print g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}{mainImageUrl}'`

But I don’t have iTerm2 or a Mac!

I got you. At the expense of a number of other dependencies, here’s a version that will work on any terminal that supports 256-​color mode with ANSI codes using Image::Term256Color from CPAN and a Unicode font with block characters. I’ll also use Term::ReadKey to size the image for the width of your window. (Again, this stretches the definition of one-​liner.”)

$ perl -Mojo -MImage::Term256Color -MTerm::ReadKey -E 'say for Image::Term256Color::convert(g(g("https://sik.search.blue.cdtapps.com/us/en/search-result-page", form => {types => "PRODUCT", q => "BLÅHAJ"})->json->{searchResultPage}{products}{main}{items}[0]{product}{mainImageUrl})->body, {scale_x => (GetTerminalSize)[0], utf8 => 1})'

I hate Mojolicious! Can’t you just use core modules?

Fine. Here’s retrieving the product price using HTTP::Tiny and the pure-​Perl JSON parser JSON::PP, which were added to core in version 5.14.

$ perl -MHTTP::Tiny -MJSON::PP -E 'say @{decode_json(HTTP::Tiny->new->get("https://sik.search.blue.cdtapps.com/us/en/search-result-page?types=PRODUCT&q=BLÅHAJ")->{content})->{searchResultPage}{products}{main}{items}[0]{product}}{qw(currencyCode priceNumeral)}'
USD19.99

Fetching and displaying a picture of the huggable shark using MIME::Base64 or Image::Term256Color as above is left as an exercise to the reader.

plate of eggs and hash browns

This month I started a new job at Alert Logic, a cybersecurity provider with Perl (among many other things) at its beating heart. I’ve been learning a lot, and part of the process has been understanding the APIs in the code base. To that end, I’ve been writing small test scripts to tease apart data structures, using Perl array-​processing, list-​processing, and hash- (i.e., associative array)-processing functions.

I’ve covered map, grep, and friends a couple times before. Most recently, I described using List::Util’s any function to check if a condition is true for any item in a list. In the simplest case, you can use it to check to see if a given value is in the list at all:

use feature 'say';
use List::Util 'any';
my @colors =
  qw(red orange yellow green blue indigo violet);
say 'matched' if any { /^red$/ } @colors;

However, if you’re going to be doing this a lot with arbitrary strings, Perl FAQ section 4 advises turning the array into the keys of a hash and then checking for membership there. For example, here’s a simple script to check if the colors input (either from the keyboard or from files passed as arguments) are in the rainbow:

#!/usr/bin/env perl

use v5.22; # introduced <<>> for safe opening of arguments
use warnings;
 
my %in_colors = map {$_ => 1}
  qw(red orange yellow green blue indigo violet);

while (<<>>) {
  chomp;
  say "$_ is in the rainbow" if $in_colors{$_};
}

List::Util has a bunch of functions for processing lists of pairs that I’ve found useful when pawing through hashes. pairgrep, for example, acts just like grep but instead assigns $a and $b to each key and value passed in and returns the resulting pairs that match. I’ve used it as a quick way to search for hash entries matching certain value conditions:

use List::Util 'pairgrep';
my %numbers = (zero => 0, one => 1, two => 2, three => 3);
my %odds = pairgrep {$b % 2} %numbers;

Sure, you could do this by invoking a mix of plain grep, keys, and a hash slice, but it’s noisier and more repetitive:

use v5.20; # for key/value hash slice 
my %odds = %numbers{grep {$numbers{$_} % 2} keys %numbers};

pairgreps compiled C‑based XS code can also be faster, as evidenced by this Benchmark script that works through a hash made of the Unix words file (479,828 entries on my machine):

#!/usr/bin/env perl

use v5.20;
use warnings;
use List::Util 'pairgrep';
use Benchmark 'cmpthese';

my (%words, $count);
open my $fh, '<', '/usr/share/dict/words'
  or die "can't open words: $!";
while (<$fh>) {
  chomp;
  $words{$_} = $count++;
}
close $fh;

cmpthese(100, {
  grep => sub {
    my %odds = %words{grep {$words{$_} % 2} keys %words};
  },
  pairgrep => sub {
    my %odds = pairgrep {$b % 2} %words;
  },
} );

Benchmark output:

           Rate     grep pairgrep
grep     1.47/s       --     -20%
pairgrep 1.84/s      25%       --

In general, I urge you to work through the Perl documentations tutorials on references, lists of lists, the data structures cookbook, and the FAQs on array and hash manipulation. Then dip into the various list-​processing modules (especially the included List::Util and CPAN’s List::SomeUtils) for ready-​made functions for common operations. You’ll find a wealth of techniques for creating, managing, and processing the data structures that your programs need.

arrow communication direction display

When I first started writing Perl in my early 20’s, I tended to follow a lot of the structured programming conventions I had learned in school through Pascal, especially the notion that every function has a single point of exit. For example:

sub double_even_number {
    # not using signatures, this is mid-1990's code
    my $number = shift;

    if (not $number % 2) {
        $number *= 2;
    }

    return $number; 
}

This could get pretty convoluted, especially if I was doing something like validating multiple arguments. And at the time I didn’t yet grok how to handle exceptions with eval and die, so I’d end up with code like:

sub print_postal_address {
    # too many arguments, I know
    my ($name, $street1, $street2, $city, $state, $zip) = @_;
    # also this notion of addresses is naive and US-centric

    my $error;

    if (!$name) {
        $error = 'no name';
    }
    else {
        print "$name\n";

        if (!$street1) {
            $error = 'no street';
        }
        else {
            print "$street1\n";

            if ($street2) {
                print "$street2\n";
            }

            if (!$city) {
                $error = 'no city';
            }
            else {
                print "$city, ";

                if (!$state) {
                    $error = 'no state';
                }
                else {
                    print "$state ";

                    if (!$zip) {
                        $error = 'no ZIP code';
                    }
                    else {
                        print "$zip\n";
                    }
                }
            }
        }
    }

    return $error;
}

What a mess. Want to count all those braces to make sure they’re balanced? This is sometimes called the arrow anti-​pattern, with the arrowhead(s) being the most nested statement. The default ProhibitDeepNests perlcritic policy is meant to keep you from doing that.

The way out (literally) is guard clauses: checking early if something is valid and bailing out quickly if not. The above example could be written:

sub print_postal_address {
    my ($name, $street1, $street2, $city, $state, $zip) = @_;

    if (!$name) {
        return 'no name';
    }
    if (!$street1) {
        return 'no street1';
    }
    if (!$city) {
        return 'no city';
    }
    if (!$state) {
        return 'no state';
    }
    if (!$zip) {
        return 'no zip';
    }

    print join "\n",
      $name,
      $street1,
      $street2 ? $street2 : (),
      "$city, $state $zip\n";

    return;
}

With Perl’s statement modifiers (sometimes called postfix controls) we can do even better:

    ...

    return 'no name'    if !$name;
    return 'no street1' if !$street1;
    return 'no city'    if !$city;
    return 'no state'   if !$state;
    return 'no zip'     if !$zip;

    ...

(Why if instead of unless? Because the latter can be confusing with double-​negatives.)

Guard clauses aren’t limited to the beginnings of functions or even exiting functions entirely. Often you’ll want to skip or even exit early conditions in a loop, like this example that processes files from standard input or the command line:

while (<>) {
    next if /^SKIP THIS LINE: /;
    last if /^END THINGS HERE$/;

    ...
}

Of course, if you are validating function arguments, you should consider using actual subroutine signatures if you have a Perl newer than v5.20 (released in 2014), or one of the other type validation solutions if not. Today I would write that postal function like this, using Type::Params for validation and named arguments:

use feature qw(say state); 
use Types::Standard 'Str';
use Type::Params 'compile_named';

sub print_postal_address {
    state $check = compile_named(
        name    => Str,
        street1 => Str,
        street2 => Str, {optional => 1},
        city    => Str,
        state   => Str,
        zip     => Str,
    );
    my $arg = $check->(@_);

    say join "\n",
      $arg->{name},
      $arg->{street1},
      $arg->{street2} ? $arg->{street2} : (),
      "$arg->{city}, $arg->{state} $arg->{zip}";

    return;
}

print_postal_address(
    name    => 'J. Random Hacker',
    street1 => '123 Any Street',
    city    => 'Somewhereville',
    state   => 'TX',
    zip     => 12345,
);

Note that was this part of a larger program, I’d wrap that print_postal_address call in a try block and catch exceptions such as those thrown by the code reference $check generated by compile_named. This highlights one concern of guard clauses and other return early” patterns: depending on how much has already occurred in your program, you may have to perform some resource cleanup either in a catch block or something like Syntax::Keyword::Try’s finally block if you need to tidy up after both success and failure.