man cutting tress using chainsaw

The Java world had an… inter­est­ing week­end when secu­ri­ty researchers revealed on December 9 a vul­ner­a­bil­i­ty in the pop­u­lar Apache Log4j 2 soft­ware library for record­ing and debug­ging events. Systems as diverse as Amazon Web Services, Apple iCloud, and the Minecraft video game could be exploit­ed to run arbi­trary code on a serv­er mere­ly by send­ing a specially-​crafted string of text. Information tech­nol­o­gy pro­fes­sion­als have been scram­bling ever since the ini­tial dis­clo­sure to patch, upgrade, recon­fig­ure, or oth­er­wise pro­tect affect­ed servers. It’s bad, and past unpatched vul­ner­a­bil­i­ties like this have been respon­si­ble for the expo­sure of mil­lions of people’s sen­si­tive data.

Many Perl appli­ca­tions use the similarly-​named and ‑designed Log::Log4perl library, and the good news is that as far as I can tell the lat­ter doesn’t suf­fer from the type of vul­ner­a­bil­i­ty described above. This doesn’t mean poorly-​written or ‑con­fig­ured Perl-​based sys­tems are immune to all exploits, just this par­tic­u­lar one. You should be safe to con­tin­ue using Log4perl unless some­one has delib­er­ate­ly con­fig­ured it oth­er­wise, and in fact, my work uses it extensively.

You might be sur­prised to read me sug­gest­ing a log­ging frame­work after writ­ing mul­ti­ple arti­cles espous­ing the Perl step debug­ger as an alter­na­tive. Log4perl devel­op­er Mike Schilli’s 2002 intro­duc­tion to the pack­age for Perl.com came down on the oppo­site side of the argu­ment. It can seem like one of those pro­gram­mer reli­gious issues like tabs vs. spaces, vim vs. Emacs, or Linux vs. Windows. (For the record, the cor­rect answers are spaces, BBEdit, and macOS. 😉)

But in this case, you can and should have the best of both worlds—logging at dif­fer­ent lev­els to appro­pri­ate des­ti­na­tions while still drop­ping into the inter­ac­tive debug­ger when you need to do some­thing trick­i­er like exam­ine pro­gram state or tweak a data struc­ture on the fly. I use both tech­niques and only empha­size the advo­ca­cy of step debug­ging because it’s under­stood less.

In March I wrote The Perl debug­ger can be your super­pow­er, intro­duc­ing the step debug­ger as a bet­ter way to debug your Perl code rather than lit­ter­ing your source with tem­po­rary print state­ments or log­ging. I use the debug­ger all the time, and I’ve real­ized that some more tech­niques are worth covering.

Although I men­tioned a caveat when debug­ging web appli­ca­tions, our apps at work all adhere to the Perl Web Server Gateway Interface (PSGI) spec­i­fi­ca­tion and thus we can use tools like Test::WWW::Mechanize::PSGI or Plack::Test to run tests and debug­ging ses­sions in the same Perl process. (Mojolicious users can use some­thing like Test::Mojo for the same effect.)

To demon­strate, let’s get start­ed with some­thing like this which tests that a giv­en route (/say-hello) returns a cer­tain JSON struc­ture ({"message": "Hello world!"}):

#!/usr/bin/env perl

use Test::Most;
use Test::WWW::Mechanize::PSGI;
use JSON::MaybeXS;
use Local::MyApp; # name of app's main module

my $mech = Test::WWW::Mechanize::PSGI->new(
    # a Dancer2 app, so to_app returns a PSGI coderef
    app => Local::MyApp->to_app(),
);
$mech->get_ok('/say-hello');
lives_and {
    my $json = decode_json($mech->content);
    cmp_deeply( $json, {message => 'Hello world!'} );
} 'message is Hello world!';

done_testing;

All very fine and well, but what hap­pens if that route starts return­ing a dif­fer­ent mes­sage or worse, invalid out­put that caus­es decode_json to fail? Eventually, you’ll rewrite the test in the script to out­put the offend­ing con­tent when some­thing goes wrong, but right now you want to suss out the root cause.

Debuggers have the con­cept of break­points, which are flags that tell the debug­ger to stop at a cer­tain line of code and wait for instruc­tions. We can set them while run­ning the debug­ger with the b com­mand or con­tin­ue to a one-​time break­point with the c com­mand, or we can insert them into the code our­selves before run­ning it through the debug­ger in the first place.

Add this line right after the lives_and { line:

$DB::single = 1;

This sim­u­lates hav­ing typed the s com­mand in the debug­ger at that line, stop­ping exe­cu­tion at that point. Run our test with per­l’s -d option, and then type c to con­tin­ue to that breakpoint:

$ perl -d -Ilib t/test_psgi.t

Loading DB routines from perl5db.pl version 1.60
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

[Local::MyApp:7170] core @2021-07-06 07:33:22> Built config from files: /Users/mgardner/Projects/blog/myapp/config.yml /Users/mgardner/Projects/blog/myapp/environments/development.yml in (eval 310)[/Users/mgardner/.plenv/versions/5.34.0/lib/perl5/site_perl/5.34.0/Sub/Quote.pm:3] l. 910
Test2::API::CODE(0x7ffabea39ee8)(/Users/mgardner/.plenv/versions/5.34.0/lib/perl5/site_perl/5.34.0/Test2/API.pm:71):
71:	    INIT { eval 'END { test2_set_is_end() }; 1' or die $@ }

  DB<1> c

[...]
ok 1 - GET /say-hello
main::CODE(0x7f8069caf2c8)(t/test_psgi.t:14):
15:	    my $json = decode_json($mech->content);

  DB<1> 

From here we can exam­ine vari­ables, set oth­er break­points, or even exe­cute arbi­trary lines of code. Let’s see what became of that HTTP GET request:

  DB<1> x $mech->content

0  '{"error":"Undefined subroutine &Local::MyApp::build_frog called at lib/Local/MyApp.pm line 11.\\n"}'

  DB<2> 

Aha, some­thing has returned some dif­fer­ent JSON indi­cat­ing an error. Let’s look at the lines around (1020) the offend­ing line (11):

  DB<2> f lib/Local/MyApp.pm

  DB<3> l 10-20

10:	        my $method = 'build_frog';
11:	        $method->();
12 	    }
13:	    catch ($e) {
14:	        send_as JSON => {error => $e};
15 	    }
16:	    send_as JSON => {message => 'Hello world!'};
17:	};
18
19 	sub build_frob {
20:	    return;

  DB<4>

Yep, a typo on line 11, and one that was­n’t caught at com­pile time since it’s gen­er­at­ed at runtime.

Just to be sure (and to demon­strate some oth­er cool debug­ger fea­tures), let’s set anoth­er break­point while in the debug­ger and then exer­cise that route again. Then we’ll check that $method vari­able against the list of avail­able meth­ods in the Local::MyApp package.

  DB<4> b 11

  DB<5> $mech->get('/say-hello')

[...]
Local::MyApp::CODE(0x7f8066f2db60)(lib/Local/MyApp.pm:11):
11:	        $method->();

  DB<<6>> x $method

0  'build_frog'

  DB<<7>> m Local::MyApp
any
app
body_parameters
build_frob
captures
config
content
[...]
  DB<<8>>

No doubt about it, that vari­able is being set incorrectly.

Quit out of the debug­ger with the q com­mand, make the fix (we prob­a­bly want errors to give some­thing oth­er than an HTTP 200 OK while we’re at it), and re-​run the test:

$ perl -Ilib t/test_psgi.t

[Local::MyApp:8277] core @2021-07-06 07:48:36> Built config from files: /Users/mgardner/Projects/blog/myapp/config.yml /Users/mgardner/Projects/blog/myapp/environments/development.yml in (eval 309) l. 910
Name "DB::single" used only once: possible typo at t/test_psgi.t line 13.
[...]
ok 1 - GET /say-hello
ok 2 - message is Hello world!
1..2

Note that warn­ing about leav­ing $DB::single in there. While harm­less, it’s a good reminder to remove such lines from your code so that they don’t sur­prise you or your team­mates dur­ing future debug­ging sessions.

And that’s it. Note that because we’re using PSGI, we were able to set break­points in our web app code itself and the debug­ger stopped there and enabled us to have a look around. And as you’ve seen, once you’re at a break­point you can switch to dif­fer­ent files, add/​remove more break­points, run arbi­trary code, and more. The perlde­bug doc­u­men­ta­tion page has all the details.

Happy debug­ging! For your ref­er­ence, here’s the full app mod­ule and test script used in this article:

MyApp.pm

package Local::MyApp;
use Dancer2;
use Feature::Compat::Try;

our $VERSION = '0.1';

get '/say-hello' => sub {
    try {
        no strict 'refs';
        my $method = 'build_frob';
        $method->();
    }
    catch ($e) {
        status 'error';
        send_as JSON => {error => $e};
    }
    send_as JSON => {message => 'Hello world!'};
};

sub build_frob {
    return;
}

true;

test_psgi.t

#!/usr/bin/env perl

use Test::Most;
use Test::WWW::Mechanize::PSGI;
use JSON::MaybeXS;
use Local::MyApp; # name of your app's main module goes here

my $mech = Test::WWW::Mechanize::PSGI->new(
    # a Dancer2 app, so to_app returns a PSGI coderef
    app => Local::MyApp->to_app(),
);
$mech->get_ok('/say-hello');
lives_and {
    my $json = decode_json($mech->content);
    cmp_deeply( $json, {message => 'Hello world!'} );
} 'message is Hello world!';

done_testing;

brave doctor in flying superhero cape with fist stretched

We’ve all been there: Deep in the bow­els of a pro­gram, try­ing to fig­ure out why some­thing’s not work­ing right. 

Maybe a vari­able isn’t get­ting the right val­ue. Maybe a func­tion or method isn’t get­ting called or has the wrong argu­ments. Maybe we don’t even know if a line of code is being reached.

So what do we do? We lit­ter our code with print state­ments or log­ging. Got val­ue.” Calling foo with x.” Reached line 42.”

And then we have to make sure to clean it all out, lest these infor­mal clues make their way to pro­duc­tion and the eyes of users.

There’s a better way

What if I told you that you could freeze time in the mid­dle of your pro­gram and exam­ine all these details while they hap­pen, or even before they hap­pen? That you could be alert­ed when vari­ables or expres­sions changed and imme­di­ate­ly dive in to see the cause? That you could even run explorato­ry test code and play what if” sce­nar­ios?

All of this is pos­si­ble with a debug­ger, and Perl has one built-​in. Run your pro­gram with perl -d, and after a slight delay you’re pre­sent­ed with a new com­mand line prompt:

% perl -d hello.pl

Loading DB routines from perl5db.pl version 1.57
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

main::(hello.pl:7):	say 'Hello world!';
  DB<1> 

Before you go any fur­ther, you might want to know how to exit the debug­ger. Just type q and then press Enter and you’ll be back at your nor­mal com­mand line.

Perl also comes with a good (if incom­plete) tuto­r­i­al on using the debug­ger, and I sug­gest you work through that to famil­iar­ize your­self. It will tell you how to get help, list and view vari­ables and lines of code, set break­points, and step through line by line.

And then there’s the full man­u­al, where you can learn how to do things like get­ting a stack back­trace of all calls lead­ing up to the cur­rent line.

My favorite debugging trick

By far my favorite thing to do with the debug­ger is to run it against test scripts using the prove com­mand. This takes advan­tage of proves abil­i­ty to run an arbi­trary inter­preter through its test har­ness. Run it like this:

% prove --verbose --lib --exec 'perl -Ilib -d' t/foo.t

…and you’ll be able to step through the test script at t/foo.t, load­ing any nec­es­sary mod­ules from the lib direc­to­ry, get­ting ver­bose test out­put along the way.

This is extreme­ly help­ful when paired with mod­ules like Test::Mojo, which can load Mojolicious web appli­ca­tion class­es and then exer­cise them using its own user agent. (You can do sim­i­lar things with oth­er PSGI appli­ca­tions using Plack::Test.)

A caveat

If you’re debug­ging and test­ing your appli­ca­tions strict­ly through net­work requests to anoth­er process, such as through Test::WWW::Mechanize, run­ning the debug­ger will only show you the client-​side of things and not allow you to step through exe­cu­tion of the serv­er side. This is espe­cial­ly preva­lent among lega­cy apps not using the PSGI pro­to­col. If you’re using Apache mod_​perl have a look at Apache::DB, which enables you to run the Apache httpd web serv­er with the inter­ac­tive debugger.

So what’s your superpower?

A debug­ger is a com­mon fea­ture in many lan­guages and devel­op­ment envi­ron­ments, and Perl has had one for decades. Yet I’m con­stant­ly sur­prised that more devel­op­ers don’t know about or don’t use this pow­er­ful tool that can cut through the add prints/​run/​stop/​add more prints/​run” cycle.

How can you inte­grate the Perl debug­ger into your pro­gram­ming work­flow? Leave me a mes­sage in the com­ments and let me know.

beetle on green leaf

I spent this week debug­ging an issue with an inter­nal web tool that our company’s sup­port team relies upon to pull up infor­ma­tion about our cus­tomers, mak­ing changes on their behalf to their sub­scrip­tions of the var­i­ous pack­ages and ser­vices that we offer. Trying to view one par­tic­u­lar cus­tomer — one! — would always crash the appli­ca­tion, leav­ing an unhelp­ful mes­sage that said, Internal Server Error: Please con­tact the serv­er admin­is­tra­tor. More infor­ma­tion about this error may be avail­able in the serv­er error log.” Unfortunately, there was only a sin­gle line in the log, Error 500,” which only indi­cat­ed that yes, an error had hap­pened on the serv­er. No details from the application.

Luckily, this appli­ca­tion was writ­ten in Perl, an expres­sive pro­gram­ming lan­guage with a rich ecosys­tem of open-​source libraries. It also has a built-​in debug­ger mode that can run your pro­gram step by step, line by line. Any pro­gram that can be run from the text com­mand line can be paused, have its vari­ables and objects exam­ined, new code inter­ac­tive­ly entered, and then con­tin­ue its exe­cu­tion as if noth­ing had happened.

However, this was a web appli­ca­tion that assumed it was run­ning in a web serv­er envi­ron­ment, and the customer’s infor­ma­tion was in our pro­duc­tion data­base, safe from pry­ing eyes (includ­ing curi­ous devel­op­ers like me) due to finan­cial com­pli­ance rules. I could not sim­ply run this pro­gram on my desk­top and repro­duce the prob­lem with this one cus­tomer — I had to some­how tease out more infor­ma­tion from a run­ning sys­tem and report it back using the only tool avail­able: the serv­er error log men­tioned above.

But still, the Perl debug­ger approach was appeal­ing. Could I some­how get the appli­ca­tion to log each line of code as it was exe­cut­ed? Could I then see what was run­ning the moment before it crashed, the offend­ing line print­ed in the log like a smok­ing gun that had just mur­dered its vic­tim? And assum­ing that the prob­lem was in our code and not in the mil­lions of lines of third-​party code it depend­ed upon, could I fil­ter out all the noise?

The answer, thank­ful­ly, was yes; since the debug­ger itself is writ­ten in Perl and designed to be extend­ed or replaced, I could add code at the begin­ning of our appli­ca­tion that inter­cept­ed each line as it was run, throw out any­thing that came from a file out­side of our application’s direc­to­ry fold­er, and then report the rest (along with help­ful line num­bers) to the error log. Then turn on the debug” switch on the web serv­er run­ning the appli­ca­tion, and voilà, the log would duti­ful­ly fill up with (slow­er, more memory-​consuming) code report­ed by the debugger.

We set up our stag­ing serv­er to use the branch of code with debug­ging enabled, and then instruct­ed the appli­ca­tion to dis­play the prob­lem­at­ic cus­tomer’s records. As expect­ed, the error log imme­di­ate­ly began fill­ing up with line after line of our application’s code and then bang, crashed right after issu­ing a par­tic­u­lar data­base query for ser­vices tied to the account. I had my smok­ing gun! After extract­ing the query and run­ning it on a redact­ed copy of our data­base, I found that it was return­ing some 1.9 mil­lion rows of data as it retrieved pro­vi­sion­ing, billing, and renew­al his­to­ry for every ser­vice owned by the cus­tomer. This was far too many than nec­es­sary — the appli­ca­tion only cares about cur­rent sta­tus, and it was run­ning out of mem­o­ry as it cre­at­ed ser­vice objects for each row.

The data­base expert on my team was able to adjust the query to return only cur­rent infor­ma­tion, and after a quick test on the redact­ed data­base, the change is now wait­ing for qual­i­ty assur­ance test­ing before launch­ing to our pro­duc­tion servers. The debug­ging code branch will be saved until it’s need­ed again, and our team was once again grate­ful that we were work­ing in such a pow­er­ful pro­gram­ming lan­guage as Perl.