This week my main task for this sprint was canceled. While not as momentous as the cancellation of an entire project (I’ve been there too), deleting the past week’s work still stung. This isn’t the first time, though, so I know that there are a few things to keep in mind:

You didn’t waste your time

Bottom line: Were you paid for your work? Then your employer still sees it as valuable, if only to make sure that a given line of development was sufficiently explored before determining it wasn’t worth continuing. Developing a product or service often means saying no” to things, and sometimes that means cutting losses before the sunk cost fallacy takes hold.

You probably learned something

Over the past week’s work, I learned about managing TLS connections (including supporting ciphers that are no longer considered secure), parameter validation, and XML namespace support in XPath. You probably learned a lot more if your project extended longer, and you can use that knowledge further on in your career. Put it on your résumé or CV, and you may get an opportunity to work on the same things in the future.

You could continue if you want

Okay, maybe you’re not going to sneak into the office for months to finish things. But as long as you have the time and inclination, you could continue to work on your project, especially if you think it could be valuable to the company later on. Consider this carefully, though—you don’t want off-​the-​books work taking time and energy away from your main job.

There’s no shame

Lastly, you shouldn’t feel ashamed about being part of a canceled project. They happen all the time, and probably should happen more—history is littered with failed software projects that likely could have cost less if only their problems were recognized earlier. By its nature, software development is exploratory and difficult, and not every idea pans out. As long as you can find something new to work on, you’ll be fine.

young lady learning sign language during online lesson with female tutor

It’s been years since I’ve had to hack on anything XML-related, but a recent project at work has me once again jumping into the waters of generating, parsing, and modifying this 90s-​era document format. Most developers these days likely only know of it as part of the curiously-​named XMLHTTPRequest object in web browsers used to retrieve data in JSON format from servers, and as the X” in AJAX. But here we are in 2021, and there are still plenty of APIs and documents using XML to get their work done.

In my particular case, the task is to update the API calls for a new version of Virtuozzo Automator. Its API is a bit unusual in that it doesn’t use HTTP, but rather relies on opening a TLS-encrypted socket to the server and exchanging documents delimited with a null character. The previous version of our code is in 1990s-​sysadmin-​style Perl, with manual blessing of objects and parsing the XML using regular expressions. I’ve decided to update it to use the Moo object system and a proper XML parser. But which parser and module to use?

Selecting a parser

There are several generic XML modules for parsing and generating XML on CPAN, each with its own advantages and disadvantages. I’d like to say that I did a comprehensive survey of each of them, but this project is pressed for time (aren’t they all?) and I didn’t want to create too many extra dependencies in our Perl stack. Luckily, XML::LibXML is already available, I’ve had some previous experience with it, and it’s a good choice for performant standards-​based XML parsing (using either DOM or SAX) and generation.

Given more time and leeway in adding dependencies, I might use something else. If the Virtuozzo API had an XML Schema or used SOAP, I would consider XML::Compile as I’ve had some success with that in other projects. But even that uses XML::LibXML under the hood, so I’d still be using that. Your mileage may vary.

Generating XML

Depending on the size and complexity of the XML documents to generate, you might choose to build them up node by node using XML::LibXML::Node and XML::LibXML::Element objects. Most of the messages I’m sending to Virtuozzo Automator are short and have easily-​interpolated values, so I’m using here-​document islands of XML inside my Perl code. This also has the advantage of being easily validated against the examples in the documentation.

Where the interpolated values in the messages are a little complicated, I’m using this idiom inside the here-docs:

@{[ ... ]}

This allows me to put an arbitrary expression in the … part, which is then put into an anonymous array reference, which is then immediately dereferenced into its string result. It’s a cheap and cheerful way to do minimal templating inside Perl strings without loading a full templating library; I’ve also had success using this technique when generating SQL for database queries.

Parser as an object attribute

Rather than instantiate a new XML::LibXML in every method that needs to parse a document, I created a private attribute:

package Local::API::Virtozzo::Agent {
    use Moo;
    use XML::LibXML;
    use Types::Standard qw(InstanceOf);
    ...
    has _parser => (
        is      => 'ro',
        isa     => InstanceOf['XML::LibXML'],
        default => sub { XML::LibXML->new() },
    );
    sub foo {
        my $self = shift;
        my $send_doc = $self->_parser
          ->parse_string(<<"END_XML");
            <foo/>
END_XML
        ...
    }
...
}

Boilerplate

XML documents can be verbose, with elements that rarely change in every document. In the Virtuozzo API’s case, every document has a <packet> element containing a version attribute and an id attribute to match requests to responses. I wrote a simple function to wrap my documents in this element that pulled the version from a constant and always increased the id by one every time it’s called:

sub _wrap_packet {
    state $send_id = 1;
    return qq(<packet version="$PACKET_VERSION" id=")
      . $send_id++ . '">' . shift . '</packet>';
}

If I need to add more attributes to the <packet> element (for instance, namespaces for attributes in enclosed elements, I can always use XML::LibXML::Element::setAttribute after parsing the document string.

Parsing responses with XPath

Rather than using brittle regular expressions to extract data from the response, I use the shared parser object from above and then the full power of XPath:

use English;
...
sub get_sampleID {
    my ($self, $sample_name) = @_;
    ...
    # used to separate documents
    local $INPUT_RECORD_SEPARATOR = "\0";
    # $self->_sock is the IO::Socket::SSL connection
    my $get_doc = $self->_parser( parse_string(
      $self->_sock->getline(),
    ) );
    my $sample_id = $get_doc->findvalue(
        qq(//ns3:id[following-sibling::ns3:name="$sample_name"]),
    );
    return $sample_id;
}

This way, even if the order of elements change or more elements are introduced, the XPath patterns will continue to find the right data.

Conclusion… so far

I’m only about halfway through updating these API calls, and I’ve left out some non-​XML-​related details such as setting up the TLS socket connection. Hopefully this article has given you a taste of what’s involved in XML processing these days. Please leave me a comment if you have any suggestions or questions.

This proposal from Dan Book seems reasonable to me. A version 7 feature bundle that renders signatures non-​experimental; removes the indirect, multidimensional, and bareword filehandle features; enables warnings and utf8 by default? Sure. And more importantly, incrementing the major version every time a new feature is stable.

Unfortunately we’re still on Perl 5.16.3 at work, and it would be a big push to make sure our codebase is compatible with a newer version, much less adopt new features. But I’m willing to bet a reasonable release of version 7 might be just the push we need.

On the heels of my blog article and upcoming presentation comes Paul Evans’ call to de-​experimentalize (is that a word?) subroutine signatures in Perl core. It’s been stable for over four years now, and the experimental” tag has been holding back developers big and small, so I fully support this effort. Maybe it can make it into Perl 5.34? Here’s hoping.