It’s been years since I’ve had to hack on anything XML-related, but a recent project at work has me once again jumping into the waters of generating, parsing, and modifying this 90s-era document format. Most developers these days likely only know of it as part of the curiously-named XMLHTTPRequest object in web browsers used to retrieve data in JSON format from servers, and as the “X” in AJAX. But here we are in 2021, and there are still plenty of APIs and documents using XML to get their work done.
In my particular case, the task is to update the API calls for a new version of Virtuozzo Automator. Its API is a bit unusual in that it doesn’t use HTTP, but rather relies on opening a TLS-encrypted socket to the server and exchanging documents delimited with a null character. The previous version of our code is in 1990s-sysadmin-style Perl, with manual blessing of objects and parsing the XML using regular expressions. I’ve decided to update it to use the Moo object system and a proper XML parser. But which parser and module to use?
Selecting a parser
There are several generic XML modules for parsing and generating XML on CPAN, each with its own advantages and disadvantages. I’d like to say that I did a comprehensive survey of each of them, but this project is pressed for time (aren’t they all?) and I didn’t want to create too many extra dependencies in our Perl stack. Luckily, XML::LibXML is already available, I’ve had some previous experience with it, and it’s a good choice for performant standards-based XML parsing (using either DOM or SAX) and generation.
Given more time and leeway in adding dependencies, I might use something else. If the Virtuozzo API had an XML Schema or used SOAP, I would consider XML::Compile as I’ve had some success with that in other projects. But even that uses XML::LibXML under the hood, so I’d still be using that. Your mileage may vary.
Generating XML
Depending on the size and complexity of the XML documents to generate, you might choose to build them up node by node using XML::LibXML::Node and XML::LibXML::Element objects. Most of the messages I’m sending to Virtuozzo Automator are short and have easily-interpolated values, so I’m using here-document islands of XML inside my Perl code. This also has the advantage of being easily validated against the examples in the documentation.
Where the interpolated values in the messages are a little complicated, I’m using this idiom inside the here-docs:
@{[ ... ]}
This allows me to put an arbitrary expression in the … part, which is then put into an anonymous array reference, which is then immediately dereferenced into its string result. It’s a cheap and cheerful way to do minimal templating inside Perl strings without loading a full templating library; I’ve also had success using this technique when generating SQL for database queries.
Parser as an object attribute
Rather than instantiate a new XML::LibXML in every method that needs to parse a document, I created a private attribute:
package Local::API::Virtozzo::Agent {
use Moo;
use XML::LibXML;
use Types::Standard qw(InstanceOf);
...
has _parser => (
is => 'ro',
isa => InstanceOf['XML::LibXML'],
default => sub { XML::LibXML->new() },
);
sub foo {
my $self = shift;
my $send_doc = $self->_parser
->parse_string(<<"END_XML");
<foo/>
END_XML
...
}
...
}
Boilerplate
XML documents can be verbose, with elements that rarely change in every document. In the Virtuozzo API’s case, every document has a <packet> element containing a version attribute and an id attribute to match requests to responses. I wrote a simple function to wrap my documents in this element that pulled the version from a constant and always increased the id by one every time it’s called:
sub _wrap_packet {
state $send_id = 1;
return qq(<packet version="$PACKET_VERSION" id=")
. $send_id++ . '">' . shift . '</packet>';
}
If I need to add more attributes to the <packet> element (for instance, namespaces for attributes in enclosed elements, I can always use XML::LibXML::Element::setAttribute after parsing the document string.
Rather than using brittle regular expressions to extract data from the response, I use the shared parser object from above and then the full power of XPath:
use English;
...
sub get_sampleID {
my ($self, $sample_name) = @_;
...
# used to separate documents
local $INPUT_RECORD_SEPARATOR = "\0";
# $self->_sock is the IO::Socket::SSL connection
my $get_doc = $self->_parser( parse_string(
$self->_sock->getline(),
) );
my $sample_id = $get_doc->findvalue(
qq(//ns3:id[following-sibling::ns3:name="$sample_name"]),
);
return $sample_id;
}
This way, even if the order of elements change or more elements are introduced, the XPath patterns will continue to find the right data.
Conclusion… so far
I’m only about halfway through updating these API calls, and I’ve left out some non-XML-related details such as setting up the TLS socket connection. Hopefully this article has given you a taste of what’s involved in XML processing these days. Please leave me a comment if you have any suggestions or questions.
5 thoughts on “Perl and XML in 2021: A few lessons learned”
It include a “sandbox” which you can use to try out XPath expressions against an XML document (including one you provide).
[…] Perl and XML in 2021: A few lessons learned […]
Using a Mojo::IOLoop::Client for the connection, a Mojo::DOM instance for the XML parser and a Mojo::Template and/or Mojo::DOM to generate the response would also be a nice stack for this problem. I’ve done very similar things for other projects.
I’m just signing in to the API and issuing commands. How would an event loop like Mojo::IOLoop help me?
The loop wasn’t my main focus in this case, the consistent and clean api over the spaces you’re considering was my point.
Comments are closed.
{"id":"11","mode":"button","open_style":"in_modal","currency_code":"USD","currency_symbol":"$","currency_type":"decimal","blank_flag_url":"https:\/\/phoenixtrap.com\/wp-content\/plugins\/tip-jar-wp\/\/assets\/images\/flags\/blank.gif","flag_sprite_url":"https:\/\/phoenixtrap.com\/wp-content\/plugins\/tip-jar-wp\/\/assets\/images\/flags\/flags.png","default_amount":500,"top_media_type":"featured_image","featured_image_url":"https:\/\/phoenixtrap.com\/wp-content\/uploads\/2021\/02\/image-200x200.jpg","featured_embed":"","header_media":null,"file_download_attachment_data":null,"recurring_options_enabled":true,"recurring_options":{"never":{"selected":true,"after_output":"One time only"},"weekly":{"selected":false,"after_output":"Every week"},"monthly":{"selected":false,"after_output":"Every month"},"yearly":{"selected":false,"after_output":"Every year"}},"strings":{"current_user_email":"","current_user_name":"","link_text":"Leave a tip!","complete_payment_button_error_text":"Check info and try again","payment_verb":"Pay","payment_request_label":"The Phoenix Trap","form_has_an_error":"Please check and fix the errors above","general_server_error":"Something isn't working right at the moment. Please try again.","form_title":"The Phoenix Trap","form_subtitle":"Do you like what you see? Leave a one-time or recurring tip!","currency_search_text":"Country or Currency here","other_payment_option":"Other payment option","manage_payments_button_text":"Manage your payments","thank_you_message":"Thank you for being a supporter!","payment_confirmation_title":"The Phoenix Trap","receipt_title":"Your Receipt","print_receipt":"Print Receipt","email_receipt":"Email Receipt","email_receipt_sending":"Sending receipt...","email_receipt_success":"Email receipt successfully sent","email_receipt_failed":"Email receipt failed to send. Please try again.","receipt_payee":"Paid to","receipt_statement_descriptor":"This will show up on your statement as","receipt_date":"Date","receipt_transaction_id":"Transaction ID","receipt_transaction_amount":"Amount","refund_payer":"Refund from","login":"Log in to manage your payments","manage_payments":"Manage Payments","transactions_title":"Your Transactions","transaction_title":"Transaction Receipt","transaction_period":"Plan Period","arrangements_title":"Your Plans","arrangement_title":"Manage Plan","arrangement_details":"Plan Details","arrangement_id_title":"Plan ID","arrangement_payment_method_title":"Payment Method","arrangement_amount_title":"Plan Amount","arrangement_renewal_title":"Next renewal date","arrangement_action_cancel":"Cancel Plan","arrangement_action_cant_cancel":"Cancelling is currently not available.","arrangement_action_cancel_double":"Are you sure you'd like to cancel?","arrangement_cancelling":"Cancelling Plan...","arrangement_cancelled":"Plan Cancelled","arrangement_failed_to_cancel":"Failed to cancel plan","back_to_plans":"\u2190 Back to Plans","update_payment_method_verb":"Update","sca_auth_description":"Your have a pending renewal payment which requires authorization.","sca_auth_verb":"Authorize renewal payment","sca_authing_verb":"Authorizing payment","sca_authed_verb":"Payment successfully authorized!","sca_auth_failed":"Unable to authorize! Please try again.","login_button_text":"Log in","login_form_has_an_error":"Please check and fix the errors above","uppercase_search":"Search","lowercase_search":"search","uppercase_page":"Page","lowercase_page":"page","uppercase_items":"Items","lowercase_items":"items","uppercase_per":"Per","lowercase_per":"per","uppercase_of":"Of","lowercase_of":"of","back":"Back to plans","zip_code_placeholder":"Zip\/Postal Code","download_file_button_text":"Download File","input_field_instructions":{"tip_amount":{"placeholder_text":"How much would you like to tip?","initial":{"instruction_type":"normal","instruction_message":"How much would you like to tip? Choose any currency."},"empty":{"instruction_type":"error","instruction_message":"How much would you like to tip? Choose any currency."},"invalid_curency":{"instruction_type":"error","instruction_message":"Please choose a valid currency."}},"recurring":{"placeholder_text":"Recurring","initial":{"instruction_type":"normal","instruction_message":"How often would you like to give this?"},"success":{"instruction_type":"success","instruction_message":"How often would you like to give this?"},"empty":{"instruction_type":"error","instruction_message":"How often would you like to give this?"}},"name":{"placeholder_text":"Name on Credit Card","initial":{"instruction_type":"normal","instruction_message":"What is the name on your credit card?"},"success":{"instruction_type":"success","instruction_message":"Enter the name on your card."},"empty":{"instruction_type":"error","instruction_message":"Please enter the name on your card."}},"privacy_policy":{"terms_title":"Terms and conditions","terms_body":null,"terms_show_text":"View Terms","terms_hide_text":"Hide Terms","initial":{"instruction_type":"normal","instruction_message":"I agree to the terms."},"unchecked":{"instruction_type":"error","instruction_message":"Please agree to the terms."},"checked":{"instruction_type":"success","instruction_message":"I agree to the terms."}},"email":{"placeholder_text":"Your email address","initial":{"instruction_type":"normal","instruction_message":"What is your email address?"},"success":{"instruction_type":"success","instruction_message":"Enter your email address"},"blank":{"instruction_type":"error","instruction_message":"Enter your email address"},"not_an_email_address":{"instruction_type":"error","instruction_message":"Make sure you have entered a valid email address"}},"note_with_tip":{"placeholder_text":"Your note here...","initial":{"instruction_type":"normal","instruction_message":"Attach a note to your tip (optional)"},"empty":{"instruction_type":"normal","instruction_message":"Attach a note to your tip (optional)"},"not_empty_initial":{"instruction_type":"normal","instruction_message":"Attach a note to your tip (optional)"},"saving":{"instruction_type":"normal","instruction_message":"Saving note..."},"success":{"instruction_type":"success","instruction_message":"Note successfully saved!"},"error":{"instruction_type":"error","instruction_message":"Unable to save note note at this time. Please try again."}},"email_for_login_code":{"placeholder_text":"Your email address","initial":{"instruction_type":"normal","instruction_message":"Enter your email to log in."},"success":{"instruction_type":"success","instruction_message":"Enter your email to log in."},"blank":{"instruction_type":"error","instruction_message":"Enter your email to log in."},"empty":{"instruction_type":"error","instruction_message":"Enter your email to log in."}},"login_code":{"initial":{"instruction_type":"normal","instruction_message":"Check your email and enter the login code."},"success":{"instruction_type":"success","instruction_message":"Check your email and enter the login code."},"blank":{"instruction_type":"error","instruction_message":"Check your email and enter the login code."},"empty":{"instruction_type":"error","instruction_message":"Check your email and enter the login code."}},"stripe_all_in_one":{"initial":{"instruction_type":"normal","instruction_message":"Enter your credit card details here."},"empty":{"instruction_type":"error","instruction_message":"Enter your credit card details here."},"success":{"instruction_type":"normal","instruction_message":"Enter your credit card details here."},"invalid_number":{"instruction_type":"error","instruction_message":"The card number is not a valid credit card number."},"invalid_expiry_month":{"instruction_type":"error","instruction_message":"The card's expiration month is invalid."},"invalid_expiry_year":{"instruction_type":"error","instruction_message":"The card's expiration year is invalid."},"invalid_cvc":{"instruction_type":"error","instruction_message":"The card's security code is invalid."},"incorrect_number":{"instruction_type":"error","instruction_message":"The card number is incorrect."},"incomplete_number":{"instruction_type":"error","instruction_message":"The card number is incomplete."},"incomplete_cvc":{"instruction_type":"error","instruction_message":"The card's security code is incomplete."},"incomplete_expiry":{"instruction_type":"error","instruction_message":"The card's expiration date is incomplete."},"incomplete_zip":{"instruction_type":"error","instruction_message":"The card's zip code is incomplete."},"expired_card":{"instruction_type":"error","instruction_message":"The card has expired."},"incorrect_cvc":{"instruction_type":"error","instruction_message":"The card's security code is incorrect."},"incorrect_zip":{"instruction_type":"error","instruction_message":"The card's zip code failed validation."},"invalid_expiry_year_past":{"instruction_type":"error","instruction_message":"The card's expiration year is in the past"},"card_declined":{"instruction_type":"error","instruction_message":"The card was declined."},"missing":{"instruction_type":"error","instruction_message":"There is no card on a customer that is being charged."},"processing_error":{"instruction_type":"error","instruction_message":"An error occurred while processing the card."},"invalid_request_error":{"instruction_type":"error","instruction_message":"Unable to process this payment, please try again or use alternative method."},"invalid_sofort_country":{"instruction_type":"error","instruction_message":"The billing country is not accepted by SOFORT. Please try another country."}}}},"fetched_oembed_html":false}
Have you found this XML::LibXML Tutorial — http://grantm.github.io/perl-libxml-by-example/ ?
It include a “sandbox” which you can use to try out XPath expressions against an XML document (including one you provide).
[…] Perl and XML in 2021: A few lessons learned […]
Using a Mojo::IOLoop::Client for the connection, a Mojo::DOM instance for the XML parser and a Mojo::Template and/or Mojo::DOM to generate the response would also be a nice stack for this problem. I’ve done very similar things for other projects.
I’m just signing in to the API and issuing commands. How would an event loop like Mojo::IOLoop help me?
The loop wasn’t my main focus in this case, the consistent and clean api over the spaces you’re considering was my point.