Perils of Plugins

This is a very old article. It has been imported from older blogging software, and the formatting, images, etc may have been lost. Some links may be broken. Some of the information may no longer be correct. Opinions expressed in this article may no longer be held.

Plugin-based architectures can be a bad idea.

Not always. In user-facing applications, where the list of installed and enabled plugins is clear, then plugins are often a good thing. This article is concerned not with end-user facing applications, but with libraries. Libraries that allow their functionality to be extended through plugins. In particular, libraries that automatically detect and load all installed plugins.

Plugins aren’t always obviously plugins. In this article, I’m defining a plugin as a software module that adds additional functionality or modifies the externally observable behaviour of the existing functionality of the core piece of software. Call it a “plugin” or an “optional dependency” – it’s the same thing.

Here’s a simple hypothetical example:

package Postcode;

our $AUTHORITY = ‘local:ALICE’;
our $VERSION = ‘1.0’;

use Modern::Perl;
use Carp qw( confess );
use Class::Load qw( try_load_class is_class_loaded );

use constant {
IDX_COUNTRY => 0,
IDX_POSTAL_CODE => 1,
NEXT_IDX => 2,
};

sub new
{
my ($class, $country, $postal_code) = @_;

# $country should be an upper-case ISO 3166 alpha-2 code
$country = uc $country;
confess “$country not a valid country identifier”
unless $country =~ /^[A-Z]{2}$/;

unless ($class =~ /::[A-Z]{2}$/)
{
my $specific_class = join ‘::’ => ($class, $country);
try_load_class($specific_class);
return $specific_class->new($country, $postal_code)
if is_class_loaded($specific_class);
}

return bless [$country, $postal_code] => $class;
}

sub country
{
my $self = shift;
$self->[ $self->IDX_COUNTRY ];
}

sub postal_code
{
my $self = shift;
uc $self->[ $self->IDX_POSTAL_CODE ];
}

Hopefully what the above code does should be immediately apparent. You can construct postcode objects like this:

my $beverley_hills = Postcode::->new(US => 90210);
my $buckingham_palace = Postcode::->new(GB => “SW1A 1AA”);

If the modules Postcode::US or Postcode::GB are installed, then locale-specific objects will be constructed which may provide extra functionality like $beverley_hills->get_state; otherwise generic Postcode objects will be constructed. Here’s an example locale-specific plugin…

package Postcode::GB;

our $AUTHORITY = ‘local:ALICE’;
our $VERSION = ‘1.0’;

use Modern::Perl;
use Carp qw( confess );
use base ‘Postcode’;

sub new
{
my $self = shift->SUPER::new(@_);

# Canonicalise whitespace
$self->[ $self->IDX_POSTAL_CODE ] =~ s{\s}{}g;
$self->[ $self->IDX_POSTAL_CODE ] =~ s{(^.+)(…)$}{$1 $2}g;

return $self;
}

# XXX: this regexp doesn’t cover some overseas territories
# (Falklands, Pitcairn, etc) and doesn’t cover BFPO codes.
#
my $regexp = qr {
^
([A-Z]{1,2})
([0-9]{1,2} | [0-9]{1}[A-Z]{1})
\s
([0-9]{1})
([A-Z]{2})
$
}x;

sub postcode_area { shift->postcode =~ $regexp and “$1” }
sub postcode_district { shift->postcode =~ $regexp and “$1$2” }
sub postcode_sector { shift->postcode =~ $regexp and “$1$2 $3” }
sub postcode_unit { shift->postcode =~ $regexp and “$1$2 $3$4” }

Perhaps some of those locale-specific modules will be distributed alongside the base Postcode distribution; others may be written by third parties interested in dealing with addresses in specific geographic regions. Sounds like a good plan – if Alice is maintaining the Postcode distribution, she might be happy to maintain the British and Irish modules, but have no interest in maintaining modules covering China or Vietnam. (Ireland would be especially easy to implement – it is, as far as I’m aware, the only European state to not use postal codes.) Somebody else might be happy to maintain those though.

But what’s wrong with this?

Bob decides to write an Address module that makes use of Postcode. Here’s some of the methods:

package Address;

use Postcode;
…;

sub get_country_iso3166
{
my $self = shift;
return uc(…);
}

sub get_postcode
{
my $self = shift;
my $pc_str = …;
return Postcode::->new($self->get_country_iso3166, $pc_str);
}

sub get_state
{
my $self = shift;
my $state = …;

# For the USA, if the state is missing, can infer it
# from postal code.
#
$state //= $self->get_postcode->get_state
if $self->get_country_iso3166 eq ‘US’;

return $state;
}

…;
1;

Everything looks fine, but what happens if the Address package gets run on a machine without Postcode::US installed? Then there will be no $self->get_postcode->get_state method and the code will croak.

OK, so Bob should explicitly declare a dependency on Postcode::US, but there are any number of reasons why he might not, including:

Postcode::US might be distributed as part of the core Postcode distribution. (But the fact that it is now, doesn’t mean it won’t be split out into a separate distribution in the future.)
Bob might be generating his dependency list automatically using some script that scans his source code looking for package names, but Postcode::US isn’t mentioned explicitly in his code.
Bob might have simply assumed that because the get_state method is available on his own machine, it will always be available everywhere.

And this does happen in real life. Few were prepared when HTTPS support was split out from libwww-perl.

Another more subtle issue would be if the Address package relied on Postcode::->new(GB => $postcode) performing the whitespace canonicalisation found in Postcode::GB. This variety of problem is particularly hard to debug.

A related issue is that the Address package and the GeoLocator package might require different sets of plugins to be loaded. Address may be relying on Postcode::GB being installed, but GeoLocator may be relying on it not being installed.

OK, so now we understand the problems, what are the solutions?

One resolution is to eschew plugins entirely. However plugin based architectures do provide certain benefits, such as the convenience of being able to split different non-overlapping areas of functionality (locales in the Postcode example) between maintainers; or to make particular high-cost pieces of functionality (high CPU or memory usage; lots of CPAN dependencies; etc) more optional.

So let’s assume we want to be able to write pluggable software libraries. One simple solution is to simply stop automatically loading all plugins found on disk. Make your library’s caller explicitly load the plugins they need. Don’t do the try_load_class thing on your caller’s behalf.

package Address;

use Postcode;
use Postcode::AU;
use Postcode::GB;
use Postcode::US;
…;
=

You could even provide a little syntactic sugar:

package Address;

use Postcode -locales => [qw/au gb us/];
…;

The key part to providing this sugar is that the Postcode must attempt to load the three plugins and croak (or at least carp) if any of them is not available. If it silently ignores missing locale plugins then it’s not doing its job.

This precaution solves many of the issues we’ve looked at, but not the issue where within the lifetime of a single process we want to sometimes use the Postcode::GB plugin, but at other times use plain old Postcode.

This is really a problem of global state – the decision of whether to use the plugin or not, all basically comes down to the contents of the %INC hash, Perl’s global variable which tracks which modules have been loaded. If Postcode::GB is loaded, then it will always be used for British postcodes; if it’s not loaded, then it will never be used.

So the solution is to use local state. Determine the list of plugins in use on an object-by-object basis. Let’s see how this can be applied to Postcode:

package Postcode;

…;

sub new
{
my ($class, $country, $postal_code) = @_;

# $country should be an upper-case ISO 3166 alpha-2 code
$country = uc $country;
confess “$country not a valid country identifier”
unless $country =~ /^[A-Z]{2}$/;

return bless [$country, $postal_code] => $class;
}

sub new_using_plugin
{
my ($class, $country, $postal_code) = @_;

# $country should be an upper-case ISO 3166 alpha-2 code
$country = uc $country;
confess “$country not a valid country identifier”
unless $country =~ /^[A-Z]{2}$/;

confess “call new_using_plugin on the base ‘Postcode’ class”
if $class =~ /::[A-Z]{2}$/;

my $specific_class = join ‘::’ => ($class, $country);
try_load_class($specific_class);
confess “Could not load class $specific_class”
unless is_class_loaded($specific_class)
return $specific_class->new($country, $postal_code);
}

…;
1;

With those two constructors, the caller is forced to make a choice between using the base Postcode class, or using the plugin (but croaking if the plugin is unavailable). If they really don’t care, then they can always do this:

my $beverley_hills =
eval { Postcode::->new_using_plugin(US => 90210) }
||do { Postcode::->new(US => 90210) };

For more complex cases where you wish to allow plugins on an object-by-object basis, then consider writing the plugin as Moose roles and using MooseX::Traits to allow your caller to construct objects using a combination of those roles.

Plugins can be a very powerful tool, if used carefully. Avoid unpredictable loading; avoid global state.