A Less Smart Smartmatch

This is a very old article. It has been imported from older blogging software, and the formatting, images, etc may have been lost. Some links may be broken. Some of the information may no longer be correct. Opinions expressed in this article may no longer be held.

The smartmatch operator (~~) introduced in Perl 5.10 (and borrowed from Perl 6) has been the subject of much criticism. Its behaviour changes based on the types of its arguments (arrays vs hashes vs numbers vs strings vs …). perlop lists over twenty different behaviours based on different combinations of arguments. Although the operator normally does what you want, what people would want from certain combinations (%hash ~~ @arr anybody?) is nor always clear.

(Aside: in Perl 6 which has a stronger type system, the behaviour of smartmatch is more predictable.)

For this reason, it has been proposed that the smartmatch operator be simplified, or perhaps even removed in a future version of Perl 5. To this end, Perl 5.18 has introduced some warnings about its experimental nature.

Some seem to believe the smartmatch is fairly unnecessary: if you’re matching against a number, use ==; if you’re matching against a string, use eq; if you’re matching against a regexp, use =~; why do we need an “all of the above” operator?

I think the strength of smartmatch comes when you don’t know what you’re matching against. For example, in one of my projects there is an option to provide a filter of which tables in a database to skip processing. To handle this option, I used the smartmatch. This allows the caller to provide a regexp matching tables to skip, or a coderef which should return true to skip a table, or an arrayref of table names to skip, or some combination of the above. And I don’t have to worry about it. I just document that they should pass a filter that will be used as the right-hand-side of a smartmatch against the table names.

So I like the idea of smartmatch. I don’t think it can be easily replaced with other Perl operators. And I don’t think it should be removed.

But I would welcome a simplified version. And because I don’t like to wait, I’ve released a simplified implementation of smartmatch to the CPAN. It’s called match::simple.

I didn’t want to play crazy parsing games, so I could not override the actual ~~ operator. Instead I used this pattern to implement my own fake infix operator. Here’s how it looks:

   use match::simple;
   
   if ($this |M| $that) {
      ...;
   }

Alternatively, if you don’t want the infix stuff:

   use match::simple qw(match);
   
   if (match($this, $that)) {
      ...;
   }

The rules for match::simple are a lot simpler than smartmatch. They are always determined by the type of the operand on the right-hand side, which can be:

  • undef — then the match is only successful if the left-hand side is also undef
  • a string — the operator acts like eq
  • a regexp — the operator acts like =~
  • a coderef — the operator passes the left-hand side to it as an argument
  • an object — the operator calls the MATCH operator on the object, or calls overloaded ~~
  • an arrayref — the operator recurses to each element in the array, and succeeds if the left-hand side matches any of them

If the right-hand side is anything else (e.g. a filehandle, or a hashref) then it throws an exception.

Perhaps this is not everybody’s favourite combination of matches, but it works for me.

match::simple has been on CPAN for about 10 months and is already being used in a few projects. So why is this news?

Well, I’ve re-implemented it in XS. The result is match::simple::XS. match::simple itself remains a pure Perl implementation, but will automatically switch in the XS implementation if it detects that it is installed. This has resulted in a very fast implementation of matching; in some cases (such as matching a string within an arrayref of strings) faster than smartmatch itself.