Perl is a reasonable scripting language (as are others, so shh!). It has always had strong regular expression support; those regular expressions can also be used to do substitutions, such as:
my $pet = "I have a dog";
$pet =~ s/dog/cat/;
Neat enough. But lets say I want to look up the parts of the “s///
” that define my search text, and my replacement text. Easy enough:
my $pet = "I have a dog";
my $search = "dog";
my $replace = "cat";
$pet =~ s/$search/$replace/;
But lets make our substitution a little more complex – I want to match a URL, and have the host and port lower case, but leave the path as the case it comes in, and I don’t want to be entering expressions as the replacement text! Lets try:
my $url = 'http://www.FOo.COm/wibbLE';
my $search = '^([^:]+://[^/]+)/?(.*)?$';
my $replace = '\L$1\E/$2';
print $url if $url =~ s/$search/$replace/;
Sadly, while $search
matches, the replace is the string “\L$1\E/$2
“. It appears that we need to use a combination of “/e
” mode (evaluate as expression) to evaluage this replacement string. Of course, when we’re doing eval, we want to ensure we don’t have malicious content in $replace, such as “unlink()
” and friends who could do Bad Things. So my solution was to escape double quotes from my $replace string, wrap that all in double quotes, and pass in “/ee
“:
my $search = '^([^:]+://[^/]+(:\d+)?)/?(.*)?$'; # From database
my $replace = '\L$1\E/$3'; # From database
$replace =~ s/"/\\"/g; # Protection from embedded code
$replace = '"' . $replace . '"'; # Put in a string for /ee
print $url if $url =~ s/$search/$replace/ee;
This will give us:
- Port and host name lower case
- Hostname (and port) will always have a slash after it
- Bad code like
unlink()
won’t be run - The expression that we initally set/fetch/got in
$replace
is just a vanilla replacement term, not arbitary Perl code.