Hidden features of Perl?
PerlHidden FeaturesPerl Problem Overview
What are some really useful but esoteric language features in Perl that you've actually been able to employ to do useful work?
Guidelines:
- Try to limit answers to the Perl core and not CPAN
- Please give an example and a short description
##Hidden Features also found in other languages' Hidden Features:##
(These are all from Corion's answer)
- C
- Duff's Device
- Portability and Standardness
- C#
- Quotes for whitespace delimited lists and strings
- Aliasable namespaces
- Java
- Static Initalizers
- JavaScript
- Functions are First Class citizens
- Block scope and closure
- Calling methods and accessors indirectly through a variable
- Ruby
- Defining methods through code
- PHP
- Pervasive online documentation
- Magic methods
- Symbolic references
- Python
- One line value swapping
- Ability to replace even core functions with your own functionality
##Other Hidden Features:##
Operators:
- The bool quasi-operator
- The flip-flop operator
- Also used for list construction
- The
++
and unary-
operators work on strings - The repetition operator
- The spaceship operator
- The || operator (and // operator) to select from a set of choices
- The diamond operator
- Special cases of the
m//
operator - The tilde-tilde "operator"
Quoting constructs:
Syntax and Names:
- There can be a space after a sigil
- You can give subs numeric names with symbolic references
- Legal trailing commas
- Grouped Integer Literals
- hash slices
- Populating keys of a hash from an array
Modules, Pragmas, and command-line options:
- use strict and use warnings
- Taint checking
- Esoteric use of -n and -p
- CPAN
overload::constant
- IO::Handle module
- Safe compartments
- Attributes
Variables:
Loops and flow control:
Regular expressions:
Other features:
- The debugger
- Special code blocks such as BEGIN, CHECK, and END
- The
DATA
block - New Block Operations
- Source Filters
- Signal Hooks
- map (twice)
- Wrapping built-in functions
- The
eof
function - The
dbmopen
function - Turning warnings into errors
Other tricks, and meta-answers:
See Also:
Perl Solutions
Solution 1 - Perl
The flip-flop operator is useful for skipping the first iteration when looping through the records (usually lines) returned by a file handle, without using a flag variable:
while(<$fh>)
{
next if 1..1; # skip first record
...
}
Run perldoc perlop
and search for "flip-flop" for more information and examples.
Solution 2 - Perl
There are many non-obvious features in Perl.
For example, did you know that there can be a space after a sigil?
$ perl -wle 'my $x = 3; print $ x'
3
Or that you can give subs numeric names if you use symbolic references?
$ perl -lwe '*4 = sub { print "yes" }; 4->()'
yes
There's also the "bool" quasi operator, that return 1 for true expressions and the empty string for false:
$ perl -wle 'print !!4'
1
$ perl -wle 'print !!"0 but true"'
1
$ perl -wle 'print !!0'
(empty line)
Other interesting stuff: with use overload
you can overload string literals and numbers (and for example make them BigInts or whatever).
Many of these things are actually documented somewhere, or follow logically from the documented features, but nonetheless some are not very well known.
Update: Another nice one. Below the q{...}
quoting constructs were mentioned, but did you know that you can use letters as delimiters?
$ perl -Mstrict -wle 'print q bJet another perl hacker.b'
Jet another perl hacker.
Likewise you can write regular expressions:
m xabcx
# same as m/abc/
Solution 3 - Perl
Add support for compressed files via magic ARGV:
s{
^ # make sure to get whole filename
(
[^'] + # at least one non-quote
\. # extension dot
(?: # now either suffix
gz
| Z
)
)
\z # through the end
}{gzcat '$1' |}xs for @ARGV;
(quotes around $_ necessary to handle filenames with shell metacharacters in)
Now the <>
feature will decompress any @ARGV
files that end with ".gz" or ".Z":
while (<>) {
print;
}
Solution 4 - Perl
One of my favourite features in Perl is using the boolean ||
operator to select between a set of choices.
$x = $a || $b;
# $x = $a, if $a is true.
# $x = $b, otherwise
This means one can write:
$x = $a || $b || $c || 0;
to take the first true value from $a
, $b
, and $c
, or a default of 0
otherwise.
In Perl 5.10, there's also the //
operator, which returns the left hand side if it's defined, and the right hand side otherwise. The following selects the first defined value from $a
, $b
, $c
, or 0
otherwise:
$x = $a // $b // $c // 0;These can also be used with their short-hand forms, which are very useful for providing defaults:
$x ||= 0; # If $x was false, it now has a value of 0.Cheerio,$x //= 0; # If $x was undefined, it now has a value of zero.
Paul
Solution 5 - Perl
The operators ++ and unary - don't only work on numbers, but also on strings.
my $_ = "a"
print -$_
prints -a
print ++$_
prints b
$_ = 'z'
print ++$_
prints aa
Solution 6 - Perl
As Perl has almost all "esoteric" parts from the other lists, I'll tell you the one thing that Perl can't:
The one thing Perl can't do is have bare arbitrary URLs in your code, because the //
operator is used for regular expressions.
Just in case it wasn't obvious to you what features Perl offers, here's a selective list of the maybe not totally obvious entries:
Portability and Standardness - There are likely more computers with Perl than with a C compiler
A file/path manipulation class - File::Find works on even more operating systems than .Net does
Quotes for whitespace delimited lists and strings - Perl allows you to choose almost arbitrary quotes for your list and string delimiters
Aliasable namespaces - Perl has these through glob assignments:
*My::Namespace:: = \%Your::Namespace
Static initializers - Perl can run code in almost every phase of compilation and object instantiation, from BEGIN
(code parse) to CHECK
(after code parse) to import
(at module import) to new
(object instantiation) to DESTROY
(object destruction) to END
(program exit)
Functions are First Class citizens - just like in Perl
Block scope and closure - Perl has both
Calling methods and accessors indirectly through a variable - Perl does that too:
my $method = 'foo';
my $obj = My::Class->new();
$obj->$method( 'baz' ); # calls $obj->foo( 'baz' )
Defining methods through code - Perl allows that too:
*foo = sub { print "Hello world" };
Pervasive online documentation - Perl documentation is online and likely on your system too
Magic methods that get called whenever you call a "nonexisting" function - Perl implements that in the AUTOLOAD function
Symbolic references - you are well advised to stay away from these. They will eat your children. But of course, Perl allows you to offer your children to blood-thirsty demons.
One line value swapping - Perl allows list assignment
Ability to replace even core functions with your own functionality
use subs 'unlink';
sub unlink { print 'No.' }
or
BEGIN{
*CORE::GLOBAL::unlink = sub {print 'no'}
};
unlink($_) for @ARGV
Solution 7 - Perl
Autovivification. AFAIK no other language has it.
Solution 8 - Perl
It's simple to quote almost any kind of strange string in Perl.
my $url = q{http://my.url.com/any/arbitrary/path/in/the/url.html};
In fact, the various quoting mechanisms in Perl are quite interesting. The Perl regex-like quoting mechanisms allow you to quote anything, specifying the delimiters. You can use almost any special character like #, /, or open/close characters like (), [], or {}. Examples:
my $var = q#some string where the pound is the final escape.#;
my $var2 = q{A more pleasant way of escaping.};
my $var3 = q(Others prefer parens as the quote mechanism.);
Quoting mechanisms:
q : literal quote; only character that needs to be escaped is the end character. qq : an interpreted quote; processes variables and escape characters. Great for strings that you need to quote:
my $var4 = qq{This "$mechanism" is broken. Please inform "$user" at "$email" about it.};
qx : Works like qq, but then executes it as a system command, non interactively. Returns all the text generated from the standard out. (Redirection, if supported in the OS, also comes out) Also done with back quotes (the ` character).
my $output = qx{type "$path"}; # get just the output
my $moreout = qx{type "$path" 2>&1}; # get stuff on stderr too
qr : Interprets like qq, but then compiles it as a regular expression. Works with the various options on the regex as well. You can now pass the regex around as a variable:
sub MyRegexCheck {
my ($string, $regex) = @_;
if ($string)
{
return ($string =~ $regex);
}
return; # returns 'null' or 'empty' in every context
}
my $regex = qr{http://[\w]\.com/([\w]+/)+};
@results = MyRegexCheck(q{http://myurl.com/subpath1/subpath2/}, $regex);
qw : A very, very useful quote operator. Turns a quoted set of whitespace separated words into a list. Great for filling in data in a unit test.
my @allowed = qw(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z { });
my @badwords = qw(WORD1 word2 word3 word4);
my @numbers = qw(one two three four 5 six seven); # works with numbers too
my @list = ('string with space', qw(eight nine), "a $var"); # works in other lists
my $arrayref = [ qw(and it works in arrays too) ];
They're great to use them whenever it makes things clearer. For qx, qq, and q, I most likely use the {} operators. The most common habit of people using qw is usually the () operator, but sometimes you also see qw//.
Solution 9 - Perl
The "for" statement can be used the same way "with" is used in Pascal:
for ($item)
{
s/ / /g;
s/<.*?>/ /g;
$_ = join(" ", split(" ", $_));
}
You can apply a sequence of s/// operations, etc. to the same variable without having to repeat the variable name.
NOTE: the non-breaking space above ( ) has hidden Unicode in it to circumvent the Markdown. Don't copy paste it :)
Solution 10 - Perl
Not really hidden, but many every day Perl programmers don't know about CPAN. This especially applies to people who aren't full time programmers or don't program in Perl full time.
Solution 11 - Perl
The quoteword operator is one of my favourite things. Compare:
my @list = ('abc', 'def', 'ghi', 'jkl');
and
my @list = qw(abc def ghi jkl);
Much less noise, easier on the eye. Another really nice thing about Perl, that one really misses when writing SQL, is that a trailing comma is legal:
print 1, 2, 3, ;
That looks odd, but not if you indent the code another way:
print
results_of_foo(),
results_of_xyzzy(),
results_of_quux(),
;
Adding an additional argument to the function call does not require you to fiddle around with commas on previous or trailing lines. The single line change has no impact on its surrounding lines.
This makes it very pleasant to work with variadic functions. This is perhaps one of the most under-rated features of Perl.
Solution 12 - Perl
The ability to parse data directly pasted into a DATA block. No need to save to a test file to be opened in the program or similar. For example:
my @lines = <DATA>;
for (@lines) {
print if /bad/;
}
__DATA__
some good data
some bad data
more good data
more good data
Solution 13 - Perl
Binary "x" is the repetition operator:
print '-' x 80; # print row of dashes
It also works with lists:
print for (1, 4, 9) x 3; # print 149149149
Solution 14 - Perl
New Block Operations
I'd say the ability to expand the language, creating pseudo block operations is one.
-
You declare the prototype for a sub indicating that it takes a code reference first:
sub do_stuff_with_a_hash (&\%) { my ( $block_of_code, $hash_ref ) = @_; while ( my ( $k, $v ) = each %$hash_ref ) { $block_of_code->( $k, $v ); } }
-
You can then call it in the body like so
use Data::Dumper; do_stuff_with_a_hash { local $Data::Dumper::Terse = 1; my ( $k, $v ) = @_; say qq(Hey, the key is "$k"!); say sprintf qq(Hey, the value is "%v"!), Dumper( $v ); } %stuff_for ;
(Data::Dumper::Dumper
is another semi-hidden gem.) Notice how you don't need the sub
keyword in front of the block, or the comma before the hash. It ends up looking a lot like: map { } @list
Source Filters
Also, there are source filters. Where Perl will pass you the code so you can manipulate it. Both this, and the block operations, are pretty much don't-try-this-at-home type of things.
I have done some neat things with source filters, for example like creating a very simple language to check the time, allowing short Perl one-liners for some decision making:
perl -MLib::DB -MLib::TL -e 'run_expensive_database_delete() if $hour_of_day < AM_7';
Lib::TL
would just scan for both the "variables" and the constants, create them and substitute them as needed.
Again, source filters can be messy, but are powerful. But they can mess debuggers up something terrible--and even warnings can be printed with the wrong line numbers. I stopped using Damian's Switch because the debugger would lose all ability to tell me where I really was. But I've found that you can minimize the damage by modifying small sections of code, keeping them on the same line.
Signal Hooks
It's often enough done, but it's not all that obvious. Here's a die handler that piggy backs on the old one.
my $old_die_handler = $SIG{__DIE__};
$SIG{__DIE__}
= sub { say q(Hey! I'm DYIN' over here!); goto &$old_die_handler; }
;
That means whenever some other module in the code wants to die, they gotta come to you (unless someone else does a destructive overwrite on $SIG{__DIE__}
). And you can be notified that somebody things something is an error.
Of course, for enough things you can just use an END { }
block, if all you want to do is clean up.
overload::constant
You can inspect literals of a certain type in packages that include your module. For example, if you use this in your import
sub:
overload::constant
integer => sub {
my $lit = shift;
return $lit > 2_000_000_000 ? Math::BigInt->new( $lit ) : $lit
};
it will mean that every integer greater than 2 billion in the calling packages will get changed to a Math::BigInt
object. (See overload::constant).
Grouped Integer Literals
While we're at it. Perl allows you to break up large numbers into groups of three digits and still get a parsable integer out of it. Note 2_000_000_000
above for 2 billion.
Solution 15 - Perl
Taint checking. With taint checking enabled, perl will die (or warn, with -t
) if you try to pass tainted data (roughly speaking, data from outside the program) to an unsafe function (opening a file, running an external command, etc.). It is very helpful when writing setuid scripts or CGIs or anything where the script has greater privileges than the person feeding it data.
Magic goto. goto &sub
does an optimized tail call.
The debugger.
use strict
and use warnings
. These can save you from a bunch of typos.
Solution 16 - Perl
Based on the way the "-n"
and "-p"
switches are implemented in Perl 5, you can write a seemingly incorrect program including }{
:
ls |perl -lne 'print $_; }{ print "$. Files"'
which is converted internally to this code:
LINE: while (defined($_ = <ARGV>)) {
print $_; }{ print "$. Files";
}
Solution 17 - Perl
Let's start easy with the Spaceship Operator.
$a = 5 <=> 7; # $a is set to -1
$a = 7 <=> 5; # $a is set to 1
$a = 6 <=> 6; # $a is set to 0
Solution 18 - Perl
This is a meta-answer, but the Perl Tips archives contain all sorts of interesting tricks that can be done with Perl. The archive of previous tips is on-line for browsing, and can be subscribed to via mailing list or atom feed.
Some of my favourite tips include building executables with PAR, using autodie to throw exceptions automatically, and the use of the switch and smart-match constructs in Perl 5.10.
Disclosure: I'm one of the authors and maintainers of Perl Tips, so I obviously think very highly of them. ;)
Solution 19 - Perl
http://perldoc.perl.org/functions/map.html">map</a> - not only because it makes one's code more expressive, but because it gave me an impulse to read a little bit more about this "functional programming".
Solution 20 - Perl
My vote would go for the (?{}) and (??{}) groups in Perl's regular expressions. The first executes Perl code, ignoring the return value, the second executes code, using the return value as a regular expression.
Solution 21 - Perl
The continue clause on loops. It will be executed at the bottom of every loop, even those which are next'ed.
while( <> ){
print "top of loop\n";
chomp;
next if /next/i;
last if /last/i;
print "bottom of loop\n";
}continue{
print "continue\n";
}
Solution 22 - Perl
The m//
operator has some obscure special cases:
- If you use
?
as the delimiter it only matches once unless you callreset
. - If you use
'
as the delimiter the pattern is not interpolated. - If the pattern is empty it uses the pattern from the last successful match.
Solution 23 - Perl
while(/\G(\b\w*\b)/g) {
print "$1\n";
}
the \G anchor. It's hot.
Solution 24 - Perl
The null filehandle diamond operator <>
has its place in building command line tools. It acts like <FH>
to read from a handle, except that it magically selects whichever is found first: command line filenames or STDIN. Taken from perlop:
while (<>) {
... # code for each line
}
Solution 25 - Perl
Special code blocks such as BEGIN
, CHECK
and END
. They come from Awk, but work differently in Perl, because it is not record-based.
The BEGIN
block can be used to specify some code for the parsing phase; it is also executed when you do the syntax-and-variable-check perl -c
. For example, to load in configuration variables:
BEGIN {
eval {
require 'config.local.pl';
};
if ($@) {
require 'config.default.pl';
}
}
Solution 26 - Perl
rename("$_.part", $_) for "data.txt";
renames data.txt.part to data.txt without having to repeat myself.
Solution 27 - Perl
A bit obscure is the tilde-tilde "operator" which forces scalar context.
print ~~ localtime;
is the same as
print scalar localtime;
and different from
print localtime;
Solution 28 - Perl
tie, the variable tying interface.
Solution 29 - Perl
The "desperation mode" of Perl's loop control constructs which causes them to look up the stack to find a matching label allows some curious behaviors which Test::More takes advantage of, for better or worse.
SKIP: {
skip() if $something;
print "Never printed";
}
sub skip {
no warnings "exiting";
last SKIP;
}
There's the little known .pmc file. "use Foo" will look for Foo.pmc in @INC before Foo.pm. This was intended to allow compiled bytecode to be loaded first, but Module::Compile takes advantage of this to cache source filtered modules for faster load times and easier debugging.
The ability to turn warnings into errors.
local $SIG{__WARN__} = sub { die @_ };
$num = "two";
$sum = 1 + $num;
print "Never reached";
That's what I can think of off the top of my head that hasn't been mentioned.
Solution 30 - Perl
The goatse operator*
:
$_ = "foo bar";
my $count =()= /[aeiou]/g; #3
or
sub foo {
return @_;
}
$count =()= foo(qw/a b c d/); #4
It works because list assignment in scalar context yields the number of elements in the list being assigned.
*
Note, not really an operator
Solution 31 - Perl
The input record separator can be set to a reference to a number to read fixed length records:
$/ = \3; print $_,"\n" while <>; # output three chars on each line
Solution 32 - Perl
I don't know how esoteric it is, but one of my favorites is the http://www.webquills.net/scroll/2008/05/perl-5-hash-slices-can-replace.html">hash slice. I use it for all kinds of things. For example to merge two hashes:
my %number_for = (one => 1, two => 2, three => 3); my %your_numbers = (two => 2, four => 4, six => 6); @number_for{keys %your_numbers} = values %your_numbers; print sort values %number_for; # 12346
Solution 33 - Perl
This one isn't particularly useful, but it's extremely esoteric. I stumbled on this while digging around in the Perl parser.
Before there was POD, perl4 had a trick to allow you to embed the man page, as nroff, straight into your program so it wouldn't get lost. perl4 used a program called wrapman (see Pink Camel page 319 for some details) to cleverly embed an nroff man page into your script.
It worked by telling nroff to ignore all the code, and then put the meat of the man page after an END tag which tells Perl to stop processing code. Looked something like this:
#!/usr/bin/perl
'di';
'ig00';
...Perl code goes here, ignored by nroff...
.00; # finish .ig
'di \" finish the diversion
.nr nl 0-1 \" fake up transition to first page
.nr % 0 \" start at page 1
'; __END__
...man page goes here, ignored by Perl...
The details of the roff magic escape me, but you'll notice that the roff commands are strings or numbers in void context. Normally a constant in void context produces a warning. There are special exceptions in op.c
to allow void context strings which start with certain roff commands.
/* perl4's way of mixing documentation and code
(before the invention of POD) was based on a
trick to mix nroff and perl code. The trick was
built upon these three nroff macros being used in
void context. The pink camel has the details in
the script wrapman near page 319. */
const char * const maybe_macro = SvPVX_const(sv);
if (strnEQ(maybe_macro, "di", 2) ||
strnEQ(maybe_macro, "ds", 2) ||
strnEQ(maybe_macro, "ig", 2))
useless = NULL;
This means that 'di';
doesn't produce a warning, but neither does 'die';
'did you get that thing I sentcha?';
or 'ignore this line';
.
In addition, there are exceptions for the numeric constants 0
and 1
which allows the bare .00;
. The code claims this was for more general purposes.
/* the constants 0 and 1 are permitted as they are
conventionally used as dummies in constructs like
1 while some_condition_with_side_effects; */
else if (SvNIOK(sv) && (SvNV(sv) == 0.0 || SvNV(sv) == 1.0))
useless = NULL;
And what do you know, 2 while condition
does warn!
Solution 34 - Perl
You can use @{[...]} to get an interpolated result of complex perl expressions
$a = 3;
$b = 4;
print "$a * $b = @{[$a * $b]}";
prints: 3 * 4 = 12
Solution 35 - Perl
sub load_file
{
local(@ARGV, $/) = shift;
<>;
}
and a version that returns an array as appropriate:
sub load_file
{
local @ARGV = shift;
local $/ = wantarray? $/: undef;
<>;
}
Solution 36 - Perl
use diagnostics;
If you are starting to work with Perl and have never done so before, this module will save you tons of time and hassle. For almost every basic error message you can get, this module will give you a lengthy explanation as to why your code is breaking, including some helpful hints as to how to fix it. For example:
use strict;
use diagnostics;
$var = "foo";
gives you this helpful message:
Global symbol "$var" requires explicit package name at - line 4. Execution of - aborted due to compilation errors (#1) (F) You've said "use strict vars", which indicates that all variables must either be lexically scoped (using "my"), declared beforehand using "our", or explicitly qualified to say which package the global variable is in (using "::").Uncaught exception from user code: Global symbol "$var" requires explicit package name at - line 4. Execution of - aborted due to compilation errors. at - line 5
use diagnostics;
use strict;
sub myname {
print { " Some Error " };
};
you get this large, helpful chunk of text:
syntax error at - line 5, near "};"
Execution of - aborted due to compilation errors (#1)
(F) Probably means you had a syntax error. Common reasons include:
A keyword is misspelled.
A semicolon is missing.
A comma is missing.
An opening or closing parenthesis is missing.
An opening or closing brace is missing.
A closing quote is missing.
Often there will be another error message associated with the syntax
error giving more information. (Sometimes it helps to turn on -w.)
The error message itself often tells you where it was in the line when
it decided to give up. Sometimes the actual error is several tokens
before this, because Perl is good at understanding random input.
Occasionally the line number may be misleading, and once in a blue moon
the only way to figure out what's triggering the error is to call
perl -c repeatedly, chopping away half the program each time to see
if the error went away. Sort of the cybernetic version of S<20
questions>.
Uncaught exception from user code:
syntax error at - line 5, near "};"
Execution of - aborted due to compilation errors.
at - line 7
From there you can go about deducing what might be wrong with your program (in this case, print is formatted entirely wrong). There's a large number of known errors with diagnostics. Now, while this would not be a good thing to use in production, it can serve as a great learning aid for those who are new to Perl.
Solution 37 - Perl
There also is $[ the variable which decides at which index an array starts. Default is 0 so an array is starting at 0. By setting
$[=1;
You can make Perl behave more like AWK (or Fortran) if you really want to.
Solution 38 - Perl
($x, $y) = ($y, $x) is what made me want to learn Perl.
The list constructor 1..99 or 'a'..'zz' is also very nice.
Solution 39 - Perl
@Schwern mentioned turning warnings into errors by localizing $SIG{__WARN__}
. You can do also do this (lexically) with use warnings FATAL => "all";
. See perldoc lexwarn
.
On that note, since Perl 5.12, you've been able to say perldoc foo
instead of the full perldoc perlfoo
. Finally! :)
Solution 40 - Perl
Safe compartments.
With the Safe module you can build your own sandbox-style environment using nothing but perl. You would then be able to load perl scripts into the sandbox.
Best regards,
Solution 41 - Perl
Core IO::Handle
module. Most important thing for me is that it allows autoflush on filehandles. Example:
use IO::Handle;
$log->autoflush(1);
Solution 42 - Perl
How about the ability to use
my @symbols = map { +{ 'key' => $_ } } @things;
to generate an array of hashrefs from an array -- the + in front of the hashref disambiguates the block so the interpreter knows that it's a hashref and not a code block. Awesome.
(Thanks to Dave Doyle for explaining this to me at the last Toronto Perlmongers meeting.)
Solution 43 - Perl
All right. Here is another. Dynamic Scoping. It was talked about a little in a different post, but I didn't see it here on the hidden features.
Dynamic Scoping like Autovivification has a very limited amount of languages that use it. Perl and Common Lisp are the only two I know of that use Dynamic Scoping.
Solution 44 - Perl
Use lvalues to make your code really confusing:
my $foo = undef ;
sub bar:lvalue{ return $foo ;}
# Then later
bar = 5 ;
print bar ;
Solution 45 - Perl
The Schwartzian Transform is a technique that allows you to efficiently sort by a computed, secondary index. Let's say that you wanted to sort a list of strings by their md5 sum. The comments below are best read backwards (that's the order I always end up writing these anyways):
my @strings = ('one', 'two', 'three', 'four');
my $md5sorted_strings =
map { $_->[0] } # 4) map back to the original value
sort { $a->[1] cmp $b->[1] } # 3) sort by the correct element of the list
map { [$_, md5sum_func($_)] } # 2) create a list of anonymous lists
@strings # 1) take strings
This way, you only have to do the expensive md5 computation N times, rather than N log N times.
Solution 46 - Perl
One useful composite operator for conditionally adding strings or lists into other lists is the x!!
operator:
print 'the meaning of ', join ' ' =>
'life,' x!! $self->alive,
'the universe,' x!! ($location ~~ Universe),
('and', 'everything.') x!! 42; # this is added as a list
this operator allows for a reversed syntax similar to
do_something() if test();
Solution 47 - Perl
This one-liner illustrates how to use glob to generate all word combinations of an alphabet (A, T, C, and G -> DNA) for words of a specified length (4):
perl -MData::Dumper -e '@CONV = glob( "{A,T,C,G}" x 4 ); print Dumper( \@CONV )'
Solution 48 - Perl
My favorite semi-hidden feature of Perl is the eof
function. Here's an example pretty much directly from perldoc -f eof
that shows how you can use it to reset the file name and $.
(the current line number) easily across multiple files loaded up at the command line:
while (<>) {
print "$ARGV:$.\t$_";
}
continue {
close ARGV if eof
}
Solution 49 - Perl
You can replace the delimiter in regexes and strings with just about anything else. This is particularly useful for "leaning toothpick syndrome", exemplified here:
$url =~ /http:\/\/www\.stackoverflow\.com\//;
You can eliminate most of the back-whacking by changing the delimiter. /bar/
is shorthand for m/bar/
which is the same as m!bar!
.
$url =~ m!http://www\.stackoverflow\.com/!;
You can even use balanced delimiters like {} and []. I personally love these. q{foo}
is the same as 'foo'
.
$code = q{
if( this is awesome ) {
print "Look ma, no escaping!";
}
};
To confuse your friends (and your syntax highlighter) try this:
$string = qq'You owe me $1,000 dollars!';
Solution 50 - Perl
Very late to the party, but: attributes.
Attributes essentially let you define arbitrary code to be associated with the declaration of a variable or subroutine. The best way to use these is with Attribute::Handlers; this makes it easy to define attributes (in terms of, what else, attributes!).
I did a presentation on using them to declaratively assemble a pluggable class and its plugins at YAPC::2006, online here. This is a pretty unique feature.
Solution 51 - Perl
I personally love the /e modifier to the s/// operation:
while(<>) {
s/(\w{0,4})/reverse($1);/e; # reverses all words between 0 and 4 letters
print;
}
Input:
This is a test of regular expressions
^D
Output (I think):
sihT si a tset fo regular expressions
Solution 52 - Perl
use Quantum::Superpositions;
if ($x == any($a, $b, $c)) { ... }
Solution 53 - Perl
There is a more powerful way to check program for syntax errors:
perl -w -MO=Lint,no-context myscript.pl
The most important thing that it can do is reporting for 'unexistant subroutine' errors.
Solution 54 - Perl
use re debug
Doc on use re debug
and
perl -MO=Concise[,OPTIONS]
Doc on Concise
Besides being exquisitely flexible, expressive and amenable to programing in the style of C, Pascal, Python and other languages, there are several pragmas command switches that make Perl my 'goto' language for initial kanoodling on an algorithm, regex, or quick problems that needs to be solved. These two are unique to Perl I believe, and are among my favorites.
use re debug
:
Most modern flavors of regular expressions owe their current form and function to Perl. While there are many Perl forms of regex that cannot be expressed in other languages, there are almost no forms of other languages' flavor of regex that cannot be expressed in Perl. Additionally, Perl has a wonderful regex debugger built in to show how the regex engine is interpreting your regex and matching against the target string.
Example: I recently was trying to write a simple CSV routine. (Yes, yes, I know, I should have been using Text::CSV...) but the CSV values were not quoted and simple.
My first take was /^(^(?:(.*?),){$i}/
to extract the i record on n CSV records. That works fine -- except for the last record or n of n. I could see that without the debugger.
Next I tried /^(?:(.*?),|$){$i}/
This did not work, and I could not see immediately why. I thought I was saying (.*?)
followed by a comma or EOL. Then I added use re debug
at the top of a small test script. Ahh yes, the alteration between ,|$
was not being interpreted that way; it was being interpreted as ((.*?),) | ($)
-- not what I wanted.
A new grouping was needed. So I arrived at the working /^(?:(.*?)(?:,|$)){$i}/
. While I was in the regex debugger, I was surprised how many loops it took for a match towards the end of the string. It is the .*?
term that is quite ambiguous and requires excessive backtracking to satisfy. So I tried /^(?:(?:^|,)([^,]*)){$i}/
This does two things: 1) reduces backtracking because of the greedy match of all but a comma 2) allowed the regex optimizer to only use the alteration once on the first field. Using Benchmark, this is 35% faster than the first regex. The regex debugger is wonderful and few use it.
perl -MO=Concise[,OPTIONS]
:
The B and Concise frameworks are tremendous tools to see how Perl is interpreting your masterpiece. Using the -MO=Concise
prints the result of the Perl interpreters translation of your source code. There are many options to Concise and in B, you can write your own presentation of the OP codes.
As in this post, you can use Concise to compare different code structures. You can interleave your source lines with the OP codes those lines generate. Check it out.
Solution 55 - Perl
You can use different quotes on HEREDOCS to get different behaviors.
my $interpolation = "We will interpolated variables";
print <<"END";
With double quotes, $interpolation, just like normal HEREDOCS.
END
print <<'END';
With single quotes, the variable $foo will *not* be interpolated.
(You have probably seen this in other languages.)
END
## this is the fun and "hidden" one
my $shell_output = <<`END`;
echo With backticks, these commands will be executed in shell.
echo The output is returned.
ls | wc -l
END
print "shell output: $shell_output\n";
Solution 56 - Perl
Axeman reminded me of how easy it is to wrap some of the built-in functions.
Before Perl 5.10 Perl didn't have a pretty print(say) like Python.
So in your local program you could do something like:
sub print {
print @_, "\n";
}
or add in some debug.
sub print {
exists $ENV{DEVELOPER} ?
print Dumper(@_) :
print @_;
}
Solution 57 - Perl
The following are just as short but more meaningful than "~~" since they indicate what is returned, and there's no confusion with the smart match operator:
print "".localtime; # Request a string
print 0+@array; # Request a number
Solution 58 - Perl
Two things that work well together: IO handles on in-core strings, and using function prototypes to enable you to write your own functions with grep/map-like syntax.
sub with_output_to_string(&) { # allows compiler to accept "yoursub {}" syntax.
my $function = shift;
my $string = '';
my $handle = IO::Handle->new();
open($handle, '>', \$string) || die $!; # IO handle on a plain scalar string ref
my $old_handle = select $handle;
eval { $function->() };
select $old_handle;
die $@ if $@;
return $string;
}
my $greeting = with_output_to_string {
print "Hello, world!";
};
print $greeting, "\n";
Solution 59 - Perl
The ability to use a hash as a seen filter in a loop. I have yet to see something quite as nice in a different language. For example, I have not been able to duplicate this in python.
For example, I want to print a line if it has not been seen before.
my %seen;
for (<LINE>) {
print $_ unless $seen{$_}++;
}
Solution 60 - Perl
The new -E option on the command line:
> perl -e "say 'hello"" # does not work
String found where operator expected at -e line 1, near "say 'hello'"
(Do you need to predeclare say?)
syntax error at -e line 1, near "say 'hello'"
Execution of -e aborted due to compilation errors.
> perl -E "say 'hello'"
hello
Solution 61 - Perl
You can expand function calls in a string, for example;
print my $foo = "foo @{[scalar(localtime)]} bar";
> foo Wed May 26 15:50:30 2010 bar
Solution 62 - Perl
The feature I like the best is statement modifiers.
Don't know how many times I've wanted to do:
say 'This will output' if 1;
say 'This will not output' unless 1;
say 'Will say this 3 times. The first Time: '.$_ for 1..3;
in other languages. etc...
The 'etc' reminded me of another 5.12 feature, the Yada Yada operator.
This is great, for the times when you just want a place holder.
sub something_really_important_to_implement_later {
...
}
Check it out: Perl Docs on Yada Yada Operator.
Solution 63 - Perl
I'm a bit late to the party, but a vote for the built-in tied-hash function dbmopen()
-- it's helped me a lot. It's not exactly a database, but if you need to save data to disk it takes away a lot of the problems and Just Works. It helped me get started when I didn't have a database, didn't understand Storable.pm, but I knew I wanted to progress beyond reading and writing to text files.
Solution 64 - Perl
You might think you can do this to save memory:
@is_month{qw(jan feb mar apr may jun jul aug sep oct nov dec)} = undef;
print "It's a month" if exists $is_month{lc $mon};
but it doesn't do that. Perl still assigns a different scalar value to each key. Devel::Peek shows this. PVHV
is the hash. Elt
is a key and the SV
that follows is its value. Note that each SV has a different memory address indicating they're not being shared.
Dump \%is_month, 12;
SV = RV(0x81c1bc) at 0x81c1b0
REFCNT = 1
FLAGS = (TEMP,ROK)
RV = 0x812480
SV = PVHV(0x80917c) at 0x812480
REFCNT = 2
FLAGS = (SHAREKEYS)
ARRAY = 0x206f20 (0:8, 1:4, 2:4)
hash quality = 101.2%
KEYS = 12
FILL = 8
MAX = 15
RITER = -1
EITER = 0x0
Elt "feb" HASH = 0xeb0d8580
SV = NULL(0x0) at 0x804b40
REFCNT = 1
FLAGS = ()
Elt "may" HASH = 0xf2290c53
SV = NULL(0x0) at 0x812420
REFCNT = 1
FLAGS = ()
An undef scalar takes as much memory as an integer scalar, so you might ask well just assign them all to 1 and avoid the trap of forgetting to check with exists
.
my %is_month = map { $_ => 1 } qw(jan feb mar apr may jun jul aug sep oct nov dec);
print "It's a month" if $is_month{lc $mon});
Solution 65 - Perl
The expression defined &DB::DB
returns true if the program is running from within the debugger.
Solution 66 - Perl
Interpolation of match regular expressions. A useful application of this is when matching on a blacklist. Without using interpolation it is written like so:
#detecting blacklist words in the current line
/foo|bar|baz/;
Can instead be written
@blacklistWords = ("foo", "bar", "baz");
$anyOfBlacklist = join "|", (@blacklistWords);
/$anyOfBlacklist/;
This is more verbose, but allows for population from a datafile. Also if the list is maintained in the source for whatever reason, it is easier to maintain the array then the RegExp.
Solution 67 - Perl
Using hashes (where keys are unique) to obtain the unique elements of a list:
my %unique = map { $_ => 1 } @list;
my @unique = keys %unique;
Solution 68 - Perl
Add one for the unpack() and pack() functions, which are great if you need to import and/or export data in a format which is used by other programs.
Of course these days most programs will allow you to export data in XML, and many commonly used proprietary document formats have associated Perl modules written for them. But this is one of those features that is incredibly useful when you need it, and pack()/unpack() are probably the reason that people have been able to write CPAN modules for so many proprietary data formats.
Solution 69 - Perl
Next time you're at a geek party pull out this one-liner in a bash shell and the women will swarm you and your friends will worship you:
find . -name "*.txt"|xargs perl -pi -e 's/1:(\S+)/uc($1)/ge'
Process all *.txt files and do an in-place find and replace using perl's regex. This one converts text after a '1:' to upper case and removes the '1:'. Uses Perl's 'e' modifier to treat the second part of the find/replace regex as executable code. Instant one-line template system. Using xargs lets you process a huge number of files without running into bash's command line length limit.
Solution 70 - Perl
@Corion - Bare URLs in Perl? Of course you can, even in interpolated strings. The only time it would matter is in a string that you were actually USING as a regular expression.
Solution 71 - Perl
Showing progress in the script by printing on the same line:
$| = 1; # flush the buffer on the next output
for $i(1..100) {
print "Progress $i %\r"
}
Solution 72 - Perl
One more...
Perl cache:
my $processed_input = $records || process_inputs($records_file);
On Elpeleg Open Source, Perl CMS http://www.web-app.net/
Solution 73 - Perl
$0 is the name of the perl script being executed. It can be used to get the context from which a module is being run.
# MyUsefulRoutines.pl
sub doSomethingUseful {
my @args = @_;
# ...
}
if ($0 =~ /MyUsefulRoutines.pl/) {
# someone is running perl MyUsefulRoutines.pl [args] from the command line
&doSomethingUseful (@ARGV);
} else {
# someone is calling require "MyUsefulRoutines.pl" from another script
1;
}
This idiom is helpful for treating a standalone script with some useful subroutines into a library that can be imported into other scripts. Python has similar functionality with the object.__name__ == "__main__"
idiom.
Solution 74 - Perl
using bare blocks with redo
or other control words to create custom looping constructs.
traverse a linked list of objects returning the first ->can('print')
method:
sub get_printer {
my $self = shift;
{$self->can('print') or $self = $self->next and redo}
}
Solution 75 - Perl
Perl is great as a flexible awk/sed.
For example lets use a simple replacement for ls | xargs stat
, naively done like:
$ ls | perl -pe 'print "stat "' | sh
This doesn't work well when the input (filenames) have spaces or shell special characters like |$\
. So single quotes are frequently required in the Perl output.
One complication with calling perl via the command line -ne
is that the shell gets first nibble at your one-liner. This often leads to torturous escaping to satisfy it.
One 'hidden' feature that I use all the time is \x27
to include a single quote instead of trying to use shell escaping '\''
So:
$ ls | perl -nle 'chomp; print "stat '\''$_'\''"' | sh
can be more safely written:
$ ls | perl -pe 's/(.*)/stat \x27$1\x27/' | sh
That won't work with funny characters in the filenames, even quoted like that. But this will:
$ ls | perl -pe 's/\n/\0/' | xargs -0 stat
Solution 76 - Perl
"now"
sub _now {
my ($now) = localtime() =~ /([:\d]{8})/;
return $now;
}
print _now(), "\n"; # 15:10:33
Solution 77 - Perl
B::Deparse - Perl compiler backend to produce perl code. Not something you'd use in your daily Perl coding, but could be useful in special circumstances.
If you come across some piece of code that is obfuscated, or a complex expression, pass it through Deparse
. Useful to figure out a JAPH or a Perl code that is golfed.
$ perl -e '$"=$,;*{;qq{@{[(A..Z)[qq[0020191411140003]=~m[..]g]]}}}=*_=sub{print/::(.*)/};$\=$/;q<Just another Perl Hacker>->();'
Just another Perl Hacker
$ perl -MO=Deparse -e '$"=$,;*{;qq{@{[(A..Z)[qq[0020191411140003]=~m[..]g]]}}}=*_=sub{print/::(.*)/};$\=$/;q<Just another Perl Hacker>->();'
$" = $,;
*{"@{[('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z')['0020191411140003' =~ /../g]];}";} = *_ = sub {
print /::(.*)/;
}
;
$\ = $/;
'Just another Perl Hacker'->();
-e syntax OK
A more useful example is to use deparse to find out the code behind a coderef, that you might have received from another module, or
use B::Deparse;
my $deparse = B::Deparse->new;
$code = $deparse->coderef2text($coderef);
print $code;
Solution 78 - Perl
I like the way we can insert a element in any place in the array, such as
=> Insert $x in position $i in array @a
@a = ( 11, 22, 33, 44, 55, 66, 77 );
$x = 10;
$i = 3;
@a = ( @a[0..$i-1], $x, @a[$i..$#a] );