Profiling and Optimizing Perl
with Devel::NYTProf and XS
Peter Karman
Premature Optimization
"We should forget about small efficiencies,
say about 97% of the time:
premature optimization is the root of all evil."
-Donald Knuth
Mature Optimization
- What about mature optimization?
- How do I know which 3% to worry about?
- Is it the algorithm or the language?
The Three Percent
- Don't benchmark, profile.
- Venerable Devel::DProf
- Devel::NYTProf
Devel::NYTProf
- HTML output
- Line-by-line as well as by sub and file
- So thorough it's scary
Observations about Perl as a language
- Optimized for developer time.
- Optimized for ASCII.
- Malloc/free is expensive, so Perl takes
without giving.
- Tight loops (char-level iterations) are expensive.
- Function calls, especially methods, are expensive.
- Regex can be expensive.
Taking the plunge into C and XS
- When speed really matters.
- When you have more time than money.
- When you're convinced it's the language, not the algorithm.
Stay away from XS
Render unto C what is C's,
and render unto Perl what is Perl's.
#include your C
/* pure C helpers */
#include "search-tools.c"
/***************************************************/
MODULE = Search::Tools PACKAGE = Search::Tools
PROTOTYPES: enable
Keep your XS minimal
boolean
is_ascii(string)
SV* string;
CODE:
RETVAL = st_is_ascii(string);
OUTPUT:
RETVAL
Late nights and hair loss
perlguts
,
perlapi
,
perlxs
,
perlcall
:
write them upon the tablet of thy heart.
- Reference counting is now your job.
- Portability.
- Perl versions.
If performance matters, it's worth it
XS vs PP Search::Tools Tokenizer
Rate pure-perl-greek xs-greek pure-perl-ascii xs-ascii xs-ascii-heatseeker-qr
pure-perl-greek 1171/s -- -60% -87% -95% -96%
xs-greek 2959/s 153% -- -66% -87% -91%
pure-perl-ascii 8696/s 643% 194% -- -62% -74%
xs-ascii 22727/s 1841% 668% 161% -- -32%
xs-ascii-heatseeker-qr 33333/s 2747% 1027% 283% 47% --
Credits
- Henry at zen.co.za for prompting the Snipper refactor