PHP String Analyzer
PHP string analyzer is a static program analyzer that
approximates the string output of a PHP program with a
context-free grammar. The analyzer can be used to check properties of a PHP program.
For example, it can be used to validate dynamically
generated Web pages by a PHP program.
You can find the basic principle of the analysis in the following paper.
The XML validation algorithms used in the analyzer are described in
the following paper.
- XML Validation for Context-Free Grammars, Y. Minamide, A. Tozawa, In Proc. of The Fourth ASIAN Symposium on Programming Languages and Systems, LNCS 4279, pp. 357-373, 2006. (to appear, preliminary version)
The preliminary experiments on XHTML validation with the analyzer is reported
in the following paper. Note the current of the analyzer does not use the
validation algorithm described in the paper.
The basic idea of the analyzer comes from Java String Analyzer.
It is a string analyzer for Java based on regular languages.
What is the PHP String analyzer
PHP String Analyzer approximates the string output of a program as a
context-free grammar. the analyzer takes two inputs: a PHP program and
an input specification.
Let us consider the following program.
<?php
for ($i = 0; $i < $n; $i++)
$x = "0".$x."1";
echo $x;
?>
For the analysis, we need to specify the initial values of the global
variables in the program. The specification is given as follows:
$x : /abc|xyz/
$n : int
The specification /abc|xyz/ is a regular expression representing the set of
strings {abc, xyz}. The type 'int' is specified for the variable $n.
The analyzer is executed as follows:
% phpsa -ispec example0.ispec -simplify example0.php
where the option -simplify indicates that the analyzer tries to simplify the CFG.
Then we obtain the following context-free grammar as approximation of the program's string output.
({$268$, $293$}, // variables
{$268$ -> $293$|0$268$1,$293$ -> abc|xyz}, // productions
$268$) // start symbol
This grammar represents the set of strings { 0^nabc1^n | n >= 0} U { 0^nxyz1^n | n >= 0} as we expect.
Authors