|
token_get_all
Split given source into PHP tokens
(PHP 4 >= 4.2.0, PHP 5)
Example 2559. token_get_all() examples<?php Code Examples / Notes » token_get_allbishop
You may want to know the line and column number at which a token begins (or ends). Since this tokenizer interface doesn't provide that information, you have to track it manually, like below: <?php function update_line_and_column_positions($c, &$line, &$col) { // update line count $numNewLines = substr_count($c, "\n"); if (1 <= $numNewLines) { // have new lines, add them in $line += $numNewLines; $col = 1; // skip to right past the last new line, as it won't affect the column position $c = substr($c, strrpos($c, "\n") + 1); if ($c === false) { $c = ''; } } // update column count $col += strlen($c); } ?> Now use it, something like: <?php $line = 1; $col = 1; foreach ($tokens as $token) { if (is_array($token)) { list ($token, $text) = $token; } else if (is_string($token)) { $text = $token; } update_line_and_column_positions($text, $line, $col); } ?> Note this assumes that your desired coordinate system is 1-based (eg (1,1) is the upper left). Zero-based is left as an exercise for the reader. leon atkinson
This function parses PHP code. Here's an example of it's use. <? $code = '<?$a = 3;?>'; foreach(token_get_all($code) as $c) { if(is_array($c)) { print(token_name($c[0]) . ": '" . htmlentities($c[1]) . "'\n"); } else { print("$c\n"); } } ?> phpcomments
Regarding bertrand at toggg dot com's comment: there is another case of the { } curly braces being used in PHP, but the token_get_all() function treats it just like a code block: string index. Example: <?php $text = "Hello"; if ($text{ 0 } == 'H') { echo "This example uses { for both a PHP block and a string index."; } ?> Just in case some people were wondering. Since PHP treats them as the same token, it makes some things a little more interesting for parsing. You can't just assume that { ... } is a code block, it could just be a number referring to an index of a string. bertrand
If you want to retrieve the PHP blocks then you will count up the opening curly braces '{' and down the closing ones '}' (counter zero means block finished) CAUTION: the opening curly braces token can take 3 values: 1) '{' for all PHP code blocks, 2) T_CURLY_OPEN for "protected" variables within strings as "{$var}" 3) T_DOLLAR_OPEN_CURLY_BRACES for extended format "${var}" On the other hand, closing token is allways '}' ! So counting up must take place on the 3 tokens: '{' , T_CURLY_OPEN and T_DOLLAR_OPEN_CURLY_BRACES Have fun with PHP tokenizer ! |