Documentation

JS extends Tokenizer
in package

Application

$endScopeTokens : array<string|int, mixed>: A list of tokens that end the scope.
$ignoredLines : array<string|int, mixed>: A list of lines being ignored due to error suppression comments.
$knownLengths : array<int, int>: Known lengths of tokens.
$scopeOpeners : array<string|int, mixed>: A list of tokens that are allowed to open a scope.
$commentTokens : array<string|int, mixed>: A list tokens that start and end comments.
$config : Config: The config data for the run.
$eolChar : string: The EOL char used in the content.
$numTokens : int: The number of tokens in the tokens array.
$stringTokens : array<string|int, mixed>: A list string delimiters.
$tokens : array<string|int, mixed>: A token-based representation of the content.
$tokenValues : array<string|int, mixed>: A list of special JS tokens and their types.
__construct() : void: Initialise the tokenizer.
getRegexToken() : array<string, string>|null: Tokenizes a regular expression if one is found.
getTokens() : array<string|int, mixed>: Gets the array of tokens.
processAdditional() : void: Performs additional processing after main tokenizing.
replaceTabsInToken() : void: Replaces tabs in original token content with spaces.
tokenize() : array<string|int, mixed>: Creates an array of tokens when given some JS code.
isMinifiedContent() : bool: Checks the content to see if it looks minified.
createLevelMap() : void: Constructs the level map.
createParenthesisNestingMap() : void: Creates a map for the parenthesis tokens that surround other tokens.
createPositionMap() : void: Sets token position information.
createScopeMap() : void: Creates a scope map of tokens that open scopes.
createTokenMap() : void: Creates a map of brackets positions.
recurseScopeMap() : int: Recurses though the scope openers to build a scope map.

$endScopeTokens

A list of tokens that end the scope.


    public
        array<string|int, mixed>
    $endScopeTokens
     = [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET, T_BREAK => T_BREAK]

This array is just a unique collection of the end tokens from the _scopeOpeners array. The data is duplicated here to save time during parsing of the file.

$ignoredLines

A list of lines being ignored due to error suppression comments.


    public
        array<string|int, mixed>
    $ignoredLines
     = []

$knownLengths

Known lengths of tokens.


    public
        array<int, int>
    $knownLengths
     = []

$scopeOpeners

A list of tokens that are allowed to open a scope.


    public
        array<string|int, mixed>
    $scopeOpeners
     = [T_IF => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => false, 'shared' => false, 'with' => []], T_TRY => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => true, 'shared' => false, 'with' => []], T_CATCH => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => true, 'shared' => false, 'with' => []], T_ELSE => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => false, 'shared' => false, 'with' => []], T_FOR => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => false, 'shared' => false, 'with' => []], T_CLASS => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => true, 'shared' => false, 'with' => []], T_FUNCTION => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => false, 'shared' => false, 'with' => []], T_WHILE => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => false, 'shared' => false, 'with' => []], T_DO => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => true, 'shared' => false, 'with' => []], T_SWITCH => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => true, 'shared' => false, 'with' => []], T_CASE => ['start' => [T_COLON => T_COLON], 'end' => [T_BREAK => T_BREAK, T_RETURN => T_RETURN, T_CONTINUE => T_CONTINUE, T_THROW => T_THROW], 'strict' => true, 'shared' => true, 'with' => [T_DEFAULT => T_DEFAULT, T_CASE => T_CASE, T_SWITCH => T_SWITCH]], T_DEFAULT => ['start' => [T_COLON => T_COLON], 'end' => [T_BREAK => T_BREAK, T_RETURN => T_RETURN, T_CONTINUE => T_CONTINUE, T_THROW => T_THROW], 'strict' => true, 'shared' => true, 'with' => [T_CASE => T_CASE, T_SWITCH => T_SWITCH]]]

This array also contains information about what kind of token the scope opener uses to open and close the scope, if the token strictly requires an opener, if the token can share a scope closer, and who it can be shared with. An example of a token that shares a scope closer is a CASE scope.

$commentTokens

A list tokens that start and end comments.


    protected
        array<string|int, mixed>
    $commentTokens
     = ['//' => null, '/*' => '*/', '/**' => '*/']

$config

The config data for the run.


    protected
        Config
    $config
     = null

$eolChar

The EOL char used in the content.


    protected
        string
    $eolChar
     = []

$numTokens

The number of tokens in the tokens array.


    protected
        int
    $numTokens
     = 0

$stringTokens

A list string delimiters.


    protected
        array<string|int, mixed>
    $stringTokens
     = [''' => ''', '"' => '"']

$tokens

A token-based representation of the content.


    protected
        array<string|int, mixed>
    $tokens
     = []

$tokenValues

A list of special JS tokens and their types.


    protected
        array<string|int, mixed>
    $tokenValues
     = ['class' => 'T_CLASS', 'function' => 'T_FUNCTION', 'prototype' => 'T_PROTOTYPE', 'try' => 'T_TRY', 'catch' => 'T_CATCH', 'return' => 'T_RETURN', 'throw' => 'T_THROW', 'break' => 'T_BREAK', 'switch' => 'T_SWITCH', 'continue' => 'T_CONTINUE', 'if' => 'T_IF', 'else' => 'T_ELSE', 'do' => 'T_DO', 'while' => 'T_WHILE', 'for' => 'T_FOR', 'var' => 'T_VAR', 'case' => 'T_CASE', 'default' => 'T_DEFAULT', 'true' => 'T_TRUE', 'false' => 'T_FALSE', 'null' => 'T_NULL', 'this' => 'T_THIS', 'typeof' => 'T_TYPEOF', '(' => 'T_OPEN_PARENTHESIS', ')' => 'T_CLOSE_PARENTHESIS', '{' => 'T_OPEN_CURLY_BRACKET', '}' => 'T_CLOSE_CURLY_BRACKET', '[' => 'T_OPEN_SQUARE_BRACKET', ']' => 'T_CLOSE_SQUARE_BRACKET', '?' => 'T_INLINE_THEN', '.' => 'T_OBJECT_OPERATOR', '+' => 'T_PLUS', '-' => 'T_MINUS', '*' => 'T_MULTIPLY', '%' => 'T_MODULUS', '/' => 'T_DIVIDE', '^' => 'T_LOGICAL_XOR', ',' => 'T_COMMA', ';' => 'T_SEMICOLON', ':' => 'T_COLON', '<' => 'T_LESS_THAN', '>' => 'T_GREATER_THAN', '<<' => 'T_SL', '>>' => 'T_SR', '>>>' => 'T_ZSR', '<<=' => 'T_SL_EQUAL', '>>=' => 'T_SR_EQUAL', '>>>=' => 'T_ZSR_EQUAL', '<=' => 'T_IS_SMALLER_OR_EQUAL', '>=' => 'T_IS_GREATER_OR_EQUAL', '=>' => 'T_DOUBLE_ARROW', '!' => 'T_BOOLEAN_NOT', '||' => 'T_BOOLEAN_OR', '&&' => 'T_BOOLEAN_AND', '|' => 'T_BITWISE_OR', '&' => 'T_BITWISE_AND', '!=' => 'T_IS_NOT_EQUAL', '!==' => 'T_IS_NOT_IDENTICAL', '=' => 'T_EQUAL', '==' => 'T_IS_EQUAL', '===' => 'T_IS_IDENTICAL', '-=' => 'T_MINUS_EQUAL', '+=' => 'T_PLUS_EQUAL', '*=' => 'T_MUL_EQUAL', '/=' => 'T_DIV_EQUAL', '%=' => 'T_MOD_EQUAL', '++' => 'T_INC', '--' => 'T_DEC', '//' => 'T_COMMENT', '/*' => 'T_COMMENT', '/**' => 'T_DOC_COMMENT', '*/' => 'T_COMMENT']

__construct()

Initialise the tokenizer.


    public
                __construct(string $content, Config $config[, string $eolChar = '\n' ]) : void

Pre-checks the content to see if it looks minified.

Parameters

$content : string: The content to tokenize,
$config : Config: The config data for the run.
$eolChar : string = '\n': The EOL char used in the content.

Return values

void —

getRegexToken()

Tokenizes a regular expression if one is found.


    public
                getRegexToken(string $char, string $string, string $chars, string $tokens) : array<string, string>|null

If a regular expression is not found, NULL is returned.

Parameters

$char : string: The index of the possible regex start character.
$string : string: The complete content of the string being tokenized.
$chars : string: An array of characters being tokenized.
$tokens : string: The current array of tokens found in the string.

Return values

array<string, string>|null —

getTokens()

Gets the array of tokens.


    public
                getTokens() : array<string|int, mixed>

Return values

array<string|int, mixed> —

processAdditional()

Performs additional processing after main tokenizing.


    public
                processAdditional() : void

This additional processing looks for properties, closures, labels and objects.

Return values

void —

replaceTabsInToken()

Replaces tabs in original token content with spaces.


    public
                replaceTabsInToken(array<string|int, mixed> &$token[, string $prefix = ' ' ][, string $padding = ' ' ][, int $tabWidth = null ]) : void

Each tab can represent between 1 and $config->tabWidth spaces, so this cannot be a straight string replace. The original content is placed into an orig_content index and the new token length is also set in the length index.

Parameters

$token : array<string|int, mixed>: The token to replace tabs inside.
$prefix : string = ' ': The character to use to represent the start of a tab.
$padding : string = ' ': The character to use to represent the end of a tab.
$tabWidth : int = null: The number of spaces each tab represents.

Return values

void —

tokenize()

Creates an array of tokens when given some JS code.


    public
                tokenize(string $string) : array<string|int, mixed>

Parameters

$string : string: The string to tokenize.

Return values

array<string|int, mixed> —

isMinifiedContent()

Checks the content to see if it looks minified.


    protected
                isMinifiedContent(string $content[, string $eolChar = '\n' ]) : bool

Parameters

$content : string: The content to tokenize.
$eolChar : string = '\n': The EOL char used in the content.

Return values

bool —

createLevelMap()

Constructs the level map.


    private
                createLevelMap() : void

The level map adds a 'level' index to each token which indicates the depth that a token within a set of scope blocks. It also adds a 'conditions' index which is an array of the scope conditions that opened each of the scopes - position 0 being the first scope opener.

Return values

void —

createParenthesisNestingMap()

Creates a map for the parenthesis tokens that surround other tokens.


    private
                createParenthesisNestingMap() : void

Return values

void —

createPositionMap()

Sets token position information.


    private
                createPositionMap() : void

Can also convert tabs into spaces. Each tab can represent between 1 and $width spaces, so this cannot be a straight string replace.

Return values

void —

createScopeMap()

Creates a scope map of tokens that open scopes.


    private
                createScopeMap() : void

Return values

void —

createTokenMap()

Creates a map of brackets positions.


    private
                createTokenMap() : void

Return values

void —

recurseScopeMap()

Recurses though the scope openers to build a scope map.


    private
                recurseScopeMap(int $stackPtr[, int $depth = 1 ], int &$ignore) : int

Parameters

$stackPtr : int: The position in the stack of the token that opened the scope (eg. an IF token or FOR token).
$depth : int = 1: How many scope levels down we are.
$ignore : int: How many curly braces we are ignoring.

Return values

int —

The position in the stack that closed the scope.

JS extends Tokenizer in package Application

Table of Contents

Properties

$endScopeTokens

$ignoredLines

$knownLengths

$scopeOpeners

$commentTokens

$config

$eolChar

$numTokens

$stringTokens

$tokens

$tokenValues

Methods

__construct()

Parameters

Tags

Return values

getRegexToken()

Parameters

Return values

getTokens()

Return values

processAdditional()

Return values

replaceTabsInToken()

Parameters

Return values

tokenize()

Parameters

Return values

isMinifiedContent()

Parameters

Return values

createLevelMap()

Return values

createParenthesisNestingMap()

Return values

createPositionMap()

Return values

createScopeMap()

Tags

Return values

createTokenMap()

Return values

recurseScopeMap()

Parameters

Tags

Return values

JS extends Tokenizer
in package

Application