Documentation

JS extends Tokenizer
in package

Table of Contents

$endScopeTokens  : array<string|int, mixed>
A list of tokens that end the scope.
$ignoredLines  : array<string|int, mixed>
A list of lines being ignored due to error suppression comments.
$knownLengths  : array<int, int>
Known lengths of tokens.
$scopeOpeners  : array<string|int, mixed>
A list of tokens that are allowed to open a scope.
$commentTokens  : array<string|int, mixed>
A list tokens that start and end comments.
$config  : Config
The config data for the run.
$eolChar  : string
The EOL char used in the content.
$numTokens  : int
The number of tokens in the tokens array.
$stringTokens  : array<string|int, mixed>
A list string delimiters.
$tokens  : array<string|int, mixed>
A token-based representation of the content.
$tokenValues  : array<string|int, mixed>
A list of special JS tokens and their types.
__construct()  : void
Initialise the tokenizer.
getRegexToken()  : array<string, string>|null
Tokenizes a regular expression if one is found.
getTokens()  : array<string|int, mixed>
Gets the array of tokens.
processAdditional()  : void
Performs additional processing after main tokenizing.
replaceTabsInToken()  : void
Replaces tabs in original token content with spaces.
tokenize()  : array<string|int, mixed>
Creates an array of tokens when given some JS code.
isMinifiedContent()  : bool
Checks the content to see if it looks minified.
createLevelMap()  : void
Constructs the level map.
createParenthesisNestingMap()  : void
Creates a map for the parenthesis tokens that surround other tokens.
createPositionMap()  : void
Sets token position information.
createScopeMap()  : void
Creates a scope map of tokens that open scopes.
createTokenMap()  : void
Creates a map of brackets positions.
recurseScopeMap()  : int
Recurses though the scope openers to build a scope map.

Properties

$endScopeTokens

A list of tokens that end the scope.

public array<string|int, mixed> $endScopeTokens = [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET, T_BREAK => T_BREAK]

This array is just a unique collection of the end tokens from the _scopeOpeners array. The data is duplicated here to save time during parsing of the file.

$ignoredLines

A list of lines being ignored due to error suppression comments.

public array<string|int, mixed> $ignoredLines = []

$knownLengths

Known lengths of tokens.

public array<int, int> $knownLengths = []

$scopeOpeners

A list of tokens that are allowed to open a scope.

public array<string|int, mixed> $scopeOpeners = [T_IF => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => false, 'shared' => false, 'with' => []], T_TRY => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => true, 'shared' => false, 'with' => []], T_CATCH => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => true, 'shared' => false, 'with' => []], T_ELSE => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => false, 'shared' => false, 'with' => []], T_FOR => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => false, 'shared' => false, 'with' => []], T_CLASS => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => true, 'shared' => false, 'with' => []], T_FUNCTION => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => false, 'shared' => false, 'with' => []], T_WHILE => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => false, 'shared' => false, 'with' => []], T_DO => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => true, 'shared' => false, 'with' => []], T_SWITCH => ['start' => [T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET], 'end' => [T_CLOSE_CURLY_BRACKET => T_CLOSE_CURLY_BRACKET], 'strict' => true, 'shared' => false, 'with' => []], T_CASE => ['start' => [T_COLON => T_COLON], 'end' => [T_BREAK => T_BREAK, T_RETURN => T_RETURN, T_CONTINUE => T_CONTINUE, T_THROW => T_THROW], 'strict' => true, 'shared' => true, 'with' => [T_DEFAULT => T_DEFAULT, T_CASE => T_CASE, T_SWITCH => T_SWITCH]], T_DEFAULT => ['start' => [T_COLON => T_COLON], 'end' => [T_BREAK => T_BREAK, T_RETURN => T_RETURN, T_CONTINUE => T_CONTINUE, T_THROW => T_THROW], 'strict' => true, 'shared' => true, 'with' => [T_CASE => T_CASE, T_SWITCH => T_SWITCH]]]

This array also contains information about what kind of token the scope opener uses to open and close the scope, if the token strictly requires an opener, if the token can share a scope closer, and who it can be shared with. An example of a token that shares a scope closer is a CASE scope.

$commentTokens

A list tokens that start and end comments.

protected array<string|int, mixed> $commentTokens = ['//' => null, '/*' => '*/', '/**' => '*/']

$eolChar

The EOL char used in the content.

protected string $eolChar = []

$numTokens

The number of tokens in the tokens array.

protected int $numTokens = 0

$stringTokens

A list string delimiters.

protected array<string|int, mixed> $stringTokens = [''' => ''', '"' => '"']

$tokens

A token-based representation of the content.

protected array<string|int, mixed> $tokens = []

$tokenValues

A list of special JS tokens and their types.

protected array<string|int, mixed> $tokenValues = ['class' => 'T_CLASS', 'function' => 'T_FUNCTION', 'prototype' => 'T_PROTOTYPE', 'try' => 'T_TRY', 'catch' => 'T_CATCH', 'return' => 'T_RETURN', 'throw' => 'T_THROW', 'break' => 'T_BREAK', 'switch' => 'T_SWITCH', 'continue' => 'T_CONTINUE', 'if' => 'T_IF', 'else' => 'T_ELSE', 'do' => 'T_DO', 'while' => 'T_WHILE', 'for' => 'T_FOR', 'var' => 'T_VAR', 'case' => 'T_CASE', 'default' => 'T_DEFAULT', 'true' => 'T_TRUE', 'false' => 'T_FALSE', 'null' => 'T_NULL', 'this' => 'T_THIS', 'typeof' => 'T_TYPEOF', '(' => 'T_OPEN_PARENTHESIS', ')' => 'T_CLOSE_PARENTHESIS', '{' => 'T_OPEN_CURLY_BRACKET', '}' => 'T_CLOSE_CURLY_BRACKET', '[' => 'T_OPEN_SQUARE_BRACKET', ']' => 'T_CLOSE_SQUARE_BRACKET', '?' => 'T_INLINE_THEN', '.' => 'T_OBJECT_OPERATOR', '+' => 'T_PLUS', '-' => 'T_MINUS', '*' => 'T_MULTIPLY', '%' => 'T_MODULUS', '/' => 'T_DIVIDE', '^' => 'T_LOGICAL_XOR', ',' => 'T_COMMA', ';' => 'T_SEMICOLON', ':' => 'T_COLON', '<' => 'T_LESS_THAN', '>' => 'T_GREATER_THAN', '<<' => 'T_SL', '>>' => 'T_SR', '>>>' => 'T_ZSR', '<<=' => 'T_SL_EQUAL', '>>=' => 'T_SR_EQUAL', '>>>=' => 'T_ZSR_EQUAL', '<=' => 'T_IS_SMALLER_OR_EQUAL', '>=' => 'T_IS_GREATER_OR_EQUAL', '=>' => 'T_DOUBLE_ARROW', '!' => 'T_BOOLEAN_NOT', '||' => 'T_BOOLEAN_OR', '&&' => 'T_BOOLEAN_AND', '|' => 'T_BITWISE_OR', '&' => 'T_BITWISE_AND', '!=' => 'T_IS_NOT_EQUAL', '!==' => 'T_IS_NOT_IDENTICAL', '=' => 'T_EQUAL', '==' => 'T_IS_EQUAL', '===' => 'T_IS_IDENTICAL', '-=' => 'T_MINUS_EQUAL', '+=' => 'T_PLUS_EQUAL', '*=' => 'T_MUL_EQUAL', '/=' => 'T_DIV_EQUAL', '%=' => 'T_MOD_EQUAL', '++' => 'T_INC', '--' => 'T_DEC', '//' => 'T_COMMENT', '/*' => 'T_COMMENT', '/**' => 'T_DOC_COMMENT', '*/' => 'T_COMMENT']

Methods

__construct()

Initialise the tokenizer.

public __construct(string $content, Config $config[, string $eolChar = '\n' ]) : void

Pre-checks the content to see if it looks minified.

Parameters
$content : string

The content to tokenize,

$config : Config

The config data for the run.

$eolChar : string = '\n'

The EOL char used in the content.

Tags
throws
TokenizerException

If the file appears to be minified.

Return values
void

getRegexToken()

Tokenizes a regular expression if one is found.

public getRegexToken(string $char, string $string, string $chars, string $tokens) : array<string, string>|null

If a regular expression is not found, NULL is returned.

Parameters
$char : string

The index of the possible regex start character.

$string : string

The complete content of the string being tokenized.

$chars : string

An array of characters being tokenized.

$tokens : string

The current array of tokens found in the string.

Return values
array<string, string>|null

getTokens()

Gets the array of tokens.

public getTokens() : array<string|int, mixed>
Return values
array<string|int, mixed>

processAdditional()

Performs additional processing after main tokenizing.

public processAdditional() : void

This additional processing looks for properties, closures, labels and objects.

Return values
void

replaceTabsInToken()

Replaces tabs in original token content with spaces.

public replaceTabsInToken(array<string|int, mixed> &$token[, string $prefix = ' ' ][, string $padding = ' ' ][, int $tabWidth = null ]) : void

Each tab can represent between 1 and $config->tabWidth spaces, so this cannot be a straight string replace. The original content is placed into an orig_content index and the new token length is also set in the length index.

Parameters
$token : array<string|int, mixed>

The token to replace tabs inside.

$prefix : string = ' '

The character to use to represent the start of a tab.

$padding : string = ' '

The character to use to represent the end of a tab.

$tabWidth : int = null

The number of spaces each tab represents.

Return values
void

tokenize()

Creates an array of tokens when given some JS code.

public tokenize(string $string) : array<string|int, mixed>
Parameters
$string : string

The string to tokenize.

Return values
array<string|int, mixed>

isMinifiedContent()

Checks the content to see if it looks minified.

protected isMinifiedContent(string $content[, string $eolChar = '\n' ]) : bool
Parameters
$content : string

The content to tokenize.

$eolChar : string = '\n'

The EOL char used in the content.

Return values
bool

createLevelMap()

Constructs the level map.

private createLevelMap() : void

The level map adds a 'level' index to each token which indicates the depth that a token within a set of scope blocks. It also adds a 'conditions' index which is an array of the scope conditions that opened each of the scopes - position 0 being the first scope opener.

Return values
void

createParenthesisNestingMap()

Creates a map for the parenthesis tokens that surround other tokens.

private createParenthesisNestingMap() : void
Return values
void

createPositionMap()

Sets token position information.

private createPositionMap() : void

Can also convert tabs into spaces. Each tab can represent between 1 and $width spaces, so this cannot be a straight string replace.

Return values
void

createScopeMap()

Creates a scope map of tokens that open scopes.

private createScopeMap() : void
Tags
see
recurseScopeMap()
Return values
void

createTokenMap()

Creates a map of brackets positions.

private createTokenMap() : void
Return values
void

recurseScopeMap()

Recurses though the scope openers to build a scope map.

private recurseScopeMap(int $stackPtr[, int $depth = 1 ], int &$ignore) : int
Parameters
$stackPtr : int

The position in the stack of the token that opened the scope (eg. an IF token or FOR token).

$depth : int = 1

How many scope levels down we are.

$ignore : int

How many curly braces we are ignoring.

Tags
throws
TokenizerException

If the nesting level gets too deep.

Return values
int

The position in the stack that closed the scope.

Search results