Skip to content

Lexer confusion about operators #3066

@ericmorand

Description

@ericmorand

Consider the following template:

{{in}}

When lexed, here is what is returned:

VAR_START_TYPE()
NAME_TYPE(in)
VAR_END_TYPE()
EOF_TYPE()

Now consider the following one:

{{in }}

When lexed, here is what is returned:

VAR_START_TYPE()
OPERATOR_TYPE(in)
VAR_END_TYPE()
EOF_TYPE()

As you can see, in the latter case, ìn is recognized as an operator, while in the former it is a name. The lexer is not able to distinguish an operator from a variable name. It is confused by formatting characters (in the second template, the before the }}) that are not supposed to be relevant inside blocks:

{{ foo.bar }} is lexically identical to {{foo.bar}} in Twig, like {% foo %} and {%foo%}.

More generally, the lexer is not very robust when it comes to operators. It is not predictable when the lexer will find an operator token or a name token:

{% for in in in %} is tokenized into:

BLOCK_START_TYPE()
NAME_TYPE(for)
OPERATOR_TYPE(in)
OPERATOR_TYPE(in)
OPERATOR_TYPE(in)
BLOCK_END_TYPE()
EOF_TYPE()

While the first and last in actually are variable names.

{{ in.in }} is tokenized into:

VAR_START_TYPE()
NAME_TYPE(in)
PUNCTUATION_TYPE(.)
OPERATOR_TYPE(in)
VAR_END_TYPE()
EOF_TYPE()

While the lexically identical template {{in.in}} is tokenized into:

VAR_START_TYPE()
NAME_TYPE(in)
PUNCTUATION_TYPE(.)
NAME_TYPE(in)
VAR_END_TYPE()
EOF_TYPE()

I can't find the official lexical specs of the language - I assume it is an internal document at Symfony's, thus I can't be sure that this is the expected behavior. But from an external point of view, this makes the lexer not very robust and not quite what is expected from a syntactic analyzer tool.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions