Skip to main content

Module tokens

Module tokens 

Source
Expand description

A Token is – conceptually – either a control sequence, or a pair of a character and a CategoryCode. In practice, we use CommandCode instead, which omits “impossible” codes (e.g. Invalid or Comment) and adds internal ones (e.g. Primitive or Argument).

The “canonical” way to represent a Token is StandardToken, which is an enum with two variants. However, since Tokens are read, processed, inspected, passed around, stored and retrieved extremely often, and in the most performance-critical parts of the engine, their representation matters. In particular, we want them to be small and ideally Copy, which excludes representing control sequences as strings; hence the generic CSName type and CompactToken as a significantly more efficient representation.

Modules§

control_sequences
A control sequence is a Token of the (usually) form \foo. We can just use strings to represent them, but there is room for optimization by e.g. interning them, which requires some more infrastructure to intern and resolve them.
token_lists

Structs§

CompactToken
A compact representation of a Token with Char==u8 and CS==InternedCSName as a single u32 (similar to the way plain TeX does it) – i.e. it is small and Copy, which yields a significant performance improvement in the most performance critical parts of the code.
DisplayToken

Enums§

StandardToken
The simplest (but not most efficient) way to represent a Token as an enum.

Traits§

Token
Trait for Tokens, to be implemented for an engine (see above). Note that two Space tokens are always considered equal.