Jun-05-2022, 06:21 PM
(This post was last modified: Jun-05-2022, 06:25 PM by Gribouillis.)
The closest thing I can think about in Python is the lexer in the PLY module, where tokens are also specified as regular expressions. As David Beazley explains in the documentation, ply sorts the regexes by order of decreasing length to define priority. Instead of building a DFA as logos seems to do, ply builds an master regex and invokes the re module.
Of course, you cannot expect a blazingly fast lexer in Python as you would in Rust. Apart from the regex sorting part, logos reminds me of the venerable flex from C, and I guess it has similar performances. For most uses, however, the lexer is usually not a bottleneck.
Of course, you cannot expect a blazingly fast lexer in Python as you would in Rust. Apart from the regex sorting part, logos reminds me of the venerable flex from C, and I guess it has similar performances. For most uses, however, the lexer is usually not a bottleneck.