parse tree vs just tokenizing - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: General (https://python-forum.io/forum-1.html) +--- Forum: News and Discussions (https://python-forum.io/forum-31.html) +--- Thread: parse tree vs just tokenizing (/thread-34160.html) |
parse tree vs just tokenizing - Skaperen - Jul-01-2021 i do not need a parse tree result. i only want to get a list of tokens from a string of Python source code. that means breaknig up a string into the parts that make sense for working with the code. i believe this is the lexical phase of a compilation. my code will only be doing simpler things like replacing things in a certaing context such as when other tokens are present or absent. a typical tokenization might look like: if this[x]==that[y]: where.at('this',x,y) -> ['if','this','[','x',']','==','that','[','y',']',':','where.at','(',"'this'",',','x',',','y',')',']'it does not matter if blank spaces are included or not. it does not matter if it splits around a dot or not. RE: parse tree vs just tokenizing - Gribouillis - Jul-02-2021 Simply use tokenize.tokenize()
RE: parse tree vs just tokenizing - Skaperen - Jul-03-2021 does that tokenize according to the Python language syntax? RE: parse tree vs just tokenizing - Skaperen - Jul-04-2021 i originally thought of tokenizing in a different way, more like what command would need. i wrote some code to do that back in my C days and also way back in my assembler days. so i was thinking in those terms. i haven't needed anything like that, yet, so i had no reason to look at the tokenize module. i knew it existed but hadn't had a reason to read up on it. well, now i have, and not only is it a solution to my immediate need, but it looks like something i can do many things with, including a Python oriented editor where i can have string substitution be applied to specific language parts. for example, change "foo" to "bar" in names, not in string literals. i'm playing around, tonight. RE: parse tree vs just tokenizing - Skaperen - Jul-04-2021 now if i could find code in Python to do this for the C language. RE: parse tree vs just tokenizing - Gribouillis - Jul-04-2021 Skaperen Wrote:now if i could find code in Python to do this for the C language. There is a well known Python module named pycparser for parsing the C language. It seems that it contains internally a Python module for lexical analysis of C code, written usin the PLY lexer/parser. You could have a look in this direction. |