Python Forum

Full Version: help with a script that adds docstrings and type hints to other scripts
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,

I've built a script, initially for my own use but I've made it public on github in case anyone else can use it, to go through python scripts and add type hints, docstrings and inline comments. It first runs black on the input file, then uses ast to ingest and understand the structure, calls on anthropic claude to generate the docstrings, type hints and comments, then re-assembles the file, saves it and runs black on it again.

I'm pretty pleased with the results, but one thing is still eluding me. It sometimes, but not all the time, gets function decorators wrong. It might indent the @decorator differently than the function or nested function. It might insert whitespace between the decorator and the function signature. Etc.

I find it useful as is -- I just have to edit the results carefully and fix the docstrings and sometimes indentation levels for functions. Especially nested functions. So it isn't bulletproof like, for example, black seems to be.

I've been beating my head against the wall trying to figure it out, but am just seeing the code crosseyed at this point, and each change I introduce seems to cause more trouble than benefit. If anyone is curious about the tool and would be willing to help, it would be wonderful.

The project is public and MIT open source. It's at https://github.com/rickbunker/BetterPython

Thanks.
(Jul-26-2024, 03:00 PM)rickbunk Wrote: [ -> ]Hi,

I've built a script, initially for my own use but I've made it public on github in case anyone else can use it, to go through python scripts and add type hints, docstrings and inline comments. It first runs black on the input file, then uses ast to ingest and understand the structure, calls on anthropic claude to generate the docstrings, type hints and comments, then re-assembles the file, saves it and runs black on it again.

I'm pretty pleased with the results, but one thing is still eluding me. It sometimes, but not all the time, gets function decorators wrong. It might indent the @decorator differently than the function or nested function. It might insert whitespace between the decorator and the function signature. Etc.

I find it useful as is -- I just have to edit the results carefully and fix the docstrings and sometimes indentation levels for functions. Especially nested functions. So it isn't bulletproof like, for example, black seems to be.

I've been beating my head against the wall trying to figure it out, but am just seeing the code crosseyed at this point, and each change I introduce seems to cause more trouble than benefit. If anyone is curious about the tool and would be willing to help, it would be wonderful.

The project is public and MIT open source. It's at https://github.com/rickbunker/basketball stars

Thanks.
Hi, Rick.

This looks like a wonderful tool, and I'm glad to see you share it with the community! Automating the addition of type hints, docstrings, and inline comments saves a lot of effort, and I'm confident that many users will find it useful.

I understand the frustration of getting decorators correct; they can be tricky with indentation and spacing. Have you considered utilizing a more structured technique to parse and reassemble the functions with decorators? There may be some quirks in how decorators are handled in the AST that can assist you improve that area of your script.

I'd love to look at your GitHub repository and see if I can help. Keep up the great work—this is an excellent contribution to open source!
When you parse the code using ast, you can use ast.NodeVisitor or ast.NodeTransformer to traverse and modify the AST (Abstract Syntax Tree) in a structured way. This can help ensure that decorators and their associated functions are handled consistently.

For example, when visiting a FunctionDef node, you can check for decorators and ensure they are properly aligned with the function definition.

import ast

class DecoratorFixer(ast.NodeTransformer):
    def visit_FunctionDef(self, node):
        # Ensure decorators are properly aligned
        for decorator in node.decorator_list:
            # Fix indentation or spacing issues here
            pass
        return self.generic_visit(node)
When reassembling the code from the AST, ensure that you preserve the original whitespace and formatting as much as possible. This can be tricky, but libraries like astor (now replaced by ast.unparse in Python 3.9+) can help.

If you're using ast.unparse, so be aware that it might not always preserve formatting perfectly. You might need to manually adjust the output to ensure decorators are correctly aligned.

Since you're already using black, you can rely on it to handle the final formatting. However, black might not always fix decorator-related issues if the AST is not correctly structured.

However, if ast is not giving you enough control over formatting, consider using libcst. It provides more granular control over code formatting and might help you handle decorators and indentation more precisely.

import libcst as cst

class DecoratorFixer(cst.CSTTransformer):
    def leave_FunctionDef(self, original_node, updated_node):
        # Fix decorator formatting here
        return updated_node