i have a str and i want to test if it is a valid variable name in the running version of Python. is str.isalpha() sufficient to perform such a test? if not, what can do the task correctly, at least in Python version 3.8 or later?
Quote: is str.isalpha() sufficient
Don't think so!
Look
here for simple guide to Python naming conventions.
Basically: The alphabet, big or small, and names not starting with numbers are allowed, and the underscore.
A class name should start with a CAPITAL LETTER it says, so you need to know what type you are dealing with. Tweak the re.
Since I've been looking at re lately, but not yet Ninja level, you could try something like this:
import re
# e = None if there is a number following \A
e = re.compile(r'\A(?=[a-zA-Z_]+)([a-zA-Z_]+)([0-9_]+)')
# e will match s but not t, u but not v
# f will catch names beginning with _
f = re.compile(r'\A(?=[a-zA-Z_]+)([0-9_]*)([a-zA-Z_]+)([0-9_]+)')
s = 'bad_boy_2'
t = '2_bad_boy'
u = 'Bad_Boy_2'
v = '2_Bad_Boy'
w = '_2_Bad_Boy_2'
x = '__main__'
Output:
res = e.match(s)
res
<re.Match object; span=(0, 9), match='bad_boy_2'>
res = e.match(t)
res # nothing
u = 'Bad_Boy_2'
res = e.match(u)
res # <re.Match object; span=(0, 9), match='Bad_Boy_2'>
res.group(1) # 'Bad_Boy_'
res.group(2) # '2'
v = '2_Bad_Boy'
res = e.match(v)
res # nothing
I suggest to invoke Python's own tokener
from keyword import iskeyword
from tokenize import generate_tokens
from token import NAME
def is_valid_variable(name: str):
t = next(generate_tokens(lambda: name))
return t.type == NAME and t.string == name and not iskeyword(name)
if __name__ == '__main__':
print('Are these valid variable names?')
for x in ('spam', 'spa.m', 'while', 'case', '創作者'):
print(f'{x} : {is_valid_variable(x)}')
Output:
Are these valid variable names?
spam : True
spa.m : False
while : False
case : True
創作者 : True
Note: new
soft keywords are accepted as valid names.
There is
str.isidentifier()
that can by used for this task.
It check the rules that is a valid Python variable name.
So a function can be like this,also add keyword check.
from keyword import iskeyword
def valid_variable(name: str) -> bool:
return name.isidentifier() and not iskeyword(name)
# Test
print(valid_variable("name_test")) # True
print(valid_variable("2variable")) # False
print(valid_variable("_variable")) # True
print(valid_variable("v@rable")) # False
print(valid_variable("for")) # False (because 'for' is a keyword)
when i read about isidentifier() it described only working with ASCII characters. yet, i have used Unicode non-ASCII characters in variable names, so i skipped over that one. can it test all Unicode, too?
(Jun-03-2024, 10:43 PM)Skaperen Wrote: [ -> ]when i read about isidentifier() it described only working with ASCII characters. yet, i have used Unicode non-ASCII characters in variable names, so i skipped over that one. can it test all Unicode, too?
isidentifier
work for the Unicode range that are allowed(variable and function names) in Python 3.
Quote:Python 3 allows Unicode characters in variable and function names, but they must be letter characters. Non-letter characters are not allowed.
>>> Σ = 20
>>> Σ
20
>>> 'Σ'.isidentifier()
True
>>> '🙁'.isidentifier()
False
>>> Gauß = 100
>>> Gauß
100
>>> 'Gauß'.isidentifier()
True
>>> ♥ = 'love'
File "<interactive input>", line 1
♥ = 'love'
^
SyntaxError: invalid character '♥' (U+2665)
>>> '♥'.isidentifier()
False
Even if Python 3 allows for a set of Unicode characters letters names in varibles,i would say that should try to avoid it.
my project is not creating variable names, it is testing them. so, using Unicode or just ASCII is not my decision.
it gets the name as a str so it can get any Unicode. there are a lot of variant digits in Unicode, too. as long as isidentifier() is totally consistent with the source code parser over the entire range of str values, then this would be the correct function to use.