Python Forum

Full Version: Need help with my code (regular expression)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi there,

I am new to the world of python, sorry for my primitive question!

Here is my code
import re
import collections

txt = “MCV L +67 83.0 - 101.0 -34 fL something”
result = re.findall(r"[-|+]?([0-9]*.[0-9]+|[0-9]+)", txt)
word = re.findall(r"\s(mm|milj./µL|g/dL|%|fL|pg|/µL|something)", txt)

def get_number_of_elements(list):
count = 0
for element in list:
count += 1
return count

count = get_number_of_elements(result)

for i in range (0,count):
print (result[i])

print(“Value is”,result[0])
print (“Range is”, result[1] + " - " + result[2],word[0])

Here is my question →
I was hoping result will give me +67 and -34 but it gives me 67, 34.
Although I have added optional + or - to be recognized, it never takes that into account.
What am I missing ? Please help!
Thanks in advance
First, please put your code inside python bbcode tags (use the python button above the editor) so that indentation is preserved.

Next, the character class you create with square brackets already matches everything inside. You don't need a pipe symbol (unless you actually want to match a pipe character).

Finally your character class is matching, but it's outside your capturing parenthesis. So the result you get excludes the sign. Just move it inside the parenthesis.

I prefer using the \d for matching digits. So if you're just trying to capture any floating point numbers, I'd probably instead use:

result = re.findall(r"([-+]?\d*\.?\d+)", txt)
Thanks for help/suggestions.

On little further search, I figured following expression does the trick perfectly.
result = re.findall(r"[-+]?[0-9]*.?[0-9]+", txt)
Beware the dot. It's unescaped in your last version so it can match other characters (including spaces).

>>> re.findall(r"[-+]?[0-9]*.?[0-9]+", "5.72 +45x843 hi 32")
['5.72', '+45x843', ' 32']
>>> re.findall(r"[-+]?\d*\.?\d+", "5.72 +45x843 hi 32")
['5.72', '+45', '843', '32']
A helpful document for composing regular expressions is Python 3 reference: re — Regular expression operations.

The special characters are listed and discussed in the section on Regular Expression Syntax. This is useful for composing a tentative regular expression, and for refining it after identifying characters that need to be escaped, such as the dot in the current case.
(Apr-04-2022, 12:03 AM)bowlofred Wrote: [ -> ]Beware the dot. It's unescaped in your last version so it can match other characters (including spaces).

>>> re.findall(r"[-+]?[0-9]*.?[0-9]+", "5.72 +45x843 hi 32")
['5.72', '+45x843', ' 32']
>>> re.findall(r"[-+]?\d*\.?\d+", "5.72 +45x843 hi 32")
['5.72', '+45', '843', '32']

Great tip .... thank you Smile