Python Forum

Full Version: Help fix Regular expression
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I'm attempting to extract the 2 digit value that may appear on either side of the term 'm_m_s_e'

I tested the following expression on regex101.com and the expression works as expected and returns the numerical value I want for a vast majority of my records. I have a few that I can't locate why I have no match when it should.

Looks like I can match values if it appears after m_m_s_e but not if it appears before

The expression finds my score for this string
String1 = clear skin lesions noted mental status, cognition, and cortical functions:m_m_s_e 25/30 · mental status (including orientation to person, place, and time; recent
I get 25


but no match for this one
String2 = examination, she was alert and cooperative she scored 29/30 on the m_m_s_e language was fluent pupils were equal and reacted directly and
I'd expect 29

nor this one

String5 = the hour hand where of the same size she scored 25/30 m_m_s_e she had a cautious gait she has mild postural instability
I'd expect 25

# A function to get the MMSE score from Provider notes.
import re
def get_score(notes):
   
  #  note_search = re.search(' ([A-Za-z]+)\.', name)
   score_search =  re.search(r".*?(?:\b(\d\d)\b.*m_m_s_e|m_m_s_e.*?\b(\d\d)\b).*", notes)
    
    # If the title exists, extract and return it.
   if score_search:
        return score_search.group(2)
   return ""
Your function is returning group #2. If it matches before m_m_s_e, the match is in group #1.
you can accomplish the same task using str.split() and the in operator
data = '''clear skin lesions noted mental status, cognition, and cortical functions:m_m_s_e 25/30 · mental status (including orientation to person, place, and time; recent
20/30 blah
examination, she was alert and cooperative she scored 29/30 on the m_m_s_e language was fluent pupils were equal and reacted directly and
the hour hand where of the same size she scored 25/30 m_m_s_e she had a cautious gait she has mild postural instability 
'''

for line in data.split('\n'):
    if 'm_m_s_e' in line:
        for word in line.split():
            if '/' in word:
                print(word.split('/')[0])
Output:
25 29 25