Python Forum
Find numbers using Regex - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Find numbers using Regex (/thread-37801.html)

Pages: 1 2


Find numbers using Regex - giddyhead - Jul-24-2022

Hello,

I have an regex issues as I am seeking to find only the numbers that are attached to words:
For example:
1the
5one
5529care
30over

The following regex
\d+([A-Za-z])
finds the number to include the first letter of the word. What modifications do I need so that it will find only the numbers attached to the words. Thanks

A sample text for reference:

25 - not this number

the cow just over the moon and the sun is in 1the sky

26 - not this number

5one day is soon and soon is near take 5529care, 30over and out


RE: Find numbers using Regex - rob101 - Jul-24-2022

Hi.

Does this do what you want?

import re

string = '1the 5one 5529care 30over'

for i in range(len(string)):
    digit = re.search('[0-9]', string[i])
    if digit:
        print(f'Found digit {string[i]} at position {i}')
Output:
Found digit 1 at position 0 Found digit 5 at position 5 Found digit 5 at position 10 Found digit 5 at position 11 Found digit 2 at position 12 Found digit 9 at position 13 Found digit 3 at position 19 Found digit 0 at position 20
{edit to remove my debug line of code}


RE: Find numbers using Regex - Pedroski55 - Jul-24-2022

Maybe like this?

import re
mylist = ['1the', '5one', '5529care', '30over', '55more66']
# re.findall returns a list
for s in mylist:
    result = re.findall('[0-9]+', s)
    print(s)
    print(result)



RE: Find numbers using Regex - giddyhead - Jul-24-2022

(Jul-24-2022, 06:26 AM)rob101 Wrote: Hi.

Does this do what you want?

import re

string = '1the 5one 5529care 30over'

for i in range(len(string)):
    digit = re.search('[0-9]', string[i])
    if digit:
        print(f'Found digit {string[i]} at position {i}')
Output:
Found digit 1 at position 0 Found digit 5 at position 5 Found digit 5 at position 10 Found digit 5 at position 11 Found digit 2 at position 12 Found digit 9 at position 13 Found digit 3 at position 19 Found digit 0 at position 20
{edit to remove my debug line of code}

Hi,

Thanks for the reply with information and help. Unfortunately it finds all the numbers within. The numbers in bold below is what I am looking to get rid of. The following is a sample but contains the format of the text in one of the lists which contain numbers throughout:
Hope it clarifies. Thanks in advance.

25 - not this number

the cow just over the moon and the sun is in 1the sky

26 - not this number

5one day is soon and soon is near take 5529care, 30over and out

59 - not this number

The covers at near the back of the 59closet, and when found have them place on the each of the beds. However you see the pillow cases use the ones on the 9second shelve.
(Jul-24-2022, 07:29 AM)Pedroski55 Wrote: Maybe like this?

import re
mylist = ['1the', '5one', '5529care', '30over', '55more66']
# re.findall returns a list
for s in mylist:
    result = re.findall('[0-9]+', s)
    print(s)
    print(result)

Hey,
Thanks for the help. Unfortunately due to numbers spread throughout each list it finds all the numbers. Looking to only find the numbers attached to words in bold. I have included a sample list for reference.
Thanks

25 - not this number

the cow just over the moon and the sun is in 1the sky

26 - not this number

5one day is soon and soon is near take 5529care, 30over and out

59 - not this number

The covers at near the back of the 59closet, and when found have them place on the each of the beds. However you see the pillow cases use the ones on the 9second shelve.


RE: Find numbers using Regex - rob101 - Jul-24-2022

(Jul-24-2022, 12:11 PM)giddyhead Wrote: Hi,

Thanks for the reply with information and help. Unfortunately it finds all the numbers within. The numbers in bold below is what I am looking to get rid of. The following is a sample but contains the format of the text in one of the lists which contain numbers throughout:
Hope it clarifies. Thanks in advance.

25 - not this number

the cow just over the moon and the sun is in 1the sky

26 - not this number

5one day is soon and soon is near take 5529care, 30over and out

59 - not this number

The covers at near the back of the 59closet, and when found have them place on the each of the beds. However you see the pillow cases use the ones on the 9second shelve.

No worries. The point of my post (sorry that I wasn't clear on this) is that once you know the position of the digits, you can then use that as first step to building the rest of your script and produce whatever output you like, so more a proof of concept really.

What code do you have so far?

Maybe a better way would be to have two functions (one to find the digits and one to find everything else) and have them work together, in a 'for loop' or a 'while loop', with 'if/else' branches to process the results.

def fdigit(d):
    digit = re.search('\d',d)
    if digit:
        return 1

def fchar(c):
    char = re.search('\D',c)
    if char:
        return 1
I've no idea what you're skill level is. Is this something that you're going to be able to do, or will you need guidance?

edit: in fact one function will suffice: if the digit test fails, then there's no need for the other test.


RE: Find numbers using Regex - giddyhead - Jul-24-2022

(Jul-24-2022, 02:52 PM)rob101 Wrote:
(Jul-24-2022, 12:11 PM)giddyhead Wrote: Hi,

Thanks for the reply with information and help. Unfortunately it finds all the numbers within. The numbers in bold below is what I am looking to get rid of. The following is a sample but contains the format of the text in one of the lists which contain numbers throughout:
Hope it clarifies. Thanks in advance.

25 - not this number

the cow just over the moon and the sun is in 1the sky

26 - not this number

5one day is soon and soon is near take 5529care, 30over and out

59 - not this number

The covers at near the back of the 59closet, and when found have them place on the each of the beds. However you see the pillow cases use the ones on the 9second shelve.

No worries. The point of my post (sorry that I wasn't clear on this) is that once you know the position of the digits, you can then use that as first step to building the rest of your script and produce whatever output you like, so more a proof of concept really.

What code do you have so far?

Maybe a better way would be to have two functions (one to find the digits and one to find everything else) and have them work together, in a 'for loop' or a 'while loop', with 'if/else' branches to process the results.

def fdigit(d):
    digit = re.search('\d',d)
    if digit:
        return 1

def fchar(c):
    char = re.search('\D',c)
    if char:
        return 1
I've no idea what you're skill level is. Is this something that you're going to be able to do, or will you need guidance?

edit: in fact one function will suffice: if the digit test fails, then there's no need for the other test.

Quote:Ahhh I see got it and copy. Thanks for information. I am still learning and in need of guidance. The following code is what I have that searches for the numbers attached to the words and join them back:
 for cf in soup.findAll('div', {'class':'flex flex-auto flex-col bg-white shadow-md'}):
        
        txt = re.sub('\[(.*?)\]','',cf.text)  #Subtrack anything between brackets  
        txt2 = re.split('\d+([A-Za-z])',txt) #Look for digits only attached to words ignore them
        cmp = ''.join(txt2) #Join back minus the ignore numbers
        print('cf text',txt)



RE: Find numbers using Regex - rob101 - Jul-24-2022

I think I may have cracked this for you. If not, then I'm sure you can make any adjustments. If not, then I'm more than happy to help you.

Try it by coding in your text.

#!/usr/bin/python3

import re

def fdigit(d):
    digit = re.search('\d',d)
    if digit:
        lst_string = re.split('\d',d)
        return lst_string[-1]

string = "" # put your text in this string object

lst_string = string.split(' ')
pstring = ''

for get_word in range(len(lst_string)):
    word = lst_string[get_word]
    check = fdigit(word)
    if not check:
        pstring += word+' '
    else:
        pstring += check+' '

print(pstring)
I'm not 100% happy with my function name, now that it's doing a slightly different job to the one that it was conceived for, but it'll do.

{edit}

I Think we posted at almost the same time there. If this is any use to you, then cool; if not, maybe someone else can learn something from it. Smile


RE: Find numbers using Regex - giddyhead - Jul-24-2022

(Jul-24-2022, 06:36 PM)rob101 Wrote: I think I may have cracked this for you. If not, then I'm sure you can make any adjustments. If not, then I'm more than happy to help you.

Try it by coding in your text.

#!/usr/bin/python3

import re

def fdigit(d):
    digit = re.search('\d',d)
    if digit:
        lst_string = re.split('\d',d)
        return lst_string[-1]

string = "" # put your text in this string object

lst_string = string.split(' ')
pstring = ''

for get_word in range(len(lst_string)):
    word = lst_string[get_word]
    check = fdigit(word)
    if not check:
        pstring += word+' '
    else:
        pstring += check+' '

print(pstring)
I'm not 100% happy with my function name, now that it's doing a slightly different job to the one that it was conceived for, but it'll do.

{edit}

I Think we posted at almost the same time there. If this is any use to you, then cool; if not, maybe someone else can learn something from it. Smile


Quote:I see. thanks for the information and help Unfortunately I will not be able to modify it to suit however I appreciate what you have done. may I ask help with the existing code posted? Thanks



RE: Find numbers using Regex - rob101 - Jul-24-2022

(Jul-24-2022, 08:32 PM)giddyhead Wrote: I see. thanks for the information and help Unfortunately I will not be able to modify it to suit however I appreciate what you have done. may I ask help with the existing code posted? Thanks

Sure; ask whatever you want to.


RE: Find numbers using Regex - Pedroski55 - Jul-25-2022

Not too sure what you want to do now, maybe this??

import re

mylist = [15, '15', '1the', '5one', '5529care', '30over', '55more66', 25, '25']

pattern1 = re.compile(r'\D+') # matches non-numbers
# re.findall returns a list
for s in mylist:
    match = re.search(pattern1, str(s))    
    if match:     
        result = re.findall('[0-9]+', s)
        print(s)
        print(result)