I have the following script
import re
w = 'KTPA 081653Z 00000KT 10SM BKN022TCU BKN040 OVC001RMK 28/23 A2990 FEW070 RMK AO2 SLP125 TCU SCT080 NW-NE T02830228'
k = re.findall("FEW\d+|SCT\d+|BKN\d+|OVC\d+", w)
print(k)
When I run it, I'm getting
Quote:['BKN022', 'BKN040', 'OVC001', 'FEW070', 'SCT080']
, but I only want the integers from it so that it looks like this
Quote:['022', '040', '001', '070', '080']
. Once I get the integers, I can sort it to get the lowest number.
How would I do that?
You'll want something more like this:
regex = re.compile("(?:FEW|BKN|OVC|SCT)(\d{3})")
The digits will be in a capture group for each match. Review the Python documentation for regex groups to access them.
I tried
re.compile("(?:FEW|BKN|OVC|SCT)(\d{3})")
and I was getting
Quote:re.compile('(?:FEW|BKN|OVC|SCT)(\\d{3})')
Then I tried
re.findall("(FEW|SCT|BKN|OVC)(\d{3})", w)
and now I'm getting
Quote:[('BKN', '022'), ('BKN', '040'), ('OVC', '001'), ('FEW', '070'), ('SCT', '080')]
How can I get only the integers from this?
(Oct-09-2019, 06:57 PM)tantony Wrote: [ -> ]How can I get only the integers from this?
import re
w = 'KTPA 081653Z 00000KT 10SM BKN022TCU BKN040 OVC001RMK 28/23 A2990 FEW070 RMK AO2 SLP125 TCU SCT080 NW-NE T02830228'
k = re.findall("(FEW|SCT|BKN|OVC)(\d{3})", w)
lst = [int(i[1]) for i in k]
print(lst)
Output:
[22, 40, 1, 70, 80]
@
snippsat, thanks that worked. So just to make sure, there's no way to get just the integers from my original regex?
k = re.findall("FEW\d+|SCT\d+|BKN\d+|OVC\d+", w)
With my original regex, I was getting
Quote:['BKN022', 'BKN040', 'OVC001', 'FEW070', 'SCT080']
The "?:" in the capture group changes it to a non-capturing group. So "(?:FEW|SCT|BKN|OVC)(\d{3})" would result in only the numbers. From what you posted, it looks like you compiled the regex using my code but didn't use it to match anything.
(Oct-09-2019, 07:45 PM)tantony Wrote: [ -> ]@snippsat, thanks that worked. So just to make sure, there's no way to get just the integers from my original regex? k = re.findall("FEW\d+|SCT\d+|BKN\d+|OVC\d+", w)
With my original regex, I was getting Quote:['BKN022', 'BKN040', 'OVC001', 'FEW070', 'SCT080']
(Oct-09-2019, 10:27 PM)stullis Wrote: [ -> ]The "?:" in the capture group changes it to a non-capturing group. So "(?:FEW|SCT|BKN|OVC)(\d{3})" would result in only the numbers. From what you posted, it looks like you compiled the regex using my code but didn't use it to match anything.
Hi!
I think that sometimes, newbies like myself, don't get straightaway what the experienced programmers here unselfishly and kindly provide as answers and advice.
Maybe, if you are also a newbie, you didn't realize that
Stullis was also pointing you out another solution, although you had to do the necessary adjustments. Here I'll show you what I think he meant (regex1), comparing it with what you had before (k):
import re
w = 'KTPA 081653Z 00000KT 10SM BKN022TCU BKN040 OVC001RMK 28/23 A2990 FEW070 RMK AO2 SLP125 TCU SCT080 NW-NE T02830228'
k = re.findall("FEW\d+|SCT\d+|BKN\d+|OVC\d+", w)
regex1 = re.findall("(?:FEW|BKN|OVC|SCT)(\d{3})", w)
print(k)
print(regex1)
and that produces the following output:
Output:
['BKN022', 'BKN040', 'OVC001', 'FEW070', 'SCT080']
['022', '040', '001', '070', '080']
All the best,