Posts: 61
Threads: 18
Joined: Aug 2019
I have the following script
import re
w = 'KTPA 081653Z 00000KT 10SM BKN022TCU BKN040 OVC001RMK 28/23 A2990 FEW070 RMK AO2 SLP125 TCU SCT080 NW-NE T02830228'
k = re.findall("FEW\d+|SCT\d+|BKN\d+|OVC\d+", w)
print(k) When I run it, I'm getting Quote:['BKN022', 'BKN040', 'OVC001', 'FEW070', 'SCT080']
, but I only want the integers from it so that it looks like this Quote:['022', '040', '001', '070', '080']
. Once I get the integers, I can sort it to get the lowest number.
How would I do that?
Posts: 443
Threads: 1
Joined: Sep 2018
You'll want something more like this:
regex = re.compile("(?:FEW|BKN|OVC|SCT)(\d{3})") The digits will be in a capture group for each match. Review the Python documentation for regex groups to access them.
Posts: 61
Threads: 18
Joined: Aug 2019
I tried re.compile("(?:FEW|BKN|OVC|SCT)(\d{3})") and I was getting Quote:re.compile('(?:FEW|BKN|OVC|SCT)(\\d{3})')
Then I tried re.findall("(FEW|SCT|BKN|OVC)(\d{3})", w) and now I'm getting Quote:[('BKN', '022'), ('BKN', '040'), ('OVC', '001'), ('FEW', '070'), ('SCT', '080')]
How can I get only the integers from this?
Posts: 7,312
Threads: 123
Joined: Sep 2016
(Oct-09-2019, 06:57 PM)tantony Wrote: How can I get only the integers from this? import re
w = 'KTPA 081653Z 00000KT 10SM BKN022TCU BKN040 OVC001RMK 28/23 A2990 FEW070 RMK AO2 SLP125 TCU SCT080 NW-NE T02830228'
k = re.findall("(FEW|SCT|BKN|OVC)(\d{3})", w)
lst = [int(i[1]) for i in k]
print(lst) Output: [22, 40, 1, 70, 80]
Posts: 61
Threads: 18
Joined: Aug 2019
Oct-09-2019, 07:45 PM
(This post was last modified: Oct-09-2019, 07:45 PM by tantony.)
@ snippsat, thanks that worked. So just to make sure, there's no way to get just the integers from my original regex? k = re.findall("FEW\d+|SCT\d+|BKN\d+|OVC\d+", w) With my original regex, I was getting Quote:['BKN022', 'BKN040', 'OVC001', 'FEW070', 'SCT080']
Posts: 443
Threads: 1
Joined: Sep 2018
The "?:" in the capture group changes it to a non-capturing group. So "(?:FEW|SCT|BKN|OVC)(\d{3})" would result in only the numbers. From what you posted, it looks like you compiled the regex using my code but didn't use it to match anything.
Posts: 212
Threads: 25
Joined: Aug 2019
Oct-09-2019, 11:53 PM
(This post was last modified: Oct-09-2019, 11:53 PM by newbieAuggie2019.)
(Oct-09-2019, 07:45 PM)tantony Wrote: @snippsat, thanks that worked. So just to make sure, there's no way to get just the integers from my original regex? k = re.findall("FEW\d+|SCT\d+|BKN\d+|OVC\d+", w) With my original regex, I was getting Quote:['BKN022', 'BKN040', 'OVC001', 'FEW070', 'SCT080']
(Oct-09-2019, 10:27 PM)stullis Wrote: The "?:" in the capture group changes it to a non-capturing group. So "(?:FEW|SCT|BKN|OVC)(\d{3})" would result in only the numbers. From what you posted, it looks like you compiled the regex using my code but didn't use it to match anything.
Hi!
I think that sometimes, newbies like myself, don't get straightaway what the experienced programmers here unselfishly and kindly provide as answers and advice.
Maybe, if you are also a newbie, you didn't realize that Stullis was also pointing you out another solution, although you had to do the necessary adjustments. Here I'll show you what I think he meant (regex1), comparing it with what you had before (k):
import re
w = 'KTPA 081653Z 00000KT 10SM BKN022TCU BKN040 OVC001RMK 28/23 A2990 FEW070 RMK AO2 SLP125 TCU SCT080 NW-NE T02830228'
k = re.findall("FEW\d+|SCT\d+|BKN\d+|OVC\d+", w)
regex1 = re.findall("(?:FEW|BKN|OVC|SCT)(\d{3})", w)
print(k)
print(regex1) and that produces the following output:
Output: ['BKN022', 'BKN040', 'OVC001', 'FEW070', 'SCT080']
['022', '040', '001', '070', '080']
All the best,
newbieAuggie2019
"That's been one of my mantras - focus and simplicity. Simple can be harder than complex: You have to work hard to get your thinking clean to make it simple. But it's worth it in the end because once you get there, you can move mountains."
Steve Jobs
|