Python Forum
Extract continuous numeric characters from a string in Python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extract continuous numeric characters from a string in Python
#1
I am interested in extracting a number that appears after a set of characters ('AA='). However, the issue is I: (i) am not aware how long the number is, (ii) don't know what appears right after the number (could be a blank space or ANY character except 0-9, consider that I do not know what these characters could be but they are definitely not 0-9).

Given below are few of many inputs that I can have.

Line 1: 123 NUBA AA=1.2345 $BB=1234.55
Line 2: 123 NUBA MM AA=1.2345678&BB=1234.55
Line 3: 123 NUBA RRNJH AA=1.2#ALPHA
...
The result should be: 1.2345 1.2345678 1.2 for each respective line above.

PS: I know how to use .find and get the starting location of AA= but that is not very helpful for the above two conditions. Also, I understand one way could be to loop through each character after after AA= and break if a blank space or anything except 0-9 is seen, but that is clumsy and takes unnecessary space in my code. I am looking for a more neat way of doing this.
Reply
#2
What you want to do is pickup each character after 'AA=' as long as it's a number or a decimal point. Combine those into a string and then convert it to a float. Here is one way to go about that:

data = ['Line 1: 123 NUBA AA=1.2345 $BB=1234.55',
	'Line 2: 123 NUBA MM AA=1.2345678&BB=1234.55',
	'Line 3: 123 NUBA RRNJH AA=1.2#ALPHA']

ACCEPTIBLE = '123456789.'
aa_numbers = []

for line in data :
	temp_number_string = ''
	marker = line.index ('AA=') + 3
	while line [marker] in ACCEPTIBLE :
		temp_number_string += line [marker]
		marker += 1
	aa_numbers.append (float (temp_number_string))

print (aa_numbers)
snippsat likes this post
Reply
#3
I would usually think of regex with that description,nice way not using regex bye BashBedlam.
So something like this with a combo with compile/finditer make it faster if iterate over large amount of data.
import re

data = '''\
Line 1: 123 NUBA AA=1.2345 $BB=1234.55
Line 2: 123 NUBA MM AA=1.2345678&BB=1234.55
Line 3: 123 NUBA RRNJH AA=1.2#ALPHA'''

pattern =  re.compile(r"AA=([+-]?([0-9]*[.])?[0-9]+)")
for match in pattern.finditer(data):
    print(float(match.group(1)))

Output:
1.2345 1.2345678 1.2
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Numeric Anagrams - Count Occurances monty024 2 257 Nov-13-2021, 05:05 PM
Last Post: monty024
  How to get datetime from numeric format field klllmmm 3 358 Nov-06-2021, 03:26 PM
Last Post: snippsat
  Extract a string between 2 words from a text file OscarBoots 2 275 Nov-02-2021, 08:50 AM
Last Post: ibreeden
Question [SOLVED] Delete specific characters from string lines EnfantNicolas 4 390 Oct-21-2021, 11:28 AM
Last Post: EnfantNicolas
  seeking simple|clean|pythonic way to capture {1,} numeric clusters NetPCDoc 6 1,002 Jun-10-2021, 05:14 PM
Last Post: NetPCDoc
  How to extract specific key value pair from string? aditi06 0 868 Apr-15-2021, 06:26 PM
Last Post: aditi06
Question How to extract multiple text from a string? chatguy 2 814 Feb-28-2021, 07:39 AM
Last Post: bowlofred
  How to get continuous movement whilst using State Machine cjoe1993 2 723 Dec-10-2020, 06:36 AM
Last Post: cjoe1993
  Python win32api keybd_event: How do I input a string of characters? JaneTan 3 1,212 Oct-19-2020, 04:16 AM
Last Post: deanhystad
  extract a dictionary from a string berc 4 1,239 Jul-30-2020, 06:58 AM
Last Post: berc

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020