"for loop" not indexing correctly?

DeaD_EyE · (This post was last modified: Jan-24-2020, 02:31 PM by DeaD_EyE.)

The problem is strange. I haven't reproduced it, instead I go a different way.

First of all, your regex is wrong.

{LON=-\d\d.\d\d\d\d\d\d}{LAT=\d\d.\d\d\d\d\d\d}

The corrected version:

{LON=([+-]?\d{2}\.\d{6})}{LAT=([+-]?\d{2}\.\d{6})}

+ or - or no sign in front of the number. The ? means zero or one occurrence
Quantifier for \d
Escaped the dot ., otherwise it could be any char
Grouped longitude and latitude

Use regex101 to check it.

Instead of looking up the whole data in memory, you could use an iterative solution: Line by line
You could return a dict, or a list for each result.

import re


def parse(file, regex, *, to_float=False):
    with open(file) as fd: 
        for line in fd:
            match = regex.search(line)
            if match:
                lon, lat = match.group(1), match.group(2)
                if to_float:
                    lon, lat = float(lon), float(lat)
                yield {'lon': lon, 'lat': lat}


filename = "bugs_bunny2.txt"
pattern = re.compile(r"{LON=([+-]?\d{2}\.\d{6})}{LAT=([+-]?\d{2}\.\d{6})}")
for data in parse(filename, pattern):
    print(data)

Since Python 3.8, you could write one line lesser (assignment expression):

import re


def parse(file, regex, *, to_float=False):
    with open(file) as fd: 
        for line in fd:
            if match := regex.search(line):
                lon, lat = match.group(1), match.group(2)
                if to_float:
                    lon, lat = float(lon), float(lat)
                yield {'lon': lon, 'lat': lat}


filename = "bugs_bunny2.txt"
pattern = re.compile(r"{LON=([+-]?\d{2}\.\d{6})}{LAT=([+-]?\d{2}\.\d{6})}")
for data in parse(filename, pattern):
    print(data)

Output:{'lon': '-78.555550', 'lat': '39.111222'}
{'lon': '-78.555551', 'lat': '39.111223'}
{'lon': '-78.456432', 'lat': '38.999999'}
{'lon': '-78.555593', 'lat': '39.111199'}
{'lon': '-78.555594', 'lat': '39.111190'}
{'lon': '-78.555565', 'lat': '39.111191'}
{'lon': '-78.555516', 'lat': '38.111065'}

I tried this also with one and zero lines and it works as expected.
This output is without converting the str to float.

If the file is 100 TiB big, you're still able to use this code,
because it doesn't load the whole content of the file into memory.
A nice side effect of an iterative solution.

The use of re.findall requires the whole content to be in memory.
For toy applications, it's ok.

With medium data (fits on disk, but not memory) you need an iterative solution.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Matrix indexing and initialization in " for in" loop	QuintenR	2	1,897	Dec-23-2020, 05:59 PM Last Post: QuintenR
	Nested loop indexing	Morte	4	4,025	Aug-04-2020, 07:24 AM Last Post: Morte
	How to change 0 based indexing to 1 based indexing in python..??	Ruthra	2	4,557	Jan-22-2020, 05:13 PM Last Post: Ruthra
	Why doesn't my loop work correctly? (problem with a break statement)	steckinreinhart619	2	3,258	Jun-11-2019, 10:02 AM Last Post: steckinreinhart619

"for loop" not indexing correctly?

User Panel Messages

Announcements