Let's say we have a string lie this
'a b c'
(simplified example)
so we can do
spam = 'a b c'
for ch in spam.split(' '):
print(ch)
in this case str.split() will produce a list (i.e. it will generate the whole list in memory)
for same result we can do also
def chars(eggs):
for ch in eggs.split(' '):
yield ch
spam = 'a b c'
for ch in chars(spam):
print(ch)
in this case the generator function chars will not produce the list from
eggs.split()
in memory, right? it will be evaluated lazy? I started to doubt myself today answering on SO question...

This is not answer, but in corner case (if you don't mind spaces / there are no spaces) one can yield directly:
>>> def chars(word):
... yield from word
...
>>> for char in chars('abc'):
... print(char)
...
a
b
c
>>> for char in chars('a b c'):
... if char != ' ':
... print(char)
a
b
c
As I said this was simplified example. Here is link to
my answer on SO. Note also my comment under it.
If it was simply str, I would iterate directly over it (i.e. it's in the memory anyway, generally no benefit to create a generator)
I think
re.finditer()
will work for this.
>>> import re
>>>
>>> spam = 'a b c'
>>> for match in re.finditer(r'\S+', spam):
... print(match.group())
...
a
b
c
re.finditer() is an True iterator,and will not store values in memory.
Should work for more complicated cases as can write regex pattern for a lot stuff.
So
next()
and
__next__
before values get used.
>>> r = re.finditer(r'a', 'a')
>>> r
<callable_iterator object at 0x04C5FFB0>
>>> next(r)
<re.Match object; span=(0, 1), match='a'>
>>> r = re.finditer(r'a', 'a')
>>> r.__next__()
<re.Match object; span=(0, 1), match='a'>
@
snippsat, thanks, but my question is more or less theoretical (please check also the SO)
basically, I ask I ask if we have (pseudocode)
def spam():
for egg in <SOME LIST/TUPLE OBJECT HERE, e.g. returned by some function or method like str.split()>:
yield egg
does python evaluate the list/tuple when creating the generator function I.e. create the whole list in memory or it is evaluated lazy only when yield next value. I think it's the later
Only the yield will be lazy the str.split() will return a full list.
(Jun-04-2019, 08:57 PM)Yoriz Wrote: [ -> ]Only the yield will be lazy the str.split() will return a full list.
so, if that is the case, it doesn't make sense to create the generator function in this particular case
This would be lazy, no extra list created.
def chars(eggs):
for ch in eggs:
if ch != " ":
yield ch
spam = "a b c"
for ch in chars(spam):
print(ch)
Output:
a
b
c
this was my
actual case/answer on SO:
def get_addresses(input_string):
for address in input_string.split(' BEG ')[-1].split(' END ')[0].split(' '):
yield address
foo = "70D76320 BEG 701D135D 702D72FC END EAR0 00000000 0000000"
for idx, address in enumerate(get_addresses(foo)):
print(f'[{idx}]0x{address}')
they wanted alternative to using regex to extract address values between BEG and END and format them in a particular way.
and the user asked if there is performance benefit in using generator function compared to directly iterate over
foo.split(' BEG ')[-1].split(' END ')[0].split(' ')
. My comment was that
Quote:in this particular case (assuming you will not have many addresses) there is no practical difference. In general case split() will produce list in memory, while get_addresses is generator and it will not produce the whole list in the memory. In addition it makes the code more structured and allows to test the generator function separately.
Then I had second thoughts and asked here... I should have posted the actual code from the start...
:-) Anyway, thanks a lot
Here with re.finditer(),so it's still regex but an alternative way of using regex.
addresses = btInfo.group().split()
for idx in range(len(addresses)):
So here use @r0ng
split()
and
range(len(addresses)
,together with regex.
import re
foo = "70D76320 BEG 701D135D 702D72FC END EAR0 00000000 0000000"
for match in re.finditer(r'BEG\s(.*?)\s(.*?)\s', foo):
for idx,address in enumerate(iter(match.groups())):
print(f'[{idx}]0x{address}')
Output:
[0]0x701D135D
[1]0x702D72FC