Python Forum

Full Version: maximum lenght of a regex-match
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
You will see three snippets, first a part of my Python-code, second a part of the text the Python works on and third some printed results. Then I describe my problem

<snippet>
else:
print('Tak1')
RegString = Rol + r'[:]?' + '[ \t\n]*' + RuweNaam + '[\n\t ]*' + '([A-z]*[:]?[\n\t ]*[A-z, ]*\n)*' + 'Rol[:]?' + '[\n\t ]*' + '(?P<R>[A-Z][a-z]*[ ]?)*\n'
# Hoe vind je de waarde in content
# print(RegString)
Reg = re.compile( RegString )
# print(Reg)
Match = Reg.search( content )
print(Match)
Role = Match.group('R').strip()
print(Role)
#
<\snippet>

<snippet>
Personen

Naam: Mulmaker, Helena
Rol: Getuige

Naam: Lutke, Dorothea
Geslacht: v
Rol: Moeder

Naam: Mulmaker, David
Geslacht: m
Rol: Vader

<\snippet>

Some results from print:
<snippet>
<_sre.SRE_Match object; span=(159, 398), match='Naam: \tMulmaker, Helena\nRol: \tGetuige\n\nNaam:>
<\snippet>
<snippet>
<_sre.SRE_Match object; span=(198, 398), match='Naam: \tLutke, Dorothea\nGeslacht: \tv\nRol: \tMo>
<\snippet>

Problem:
As you can see the match cuts off after a certain length has reached, but in the search a longer match is expected. I hope that the maximum length a match can be made longer.


Questions:
Can this maximum length been set?
Or is it a system-value that cannot been changed by users?
If so, do you have suggestions to bypass this problem?

Thanks, Maashoeven
My problem can be shown much easier as:

import re
import os
content = "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"

RegString = '[0-9]*'
Reg = re.compile(RegString)
Match = Reg.search(content)
print(Match)
The result of this program is:
Output:
<_sre.SRE_Match object; span=(0, 90), match='1234567890123456789012345678901234567890123456789>
As you can see the span and the match should be 90 characters, but in reality the match has a length of only 59 characters.

The problem I am working on can demand a length of say 250-300 characters. Is it possible to change the configuration of Python, so it can handle this demand?
what you get is the representation of Match object, not the text that you are searching for.
if you need the text, use group method of the Match object
 
import re
content = "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"

reg_string = '[0-9]*'
reg = re.compile(reg_string)
my_match = reg.search(content)
print(my_match.group())
by the way, from the representation you can see that span=(0, 90)