Python Forum
Regex text file to store data in list
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Regex text file to store data in list
#1
Hi all!

I have the following sample text which are Finite Element Analysis results:

The first column are X,Y and z normal stress, XY, YZ and ZX are shear stress and A,B and C are principal stress.

I would like to use Regex to search a file for this pattern (the file is unstructred with only parts of the file on this format, to my knowlege regex is the best option when the overall data is unstructured)

text = '''
X-1.30779E+01    XY-1.26471E+01      A 3.00940E+01
Y 2.63890E+01    YZ 7.83649E-04      B 5.98331E-01      
Z 5.96212E-01    ZX 2.04834E-01      C-1.67851E+01    
 
X-1.53833E+01    XY-4.23500E+00      A 2.50320E+01    
Y 2.45882E+01    YZ-1.64653E-02      B-4.95968E-01
Z-4.96026E-01    ZX 3.55515E-02      C-1.58271E+01   
'''
I have written the following regex:

normalPattern = r'\s[XYZ].\d.\d{5}E.\d{2}\s'

shearPattern = r'(\s(XY|YZ|ZX).\d.\d{5}E.\d{2}\s)'

principalPattern = r'\s[ABC].\d.\d{5}E.\d{2}\s'

reg1 = re.findall(normalPattern,text)
 
reg2 = re.findall(shearPattern,text)

reg3 = re.findall(principalPattern,text)
Which produce the following results:

Output:
reg1 Out[173]: ['\nX-1.30779E+01 ', '\nY 2.63890E+01 ', '\nZ 5.96212E-01 ', '\nX-1.53833E+01 ', '\nY 2.45882E+01 ', '\nZ-4.96026E-01 '] reg2 Out[174]: [(' XY-1.26471E+01 ', 'XY'), (' YZ 7.83649E-04 ', 'YZ'), (' ZX 2.04834E-01 ', 'ZX'), (' XY-4.23500E+00 ', 'XY'), (' YZ-1.64653E-02 ', 'YZ'), (' ZX 3.55515E-02 ', 'ZX')] reg3 Out[175]: [' A 3.00940E+01\n', ' B 5.98331E-01 ', ' C-1.67851E+01 ', ' A 2.50320E+01 ', ' B-4.95968E-01\n', ' C-1.58271E+01 ']
My question:

1) for Reg1 and Reg3 I get the "\n" in some cases, how can I exclude them?

2) for Reg2 for some reason I get an extra "XY", "YZ" and "ZX" how can I exlude them?

Thank you!

Regards
Siggi
Reply
#2
1 is because you've specified that whitespace should be the first bit of the captured item. Even if you did want this, you probably don't want the actual data. So you could either use capturing parentheses and exclude it, or use lookbehind.

But for here, I'd prefer specifying a word boundary. If the first string in the file was one of these but had no whitespace before it, the regex would fail. So maybe something like:

normalPattern = r'\b([XYZ])(.\d.\d{5}E.\d{2})\b'
The \b will match both whitespace and nothing at all being next to the element. The two sets of parentheses give you easy access to the identifier and the number string which can then be fed into float() to give you a number.

2 is not that you're getting extra stuff, you're getting a tuple back instead of a string. shearPattern has capturing parentheses. You have one set around the whole pattern (that's the first element), and you have one set just around the identifier at the front (that's the second element).
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Help with to check an Input list data with a data read from an external source sacharyya 3 402 Mar-09-2024, 12:33 PM
Last Post: Pedroski55
  search file by regex SamLiu 1 904 Feb-23-2023, 01:19 PM
Last Post: deanhystad
  Split pdf in pypdf based upon file regex standenman 1 2,074 Feb-03-2023, 12:01 PM
Last Post: SpongeB0B
Thumbs Up Need to compare the Excel file name with a directory text file. veeran1991 1 1,111 Dec-15-2022, 04:32 PM
Last Post: Larz60+
  store all variable values into list and insert to sql_summary table mg24 3 1,130 Sep-28-2022, 09:13 AM
Last Post: Larz60+
  read a text file, find all integers, append to list oldtrafford 12 3,515 Aug-11-2022, 08:23 AM
Last Post: Pedroski55
  find some word in text list file and a bit change to them RolanRoll 3 1,519 Jun-27-2022, 01:36 AM
Last Post: RolanRoll
  python-docx regex: replace any word in docx text Tmagpy 4 2,216 Jun-18-2022, 09:12 AM
Last Post: Tmagpy
  Modify values in XML file by data from text file (without parsing) Paqqno 2 1,652 Apr-13-2022, 06:02 AM
Last Post: Paqqno
  How to store the resulting Doc objects into a list named A xinyulon 1 1,893 Mar-08-2022, 11:49 PM
Last Post: bowlofred

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020