Python Forum
How to properly format rows and columns in excel data from parsed .txt blocks
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to properly format rows and columns in excel data from parsed .txt blocks
#4
I think your pattern is wrong for finding blocks. I think it should look like this. Note that I left out some of the fields to shorten this post.
import re
import pandas as pd

pattern = re.compile(
    r"^(.*)\n"
    r"^Status: (.*)\n"
    r"^Category: (.*)\n"
    r"^Description: (.*)\n",
    flags=re.MULTILINE)

with open("data.txt", "r") as file:
    text = "".join(file)

columns = ["Item", "Status", "Category", "Description"]
print(pd.DataFrame.from_records(re.findall(pattern, text), columns=columns))
I made a dummy file with some valid and invalid blocks and extra fluff to ignore.
Output:
ItemName 1 Status: Status Item 1 Category: Category Item 1 Description: Description Text 1 extra stuff ItemName 2 Category: Order is wrong Status: Status Item 2 Description: Description Text 2 extra stuff ItemName 3 Status: Status Item 3 Category: Category Item 3 Sub-Category: Extra field Description: Description Text 3 extra stuff ItemName 4 Statis: Spelling error Category: Category Item 4 Description: Description Text 4 extra stuff ItemName 5 Status: Status Item 5 Category: Category Item 5 Description: Description Text 5 extra stuff
When I run the program it finds the two valid blocks.
Output:
Item Status Category Description 0 ItemName 1 Status Item 1 Category Item 1 Description Text 1 1 ItemName 5 Status Item 5 Category Item 5 Description Text 5
jh67 likes this post
Reply


Messages In This Thread
RE: How to properly format rows and columns in excel data from parsed .txt blocks - by deanhystad - Dec-10-2022, 08:51 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Excel isnt working properly after python function is started IchNar 2 325 May-01-2024, 06:43 PM
Last Post: IchNar
  RSA Cipher with blocks Paragoon2 0 531 Nov-26-2023, 04:35 PM
Last Post: Paragoon2
  Export data from PDF as tabular format zinho 5 781 Nov-11-2023, 08:23 AM
Last Post: Pedroski55
  how do you style data frame that has empty rows. gsaray101 0 548 Sep-08-2023, 05:20 PM
Last Post: gsaray101
  Copy data from Excel and paste into Discord (Midjourney) Joe_Wright 4 2,150 Jun-06-2023, 05:49 PM
Last Post: rajeshgk
  Reading data from excel file –> process it >>then write to another excel output file Jennifer_Jone 0 1,153 Mar-14-2023, 07:59 PM
Last Post: Jennifer_Jone
  Converting a json file to a dataframe with rows and columns eyavuz21 13 4,739 Jan-29-2023, 03:59 PM
Last Post: eyavuz21
  Comparing two columns with same value but different font format doug2019 1 749 Jan-08-2023, 02:58 PM
Last Post: Larz60+
  (Python) Pulling data from UA Google Analytics with more than 100k rows into csv. Stockers 0 1,274 Dec-19-2022, 11:11 PM
Last Post: Stockers
  Extracting Data into Columns using pdfplumber arvin 17 5,750 Dec-17-2022, 11:59 AM
Last Post: arvin

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020