Nov-05-2023, 02:54 PM
Hello
I try little to work with python and process mining. so i try to create a file from a text with 4 columns case id , name, process and time but my problem is that my code put it on same column on csv - excel file wich i dont want it. I want to put them on 4 different columns and same the titles.
I try little to work with python and process mining. so i try to create a file from a text with 4 columns case id , name, process and time but my problem is that my code put it on same column on csv - excel file wich i dont want it. I want to put them on 4 different columns and same the titles.
import re import pandas as pd # Sample text paragraph (replace with your actual text) text_paragraph = """ Character: Maria Case1 - 2023-11-01 09:00 AM: Started the process Character: George Case2 - 2023-11-01 10:30 AM: Joined the project Character: Maria Case1 - 2023-11-01 11:45 AM: Continued working Character: George Case2 - 2023-11-01 12:15 PM: Left for a meeting """ # Initialize variables to store event data event_data = { 'Case ID': [], 'Character': [], 'Process': [], 'Time': [] } # Use regular expressions to extract character, case ID, process, and time information event_pattern = r"(Character: (.+)|Case(\d+) - (\d{4}-\d{2}-\d{2} \d{2}:\d{2} [APM]{2}): (.+))" matches = re.findall(event_pattern, text_paragraph) current_character = None for match in matches: character, case_id, timestamp, process = match[1], match[2], match[3], match[4] if character: current_character = character else: event_data['Character'].append(current_character) event_data['Case ID'].append(case_id) event_data['Time'].append(timestamp) event_data['Process'].append(process) # Create a DataFrame from the event data df = pd.DataFrame(event_data) # Save the DataFrame as a CSV file df.to_csv('process_mining_data_4_columns.csv', index=False)