Nov-05-2023, 02:54 PM
Hello
I try little to work with python and process mining. so i try to create a file from a text with 4 columns case id , name, process and time but my problem is that my code put it on same column on csv - excel file wich i dont want it. I want to put them on 4 different columns and same the titles.
I try little to work with python and process mining. so i try to create a file from a text with 4 columns case id , name, process and time but my problem is that my code put it on same column on csv - excel file wich i dont want it. I want to put them on 4 different columns and same the titles.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
import re import pandas as pd # Sample text paragraph (replace with your actual text) text_paragraph = """ Character: Maria Case1 - 2023-11-01 09:00 AM: Started the process Character: George Case2 - 2023-11-01 10:30 AM: Joined the project Character: Maria Case1 - 2023-11-01 11:45 AM: Continued working Character: George Case2 - 2023-11-01 12:15 PM: Left for a meeting """ # Initialize variables to store event data event_data = { 'Case ID' : [], 'Character' : [], 'Process' : [], 'Time' : [] } # Use regular expressions to extract character, case ID, process, and time information event_pattern = r "(Character: (.+)|Case(\d+) - (\d{4}-\d{2}-\d{2} \d{2}:\d{2} [APM]{2}): (.+))" matches = re.findall(event_pattern, text_paragraph) current_character = None for match in matches: character, case_id, timestamp, process = match[ 1 ], match[ 2 ], match[ 3 ], match[ 4 ] if character: current_character = character else : event_data[ 'Character' ].append(current_character) event_data[ 'Case ID' ].append(case_id) event_data[ 'Time' ].append(timestamp) event_data[ 'Process' ].append(process) # Create a DataFrame from the event data df = pd.DataFrame(event_data) # Save the DataFrame as a CSV file df.to_csv( 'process_mining_data_4_columns.csv' , index = False ) |