Python Forum
Regular Expression - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Regular Expression (/thread-13746.html)



Regular Expression - rzbddm - Oct-30-2018

I am learning Python3 and Regular Expressions. I'm self-learning as this is for work. Much of my day is consumed with data and I can no longer work in Excel as my files can be very large (100MB or larger). I would like if someone could recommend a python 3 regular expression which will do three things. (1) Delete all characters between < and >, to include the, (comma) following the >. (2) Delete the word Done and all the (,) which follow Done. (3) Delete the word Skipped and all the (,) commas which follow it as well.

Change FROM:
<UUT><H s='19' v='3.0'/>,<V t='s' s='2'/>Profile,Production,<V t='s' s='2'/>Cycle,Normal,<V t='s' s='2'/>PMVer,14.0.1.103,<V t='s' s='2'/>SeqFileVer,2.0.0.1,<V t='s' s='2'/>User,0000010011688,<V t='s' s='2'/>Station,TS-0421A,<V t='s' s='2'/>Socket,0,<V t='s' s='2'/>Date,09-27-2018,<V t='s' s='2'/>Time,06:52:00,<V t='n' s='2'/>CycleTime,1100.366,<V t='s' s='2'/>Status,Passed,<V t='s' s='2'/>WorkOrder,70524831,<V t='s' s='2'/>MRPConfigurationString,R3A1200SS132113100000000000000,<V t='s' s='2'/>BedModelNumber,P7900B000011,<V t='s' s='2'/>BedSerialNumber,01008877619851621118092621T269PF9594,<V t='s' s='2'/>TestControl_TestType,Production,<V t='s' s='2'/>TestControl_TestCycle,Initial,<V t='s' s='2'/>TestControl_RepairCode,,<V t='s' s='2'/>TestControl_RepairStr,,<R s='517'/>,<S t='a' s='3'/>SetRTEConfig,Done,,<S t='a' s='3'/>LogTestertName,Passed,,<S t='s' c='IgnoreCase' s='5'/>{LogTestertName}LogTesterName,"TS-0421A",Passed,"TS-0421A",,<S t='a' s='3'/>SetVoltage0,Done,,<S t='a' s='3'/>OutputOff,Done,,<S t='a' s='3'/>UsbExtensionStartUp,Done,,<S t='a' s='3'/>StoreScannedInforForReport,Passed,,<S t='a' s='3'/>{StoreScannedInforForReport}Record Bed Order Number,Done,70524831,<S t='a' s='3'/>{StoreScannedInforForReport}Record Bed Configuration String,Done,R3A1200SS132113100000000000000,<S t='a' s='3'/>{StoreScannedInforForReport}Record Bed Model Number,Done,P7900B000011,<S t='a' s='3'/>{StoreScannedInforForReport}Record Bed Serial Number,Done,01008877619851621118092621T269PF9594,<S t='a' s='3'/>{StoreScannedInforForReport}SetBedConfigAllInfo,Done,,<S t='a' s='3'/>InitializeSystemVariables,Done,,<S t='a' s='3'/>VerifyValidBedConfigurationString,Passed,,<S t='a' s='3'/>{VerifyValidBedConfigurationString}SetResult,Done,,<S t='n' c='EQ' s='7'/>{VerifyValidBedConfigurationString}ConfigLength,30,Passed,30,,,,<S t='a' s='3'/>{VerifyValidBedConfigurationString}InsertDefaultNodes,Done,,<S t='a' s='3'/>{VerifyValidBedConfigurationString}InsertDCB,Skipped,,<S t='a' s='3'/>{VerifyValidBedConfigurationString}InsertACB,Done,,

Change To:
Profile,Production,Cycle,Normal,PMVer,14.0.1.103,SeqFileVer,2.0.0.1,User,10011688,Station,TS-0421A,Socket,0,Date,9/27/2018,Time,6:52:00,CycleTime,1100.366,Status,Passed,WorkOrder,70524831,MRPConfigurationString,R3A1200SS132113100000000000000,BedModelNumber,P7900B000011,BedSerialNumber,01008877619851621118092621T269PF9594,TestControl_TestType,Production,TestControl_TestCycle,Initial,TestControl_RepairCode,,TestControl_RepairStr,,,SetRTEConfig,LogTestertName,Passed,,{LogTestertName}LogTesterName,TS-0421A,Passed,TS-0421A,,SetVoltage0,OutputOff,UsbExtensionStartUp,StoreScannedInforForReport,Passed,,{StoreScannedInforForReport}Record Bed Order Number,70524831,{StoreScannedInforForReport}Record Bed Configuration String,R3A1200SS132113100000000000000,{StoreScannedInforForReport}Record Bed Model Number,P7900B000011,{StoreScannedInforForReport}Record Bed Serial Number,01008877619851621118092621T269PF9594,{StoreScannedInforForReport}SetBedConfigAllInfo,InitializeSystemVariables,VerifyValidBedConfigurationString,Passed,,{VerifyValidBedConfigurationString}SetResult,{VerifyValidBedConfigurationString}ConfigLength,30,Passed,30,{VerifyValidBedConfigurationString}InsertDefaultNodes,{VerifyValidBedConfigurationString}InsertDCB,{VerifyValidBedConfigurationString}InsertACB,

Tony


RE: Regular Expression - nilamo - Oct-30-2018

What have you tried so far?
We won't just write code, we're here to help people learn Python. So show us what you've got, and we'll help along the way.

For trying out regexes and seeing what works, I recommend https://www.regexpal.com/. You can dump some sample text, and get immediate feedback on what your regex would match.


RE: Regular Expression - rzbddm - Oct-30-2018

Sorry to violate your rules. Thank you for the URL where I can experiment. I'll be back. Thanks.


RE: Regular Expression - nilamo - Oct-30-2018

You didn't violate anything :)
We'd just rather help you learn to fish, than hand you a basket of fish.


RE: Regular Expression - stranac - Oct-30-2018

Looks like you just want to strip some XML tags.
Maybe an XML parser such as lxml would be a better fit?