Python Forum

Full Version: How to split at specified delimiter
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
I have below DataFrame:
input:

VA004L200E2T99
VG002K500E4T99
VG002K500E4R45
VN009K30E6T00
VA007K100E2T40

I want to split at "E" and extract the numerical part to the right & left side of "E"
Desired output:

200E2
500E4
500E4
30E6
100E2

I tried the bleo code, but it is removing the specified character.
import re
out=re.sub(r'E', '', input)
print(out)
you can split at the E, however you would have to insert the E again at some point
>>> 'VA004L200E2T99'.split('E')
['VA004L200', '2T99']
If the letter is always K or L, you could split the first again via that letter to parse out the 200; and the same for T and R. Or you could use regex, which i am horrible at.
But I need to extract my output as:
200E2
metulburr Wrote:Or you could use regex, which i am horrible at.
Regex is not difficult to learn and it is very powerful for some tasks
>>> import re
>>> regex = re.compile(
... r"\d*" # zero or more digits
... r"E"   # the letter E
... r"\d*" # again zero or more digits
... )
>>> match = regex.search('VA004L200E2T99')
>>> match.group()
'200E2'
I tried as below:

import re
regex = re.compile(r"\d*" r"\d*")
match = regex.search('VA004L200E2T99')
aa=match.group()
print(aa)
but it does not print give any output.
(Dec-27-2018, 02:41 PM)SriRajesh Wrote: [ -> ]but it does not print give any output.
He do it from interactive shell.
Like this without:
import re

regex = re.compile(r"\d*E\d*")
match = regex.search('VN009K30E6T00')
aa = match.group()
print(aa)
SriRajesh Wrote:I have below DataFrame:
When using pandas,you should not take it out of DataFrame and use regex.
pandas has build in regex(str) to deal with DataFrame.
Example:
import pandas as pd

df = pd.DataFrame(
    [['VA004L200E2T99',77, 999],
     ['VG002K500E4T99',55, 100],
     ['VN009K30E6T00',33, 9]],
     columns=['val','foo', 'bar']
)

df['val'] = df['val'].str.extract(r'(\d*E\d*)', expand=False)
print(df)
Output:
val foo bar 0 200E2 77 999 1 500E4 55 100 2 30E6 33 9