May-01-2023, 06:43 PM
Hi There,
I know this may seem like a silly question because there might be 1000s of tutorials but I just can't quite figure it out.
I have a data frame and it has a column (columnA) where its an Object but its just a string of text like this:
'columnA': ['680 PACKAGES FOR SOAP NET WEIGHT: 17, 000. 00 KGS']
I want to extract NET WEIGHT: 17, 000. 00 KGS
This is what I've tried thus far:
-- Nothing works. It still shows NaN Values - I'm reading the API for re and other pandas.Series.str. doc. and I can't find something to suit my needs.
Also! Sometimes Net Weight comes In other forms like N.W.: or Net Weight; or Net Weight or Net WT: and the endings vary like KGS, KG,
I'm really not sure how further explore this.
I know this may seem like a silly question because there might be 1000s of tutorials but I just can't quite figure it out.
I have a data frame and it has a column (columnA) where its an Object but its just a string of text like this:
'columnA': ['680 PACKAGES FOR SOAP NET WEIGHT: 17, 000. 00 KGS']
I want to extract NET WEIGHT: 17, 000. 00 KGS
This is what I've tried thus far:
1 2 3 4 5 |
df[ 'Net Weight' ] = df[ 'columnA' ]. str .extract( 'NET WEIGHT: (\d+ KGS)' ) df[ 'Net Weight' ] = df[ 'columnA' ]. str .extract( 'NET WEIGHT:? (\d+ KGS)' ) df[ 'Net Weight' ] = df[ 'columnA' ]. str .extract( 'NET\sWEIGHT:\s?(\d+\.?\d*\sKGS)' ) df[ 'Net Weight' ] = df[ 'columnA' ]. apply ( lambda x: re.search(r 'NET\sWEIGHT:\s([\d,]+\.\d+\sKGS)' , x).group( 1 ) if re.search(r 'NET\sWEIGHT:\s([\d,]+\.\d+\sKGS)' , x) else None ) |
Also! Sometimes Net Weight comes In other forms like N.W.: or Net Weight; or Net Weight or Net WT: and the endings vary like KGS, KG,
I'm really not sure how further explore this.