Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 How to extract data between two strings
#1
Hi,
I have input data as below,
input.xlsx:

Group	 Name	                  Rank
Group1	 ABC_YJK_02_S_2019-08-01	2
	     ABC_YMK_5_S_2019-08-01	    5
	     ABC_JKL_S_2019-08-04	    10
Group2	 BCA_POL_S_2019-08-01	    4
	     BCA_PAL_S_2019-08-01	    5
	     BCA_TYP_S_2019-08-01	    50
	     BCA_DIST_S_2019-08-01	    23
	     BCA_STA_S_2019-08-01	    3

I name column, I only want to delete everything before the first _(underline) including underline, and remove everything after _S including _S

Then I want to write the output into out.xlsx

I use the below code, but it is not working. giving below error:
TypeError: expected string or bytes-like object



import pandas as pd
import re

df = pd.read_excel('D:\pivotdata2.xlsx',sheetname='merge')
#df['Group']=df['Group'].fillna(method='ffill')

df.to_excel('D:\writepivotdata.xlsx',index=False) 

result=[]
for index, row in df.iterrows():
    result_tmp=re.search('_ (.*?)_',row)
    result.append(result_tmp)
Quote
#2
you can use Regular Expressions (regex), but I'm not as skilled as I would like to be.

Alternatively you can use the following
line = "Group1   ABC_YJK_02_S_2019-08-01    2"
beginning, end = line.find('_'), line.find('_S_')
result = line[beginning+1 : end]
print(result)
or

table = [
"Group1   ABC_YJK_02_S_2019-08-01    2",
"         ABC_YMK_5_S_2019-08-01     5",
"         ABC_JKL_S_2019-08-04       10",
"Group2   BCA_POL_S_2019-08-01       4",
"         BCA_PAL_S_2019-08-01       5",
"         BCA_TYP_S_2019-08-01       50",
"         BCA_DIST_S_2019-08-01      23",
"         BCA_STA_S_2019-08-01       3"]

n = len(table)

resultsTable = []

for i in range(n):
    beginning, end = table[i].find('_'), table[i].find('_S_')
    result = table[i][beginning+1 : end]
    resultsTable.append(result)
Paul
Quote
#3
I want to replace resultsTable with

Group    Name      Rank
Group1   YJK_02    2
         YMK_5     5
         JKL       10
Group2   POL       4
         PAL       5
         TYP       50
         DIST      23
         STA       3
I use below code:
table = [
"Group1   ABC_YJK_02_S_2019-08-01    2",
"         ABC_YMK_5_S_2019-08-01     5",
"         ABC_JKL_S_2019-08-04       10",
"Group2   BCA_POL_S_2019-08-01       4",
"         BCA_PAL_S_2019-08-01       5",
"         BCA_TYP_S_2019-08-01       50",
"         BCA_DIST_S_2019-08-01      23",
"         BCA_STA_S_2019-08-01       3"]
 
n = len(table)
 
resultsTable = []
 
for i in range(n):
    beginning, end = table[i].find('_'), table[i].find('_S_')
    result = table[i][beginning+1 : end]
    resultsTable.append(result)
import pandas as pd
resultsTable['final_name']=pd.DataFrame(resultsTable)
writer = pd.ExcelWriter('output.xlsx')
df.to_excel(writer,'Sheet1')
df.to_excel(writer,'Sheet2')
writer.save()
Getting below error:
TypeError: list indices must be integers or slices, not str
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  fit each group and extract coefficients Progressive 1 264 Jul-20-2019, 08:20 AM
Last Post: scidam
  How to extract different data groups from multiple CSV files using python Rafiz 3 279 Jun-04-2019, 05:20 PM
Last Post: jefsummers
  Extract of matrix subpart using a deep copy paul18fr 2 229 May-02-2019, 06:49 AM
Last Post: paul18fr
  Creating matrix counting words in list of strings jazmad 2 487 Dec-23-2018, 05:47 PM
Last Post: jazmad
  OpenCV - extract 1st frame out of a video file kerzol81 2 1,377 Nov-12-2018, 09:12 AM
Last Post: kerzol81
  Result is doutfull- fuzzywuzzy process.extract method klllmmm 0 1,040 Jul-13-2018, 01:38 AM
Last Post: klllmmm
  How to extract only the last result from a code Python_TLS 1 501 Jun-27-2018, 10:01 AM
Last Post: gontajones
  How to extract previous and current rows if their value is equal SriRajesh 5 838 Jun-18-2018, 08:09 PM
Last Post: volcano63
  How to extract only time from the date_time? Jack_Sparrow 1 770 May-11-2018, 01:42 PM
Last Post: Larz60+
  Upload csv file as numbers (floating?) and extract element, row, and column bentaz 7 1,127 Mar-19-2018, 05:34 PM
Last Post: bentaz

Forum Jump:


Users browsing this thread: 1 Guest(s)