Python Forum
Identify two specific words next to each
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Identify two specific words next to each
#1
Hi,

I'm trying to identify two specific words next to each other and remove them from a string. I though of using REPLACE, but this can work on one word.
E.G I would receive text like this
Quote:message = '''
Good Morning

We need your input please.


Vriendelike groete/ Kind regards

Badu Thusong

Direct tel: 021 974 7313 | Email:
[email protected]


Dear Boss


How are you

today

Your number

Branch Agency: Meme

Branch Agency Code: 0329271

Thank you for contacting us


Kind regards

Agriculture Contact Centre

so I want to go through it, whenever I find
Quote:Kind regards
or
Quote:Vriendelike groete/ Kind regards
I want to remove it from the message.

Any advice?
Reply
#2
Whats wrong with str.replace?

In [1]: for phrase in ['Kind regards', 'Vriendelike groete/ Kind regards']: 
   ...:     message = message.replace(phrase, '')
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#3
It removes "Kind Regards" in the first email in the text, everything below still picks "Kind Regards"
Reply
#4
My mistake - the order should be reversed, otherwise 'Kind regards' will be replaced on first loop and second phrase will not be matched (as 'Kind regards' part is already removed).

In [1]: message = '\nGood Morning\n\nWe need your input please.\n\n\nVriendelike groete/ Kind regards\n\nBadu Thusong\n\nDirect tel: 021 974 7313 | Email:\nBadu@thus
   ...: ong.com\n\n\nDear Boss\n\n\nHow are you\n\ntoday\n\nYour number\n\nBranch Agency: Meme\n\nBranch Agency Code: 0329271\n\nThank you for contacting us\n\n\nKin
   ...: d regards\n\nAgriculture Contact Centre'                                                                                                                     

In [2]: for phrase in ['Vriendelike groete/ Kind regards', 'Kind regards']: 
   ...:     message = message.replace(phrase, '') 
   ...:      
                                                                                                                                                        
In [3]: message                                                                                                                                                      
Out[3]: '\nGood Morning\n\nWe need your input please.\n\n\n\n\nBadu Thusong\n\nDirect tel: 021 974 7313 | Email:\[email protected]\n\n\nDear Boss\n\n\nHow are you\n\ntoday\n\nYour number\n\nBranch Agency: Meme\n\nBranch Agency Code: 0329271\n\nThank you for contacting us\n\n\n\n\nAgriculture Contact Centre'
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#5
I also have the same problem.
Reply
#6
My code doesn't recognize "Kind Regards" at all on it's own, it only removes it when it is next to "Vriendelike groete" :

import re
from bs4 import BeautifulSoup
import string


message = '''
Dear Sir

What do you think


Kind regards

Badu Thusong

Direct tel: 021 974 7313 | Email:
[email protected]

Good Morning

We need your input please.


Vriendelike groete/ Kind regards

Badu Thusong

Direct tel: 021 974 7313 | Email:
[email protected]


Dear Boss


How are you

today

Your number

Branch Agency: Meme

Branch Agency Code: 0329271

Thank you for contacting us


Kind regards

Agriculture Contact Centre
'''

for phrase in ['Vriendelike groete/Kind regards', 'Vriendelike groete/ Kind regards']:
    text = message.replace(phrase, '')
print(text)
Output:

Output:
Dear Sir What do you think Kind regards Badu Thusong Direct tel: 021 974 7313 | Email: [email protected] Good Morning We need your input please. Badu Thusong Direct tel: 021 974 7313 | Email: [email protected] Dear Boss How are you today Your number Branch Agency: Meme Branch Agency Code: 0329271 Thank you for contacting us Kind regards Agriculture Contact Centre
(Apr-26-2019, 07:08 AM)perfringo Wrote: My mistake - the order should be reversed, otherwise 'Kind regards' will be replaced on first loop and second phrase will not be matched (as 'Kind regards' part is already removed).

In [1]: message = '\nGood Morning\n\nWe need your input please.\n\n\nVriendelike groete/ Kind regards\n\nBadu Thusong\n\nDirect tel: 021 974 7313 | Email:\nBadu@thus
   ...: ong.com\n\n\nDear Boss\n\n\nHow are you\n\ntoday\n\nYour number\n\nBranch Agency: Meme\n\nBranch Agency Code: 0329271\n\nThank you for contacting us\n\n\nKin
   ...: d regards\n\nAgriculture Contact Centre'                                                                                                                     

In [2]: for phrase in ['Vriendelike groete/ Kind regards', 'Kind regards']: 
   ...:     message = message.replace(phrase, '') 
   ...:      
                                                                                                                                                        
In [3]: message                                                                                                                                                      
Out[3]: '\nGood Morning\n\nWe need your input please.\n\n\n\n\nBadu Thusong\n\nDirect tel: 021 974 7313 | Email:\[email protected]\n\n\nDear Boss\n\n\nHow are you\n\ntoday\n\nYour number\n\nBranch Agency: Meme\n\nBranch Agency Code: 0329271\n\nThank you for contacting us\n\n\n\n\nAgriculture Contact Centre'
Reply
#7
Doing the replacement with replace, you'll have issues with spelling/lowercase-uppercase.
You can use reular expressions for this task. (Don't use it to parse HTML).

Here a simple example:

andre@andre-GP70-2PE:~$ ipython
Python 3.7.3 (default, Apr 15 2019, 14:17:18) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.4.0 -- An enhanced Interactive Python. Type '?' for help.

   ...:   
   ...: Vriendelike groete/ Kind regards 
   ...:   
   ...: Badu Thusong 
   ...:   
   ...: Direct tel: 021 974 7313 | Email: 
   ...: [email protected] 
   ...:   
   ...:   
   ...: Dear Boss 
   ...:   
   ...:   
   ...: How are you 
   ...:   
   ...: today 
   ...:   
   ...: Your number 
   ...:   
   ...: Branch Agency: Meme 
   ...:   
   ...: Branch Agency Code: 0329271 
   ...:   
   ...: Thank you for contacting us 
   ...:   
   ...:   
   ...: Kind regards 
   ...:   
   ...: Agriculture Contact Centre 
   ...: '''                                                                                                                                   

In [2]: import re                                                                                                                             

In [3]: re.sub(r'[kK]ind [rR]egards', 'Best greetings', message)                                                                              
Out[3]: '\nDear Sir\n \nWhat do you think\n \n \nBest greetings\n \nBadu Thusong\n \nDirect tel: 021 974 7313 | Email:\[email protected]\n \nGood Morning\n \nWe need your input please.\n \n \nVriendelike groete/ Best greetings\n \nBadu Thusong\n \nDirect tel: 021 974 7313 | Email:\[email protected]\n \n \nDear Boss\n \n \nHow are you\n \ntoday\n \nYour number\n \nBranch Agency: Meme\n \nBranch Agency Code: 0329271\n \nThank you for contacting us\n \n \nBest greetings\n \nAgriculture Contact Centre\n'

In [4]: print(re.sub(r'[kK]ind [rR]egards', 'Best greetings', message))                                                                       

Dear Sir
 
What do you think
 
 
Best greetings
 
Badu Thusong
 
Direct tel: 021 974 7313 | Email:
[email protected]
 
Good Morning
 
We need your input please.
 
 
Vriendelike groete/ Best greetings
 
Badu Thusong
 
Direct tel: 021 974 7313 | Email:
[email protected]
 
 
Dear Boss
 
 
How are you
 
today
 
Your number
 
Branch Agency: Meme
 
Branch Agency Code: 0329271
 
Thank you for contacting us
 
 
Best greetings
 
Agriculture Contact Centre
The regex matches on:
  • kind regards
  • kind Regards
  • Kind regards
  • Kind Regards

I always point to regex101.com, because there you can test your regex.
There are also offline tools to check a regex. The use of regex is not very easy at the beginning, but the more you use it, the more you like it for text processing.
But never forget: Regex is not for everything a good solution.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#8
(Apr-26-2019, 07:50 AM)stahorse Wrote: My code doesn't recognize "Kind Regards" at all on it's own, it only removes it when it is next to "Vriendelike groete"

It is expected behaviour, as in your code you replace only instances of 'Vriendelike groete/Kind regards' and 'Vriendelike groete/ Kind regards'. This code can't and shouldn't replace standalone phrase 'Kind Regards'.

I suspect that this is homework. If so .replace is probably what your teachers want you to learn. If it's real life scenario then go with solution provided by DeaD_EyE. However, it beats me why would someone need to replace some (insignificant) part of string for real. Usually one must retrieve not replace data from string. Resulting string is not any better for retrieving/parsing/structuring data it contains. But it just me Smile
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#9
lol I'm not a student, this is my 4th week learning Python. I've now upgraded to looking at other people's code here at work and I'm trying to play around it. Otherwise I'm a BI Developer, trying to learn a new skill.

(Apr-26-2019, 08:39 AM)perfringo Wrote:
(Apr-26-2019, 07:50 AM)stahorse Wrote: My code doesn't recognize "Kind Regards" at all on it's own, it only removes it when it is next to "Vriendelike groete"

It is expected behaviour, as in your code you replace only instances of 'Vriendelike groete/Kind regards' and 'Vriendelike groete/ Kind regards'. This code can't and shouldn't replace standalone phrase 'Kind Regards'.

I suspect that this is homework. If so .replace is probably what your teachers want you to learn. If it's real life scenario then go with solution provided by DeaD_EyE. However, it beats me why would someone need to replace some (insignificant) part of string for real. Usually one must retrieve not replace data from string. Resulting string is not any better for retrieving/parsing/structuring data it contains. But it just me Smile
Reply
#10
(Apr-26-2019, 09:05 AM)stahorse Wrote: lol I'm not a student, this is my 4th week learning Python. I've now upgraded to looking at other people's code here at work and I'm trying to play around it. Otherwise I'm a BI Developer, trying to learn a new skill.

Enjoy your journey in wonderful world of Python!

My comment was on practical ground - only homeworks tend to accomplish something which is useless. And this is kind of homework you assigned to yourself Wink

You can try 'practical' application as well:

if 'regards' in e_mail_body.lower():
    print('Polite person')
else:
    print('Impolite or foreigner')
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  identify not white pixels in bmp flash77 17 2,270 Nov-10-2023, 09:21 PM
Last Post: flash77
  Generate a string of words for multiple lists of words in txt files in order. AnicraftPlayz 2 2,758 Aug-11-2021, 03:45 PM
Last Post: jamesaarr
  guys please help me , pycharm is not able to identify my xlsx file CrazyGreenYT7 1 1,971 Jun-13-2021, 02:22 PM
Last Post: Larz60+
  Need to identify only files created today. tester_V 5 4,554 Feb-18-2021, 06:32 AM
Last Post: tester_V
  Need to identify sheet color in excel workbook chewy1418 2 2,448 Feb-14-2020, 03:26 PM
Last Post: chewy1418
  Need help to identify Mersenne Primes, I do need a search pattern. Pleiades 0 1,891 Dec-03-2019, 11:05 PM
Last Post: Pleiades
  Delete specific lines contain specific words mannyi 2 4,069 Nov-04-2019, 04:50 PM
Last Post: mannyi
  Syntax Error : I can't identify what's wrong! caarsonr 11 6,141 Jun-10-2019, 11:18 PM
Last Post: Yoriz
Photo How to Extract Specific Words from PDFs with Python danvsv 1 4,492 Jan-17-2019, 11:07 AM
Last Post: Larz60+
  Compare all words in input() to all words in file Trianne 1 2,716 Oct-05-2018, 06:27 PM
Last Post: ichabod801

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020