remove vowels in word with conditional

ambrozote · May-01-2021, 05:55 PM

Hello,

I am trying to shorten large strings by applying some rules to it but not being able to figure it out.

Here is the code I have tried so far w/o success.
The string in endString is my end goal using the startString sample.
The rules are added as comments below.

Thank you in advance for your help

startString = "Masonry - Concrete Block (Small) - Misc Air Layer - Insulation - Aluminium"
endString = "Msnr-CncrBlck(Smll)-MiscAirLyr-Insltn-Almnm"

#rules
#words => 4 chars are left untouched
#words > 4 chars have all lowercase vowels removed
#special chars are kept, such as parenthesis and hyphens

words = startString.split()

for index in range(len(words)):
    if len(words[index]) > 4:
        vowels = ('a', 'e', 'i', 'o', 'u')
        for x in words[index]:
            if x in vowels:
                shorter = words[index].replace(x, "")

endStringByCode = "".join(shorter)
print (endStringByCode)

supuflounder · May-01-2021, 06:04 PM

Don't you mean "<= 4" in line 5?

The problem is that you are doing

shorter = words[index].replace(x, "")

So let's look at what happens

Let words[index] be the string "I don't want any lower-case vowels".
The first time through the loop starting on line 15, your first match is the "o" in "don't". So shorter is set to
"I dn't want any lower-case vowels"
The next time there is a match it matches the a in want, so shorter is set to
"I don't wnt any lower-case vowels"
Do you see the problem?

What you want to do is always test shorter, not words[index]. Which is a bit tricky, because you have removed the letter, so you have to manage the range. I'm not going to write the code for your homework, but this is as much help as I will give.

**perfringo** · (This post was last modified: May-01-2021, 08:55 PM by perfringo.)

No rule about spaces but still they are removed. Why?

Also:

Masonry -> Msnr but Layer -> Lyr. Why in first case y is removed but in second not? Both words are longer than 4 chars.

Concrete -> Cncr. Why t is removed?

words => 4 chars are left untouched should probably be <= otherwise rules don't make sense.

jefsummers · May-01-2021, 08:27 PM

OK, I am trusting this is not homework, which means I may be duped.
There are a number of issues. See this code, which gives what you want:

startString = "Masonry - Concrete Block (Small) - Misc Air Layer - Insulation - Aluminium"
endString = "Msnr-CncrBlck(Smll)-MiscAirLyr-Insltn-Almnm"
 
#rules
#words => 4 chars are left untouched
#words > 4 chars have all lowercase vowels removed
#special chars are kept, such as parenthesis and hyphens
 
words = startString.split()
vowels = ('a', 'e', 'i', 'o', 'u', 'y')
out_string = ''

for word in words:
    if len(word) >= 4:
        shorter = ''
        for x in word:
            if x not in vowels:
                shorter = shorter + x
    else:
        shorter = word
        
    out_string = out_string + shorter

print (out_string)

Any time you have range(len(... there is a better way. In this case, for word in words. You don't have to mess with index at all. I moved the definition of vowels outside the loop, as doing that every time is inefficient. Rather than removing characters from the start string (btw - CamelCsae is frowned upon, better to use underscore like in out_string) I build the shorter string from acceptable characters, if the length is long enough. I added 'y' as a vowel as in your example it is to be removed. Perfringo is right, by using split you lose the spaces, but I assume that is ok?

menator01 · May-01-2021, 09:21 PM

I'm going to put my version as well.

newwords = []
vowels = ('a', 'e', 'i', 'o', 'u')
for word in words:
    if len(word) > 4:
        for letter in word:
            if letter in vowels:
                word = word.replace(letter, '')
        newwords.append(word)
    else:
        newwords.append(word)
print(' '.join(newwords))

Output:
Msnry - Cncrt Blck (Smll) - Misc Air Lyr - Insltn - Almnm

**perfringo** · (This post was last modified: May-01-2021, 09:28 PM by perfringo.)

(May-01-2021, 08:27 PM)jefsummers Wrote: See this code, which gives what you want

As conditions are ambiguous it's hard to say what is wanted output. For example: should (oye) -> (oye) or (oye) -> ()? Your code does latter but I personally think that former should be correct (three letter word in parentheses i.e 'untouched word').

jefsummers · May-01-2021, 09:36 PM

By gives you what you want I meant that the output matches his example.

I am concerned about some other examples such as menator's that modify the item being iterated upon.

menator01 · May-01-2021, 09:52 PM

I do not understand. I modified the item being iterated? I'm still learning. Could you explain please? Thanks.

jefsummers · May-02-2021, 12:43 PM

Menator - in line 5 you are iterating over word. In line 7 you modify word. So, let's say you are on the 6th letter of an 8 letter word, and you eliminate that letter. Python then moves to the 7th letter, which was the 8th letter, skipping the 7th.

ambrozote · May-02-2021, 02:43 PM

(May-01-2021, 08:27 PM)jefsummers Wrote: OK, I am trusting this is not homework, which means I may be duped.
There are a number of issues. See this code, which gives what you want:
startString = "Masonry - Concrete Block (Small) - Misc Air Layer - Insulation - Aluminium"
endString = "Msnr-CncrBlck(Smll)-MiscAirLyr-Insltn-Almnm"
 
#rules
#words => 4 chars are left untouched
#words > 4 chars have all lowercase vowels removed
#special chars are kept, such as parenthesis and hyphens
 
words = startString.split()
vowels = ('a', 'e', 'i', 'o', 'u', 'y')
out_string = ''

for word in words:
    if len(word) >= 4:
        shorter = ''
        for x in word:
            if x not in vowels:
                shorter = shorter + x
    else:
        shorter = word
        
    out_string = out_string + shorter

print (out_string)
Any time you have range(len(... there is a better way. In this case, for word in words. You don't have to mess with index at all. I moved the definition of vowels outside the loop, as doing that every time is inefficient. Rather than removing characters from the start string (btw - CamelCsae is frowned upon, better to use underscore like in out_string) I build the shorter string from acceptable characters, if the length is long enough. I added 'y' as a vowel as in your example it is to be removed. Perfringo is right, by using split you lose the spaces, but I assume that is ok?

I like this approach and I can understand your explanation.
Also, thank you for your advice on best practice.
it is also ok to loose the spaces as my goal here is to get the whole lenght of the string shorter while still keeping it human readable.

Thank you very much for your help :)

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	How to remove unwanted images and tables from a Word file using Python?	rownong	2	968	Feb-04-2025, 08:30 AM Last Post: Pedroski55
	Problem: Check if a list contains a word and then continue with the next word	Mangono	2	3,800	Aug-12-2021, 04:25 PM Last Post: palladium
	counting vowels in a string	project_science	3	6,872	Dec-30-2020, 06:44 PM Last Post: buran
	Counting vowels in a string (PyBite #106)	Drone4four	4	3,579	Jul-07-2020, 05:29 AM Last Post: theknowshares
	Python Speech recognition, word by word	AceScottie	6	18,922	Apr-12-2020, 09:50 AM Last Post: vinayakdhage
	Remove a sentence if it contains a word.	lokhtar	6	8,580	Feb-11-2020, 04:43 PM Last Post: stullis
	Cannot Remove the Double Quotes on a Certain Word (String) Python BeautifulSoup	soothsayerpg	5	9,745	Oct-27-2019, 09:53 AM Last Post: newbieAuggie2019
	print a word after specific word search	evilcode1	8	6,702	Oct-22-2019, 08:08 AM Last Post: newbieAuggie2019
	difference between word: and word[:] in for loop	zowhair	2	4,728	Mar-03-2018, 07:24 AM Last Post: zowhair
	no vowels function	alex_bgtv	6	6,431	Jan-01-2018, 08:48 PM Last Post: alex_bgtv

remove vowels in word with conditional

User Panel Messages

Announcements