Posts: 5
Threads: 2
Joined: May 2021
Hello,
I am trying to shorten large strings by applying some rules to it but not being able to figure it out.
Here is the code I have tried so far w/o success.
The string in endString is my end goal using the startString sample.
The rules are added as comments below.
Thank you in advance for your help
startString = "Masonry - Concrete Block (Small) - Misc Air Layer - Insulation - Aluminium"
endString = "Msnr-CncrBlck(Smll)-MiscAirLyr-Insltn-Almnm"
#rules
#words => 4 chars are left untouched
#words > 4 chars have all lowercase vowels removed
#special chars are kept, such as parenthesis and hyphens
words = startString.split()
for index in range(len(words)):
if len(words[index]) > 4:
vowels = ('a', 'e', 'i', 'o', 'u')
for x in words[index]:
if x in vowels:
shorter = words[index].replace(x, "")
endStringByCode = "".join(shorter)
print (endStringByCode)
Posts: 56
Threads: 2
Joined: Jan 2021
Don't you mean "<= 4" in line 5?
The problem is that you are doing
shorter = words[index].replace(x, "") So let's look at what happens
Let words[index] be the string "I don't want any lower-case vowels".
The first time through the loop starting on line 15, your first match is the "o" in "don't". So shorter is set to
"I dn't want any lower-case vowels"
The next time there is a match it matches the a in want, so shorter is set to
"I don't wnt any lower-case vowels"
Do you see the problem?
What you want to do is always test shorter, not words[index]. Which is a bit tricky, because you have removed the letter, so you have to manage the range. I'm not going to write the code for your homework, but this is as much help as I will give.
Posts: 1,950
Threads: 8
Joined: Jun 2018
May-01-2021, 06:30 PM
(This post was last modified: May-01-2021, 08:55 PM by perfringo.)
No rule about spaces but still they are removed. Why?
Also:
Masonry -> Msnr but Layer -> Lyr. Why in first case y is removed but in second not? Both words are longer than 4 chars.
Concrete -> Cncr. Why t is removed?
words => 4 chars are left untouched should probably be <= otherwise rules don't make sense.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 1,358
Threads: 2
Joined: May 2019
OK, I am trusting this is not homework, which means I may be duped.
There are a number of issues. See this code, which gives what you want:
startString = "Masonry - Concrete Block (Small) - Misc Air Layer - Insulation - Aluminium"
endString = "Msnr-CncrBlck(Smll)-MiscAirLyr-Insltn-Almnm"
#rules
#words => 4 chars are left untouched
#words > 4 chars have all lowercase vowels removed
#special chars are kept, such as parenthesis and hyphens
words = startString.split()
vowels = ('a', 'e', 'i', 'o', 'u', 'y')
out_string = ''
for word in words:
if len(word) >= 4:
shorter = ''
for x in word:
if x not in vowels:
shorter = shorter + x
else:
shorter = word
out_string = out_string + shorter
print (out_string) Any time you have range(len(... there is a better way. In this case, for word in words. You don't have to mess with index at all. I moved the definition of vowels outside the loop, as doing that every time is inefficient. Rather than removing characters from the start string (btw - CamelCsae is frowned upon, better to use underscore like in out_string) I build the shorter string from acceptable characters, if the length is long enough. I added 'y' as a vowel as in your example it is to be removed. Perfringo is right, by using split you lose the spaces, but I assume that is ok?
ambrozote likes this post
Posts: 1,145
Threads: 114
Joined: Sep 2019
I'm going to put my version as well.
newwords = []
vowels = ('a', 'e', 'i', 'o', 'u')
for word in words:
if len(word) > 4:
for letter in word:
if letter in vowels:
word = word.replace(letter, '')
newwords.append(word)
else:
newwords.append(word)
print(' '.join(newwords)) Output: Msnry - Cncrt Blck (Smll) - Misc Air Lyr - Insltn - Almnm
Posts: 1,950
Threads: 8
Joined: Jun 2018
May-01-2021, 09:28 PM
(This post was last modified: May-01-2021, 09:28 PM by perfringo.)
(May-01-2021, 08:27 PM)jefsummers Wrote: See this code, which gives what you want
As conditions are ambiguous it's hard to say what is wanted output. For example: should (oye) -> (oye) or (oye) -> ()? Your code does latter but I personally think that former should be correct (three letter word in parentheses i.e 'untouched word').
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 1,358
Threads: 2
Joined: May 2019
By gives you what you want I meant that the output matches his example.
I am concerned about some other examples such as menator's that modify the item being iterated upon.
Posts: 1,145
Threads: 114
Joined: Sep 2019
I do not understand. I modified the item being iterated? I'm still learning. Could you explain please? Thanks.
Posts: 1,358
Threads: 2
Joined: May 2019
Menator - in line 5 you are iterating over word. In line 7 you modify word. So, let's say you are on the 6th letter of an 8 letter word, and you eliminate that letter. Python then moves to the 7th letter, which was the 8th letter, skipping the 7th.
Posts: 5
Threads: 2
Joined: May 2021
(May-01-2021, 08:27 PM)jefsummers Wrote: OK, I am trusting this is not homework, which means I may be duped.
There are a number of issues. See this code, which gives what you want:
startString = "Masonry - Concrete Block (Small) - Misc Air Layer - Insulation - Aluminium"
endString = "Msnr-CncrBlck(Smll)-MiscAirLyr-Insltn-Almnm"
#rules
#words => 4 chars are left untouched
#words > 4 chars have all lowercase vowels removed
#special chars are kept, such as parenthesis and hyphens
words = startString.split()
vowels = ('a', 'e', 'i', 'o', 'u', 'y')
out_string = ''
for word in words:
if len(word) >= 4:
shorter = ''
for x in word:
if x not in vowels:
shorter = shorter + x
else:
shorter = word
out_string = out_string + shorter
print (out_string) Any time you have range(len(... there is a better way. In this case, for word in words. You don't have to mess with index at all. I moved the definition of vowels outside the loop, as doing that every time is inefficient. Rather than removing characters from the start string (btw - CamelCsae is frowned upon, better to use underscore like in out_string) I build the shorter string from acceptable characters, if the length is long enough. I added 'y' as a vowel as in your example it is to be removed. Perfringo is right, by using split you lose the spaces, but I assume that is ok?
I like this approach and I can understand your explanation.
Also, thank you for your advice on best practice.
it is also ok to loose the spaces as my goal here is to get the whole lenght of the string shorter while still keeping it human readable.
Thank you very much for your help :)
|