Problem processing items in list

PythonNewbee · (This post was last modified: Nov-07-2020, 01:05 PM by Larz60+.)

I'm trying to write code that analyzes tickets logged with our IT service desk.
I'm trying to find the reasons users log incidents. To do this I have an excel sheet which have records of several incidents and requests logged. I read this into a pandas dataframe. I then use the "Title" field of the dataframe to analyse the tickets. I clean the text and tokenize it with nltk.
Once I have a cleaned up list of ticket titles I iterate through the list of titles, looking for specific keywords. For every keyword I have a counter which is incremented when the keyword is found.
The code executes correctly, without any errors; but for some reason the code does not process the entire list. I cannot find why the code misses some keywords in the list and hope someone with a fresh eye could look at it and spot the mistake?

Here is the list I analyze:

Output:
vdescription_tokens = ['unable attach pdf word doc via facebook twitter', 'digital agent unable send new email', 'ivr routing concern', 'call genesys tool', 'help lockdown version', 'knighten mukuru unable login genesys pure cloud', 'social interaction routing iw ison sdh', 'static silent call genesys', 'call center agent line issue', 'genesys concern', 'pure cloud report scheduler', 'genesys cloud gotv email query routing', 'call comming', 'africa multichoice africa callcabinet alert smtp', 'genesys workspace error call destination invalid', 'query stuck twitter queue', 'genesys user able logon purecloud', 'investigate edge device', 'genesys purecloud interaction routing cloud', 'genesys purecloud interaction stuck queue routing cloud', 'genesys pure cloud query queue counting shift openning october', 'genesys concern', 'genesys email intercactions routing', 'genesys genesysout service', 'randburg pure cloud interaction search', 'interaction routing genesys', 'parameter', 'gotv', 'genesys service', 'genesysy cloud call center service level dropping agent idle status', 'purecloud login failure', 'eb primary gen server', 'eb south africa genesys unable answer call', 'purecloud login failure', 'unable extract interaction genesys', 'genesys social interaction routing', 'ivr audio upload', 'purecloud login failure', 'genesys silent call affecting inbound', 'genesys interaction routing agent', 'purecloud login failure', 'genesys dstv', 'email interaction coming opening hour', 'pure cloud issue', 'lusaka ebrahim kayabwe contact center closed urgent', 'genesys cloud agent mapping', 'genesys pure cloud query get stuck agent que power go october', 'genesys cloud email stuck', 'genesys call centre line', 'genesys auto answer', 'africa multichoice africa callcabinet alert smtp', 'query stuck twitter queue', 'genesys user able logon purecloud', 'genesys purecloud interaction stuck queue routing cloud', 'genesys pure cloud query queue counting shift openning october', 'call cabinet computer updating call call cabinet', 'genesys email intercactions routing', 'interaction routing genesys', 'genesys interaction routing', 'purecloud login failure', 'purecloud login failure', 'unable extract interaction genesys', 'call answer user chanilda vilanculos', 'genesys social interaction routing', 'genesys pure cloud automatic call handling unavailable', 'purecloud login failure', 'genesys interaction routing agent', 'purecloud login failure', 'genesys dstv', 'email interaction coming opening hour', 'pure cloud issue', 'lusaka ebrahim kayabwe contact center closed urgent', 'genesys pure cloud query get stuck agent que power go october', 'genesys cloud email stuck']

I process the list with the following code:

for i in range(len(vdescription_tokens)):
    if i < len(vdescription_tokens):
        if (vdescription_tokens[i].find('login') > -1):
            login_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('logon') > -1:
            login_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('spam') > -1:
            spam_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('unable answer call') > -1:
            call_qlty_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('static') > -1:
            call_qlty_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('silent') > -1:
            call_qlty_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('report') > -1:
            report_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('service level') > -1:
            report_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('tool') > -1:
            tools_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('email') > -1:
            email_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('stuck') > -1:
            routing_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('routing') > -1:
            routing_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('edge device') > -1:
            hw_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('facebook') > -1:
            social_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('twitter') > -1:
            social_counter+=1
            vdescription_tokens.pop(i)
        elif vdescription_tokens[i].find('social') > -1:
            social_counter+=1
            vdescription_tokens.pop(i)
        else:
            other_counter+=1

call_reason_dict.update({'Hardwre':hw_counter,'Interaction Routing':routing_counter,'Social Media':social_counter,'Email':email_counter,'Spam':spam_counter,'Passwords':login_counter,'Call Quality':call_qlty_counter,'Tools':tools_counter,'Reports':report_counter,'Other':other_counter})

As you can see, I remove an item from the list when a match is found. The reason for this is that I want to see what is left so I can add keywords to search for.

However, when the code finished executing the following list remains:

Output:
vdescription_tokens = ['digital agent unable send new email', 'call genesys tool', 'help lockdown version', 'social interaction routing iw ison sdh', 'call center agent line issue', 'genesys concern', 'genesys cloud gotv email query routing', 'call comming', 'africa multichoice africa callcabinet alert smtp', 'genesys workspace error call destination invalid', 'genesys user able logon purecloud', 'genesys purecloud interaction routing cloud', 'genesys pure cloud query queue counting shift openning october', 'genesys concern', 'genesys genesysout service', 'randburg pure cloud interaction search', 'parameter', 'gotv', 'genesys service', 'purecloud login failure', 'eb primary gen server', 'purecloud login failure', 'unable extract interaction genesys', 'ivr audio upload', 'genesys silent call affecting inbound', 'purecloud login failure', 'genesys dstv', 'pure cloud issue', 'lusaka ebrahim kayabwe contact center closed urgent', 'genesys cloud agent mapping', 'genesys cloud email stuck', 'genesys call centre line', 'genesys auto answer', 'africa multichoice africa callcabinet alert smtp', 'genesys user able logon purecloud', 'genesys pure cloud query queue counting shift openning october', 'call cabinet computer updating call call cabinet', 'interaction routing genesys', 'purecloud login failure', 'unable extract interaction genesys', 'call answer user chanilda vilanculos', 'genesys pure cloud automatic call handling unavailable', 'genesys interaction routing agent', 'genesys dstv', 'pure cloud issue', 'lusaka ebrahim kayabwe contact center closed urgent', 'genesys cloud email stuck']

This remainder of the list still contains keywords I was searching for. So it looks like the list is not properly processed.

Can anyone point out my mistake for me?

Larz60+ write Nov-07-2020, 01:05 PM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.

I added for you this time. Please use bbcode tags in future posts. Thank You

**Gribouillis** · Nov-07-2020, 04:24 PM

There are many issues with this code. The problem that you describe comes from modifying the list while iterating upon it. Let us simplify things by supposing that the list is

['foo bar', 'baz qux', 'login quux', 'static corge', 'grault garply']

When i=2, the 'login quux' is found and removed from the list. On the next iteration, one has i=3 but 'static corge' is never found because its index is now 2 instead of 3, so your algorithm could work if you started from the end of the list with

for i in range(len(vdescription_tokens) - 1, -1, -1): ...

Another obvious issue is that the code repeats itself a lot and it needs to be refactored to avoid this litany of if, elif, elif...

**Gribouillis** · Nov-07-2020, 08:41 PM

Here is an example of refactoring

sieves = [
    ('login', 'login'),
    ('logon', 'login'),
    ('spam', 'spam'),
    ('unable answer call', 'call_qlty'),
    ('static', 'call_qlty'),
    ('silent', 'call_qlty'),
    ('report', 'report'),
    ('service level', 'report'),
    ('tool', 'tools'),
    ('email', 'email'),
    ('stuck', 'routing'),
    ('routing', 'routing'),
    ('edge device', 'hw'),
    ('facebook', 'social'),
    ('twitter', 'social'),
    ('social', 'social'),
    # 'other'
]

score = { counter: 0 for (_, counter) in sieves }
score['other'] = 0
rest = []
for x in vdescription_tokens:
    for keyword, counter in sieves:
        if keyword in x:
            score[counter] += 1
            break
    else:
        score['other'] += 1
        rest.append(x)
print(score)
print(rest)

Output:{'social': 1, 'email': 8, 'login': 11, 'spam': 0, 'report': 2, 'other': 31, 'routing': 16, 'tools': 1, 'hw': 1, 'call_qlty': 3}
['help lockdown version', 'call center agent line issue', 'genesys concern', 'call comming', 'africa multichoice africa callcabinet alert smtp', 'genesys workspace error call destination invalid', 'genesys pure cloud query queue counting shift openning october', 'genesys concern', 'genesys genesysout service', 'randburg pure cloud interaction search', 'parameter', 'gotv', 'genesys service', 'eb primary gen server', 'unable extract interaction genesys', 'ivr audio upload', 'genesys dstv', 'pure cloud issue', 'lusaka ebrahim kayabwe contact center closed urgent', 'genesys cloud agent mapping', 'genesys call centre line', 'genesys auto answer', 'africa multichoice africa callcabinet alert smtp', 'genesys pure cloud query queue counting shift openning october', 'call cabinet computer updating call call cabinet', 'unable extract interaction genesys', 'call answer user chanilda vilanculos', 'genesys pure cloud automatic call handling unavailable', 'genesys dstv', 'pure cloud issue', 'lusaka ebrahim kayabwe contact center closed urgent']

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	List processing speed	Brettr	2	3,325	Jul-13-2018, 09:56 AM Last Post: Brettr

Problem processing items in list

User Panel Messages

Announcements