Python Forum
HELP: String indices must be integers
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
HELP: String indices must be integers
#1
Good evening, I recently been studying regarding about ChatBots using the Multinomial Naive Bayes approach and I got this problem the code when collecting data from a JSON File. What am I missing?

[
    {
        "main_category": "News & Events", 
        "question": "Why did the U.S Invade Iraq ?",
        "answer": "A small group of politicians believed strongly that the fact that Saddam Hussien remained in power after the first Gulf War was a signal of weakness to the rest of the world, one that invited attacks and terrorism. Shortly after taking power with George Bush in 2000 and after the attack on 9/11, they were able to use the terrorist attacks to justify war with Iraq on this basis and exaggerated threats of the development of weapons of mass destruction. The military strength of the U.S. and the brutality of Saddam's regime led them to imagine that the military and political victory would be relatively easy."
    }, 
    {
        "main_category": "Education & Reference", 
        "question": "How to get rid of a beehive?",  
        "answer": "Call an area apiarist.  They should be able to help you and would most likely remove them at no charge in exchange for the hive.  The bees have value and they now belong to you."
    }
]
import nltk
from nltk.corpus import stopwords
from nltk.stem.lancaster import LancasterStemmer
import json

stemmer = LancasterStemmer()

intents = json.loads(open('data/intents.json', 'r').read())

training_data = []

for k, row in enumerate(intents):
    training_data.append(row['main_category'])
    training_data.append(row['question'])

# capture unique stemmed words in the training corpus
corpus_words = {}
class_words = {}

classes = list(set([a['main_category'] for a in training_data]))

for c in classes:
    class_words[c] = []
    
for data in training_data:
    # tokenize each sentence into words
    for word in nltk.word_tokenize(data['question']):
        # ignore a few things
        if word not in ["?", "'s"]:
            # stem and lowercase each word
            stemmed_word = stemmer.stem(word.lower())
            if stemmed_word not in corpus_words:
                corpus_words[stemmed_word] = 1
            else:
                corpus_words[stemmed_word] += 1
                
            class_words[data['question']].extend([stemmed_word])

# we now have each word and the number of occurances of the word in our training corpus (the word's commonality)
print ("Corpus words and counts: %s" % corpus_words)
# also we have all words in each class
print ("Class words: %s" % class_words)
Error:
Traceback (most recent call last): File "main.py", line 22, in <module> classes = list(set([a['main_category'] for a in training_data])) File "main.py", line 22, in <listcomp> classes = list(set([a['main_category'] for a in training_data])) TypeError: string indices must be integers
Reply
#2
a['main_category']
a is being called as if it is a dictionary, the error is saying that a is actually a string which can only be indexed by integers.
either a should be a dictionary or 'main_category' should be an integer.
Reply
#3
Sorry sir but I still don't get it. How do I change the line of code for that?
Reply
#4
training_data is an iterable of some kind of values, they are currently strings, are they meant to be strings?
If they are you cant index them with a['main_category'] it would have to be an integer for example a[1] for the second item.
Reply
#5
The training_data is suppose to be strings, because I'm using that to fetch that data in the JSON file. I wonder why it is not working but if I do this code, it function.

training_data.append({"main_category":"News & Events", "question":"Why did the U.S Invade Iraq ?", "answer":"A small group of politicians believed strongly that the fact that Saddam Hussien remained in power after the first Gulf War was a signal of weakness to the rest of the world, one that invited attacks and terrorism. Shortly after taking power with George Bush in 2000 and after the attack on 9/11, they were able to use the terrorist attacks to justify war with Iraq on this basis and exaggerated threats of the development of weapons of mass destruction. The military strength of the U.S. and the brutality of Saddam's regime led them to imagine that the military and political victory would be relatively easy."})
Reply
#6
Because now it contains a dictionary which can be indexed as a['main_category']
Reply
#7
So, I will use that part of the code instead of the data from the JSON file.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  TypeError: string indices must be integers hendern 2 274 Oct-02-2020, 10:16 PM
Last Post: hendern
  list indices must be integers or slices, not lists error djwilson0495 2 369 Aug-27-2020, 06:13 PM
Last Post: deanhystad
  [Solved]TypeError: list indices must be integers or slices, not str NectDz 3 746 Jun-02-2020, 08:21 AM
Last Post: DreamingInsanity
  string indices must be integers constantin01 6 600 Apr-22-2020, 10:30 AM
Last Post: buran
  TypeError: list indices must be integers or slices, not float hissonrr 3 690 Apr-19-2020, 12:02 AM
Last Post: hissonrr
  How do I convert this string back to a list of integers? donmerch 6 831 Apr-05-2020, 06:43 PM
Last Post: donmerch
  TypeError: list indices must be integers or slices, not str guilla25 1 958 Jan-08-2020, 11:20 AM
Last Post: buran
  convert integers to a string tantony 1 684 Oct-04-2019, 06:40 PM
Last Post: ichabod801
  Error while fetching data from PostgreSQL tuple indices must be integers or slices, n Sandy777 6 1,809 May-12-2019, 11:41 AM
Last Post: Sandy777
  TypeError: list indices must be integers or slices, not dict bluethundr 8 7,221 Feb-28-2019, 09:47 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020