Python Forum
Watson Personality Insight: minimum number of words
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Watson Personality Insight: minimum number of words
#1
Hello dear forum members,

I currently have a task of running about 9,000 files through IBM Watson Personality Insight api. To do this, together with my colleague we created the following code (see below). I realize it's not perfect in any way, but it does the job. However, with small files of <100 words we get an error from the api (also see below). Could you please help me address this issue by improving the code, as my programming skills are insufficient :(( Basically, I'd like the code to skip any file that is <100 words and move on with the rest of the batch. Thank you in advance for your help :)

import re
import json
from os.path import join, dirname
from watson_developer_cloud import PersonalityInsightsV2
import csv
import sys
import glob


digits=(glob.glob('/Users/.../Desktop/Watson Analysis/2013/*.txt'))

personality_insights = PersonalityInsightsV2(
    username='......',
    password='......')


with open('/Users/.../Desktop/Watson Analysis/2013/watson_twitter_2013.csv', 'wt') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',',
                            quotechar='|', quoting=csv.QUOTE_MINIMAL)
    with open(digits[0]) as \
            personality_text:
                parse=json.dumps(personality_insights.profile(
                    text=personality_text.read()), indent=2)
    newp=re.split("}|{", parse)
    v=[ x for x in newp if "id" in x and "percentage" in x ]
    d=(re.findall(r'id": (.*?),', str(v)))
    w=(re.findall(r'percentage": (.*?),', str(v)))
    len(d)
    len(w)
    print(d)
    print(w)
    spamwriter.writerow(['doc']+['word_count']+d)
    for number in digits:
            with open(str(number)) as \
                personality_text:
                    parse=json.dumps(personality_insights.profile(
                        text=personality_text.read()), indent=2)
            newp=re.split("}|{", parse)
            v=[ x for x in newp if "id" in x and "percentage" in x ]
            d=(re.findall(r'id": (.*?),', str(v)))
            w=(re.findall(r'percentage": (.*?),', str(v)))
            len(d)
            len(w)
            print(d)
            print(w)
            wordcount=(re.findall(r'word_count": (.*?),', parse))
            spamwriter.writerow([number]+wordcount+w)


Error:
--------------------------------------------------------------------------- WatsonApiException Traceback (most recent call last) <ipython-input-3-26e015583735> in <module>() 33 with open(str(number)) as personality_text: 34 parse=json.dumps(personality_insights.profile( ---> 35 text=personality_text.read()), indent=2) 36 newp=re.split("}|{", parse) 37 v=[ x for x in newp if "id" in x and "percentage" in x ] /anaconda3/lib/python3.6/site-packages/watson_developer_cloud/personality_insights_v2.py in profile(self, text, content_type, accept, language, csv_headers) 59 response = self.request( 60 method='POST', url='/v2/profile', data=text, params=params, ---> 61 headers=headers) 62 if accept == 'application/json': 63 return response.json() /anaconda3/lib/python3.6/site-packages/watson_developer_cloud/watson_service.py in request(self, method, url, accept_json, headers, params, json, data, files, **kwargs) 446 error_info = self._get_error_info(response) 447 raise WatsonApiException(response.status_code, error_message, --> 448 info=error_info, httpResponse=response) WatsonApiException: Error: The number of words 94 is less than the minimum number of words required for analysis: 100, Code: 400 , X-dp-watson-tran-id: gateway01-2094016729 , X-global-transaction-id: 7ecac92c5ae406a47cd028d9


We made some improvements to pass on any file that is <100 words; however, it still gives the same error.

import json
from os.path import join, dirname
from watson_developer_cloud import PersonalityInsightsV2
import csv
import re
import json
from os.path import join, dirname
from watson_developer_cloud import PersonalityInsightsV2
import csv
import sys
import glob


digits=(glob.glob('/Users/.../Desktop/Test/*.txt'))

personality_insights = PersonalityInsightsV2(
    username='...',
    password='...')


with open('/Users/.../Desktop/Test/Test.csv', 'wt') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',',
                            quotechar='|', quoting=csv.QUOTE_MINIMAL)
    with open(digits[0]) as \
            personality_text:
                parse=json.dumps(personality_insights.profile(
                    text=personality_text.read()), indent=2)
    newp=re.split("}|{", parse)
    v=[ x for x in newp if "id" in x and "percentage" in x ]
    d=(re.findall(r'id": (.*?),', str(v)))
    w=(re.findall(r'percentage": (.*?),', str(v)))
    len(d)
    len(w)
    print(d)
    print(w)
    spamwriter.writerow(['doc']+['word_count']+d)
    for number in digits:
            with open(str(number)) as \
                personality_text:
                    f = open(str(number),"r")
                    string = f.read()
                    s=string.split(" ")
                    if len(s)<100:
                            pass
                    else:
                            parse=json.dumps(personality_insights.profile(
                                text=personality_text.read()), indent=2)
            newp=re.split("}|{", parse)
            v=[ x for x in newp if "id" in x and "percentage" in x ]
            d=(re.findall(r'id": (.*?),', str(v)))
            w=(re.findall(r'percentage": (.*?),', str(v)))
            len(d)
            len(w)
            print(d)
            print(w)
            wordcount=(re.findall(r'word_count": (.*?),', parse))
            spamwriter.writerow([number]+wordcount+w)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question In need of insight regarding Python file reading mechanisms. EnfantNicolas 7 2,980 Sep-18-2021, 10:39 AM
Last Post: ndc85430
  Generate a string of words for multiple lists of words in txt files in order. AnicraftPlayz 2 2,756 Aug-11-2021, 03:45 PM
Last Post: jamesaarr
  IBM Watson: Handshake status 403 Forbidden or No section: 'auth' groschat 1 2,756 May-07-2021, 03:44 PM
Last Post: jefsummers
  Counting number of words and organize for the bigger frequencies to the small ones. valeriorsneto 1 1,642 Feb-05-2021, 03:49 PM
Last Post: perfringo
  How to get indices of minimum time difference Mekala 1 2,113 Nov-10-2020, 11:09 PM
Last Post: deanhystad
  How to get index of minimum element between 3 & 8 in list Mekala 2 2,465 Nov-10-2020, 12:56 PM
Last Post: DeaD_EyE
  Finding MINIMUM number in a random list is not working Mona 5 2,985 Nov-18-2019, 07:27 PM
Last Post: ThomasL
  Delete minimum occurence in a string RavCOder 10 3,846 Nov-12-2019, 01:08 PM
Last Post: RavCOder
  Minimum size Amniote 8 3,750 Jul-10-2019, 02:58 PM
Last Post: nilamo
  Compare all words in input() to all words in file Trianne 1 2,716 Oct-05-2018, 06:27 PM
Last Post: ichabod801

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020