Python Forum
Help with my code! Pls and thank you its due soon
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Help with my code! Pls and thank you its due soon
#1
My professor gave us a all-tweets.zip with 12 files but I couldn't import or open it correctly. It says 'not found' so I opened it individually but my code is still incorrect. Can you help fix my code with simple code, nothing advanced bc my professor wouldn't accept anything we haven't learned. Can you fixed it and copy and paste the whole code bc I'm new and I'm still confused when people give me instructions to changing my original code.

My code:

def cleanedup(s):
    alphabet='abcdefghijklmnopqrstuvwxyz@_0123456789'
    cleantext= ''
    for character in s.lower():
           if character in alphabet:
                 cleantext+= character
           else:
                 cleantext+=''
    return cleantext

counts=()
with open('amyschumer.tweets',encoding='utf=8') as lines:
    with open('aoc.tweets',encoding='utf=8') as lines:
        with open('BarackObama.tweets',encoding='utf=8') as lines:
            with open('BillGates.tweets',encoding='utf=8') as lines:
                with open('doctorow.tweets',encoding='utf=8') as lines:
                    with open('espn.tweets',encoding='utf=8') as lines:
                        with open('ID_AA_Carmack.tweets',encoding='utf=8') as lines:
                            with open('justinbieber.tweets',encoding='utf=8') as lines:
                                with open('Kaepernick7.tweets',encoding='utf=8') as lines:
                                    with open('ladygaga.tweets',encoding='utf=8') as lines:
                                        with open('nytimes.tweets',encoding='utf=8') as lines:
                                            with open('rihanna.tweets',encoding='utf=8') as lines:
                                                for line in lines:
                                                    for word in cleanedup(line).split():
                                                             if word in counts:
                                                                 counts[word]+=1
                                                             else:
                                                                 counts[word]=1

mentionedWords=[]

for word in counts:
    if word[-3:]=='@':
        mentionedWords.append([counts[word],word])

mentionedWords.sort()
print(mentionedWords[-5:])
The task:

We will write a program for finding the most frequently mentioned usernames in each of the provided files. To correctly identify mentions, we have to cleanup each tweet, keeping only letters, digits, and symbols @ and _. After each tweet is cleaned up, we have to go through its words, and if the word starts with @, it is a mention.

Step-by-step:
Modify function cleanedup so that it keeps not only letters, but also digits 0123456789 and symbols @ and _

Write a new function findMentions that takes a filename as a parameter and reports 3 usernames most frequently mentioned in that file. The function should create a dictionary of counts for all username mentions (words starting with @). After reading through the file and accumulating the counts for all mentioned usernames, use the dictionary to create a list like this:

[[15, '@alice'], [20, '@bob'], [7, '@carol'], ... ]
Use sort to sort the above list and print out 3 most frequently mentioned usernames.

Check each file in the current folder (using os.listdir('.')), if the file name ends with .tweets, call findMentions on the file to find its most frequent mentions.

Example output
If you copy all provided .tweets files in the folder with your script, its output will look as follows:

nytimes.tweets
@caityweaver 3
@nytmag 5
@nytparenting 5

justinbieber.tweets
@applemusic 15
@theellenshow 15
@skrillex 20

aoc.tweets
@rashidatlaib 5
@ayannapressley 6
@ilhanmn 9

espn.tweets
@nba 21
@thecheckdown 29
@kingjames 32

rihanna.tweets
@rihanna 21
@savagexfenty 29
@fentybeauty 48

amyschumer.tweets
@bridgeteverett 14
@rachelfeinstein 15
@comedycentral 49

ladygaga.tweets
@ahsfx 10
@btwfoundation 11
@applemusic 13

BillGates.tweets
@theeconomist 11
@warrenbuffett 15
@melindagates 18

BarackObama.tweets
@ofa 5
@vp 5
@michelleobama 9

ID_AA_Carmack.tweets
@boztank 3
@JoeRogan 3
@elonmusk 5

Kaepernick7.tweets
@mikailsprice 26
@darthkaepernick 28
@kaepernick7 138

doctorow.tweets
@cbc 3
@doctorow 3
@sensanders 7
Reply
#2
No one is going to do the assignment for you. That's not how this forum works.

Why don't you follow the step-by-step instructions provided with the assignment? You need to write a function findMentions as described. You need to unzip the files you were provided into a single folder, and use os.listdir() to work with those files. And so on.

Your current code is re-using the variable name lines for each file you open, so you are really only working with the last opened file ('rihanna.tweets') when you start your loop on line 24. If you create a function as instructed, you can send one filename at a time to that function and get the results you need.
Reply
#3
My code so far:
def cleanedup(s):
    alphabet='abcdefghijklmnopqrstuvwxyz@_0123456789'
    cleantext= ''
    for character in s.lower():
           if character in alphabet:
                 cleantext+= character
           else:
                 cleantext+=''
    return cleantext

counts={}
with open('all-tweets.zip') as lines:
    for line in lines:
        for word in cleanedup(line).split():
            if word in counts:
                counts[word]+=1
            else:
                counts[word]=1


import os

def contains(filename, pattern):
    with open(filename) as file:
        for line in file:
            if pattern in line:
                return True
    return False

for filename in os.listdir('.'):
    if contains(filename,'@'):
        print(filename,'contains @')


mentionedWords=[]

for word in counts:
    if word[-3:]=='@':
        mentionedWords.append([counts[word],word])

mentionedWords.sort()
print(mentionedWords[-5:])


import os

path='.'
for filename in os.listdir(path):
    print(filename)
My professor gave us an all-tweets.zip file to download. Inside it contains 12 tweet files.
Please help me with my errors but by using simple coding, nothing too advanced because my professor doesn't want us to skip ahead with stuff he hasn't taught. Suggest me ways to fix my code and let me know where I went wrong.

Task:
We will write a program for finding the most frequently mentioned usernames in each of the provided files. To correctly identify mentions, we have to cleanup each tweet, keeping only letters, digits, and symbols @ and _. After each tweet is cleaned up, we have to go through its words, and if the word starts with @, it is a mention.

Step-by-step:
Modify function cleanedup so that it keeps not only letters, but also digits 0123456789 and symbols @ and _

Write a new function findMentions that takes a filename as a parameter and reports 3 usernames most frequently mentioned in that file. The function should create a dictionary of counts for all username mentions (words starting with @). After reading through the file and accumulating the counts for all mentioned usernames, use the dictionary to create a list like this:

[[15, '@alice'], [20, '@bob'], [7, '@carol'], ... ]
Use sort to sort the above list and print out 3 most frequently mentioned usernames.

Check each file in the current folder (using os.listdir('.')), if the file name ends with .tweets, call findMentions on the file to find its most frequent mentions.

If done correctly this is the output:
If you copy all provided .tweets files in the folder with your script, its output will look as follows:

nytimes.tweets
@caityweaver 3
@nytmag 5
@nytparenting 5

justinbieber.tweets
@applemusic 15
@theellenshow 15
@skrillex 20

aoc.tweets
@rashidatlaib 5
@ayannapressley 6
@ilhanmn 9

espn.tweets
@nba 21
@thecheckdown 29
@kingjames 32

rihanna.tweets
@rihanna 21
@savagexfenty 29
@fentybeauty 48

amyschumer.tweets
@bridgeteverett 14
@rachelfeinstein 15
@comedycentral 49

ladygaga.tweets
@ahsfx 10
@btwfoundation 11
@applemusic 13

BillGates.tweets
@theeconomist 11
@warrenbuffett 15
@melindagates 18

BarackObama.tweets
@ofa 5
@vp 5
@michelleobama 9

ID_AA_Carmack.tweets
@boztank 3
@JoeRogan 3
@elonmusk 5

Kaepernick7.tweets
@mikailsprice 26
@darthkaepernick 28
@kaepernick7 138

doctorow.tweets
@cbc 3
@doctorow 3
@sensanders 7
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020