Python Forum
How to find unique and common words per line from a txt file?
Thread Rating:
  • 2 Vote(s) - 3.5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to find unique and common words per line from a txt file?
#1
I had just started with Python and run into this task:

1.Find words that exists on both even and odd lines.
2.Words that only exists on even lines
3.Words that only exists on odd lines

All punctuations and uppercase is removed so we dont need to worry about that. However it is several words on the same line

The output should look something like this

Common words on both lines: ['I', 'the', 'am', 'all', 'as', ...]
Only even lines : ['yellow', 'christmas', 'smell', ...]
Only odd lines: ['yours', 'war', 'may', 'remote', ...]

I started like

evens, odds = set(), set() 
with open('textfile.txt') as f: 
 for index, row in enumerate(f): 
   if index % 2 == 0: 
     evens.add(row.strip()) 
   else: 
           odds.add(row.strip())
How should I continue from here? Is this part correct?

Would love if someone could finish the task so I could use it as a template when I do other tasks.
Reply
#2
Did some progress I think and added the split and sorted function.

evens, odds = set(), set()
with open('textfile.txt') as f:
    for index, row in enumerate(f):
        if index % 2 == 0:
            evens.update(row.split())
        else:
            odds.update(row.split())
commons = sorted(evens & odds)
Any tips how I can find the unique words on the even and odd lines?
Reply
#3
look at sets. This would be easiest.
you can also loop over and check if each words is/is not in the other one...
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#4
Thanks a lot.

Right now it is like this
evens, odds = set(), set()
with open('shakespeare.txt') as f:
    for index, row in enumerate(f):
        if index %2 :
            evens.update(row.split())
        else:
            odds.update(row.split())


# In[2]:


commons = sorted(evens & odds)
unique_odds=set(odds-evens)
unique_evens=set(evens-odds)


# In[4]:


commons


# In[5]:


unique_odds


# In[6]:


unique_evens
How can I transpose my list to rows? How can I return the values when I run my query?

The output should look something like this

commons: ['I', 'the', 'am', 'all', 'as', ...]
unique_evens : ['yellow', 'christmas', 'smell', ...]
unique_odds: ['yours', 'war', 'may', 'remote', ...]
Reply
#5
print('commons: {}'.format(list(commons)))
that is if you want square brackets. if curly brackets are OK, you can skip convert the set to list.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#6
Thanks a lot.
Reply
#7
You can make this a little shorter
odds, even = eo = [set(), set()]
with open('shakespeare.txt') as f:
    for index, row in enumerate(f):
        eo[index % 2].update(row.strip().split())
Observe that python numbers lines from 0 while common CS and programming literature and software starts line counts at 1.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  find and group similar words with re? cartonics 4 759 Oct-27-2023, 05:36 PM
Last Post: deanhystad
  FileNotFoundError: [WinError 2] The system cannot find the file specified NewBiee 2 1,609 Jul-31-2023, 11:42 AM
Last Post: deanhystad
  File "<string>", line 19, in <module> error is related to what? Frankduc 9 12,630 Mar-09-2023, 07:22 AM
Last Post: LocklearSusan
  Cannot find py credentials file standenman 5 1,691 Feb-25-2023, 08:30 PM
Last Post: Jeff900
  selenium can't find a file in my desk ? SouAmego22 0 754 Feb-14-2023, 03:21 PM
Last Post: SouAmego22
  Getting last line of each line occurrence in a file tester_V 1 885 Jan-31-2023, 09:29 PM
Last Post: deanhystad
  Need to match two words in a line tester_V 2 888 Nov-18-2022, 03:13 AM
Last Post: tester_V
  Find (each) element from a list in a file tester_V 3 1,247 Nov-15-2022, 08:40 PM
Last Post: tester_V
  Writing string to file results in one character per line RB76SFJPsJJDu3bMnwYM 4 1,404 Sep-27-2022, 01:38 PM
Last Post: buran
  what will be the best way to find data in txt file? korenron 2 1,183 Jul-25-2022, 10:03 AM
Last Post: korenron

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020