You need to research what file.read() does. You need to research what a lot of Python commands do. An easy way to do this is run interactive Python.
I made a file that looks like this:
Output:
one
two
three
In a terminal I run python without specifying a .py file to start interactive python. Then I typed some commands and looked at what happened.
Output:
>>> file = open("test.txt", "r")
>>> lines = file.read()
>>> print(type(lines))
<class 'str'>
>>> print(lines)
one
two
three
>>> print(repr(lines))
'one\ntwo\nthree\n'
file.read() returned the contents of the file as a single string.
I could try replacing part of the string.
Output:
>>> new_lines = lines.replace("two", "")
>>> print(new_lines)
one
three
>>> print(repr(new_lines))
'one\n\nthree\n'
That didn't work the way I wanted because I left in the newline after the word.
Output:
>>> new_lines = lines.replace("two\n", "")
>>> print(new_lines)
one
three
>>> print(repr(new_lines))
'one\nthree\n'
That worked better. I don't like it, but it did work.
Next I am going to try file.readlines().
Output:
>>> file.close()
>>> file = open("test.txt", "r")
>>> lines = file.readlines()
>>> file.close()
>>> print(lines)
['one\n', 'two\n', 'three\n']
Now I get a list. I can remove an item from a list.
Output:
>>> new_lines = lines.remove("two")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: list.remove(x): x not in list
The problem is I entered "two", but the item in the list is "two\n".
Output:
>>> new_lines = lines.remove("two\n")
>>> print(new_lines)
None
>>> print(lines)
['one\n', 'three\n']
What? Why None? I lookup the list.remove() function and see that the function removes the item from the list, modifying the list. In Python this is often referred to as "in place", meaning that a mutable object is modified by the function instead of creating a new object with different values. in place functions almost always return None.
Notice that "two\n" is missing from list when I print list.
But you probably don't want to type in "two\n" when you want to remove "two". To fix that problem, you need to remove the newline character from each of the strings.
Output:
>>> file = open("test.txt", "r")
>>> lines = file.readlines()
>>> file.close()
>>> print(lines)
['one\n', 'two\n', 'three\n']
>>> lines = [line.strip() for line in lines]
>>> print(lines)
['one', 'two', 'three']
>>> lines.remove("two")
>>> lines
['one', 'three']
I keep forgetting that interactive python will print values without having to use a print command.
To remove the newline characters from each of the "lines", I used a list comprehension. Think of it as a shorthand way of writing a for loop. These are roughly equivalent.
1 2 3 4 |
for index, line in enumerate (lines):
lines[index] = line.strip()
lines = [line.strip() for line in lines]
|
You might also want to remove words that use different case.
Output:
>>> file = open("test.txt", "r")
>>> lines = [line.strip() for line in file.readlines()]
>>> lines
['One', 'Two', 'Three']
>>> lines = [line for line in lines if line.lower() != "two"]
>>> lines
['One', 'Three']
Here I use a list comprehension again. This time with a condition. I only include lines that don't match "two". To make the condition case insensitive I convert the line to lower case before doing the comparison.