https://github.com/the-machine-preacher/...iles.ipynb
In Cell 8:
with open('Examples/how_many_lines.txt') as lines_doc:
for line in lines_doc.readlines():
print(line)
It should be:
with open('Examples/how_many_lines.txt') as lines_doc:
for line in lines_doc:
print(line)
The example from Cell 8 opens the file,
then it iterates over all lines with
readlines()
.
The list object is created in memory and holds the complete content of the file.
The correction of it, don't use
readlines()
.
Iterating over a file, yields line by line without loading the whole content into memory.
The first function could not process a file, which is bigger as your RAM + Swap.
Another cool trick is the use of itertools.islice.
If you want to print the first 10 lines:
from itertools import islice
def head(file, lines):
with open(file) as fd:
for line in islice(fd, 0, lines):
print(line.strip())
head(r'C:\Windows\system.ini', 10)
Tail have to be implemented different.
from collections import deque
def tail(file, lines):
result = deque(maxlen=lines)
with open(file) as fd:
for line in fd:
result.append(line.strip())
for line in result:
print(line)
tail(r'C:\Windows\system.ini', 10)
This is not optimal, because it has to read the whole file content.