Hi,
It's a general (funny?) question: how can we delete the 10 first lines of an ascii file, and save it, using a minimum of memory (
readlines
is excluded accordingly)?
I was looking to "vi/vim" but i'm not sure it can be used in console mode (my trials failed): any suggestion?
My code:
import os, subprocess, sys
Path = os.getcwd()
AsciiFile = 'Ascii.txt'
DeleteLines = subprocess.Popen(':1,10d\nwq\n',
shell = True,
stdin = None,
stdout = subprocess.PIPE)
VimRun = subprocess.Popen(['vi ', Path + '/' + AsciiFile],
stdin = DeleteLines.stdout)
Error:
Vim: Warning: Output is not to a terminal
Vim: Warning: Input is not from a terminal
...
;mVim: Error reading input, exiting...\nVim: Finished
...
Thanks
Paul
Here is some
untested code, using only python tools
from collections import deque
import itertools as itt
import os
import shutil
def remove_nlines(filename, nlines):
backup = filename + '.bak'
# move original file
os.rename(filename, backup)
with open(backup) as src, open(filename, 'w') as dst:
# consume nlines lines in source file
deque(itt.islice(src, None, nlines), maxlen=0)
# copy the rest of the file
shutil.copyfileobj(src, dst)
# remove backup file
os.remove(backup)
Hi Gribouilli,
Thanks for you hint. I just found a way using vi/vim:
import os, subprocess, sys
Path = os.getcwd()
AsciiFile = 'Ascii.txt'
vimRun = subprocess.Popen('vi ' + Path + '/' + AsciiFile + ' +1,10d -c wq ', shell = True, stdin = None, stdout = None)
# +1,10d => equivalent to ":1,10d"
# "-c" <command> => the "c" of command to be executed
# 'stdout = None' avoid echoes
Paul
from pathlib import Path
from itertools import islice
def delete_first_lines(input_file, strip_lines=10):
source = Path(input_file)
# adding .new to the target and keeping existing suffixes
target = source.with_suffix("".join(source.suffixes + [".new"]))
# open source in binary read mode (no hassle with decoding errors)
# open target in binary write mode.
with source.open("rb") as fd_in, target.open("wb") as fd_out:
for line in islice(fd_in, strip_lines, None):
fd_out.write(line)
# will delete the original file
source.unlink()
# ranme the new file to old filename (without .new)
target.rename(source)
# example with test.f.txt
delete_first_lines("test.f1.txt")
Tested on Windows. On Linux, it requires one lesser line code. On Windows, you can't overwrite a file by renaming another file, so the original file must be deleted first.
So i rewrote my code from
here.
Instead of showing first lines and last lines of choice,now it will delete those lines.
Example:
λ python del_lines.py contry.txt --head 5
Removed the first 5 lines and the last 0 lines from <contry.txt>
λ python del_lines.py contry.txt --tail 5
Removed the first 0 lines and the last 5 lines from <contry.txt>
Using together:
G:\div_code\reader_env\delete_lines
λ python del_lines.py contry.txt --head 2 --tail 7
Removed the first 2 lines and the last 7 lines from <contry.txt>
It's CLI applications using
Typer.
# del_lines.py
import typer
from collections import deque
app = typer.Typer()
@app.command()
def headtail(filename: str, head: int = 0, tail: int = 0):
try:
with open(filename, 'r', encoding='utf-8', errors='ignore') as file:
lines = file.readlines()
# Calculate the remaining lines after removing head and tail
remaining_lines = lines[head:len(lines) - tail]
# Write the remaining lines back to the file
with open(filename, 'w', encoding='utf-8', errors='ignore') as file:
file.writelines(remaining_lines)
typer.echo(f"Removed the first {head} lines and the last {tail} lines from <{filename}>")
except FileNotFoundError:
typer.echo(f"Error: The file '{filename}' does not exist.", err=True)
except Exception as e:
typer.echo(f"An error occurred: {e}", err=True)
if __name__ == "__main__":
app()
![[Image: 4yYTMP.png]](https://imagizer.imageshack.com/v2/xq70/923/4yYTMP.png)
All seems awfully complicated!
You could try like this: skip 10 lines then write to another file
path2text = 'temp/brown_fox.txt'
savepath = 'temp/short_brown_fox.txt'
# skip the first 10 lines
with open(path2text, 'r') as infile, open(savepath, 'a') as outfile:
count = 0
for line in infile:
count+=1
print(line)
if count >= 11:
outfile.writelines(line)
even if using
vi
is not a 100% pythonic way, it remains the simplest way in my mind, and fast (no loop). It can be directly inserted in my python code without using a bash script or another tool.
Everyone has ever used
cp
,
mv
in its code -i've ever experienced other common basic commands (
grep
,
sed
, vi,
wc
, etc) to deal with huge files

(Aug-07-2024, 06:51 PM)Pedroski55 Wrote: [ -> ]All seems awfully complicated!
I made every effort to avoid writing a
for
loop in my solution, for performance. That's why obvious solutions are not necessarily the best ones. Simple code does not mean efficient code.
That said, we'd need measures to compare the performance of the various solutions.
Using an external Linux command is probably the most efficient on large files because these tools are highly optimized (but it is less portable than a pure Python script using the standard library).