Python Forum

Full Version: [solved] how to delete the 10 first lines of an ascii file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,

It's a general (funny?) question: how can we delete the 10 first lines of an ascii file, and save it, using a minimum of memory (readlinesis excluded accordingly)?

I was looking to "vi/vim" but i'm not sure it can be used in console mode (my trials failed): any suggestion?

My code:
import os, subprocess, sys

Path = os.getcwd()
AsciiFile = 'Ascii.txt'

DeleteLines = subprocess.Popen(':1,10d\nwq\n', 
                               shell = True, 
                               stdin = None,
                               stdout = subprocess.PIPE)

VimRun = subprocess.Popen(['vi ', Path + '/' + AsciiFile], 
                          stdin = DeleteLines.stdout)
Error:
Vim: Warning: Output is not to a terminal Vim: Warning: Input is not from a terminal ... ;mVim: Error reading input, exiting...\nVim: Finished ...
Thanks

Paul
Here is some untested code, using only python tools
from collections import deque
import itertools as itt
import os
import shutil
def remove_nlines(filename, nlines):
    backup = filename + '.bak'
    # move original file
    os.rename(filename, backup)
    with open(backup) as src, open(filename, 'w') as dst:
        # consume nlines lines in source file
        deque(itt.islice(src, None, nlines), maxlen=0)
        # copy the rest of the file
        shutil.copyfileobj(src, dst)
    # remove backup file
    os.remove(backup)
Hi Gribouilli,

Thanks for you hint. I just found a way using vi/vim:

import os, subprocess, sys
 
Path = os.getcwd()
AsciiFile = 'Ascii.txt'
 
vimRun = subprocess.Popen('vi ' + Path + '/' + AsciiFile + ' +1,10d -c wq ', shell = True, stdin = None, stdout = None)
# +1,10d => equivalent to ":1,10d"
# "-c" <command> => the "c" of command to be executed 
# 'stdout = None' avoid echoes
Paul
from pathlib import Path
from itertools import islice


def delete_first_lines(input_file, strip_lines=10):
    source = Path(input_file)
    # adding .new to the target and keeping existing suffixes
    target = source.with_suffix("".join(source.suffixes + [".new"]))

    # open source in binary read mode (no hassle with decoding errors)
    # open target in binary write mode.
    with source.open("rb") as fd_in, target.open("wb") as fd_out:
        for line in islice(fd_in, strip_lines, None):
            fd_out.write(line)

    # will delete the original file
    source.unlink()

    # ranme the new file to old filename (without .new)
    target.rename(source)


# example with test.f.txt
delete_first_lines("test.f1.txt")
Tested on Windows. On Linux, it requires one lesser line code. On Windows, you can't overwrite a file by renaming another file, so the original file must be deleted first.
So i rewrote my code from here.
Instead of showing first lines and last lines of choice,now it will delete those lines.

Example:
λ python del_lines.py contry.txt --head 5
Removed the first 5 lines and the last 0 lines from <contry.txt>

λ python del_lines.py contry.txt --tail 5
Removed the first 0 lines and the last 5 lines from <contry.txt>
Using together:
G:\div_code\reader_env\delete_lines
λ python del_lines.py contry.txt --head 2 --tail 7
Removed the first 2 lines and the last 7 lines from <contry.txt>
It's CLI applications using Typer.
# del_lines.py
import typer
from collections import deque

app = typer.Typer()

@app.command()
def headtail(filename: str, head: int = 0, tail: int = 0):
    try:
        with open(filename, 'r', encoding='utf-8', errors='ignore') as file:
            lines = file.readlines()
        # Calculate the remaining lines after removing head and tail
        remaining_lines = lines[head:len(lines) - tail]
        # Write the remaining lines back to the file
        with open(filename, 'w', encoding='utf-8', errors='ignore') as file:
            file.writelines(remaining_lines)
        typer.echo(f"Removed the first {head} lines and the last {tail} lines from <{filename}>")
    except FileNotFoundError:
        typer.echo(f"Error: The file '{filename}' does not exist.", err=True)
    except Exception as e:
        typer.echo(f"An error occurred: {e}", err=True)

if __name__ == "__main__":
    app()
[Image: 4yYTMP.png]
All seems awfully complicated!

You could try like this: skip 10 lines then write to another file

path2text = 'temp/brown_fox.txt'
savepath = 'temp/short_brown_fox.txt'

# skip the first 10 lines
with open(path2text, 'r') as infile, open(savepath, 'a') as outfile:
    count = 0
    for line in infile:
        count+=1
        print(line)
        if count >= 11:
            outfile.writelines(line)
even if using vi is not a 100% pythonic way, it remains the simplest way in my mind, and fast (no loop). It can be directly inserted in my python code without using a bash script or another tool.

Everyone has ever used cp, mv in its code -i've ever experienced other common basic commands (grep, sed, vi, wc , etc) to deal with huge files Wink
(Aug-07-2024, 06:51 PM)Pedroski55 Wrote: [ -> ]All seems awfully complicated!
I made every effort to avoid writing a for loop in my solution, for performance. That's why obvious solutions are not necessarily the best ones. Simple code does not mean efficient code.

That said, we'd need measures to compare the performance of the various solutions.

Using an external Linux command is probably the most efficient on large files because these tools are highly optimized (but it is less portable than a pure Python script using the standard library).