Python Forum
Help with python code to search string in one file & replace with line in other file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Help with python code to search string in one file & replace with line in other file
#21
Yes, that is unfortunate.
Are you aware of the fact that you can install more than one version of python?
Reply
#22
I'll take a quick look at making this work in python 2.7. No promises, and I can't spend much more time on it.
Reply
#23
OK, re-do everything in post 15.
This will work in python 2.7, but you have to run from the command line like:
c:\Python27\python.exe SwapHeaders.py -i 'File1.txt' -b 'File2.txt' -o 'Fileout.txt'
replacing the python command to point to your python 2.7 directory
and changing file names as appropriate.

running:
c:\Python27\python.exe SwapHeaders.py -h
As an aid, will give you:
Output:
usage: SwapHeaders.py [-h] [-i ORIGINAL_FILENAME]                       [-b REPLACE_ORIGINAL_FILENAME] [-o OUT_FILENAME] optional arguments:   -h, --help            show this help message and exit   -i ORIGINAL_FILENAME, --ifile ORIGINAL_FILENAME                         Filename where headers are to be replaced   -b REPLACE_ORIGINAL_FILENAME, --bfile REPLACE_ORIGINAL_FILENAME                         Filename containing body   -o OUT_FILENAME, --ofile OUT_FILENAME                         Output filename
Reply
#24
Was getting some No such directory errors and was able to figure out how to modify the code so it would work (using MacOS system). Code below. Running it only replaces some of the targeted headers, specifically it seems to only replace those formatted that have Clavigralla and Anoplocnemis. I think I see why and will play with the script some more.

#!/usr/bin/env python

# Replace header inoriginal file header with header in header file, writing output to outputfile
# Larz60+
# from pathlib import Path
import os
import sys
import argparse


class SwapHeaders:
    def __init__(self, origfile=None, headerfile=None, outfile=None):
        # Note Modern pathlib objects removed because they won't work in
        # outdated python 2.7
        # self.home = Path('.')
        # self.data = self.home / 'data'
        # self.original_file = self.data / origfile
        # self.header_file = self.data / headerfile
        # self.out_file = self.data / outfi

        # with self.header_file.open() as fh:
        #     self.header_data = fh.readlines()

        # self.orig = self.original_file.open()
        # self.fo = self.out_file.open('w')

        self.home = os.getcwd()
        self.data = self.home + '/data/'
        self.original_file = self.data + origfile
        self.header_file = self.data + headerfile
        self.out_file = self.data + outfile

        with open(self.header_file, 'r') as fh:
            self.header_data = fh.readlines()

        self.orig = open(self.original_file, 'r')
        self.fo = None

    def close_files(self):
        self.orig.close()

    def get_replacement_header(self, match):
        retrec = None
        for line in self.header_data:
            if not line.startswith('>'):
                continue
            if match in line:
                retrec = line
                break
        return retrec

    def read_orig_record(self):
        """
        original file record read
        :return: data or False
        """
        while True:
            data = self.orig.readline()
            if not data:
                break
            yield data

    def make_new_file(self):
        # with self.out_file.open('w') as fo:
        with open(self.out_file, 'w') as fo:
            for orig in self.read_orig_record():
                match = None
                if orig.startswith('>'):
                    match = orig[1:]
                    x = match.rfind('.')
                    if x:
                        match = match[:x]
                    new = self.get_replacement_header(match)
                    if new is not None:
                        fo.write(new)
                    else:
                        fo.write(orig)
                else:
                    fo.write(orig)


def main():
    # Typical command line call python SwapHeaders.py -i 'File1.txt' -b 'File2.txt' -o 'Fileout.txt'
    parser = argparse.ArgumentParser()
    parser.add_argument("-i", "--ifile",
                        dest='original_filename',
                        help="Filename where headers are to be replaced",
                        action="store")

    parser.add_argument("-b", "--bfile",
                        dest='replace_original_filename',
                        help="Filename containing body",
                        action="store")

    parser.add_argument("-o", "--ofile",
                        dest='out_filename',
                        help="Output filename",
                        action="store")

    args = parser.parse_args()
    original_filename = args.original_filename

    replace_original_filename = args.replace_original_filename

    out_filename = args.out_filename

    sh = SwapHeaders(origfile=original_filename, headerfile=replace_original_filename, outfile=out_filename)
    sh.make_new_file()
    sh.close_files()


if __name__ == '__main__':
    main()

If I change line 70 'x = match.rfind('.')' to 'x = match.rfind('seq1')', that certainly will select the other targeted headers, but it will include headers that have, e.g., 'seq1_A_' and 'seq1_B_' which I do not want to include. Is there a way to get the match.rfind search term to exclude these instances or to just include seq1_[some numerical digits]?
Reply
#25
I've got to get some sleep for a few hours, please during that time, isolate the items that aren't being replaced.
Thanks.
Reply
#26
(Dec-19-2017, 03:51 PM)Larz60+ Wrote: I've got to get some sleep for a few hours, please during that time, isolate the items that aren't being replaced.
Thanks.

I just modified the file1.txt to insert a '.' after 'seq1', which now gets picked up by the script. I appreciate all of your help.
Reply
#27
Great!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [SOLVED] [Beautiful Soup] Replace tag.string from another file? Winfried 2 310 May-01-2025, 03:43 PM
Last Post: Winfried
  Replace values in Yaml file with value in dictionary PelleH 1 2,225 Feb-11-2025, 09:51 AM
Last Post: alexjordan
  How to remove unwanted images and tables from a Word file using Python? rownong 2 824 Feb-04-2025, 08:30 AM
Last Post: Pedroski55
  Best way to feed python script of a file absolut 6 1,185 Jan-11-2025, 07:03 AM
Last Post: Gribouillis
  Removal of watermark logo pdf file Python druva 0 773 Jan-01-2025, 11:55 AM
Last Post: druva
  How to write variable in a python file then import it in another python file? tatahuft 4 994 Jan-01-2025, 12:18 AM
Last Post: Skaperen
  How to communicate between scripts in python via shared file? daiboonchu 4 1,843 Dec-31-2024, 01:56 PM
Last Post: Pedroski55
  Problems writing a large text file in python Vilius 4 1,071 Dec-21-2024, 09:20 AM
Last Post: Pedroski55
  How to read a file as binary or hex "string" so that I can do regex search? tatahuft 3 1,215 Dec-19-2024, 11:57 AM
Last Post: snippsat
  Search in a file using regular expressions ADELE80 2 757 Dec-18-2024, 12:29 PM
Last Post: ADELE80

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020