Python Forum
Help with python code to search string in one file & replace with line in other file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Help with python code to search string in one file & replace with line in other file
#6
Ok Check this out and get back. I think it's what you are looking for. It replaces everything from the match up to the next '>' record.
It looks for the files to be in a directory named data which is a sub-directory of wherever the code is. You mat want to change this.
you can run it from the command line with a command that looks like:
python WhateverYouCallIt.py -i File1.txt -b File2.txt -o Fileout.txt > data/results.txt
code:
# Replace header in bodyfile with header in header file, writing output to outputfile Larz60+
#
from pathlib import Path
import argparse

class SwapHeaders:
    def __init__(self, origfile=None, headerfile=None, outfile=None):
        self.home = Path('.')
        self.data = self.home / 'data'
        self.original_file = self.data / origfile
        self.header_file = self.data / headerfile
        self.out_file = self.data / outfile

        with self.header_file.open() as fh:
            self.new_data = fh.readlines()

        self.make_new_file()

    def get_orig_rec(self):
        with self.original_file.open() as forig:
            for line in forig:
                yield line

    def get_match(self, match_this, fo):
        found = False
        for line in self.new_data:
            if line.startswith('>'):
                if found:
                    break
                if match_this in line:
                    found = True
            if found:
                fo.write(line)

    def make_new_file(self):
        with self.out_file.open('w') as fo:
            skip = False
            for line in self.get_orig_rec():
                if line.startswith('>'):
                    if skip:
                        skip = False
                    match = line[1:]
                    x = match.rfind('.')
                    if x:
                        match = match[:x]
                    skip = self.get_match(match, fo)
                if skip:
                    continue
                fo.write(line)


def debug_main():
    SwapHeaders(origfile='File1.txt', headerfile='File2.txt', outfile='Fileout.txt')

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("-i", "--ifile",
                        dest='original_filename',
                        help="Filename where headers are to be replaced",
                        action="store")

    parser.add_argument("-b", "--bfile",
                        dest='replace_original_filename',
                        help="Filename containing body",
                        action="store")

    parser.add_argument("-o", "--ofile",
                        dest='out_filename',
                        help="Output filename",
                        action="store")

    args = parser.parse_args()
    original_filename = args.original_filename

    replace_original_filename = args.replace_original_filename

    out_filename = args.out_filename

    SwapHeaders(origfile=original_filename, headerfile=replace_original_filename, outfile=out_filename)

if __name__ == '__main__':
    main()
    # debug_main()
partial results:
Output:
>OFAS009268-RA-EXON07 |design:coreoidea-v1,designer:forthman,probes-locus:OFAS009268-RA-EXON07,probes-probe:,probes-source:Clavigralla_tomentosicollis_gi_512427643_gb_GAJX01006991.1 TTCTACACAAACTGCTTTGCACTGAGCACCATTAAAATCATCTGTTGACCTTGCAAGTTCTTCAAAATTTACATCAACGCTAATATTCATTTTCCGAGAATGTATTTGCATAATTCGAGCACGGGCATCTTCATTTGGATGAGGAAATTCAATTTTTCTGTCTAGCCTGCCTGATCGGAGAAGGGCTGGATCTAATATATCAACTCTGTTAGTTGCTGCAATG >Clavigralla_tomentosicollis_gi_512427643_gb_GAJX01006991.1_0_rc GCTCGAATTATGCAAATACATTCTCGGAAAATGAATATTAGCGTTGATGTAAATTTTGAAGAACTTGCAAGGTCAACAGATGATTTTAATGGTGCTCAGTGCAAAGCAGTTTGTGTAGAA >OFAS009268-RA-EXON07 |design:coreoidea-v1,designer:forthman,probes-locus:OFAS009268-RA-EXON07,probes-probe:,probes-source:Clavigralla_tomentosicollis_gi_512427643_gb_GAJX01006991.1 TTCTACACAAACTGCTTTGCACTGAGCACCATTAAAATCATCTGTTGACCTTGCAAGTTCTTCAAAATTTACATCAACGCTAATATTCATTTTCCGAGAATGTATTTGCATAATTCGAGCACGGGCATCTTCATTTGGATGAGGAAATTCAATTTTTCTGTCTAGCCTGCCTGATCGGAGAAGGGCTGGATCTAATATATCAACTCTGTTAGTTGCTGCAATG >Clavigralla_tomentosicollis_gi_512427643_gb_GAJX01006991.1_35_rc AAATTGAATTTCCTCATCCAAATGAAGATGCCCGTGCTCGAATTATGCAAATACATTCTCGGAAAATGAATATTAGCGTTGATGTAAATTTTGAAGAACTTGCAAGGTCAACAGATGATT >Anasa_tristis_comp3229_c0_seq1_136_rc TCAGCCAATCATAGTGGAACCGATTTCCAGTGGAGACGAACTCCGAACTGATATTCATGGAATGGAAACACAAATAAACACTTTAGGTTCTAATAACATTGTATGTGTTCTTTCAACAAC >uce-3225_p7 |design:hemiptera-v1,designer:faircloth,probes-locus:uce-3225,probes-probe:7,probes-source:halhal1,probes-global-chromo:Scaffold629,probes-global-start:410155,probes-global-end:410275,probes-local-start:0,probes-local-end:120 AAATCCATCAAGAAATACCAACAACAACTTAAGGATGTCCAGACCGCACTCGAGGAAGAACAAAGAGCTAGGGATGATGCCCGAGAACAACTTGGTATTGCCGAAAGGCGAGCCAACGCT
Reply


Messages In This Thread
RE: Help with python code to search string in one file & replace with line in other file - by Larz60+ - Dec-16-2017, 12:58 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Cannot get cmd to print Python file Schauster 11 543 May-16-2024, 04:40 PM
Last Post: xMaxrayx
  Matching string from a file tester_V 5 545 Mar-05-2024, 05:46 AM
Last Post: Danishhafeez
  Python openyxl not updating Excel file MrBean12 1 424 Mar-03-2024, 12:16 AM
Last Post: MrBean12
  Python logging RotatingFileHandler writes to random file after the first log rotation rawatg 0 480 Feb-15-2024, 11:15 AM
Last Post: rawatg
  Unable to understand the meaning of the line of code. jahuja73 0 368 Jan-23-2024, 05:09 AM
Last Post: jahuja73
  connect sql by python using txt. file dawid294 2 528 Jan-12-2024, 08:54 PM
Last Post: deanhystad
  Writing a Linear Search algorithm - malformed string representation Drone4four 10 1,149 Jan-10-2024, 08:39 AM
Last Post: gulshan212
  file open "file not found error" shanoger 8 1,344 Dec-14-2023, 08:03 AM
Last Post: shanoger
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 1,681 Nov-09-2023, 10:56 AM
Last Post: mg24
  Search Excel File with a list of values huzzug 4 1,342 Nov-03-2023, 05:35 PM
Last Post: huzzug

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020