Python Forum
Help with python code to search string in one file & replace with line in other file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Help with python code to search string in one file & replace with line in other file
#11
I'm not going to get to this tonight, but will pick it up first thing in the AM.
I'll post the code if you want to play with it before then.
It's now replacing all of the headers, and not the data, but it's reusing the same header lines from file2.
I think I know why, but need to spend some time with my wife.
I'll pick it up latter tonight or in the A.M. (EST)

I named the program MyParse.py, you can call it whatever you like, but just change the run command which is:
MyParse.py -i 'File1.txt' -b 'File2.txt' -o 'Fileout.txt'
here's the code:
# Replace header inoriginal file header with header in header file, writing output to outputfile
# Larz60+
from pathlib import Path
import argparse

class SwapHeaders:
    def __init__(self, origfile=None, headerfile=None, outfile=None):
        self.home = Path('.')
        self.data = self.home / 'data'
        self.original_file = self.data / origfile
        self.header_file = self.data / headerfile
        self.out_file = self.data / outfile

        with self.header_file.open() as fh:
            self.header_data = fh.readlines()

        self.orig = self.original_file.open()
        self.fo = self.out_file.open('w')

    def close_files(self):
        self.orig.close()
        self.fo.close()

    def get_replacement_header(self, match):
        retrec = None
        for line in self.header_data:
            if not line.startswith('>'):
                continue
            if match in line:
                retrec = line
                break
        return retrec

    def read_orig_record(self):
        """
        original file record read
        :return: data or False
        """
        while True:
            data = self.orig.readline()
            if not data:
                break
            yield data

    def make_new_file(self):
        with self.out_file.open('w') as fo:
            for orig in self.read_orig_record():
                if orig.startswith('>'):
                    match = orig[1:]
                    x = match.rfind('.')
                    if x:
                        match = match[:x]
                    new = self.get_replacement_header(match)
                    if new is not None:
                        fo.write(new)
                    else:
                        fo.write(orig)
                else:
                    fo.write(orig)

def main():
    # Typical command line call python MyParse.py -i 'File1.txt' -b 'File2.txt' -o 'Fileout.txt'
    parser = argparse.ArgumentParser()
    parser.add_argument("-i", "--ifile",
                        dest='original_filename',
                        help="Filename where headers are to be replaced",
                        action="store")

    parser.add_argument("-b", "--bfile",
                        dest='replace_original_filename',
                        help="Filename containing body",
                        action="store")

    parser.add_argument("-o", "--ofile",
                        dest='out_filename',
                        help="Output filename",
                        action="store")

    args = parser.parse_args()
    original_filename = args.original_filename

    replace_original_filename = args.replace_original_filename

    out_filename = args.out_filename

    sh = SwapHeaders(origfile=original_filename, headerfile=replace_original_filename, outfile=out_filename)
    sh.make_new_file()
    sh.close_files()

if __name__ == '__main__':
    main()
Reply
#12
I actually believe this is working properly now.
I renamed the script to match the class name: SwapHeaders.py.
Please try it and see if the results are correct.
You can call it from another file like:
import SwapHeaders

def testit(file1name, file2name, outfilename):
    sh = SwapHeaders.SwapHeaders(origfile=file1name, headerfile=file2name, outfile=outfilename)
    sh.make_new_file()
    sh.close_files()

if __name__ == '__main__':
    testit(file1name='File1.txt', file2name='File2.txt', outfilename='Newfile.txt')
Remember, files need to be in data directory (sub-directory of program directory) and make backups first
Reply
#13
(Dec-18-2017, 01:09 AM)Larz60+ Wrote: I actually believe this is working properly now.
I renamed the script to match the class name: SwapHeaders.py.
Please try it and see if the results are correct.
You can call it from another file like:
import SwapHeaders

def testit(file1name, file2name, outfilename):
    sh = SwapHeaders.SwapHeaders(origfile=file1name, headerfile=file2name, outfile=outfilename)
    sh.make_new_file()
    sh.close_files()

if __name__ == '__main__':
    testit(file1name='File1.txt', file2name='File2.txt', outfilename='Newfile.txt')
Remember, files need to be in data directory (sub-directory of program directory) and make backups first

I'm getting this error:

Quote:File "./SwapHeaders.py", line 6, in testit
sh = SwapHeaders.SwapHeaders(origfile=file1name, headerfile=file2name, outfile=outfilename)
TypeError: 'module' object is not callable

I put the files in a subdirectory that I called 'files'. File names match what is given on line 11 of your code.

In case you need my python version info:

Quote:Owners-MacBook-Air:test2 Forthman$ python
Python 2.7.10 (default, Oct 23 2015, 18:05:06)
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Reply
#14
so that python knows where the import is located, you need an __init__.py file in the directory above the  source directory,
and an empty on in the source directory itself.

Just a check, you did call it (case sensitive) SwapHeaders.py, right
so the __init__.py file should look like:
src/
    __init__.py
    SwapHeaders.py
    TestAsImport.py
The programs should be named accordingly

If you wish, I'll move the entire thing to github, then you can download all in one step
Reply
#15
if you have git loaded, you can do the following

It's in GitHub, to clone the file structure, etc If you don't have git:
  • navigate to: https://github.com/Larz60p/SwapHeaders
  • Click on clone or download
  • Click on download zip, saving in new directory
  • Extract the files Making sure the switch on your extract program retains directory structure
I f you want to install git, see: https://git-scm.com/book/en/v2/Getting-S...alling-Git

that should set everything up including the data directory with the two test files
Reply
#16
Seems to lead me to different issue now regarding pathlib. Was getting
Quote:ImportError: No module named pathlib

Then I tried using pip to install pathlib and got:

Quote:pip install pathlib
Collecting pathlib
Downloading pathlib-1.0.1.tar.gz (49kB)
100% |████████████████████████████████| 51kB 1.2MB/s
Installing collected packages: pathlib
Running setup.py install for pathlib ... error
Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/private/var/folders/tc/s83_bztx34783jsqlt75bb6r0000gn/T/pip-build-F8mpfz/pathlib/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /var/folders/tc/s83_bztx34783jsqlt75bb6r0000gn/T/pip-BNm0zc-record/install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
creating build
creating build/lib
copying pathlib.py -> build/lib
running install_lib
copying build/lib/pathlib.py -> /Library/Python/2.7/site-packages
error: [Errno 13] Permission denied: '/Library/Python/2.7/site-packages/pathlib.py'

----------------------------------------
Command "/usr/bin/python -u -c "import setuptools, tokenize;__file__='/private/var/folders/tc/s83_bztx34783jsqlt75bb6r0000gn/T/pip-build-F8mpfz/pathlib/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /var/folders/tc/s83_bztx34783jsqlt75bb6r0000gn/T/pip-BNm0zc-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/var/folders/tc/s83_bztx34783jsqlt75bb6r0000gn/T/pip-build-F8mpfz/pathlib/

I can't seem to win on this lol
Reply
#17
what version of  python are you running?
pathlib is part of python 3.6.3 (latest version)

trying to install pathlib will cause problems.
Reply
#18
Python 2.7.10
Reply
#19
Sorry this won't work with antique python, and neither do I.
Can you upgrade?
Reply
#20
Unfortunately, I need to keep this python version for software that hasn't yet accommodated newer versions of python.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [SOLVED] [Beautiful Soup] Replace tag.string from another file? Winfried 2 573 May-01-2025, 03:43 PM
Last Post: Winfried
  Replace values in Yaml file with value in dictionary PelleH 1 2,371 Feb-11-2025, 09:51 AM
Last Post: alexjordan
  How to remove unwanted images and tables from a Word file using Python? rownong 2 929 Feb-04-2025, 08:30 AM
Last Post: Pedroski55
  Best way to feed python script of a file absolut 6 1,368 Jan-11-2025, 07:03 AM
Last Post: Gribouillis
  Removal of watermark logo pdf file Python druva 0 884 Jan-01-2025, 11:55 AM
Last Post: druva
  How to write variable in a python file then import it in another python file? tatahuft 4 1,082 Jan-01-2025, 12:18 AM
Last Post: Skaperen
  How to communicate between scripts in python via shared file? daiboonchu 4 2,129 Dec-31-2024, 01:56 PM
Last Post: Pedroski55
  Problems writing a large text file in python Vilius 4 1,157 Dec-21-2024, 09:20 AM
Last Post: Pedroski55
  How to read a file as binary or hex "string" so that I can do regex search? tatahuft 3 1,368 Dec-19-2024, 11:57 AM
Last Post: snippsat
  Search in a file using regular expressions ADELE80 2 876 Dec-18-2024, 12:29 PM
Last Post: ADELE80

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020