Python Forum
How to link two python scripts
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to link two python scripts
#21
Actually a few more hours, forgot I had a doctor's preventative maintenance appointment, and just got back.
Reply
#22
Here is the final program. It Sorts and Selects in one module.

It is written so that you can run it from the command line using:
python SelsectAndSortFasta infilename outfilename min_seq_size
Or, you can import it into another program and call from within similar to:
# Add following import at top of your program
import SelectAndSortFasta

# In your initialization routine, add:
# If a class:
    self.sasf = SelectAndSortFasta.SelectAndSortFasta()
# If just a function:
    sasf = SelectAndSortFasta.SelectAndSortFasta()

# Then when you want to run a file
    sasf.sort_fasta_file(infile, outfile, minlen)
Here's the code:
from operator import itemgetter
import sys


class SelectAndSortFasta:
    """
    Sort in decreasing size order
    """
    def __init__(self):
        self.headers = []
        self.infile = None
        self.outfile = None
        self.minlen = 0

    def ExtractHeader(self):
        """
        Reads all headers, saves starting file position, header, and size in self.headers
        :return: None
        """
        seqlen = 0
        this_header = []
        firstseq = True

        f = open(self.infile, 'r')
        f.seek(0, 2)
        file_size = f.tell()
        f.seek(0)
        while True:
            fptr = f.tell()
            if fptr == file_size:
                break
            line = None
            line = f.readline()
            line = line.strip()

            # skip empty lines
            if len(line) > 0:
                # print(f'line len: {len(line)}')
                if line.startswith('>'):
                    if firstseq:
                        firstseq = False
                    else:
                        this_header.append(seqlen)
                        self.headers.append(this_header)

                    seqlen = 0
                    this_header = []
                    this_header.append(fptr)
                    this_header.append(line)
                else:
                    seqlen += len(line)
        this_header.append(seqlen)
        self.headers.append(this_header)
        f.close()
        # This is the sort routine, sorts on column 2 (size) of self.headers, in reverse order
        self.headers.sort(key=itemgetter(2), reverse=True)
        self.show_header()

    def show_header(self):
        """
        display self.headers list
        :return: None
        """
        for item in self.headers:
            print(f'File ptr: {item[0]}, Size: {item[2]}, Header: {item[1]}')

    def write_outfile(self):
        """
        Reads self.headers (which is sorted in reverse size order,
        seeks to that record in the inout file (from file pointer which i in column 0 of self.headers)
        and writes the output file
        :return:
        """
        with open(self.infile) as f, open(self.outfile, 'w') as fo:
            for item in self.headers:
                # Ignore entry if too small
                if item[2] < minlen:
                    continue
                f.seek(item[0], 0)
                fo.write(f.readline())
                while True:
                    buf = f.readline()
                    if buf.startswith('>'):
                        break
                    fo.write(buf)

    def sort_fasta_file(self, infile, outfile, minlen):
        """
        Launch pad for sort and select program
        :param infile: Name of input fasta file
        :param outfile: Name of output fasta file
        :param minlen: Minimun size of output record. (smaller inut records are ignored)
        :return: None
        """
        self.infile = infile
        self.outfile = outfile
        self.minlen = minlen
        self.ExtractHeader()
        self.write_outfile()


if __name__ == '__main__':
    # Test routine
    srtf = SelectAndSortFasta()
    numargs = len(sys.argv)
    if numargs > 1:
        infile = sys.argv[1]
        outfile = sys.argv[2]
        minlen = sys.argv[3]
    else:
        infile = 'data/fasta/AINZ01/AINZ01.1.fsa_nt'
        outfile = 'data/fasta/AINZ01/AINZ01sorted.1.fsa_nt'
        minlen = 1000

    srtf.sort_fasta_file(infile, outfile, minlen)
You may want to supress the printout on line 57 (self.show_header())
Reply
#23
Thanks !

I get a similar error than before:

Error:
File "SelectAndSortFasta.py", line 65 print(f'File ptr: {item[0]}, Size: {item[2]}, Header: {item[1]}') ^ SyntaxError: invalid syntax
I am assuming it's again because of my version of python. I'll ask the admin from the cluster I work on if they can install python 3 instead of the 2.6 I am working with at the moment, because it looks like I am going to get lots of these errors.

In the meantime, can you help me correcting the print syntax so that I can check if the script is doing what I wanted?

Thanks again!
Reply
#24
I'm sorry, I've got in the habit of using f-string because I really like it.
It was introduced in python 3.6
change to:
print('File ptr: {}, Size: {}, Header: {}'.format(item[0], item[2], item[1]))
or install python 3.6.4

Python 2.6 is so old!
maintenance for 2.7 and below is scheduled to stop in less than 2 years.
F-string won't work with anything less than 3.6, but 3.7 is coming out soon, maybe the admin can wait for that,
It's going to have some fantastic async additions.
Reply
#25
I contacter the admin, to see if I can get access to a more recent version of python, but I don't know how fast they are to respond.

I fixed this line now, so it's running.

However my output file is created, but is completely empty. I have this on the screen:

Output:
$ python SelectAndSortFasta.py 2013_1056H.contigs.fa 2013_1056HTestOutput 1000 File ptr: 0, Size: 550068, Header: >NODE_1_length_550068_cov_488.448 File ptr: 559270, Size: 190085, Header: >NODE_2_length_190085_cov_532.242 File ptr: 752558, Size: 153922, Header: >NODE_3_length_153922_cov_529.57 File ptr: 909079, Size: 114954, Header: >NODE_4_length_114954_cov_494.41 File ptr: 1025982, Size: 106564, Header: >NODE_5_length_106564_cov_477.255 File ptr: 1134357, Size: 103116, Header: >NODE_6_length_103116_cov_541.733 File ptr: 1239226, Size: 102698, Header: >NODE_7_length_102698_cov_564.458
I didn't copy and paste the whole thing, but it goes till the end of my original fasta, all the contigs including those less than 1000bp.

I am not sure what goes wrong, as I don't have an error showing.
Reply
#26
How are you running it?
I just ran it and it worked fine

Try to modify the input and output in the else part
of the code at the bottom:
if __name__ == '__main__':
    sf = SplitFasta()
    numargs = len(sys.argv)
    if numargs > 1:
        infile = sys.argv[1]
        outfile = sys.argv[2]
        minlen = sys.argv[3]
    else:
        infile = '2013_1056H.contigs.fa'
        outfile = '2013_1056HTestOutput.fa'
        minlen = 1000
    sf.split_file(infile, outfile, int(minlen))
    sf.show_seq_out()
Reply
#27
And run like:
python SelectAndSortFasta.py
without arguments. Perhaps there's an issue with command line arguments
Reply
#28
I'm going to have to call it quits soon. Getting pretty sleepy.
I'll wait a few minutes to see if you have success.
Reply
#29
Okay I've done the modifications in the infile and outfile part, and run it without arguments. It does work now.

So I think it is safe to assume the issue comes from the command line arguments.
Reply
#30
It worked like a charm with new file
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Trying to us python.exe from our network to run scripts cubangt 3 875 Aug-17-2023, 07:53 PM
Last Post: deanhystad
  Link scripts from a different folder Extra 3 1,431 May-11-2022, 08:34 PM
Last Post: snippsat
  How do I link the virtual environment of that project to the 3.9.2 version of python? Bryant11 1 1,377 Feb-26-2022, 11:15 AM
Last Post: Larz60+
  I can't open a link with Selenium in Python jao 0 1,406 Jan-30-2022, 04:21 AM
Last Post: jao
  Parsing link from html tags with Python Melcu54 0 1,616 Jun-14-2021, 09:25 AM
Last Post: Melcu54
  How to link Sublime Text 3 Build system to Python 3.9 Using Windows 10 Fanman001 2 4,616 Mar-04-2021, 03:09 PM
Last Post: martpogs
  Running python scripts from github etc pacmyc 7 3,730 Mar-03-2021, 10:26 PM
Last Post: pacmyc
  How to skip LinkedIn signup link using python script? Mangesh121 0 1,799 Aug-26-2020, 01:22 PM
Last Post: Mangesh121
  Reading SQL scripts from excel file and run it using python saravanatn 2 2,581 Aug-23-2020, 04:49 PM
Last Post: saravanatn
  No Scripts File present after python installation ag2207 5 4,925 Jul-30-2020, 11:11 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020