Python Forum
extract information from a file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
extract information from a file
#1
Dear friends:

I must extract information from a sdf file containing molecule dates:

.
.
.
43 46 1 0 0 0 0
46 47 1 0 0 0 0
46 49 2 0 0 0 0
47 48 1 0 0 0 0
50 51 3 0 0 0 0
51 52 1 0 0 0 0
M END
> <Name>
4_pentynoic_acid

> <activity>
non


$$$$
CHAPS
MOE2011 3D
Structure written by Hyleos SD API
100103 0 0 1 0 0 0 0 0999 V2000
13.3220 1.0470 0.2750 S 0 0 0 0 0 0 0 0 0 0 0 0
.
.
.

I would like to obtain the following information for each molecule in a new file (result.txt):
Name: 4_pentynoic_acid, activity: non.

This sdf file contains 400 other molecules each one with its name and its activity.
Could you help me find the way to obtain this code?
Thank you very much!!

This is what I have been trying:

infile = open('Substrates.sdf', 'r')
outfile = open('result.txt', 'w')
copy = False
tmpLines = []
for line in infile:
	if line == '<Name>':
	
		copy = True
		tmpLines = []
	elif line == '$$$$':
		copy = False
		for tmpLine in tmpLines:
			outfile.write(tmpLine)
	elif copy:
		tmpLines.append(line)
Reply
#2
Is there a location where I can download a complete sample file?
Reply
#3
https://www.dropbox.com/s/zpjo590qusn4bx...s.sdf?dl=0
Reply
#4
import os


def get_data():
    # set starting directory
    os.chdir(os.path.abspath(os.path.dirname(__file__)))
    indata = []
    with open('Substrates.sdf', 'r') as fp:
        for line in fp:
            indata.append(line.strip())

    with open('result.txt', 'w') as fp_out:
        for n, line in enumerate(indata):
            if 'Name' in line:
                fp_out.write(f'Name: {indata[n+1]}')
            if 'activity' in line:
                fp_out.write(f', activity: {indata[n+1]}\n')

if __name__ == '__main__':
    get_data()
partial results:
Output:
(try_stuff_venv) > cat src/result.txt Name: 4_pentynoic_acid, activity: non Name: CHAPS, activity: non Name: D_24851, activity: non Name: NSC109350_got, activity: non Name: NSC118742_got, activity: non Name: NSC122301_got, activity: non Name: NSC132791_got, activity: non Name: NSC139490_got, activity: non Name: NSC144153_got, activity: non Name: NSC145150_got, activity: non Name: NSC152731_got, activity: non Name: NSC161128_got, activity: non Name: NSC167780_got, activity: non
Reply
#5
Thank you very much!
It's perfect!!!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Need help for a python script to extract information from a list of files lephunghien 6 1,078 Jun-12-2023, 05:40 PM
Last Post: snippsat
  How do I extract information from this dataset? SuchUmami 0 701 May-04-2023, 02:41 PM
Last Post: SuchUmami
  Extract file only (without a directory it is in) from ZIPIP tester_V 1 986 Jan-23-2023, 04:56 AM
Last Post: deanhystad
  How to extract specific data from .SRC (note pad file) Shinny_Shin 2 1,264 Jul-27-2022, 12:31 PM
Last Post: Larz60+
  Extract parts of a log-file and put it in a dataframe hasiro 4 6,301 Apr-08-2022, 01:18 PM
Last Post: hasiro
  Extract a string between 2 words from a text file OscarBoots 2 1,870 Nov-02-2021, 08:50 AM
Last Post: ibreeden
  Extract specific sentences from text file Bubly 3 3,405 May-31-2021, 06:55 PM
Last Post: Larz60+
  Add a new column when I extract each sheet in an Excel workbook as a new csv file shantanu97 0 2,229 Mar-24-2021, 04:56 AM
Last Post: shantanu97
  getting information from a text file Nickd12 8 3,217 Nov-17-2020, 01:29 AM
Last Post: bowlofred
  How to extract a single word from a text file buttercup 7 3,548 Jul-22-2020, 04:45 AM
Last Post: bowlofred

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020