Python Forum
extract information from a file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
extract information from a file
#1
Dear friends:

I must extract information from a sdf file containing molecule dates:

.
.
.
43 46 1 0 0 0 0
46 47 1 0 0 0 0
46 49 2 0 0 0 0
47 48 1 0 0 0 0
50 51 3 0 0 0 0
51 52 1 0 0 0 0
M END
> <Name>
4_pentynoic_acid

> <activity>
non


$$$$
CHAPS
MOE2011 3D
Structure written by Hyleos SD API
100103 0 0 1 0 0 0 0 0999 V2000
13.3220 1.0470 0.2750 S 0 0 0 0 0 0 0 0 0 0 0 0
.
.
.

I would like to obtain the following information for each molecule in a new file (result.txt):
Name: 4_pentynoic_acid, activity: non.

This sdf file contains 400 other molecules each one with its name and its activity.
Could you help me find the way to obtain this code?
Thank you very much!!

This is what I have been trying:

infile = open('Substrates.sdf', 'r')
outfile = open('result.txt', 'w')
copy = False
tmpLines = []
for line in infile:
	if line == '<Name>':
	
		copy = True
		tmpLines = []
	elif line == '$$$$':
		copy = False
		for tmpLine in tmpLines:
			outfile.write(tmpLine)
	elif copy:
		tmpLines.append(line)
Reply
#2
Is there a location where I can download a complete sample file?
Reply
#3
https://www.dropbox.com/s/zpjo590qusn4bx...s.sdf?dl=0
Reply
#4
import os


def get_data():
    # set starting directory
    os.chdir(os.path.abspath(os.path.dirname(__file__)))
    indata = []
    with open('Substrates.sdf', 'r') as fp:
        for line in fp:
            indata.append(line.strip())

    with open('result.txt', 'w') as fp_out:
        for n, line in enumerate(indata):
            if 'Name' in line:
                fp_out.write(f'Name: {indata[n+1]}')
            if 'activity' in line:
                fp_out.write(f', activity: {indata[n+1]}\n')

if __name__ == '__main__':
    get_data()
partial results:
Output:
(try_stuff_venv) > cat src/result.txt Name: 4_pentynoic_acid, activity: non Name: CHAPS, activity: non Name: D_24851, activity: non Name: NSC109350_got, activity: non Name: NSC118742_got, activity: non Name: NSC122301_got, activity: non Name: NSC132791_got, activity: non Name: NSC139490_got, activity: non Name: NSC144153_got, activity: non Name: NSC145150_got, activity: non Name: NSC152731_got, activity: non Name: NSC161128_got, activity: non Name: NSC167780_got, activity: non
Reply
#5
Thank you very much!
It's perfect!!!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Need help for a python script to extract information from a list of files lephunghien 6 1,076 Jun-12-2023, 05:40 PM
Last Post: snippsat
  How do I extract information from this dataset? SuchUmami 0 700 May-04-2023, 02:41 PM
Last Post: SuchUmami
  Extract file only (without a directory it is in) from ZIPIP tester_V 1 981 Jan-23-2023, 04:56 AM
Last Post: deanhystad
  How to extract specific data from .SRC (note pad file) Shinny_Shin 2 1,263 Jul-27-2022, 12:31 PM
Last Post: Larz60+
  Extract parts of a log-file and put it in a dataframe hasiro 4 6,291 Apr-08-2022, 01:18 PM
Last Post: hasiro
  Extract a string between 2 words from a text file OscarBoots 2 1,869 Nov-02-2021, 08:50 AM
Last Post: ibreeden
  Extract specific sentences from text file Bubly 3 3,401 May-31-2021, 06:55 PM
Last Post: Larz60+
  Add a new column when I extract each sheet in an Excel workbook as a new csv file shantanu97 0 2,227 Mar-24-2021, 04:56 AM
Last Post: shantanu97
  getting information from a text file Nickd12 8 3,217 Nov-17-2020, 01:29 AM
Last Post: bowlofred
  How to extract a single word from a text file buttercup 7 3,544 Jul-22-2020, 04:45 AM
Last Post: bowlofred

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020