##### [Solved] Reading every nth line into a column from txt file
 [Solved] Reading every nth line into a column from txt file Laplace12 Silly Frenchman Posts: 49 Threads: 19 Joined: Jul 2020 Reputation: Jun-28-2021, 09:47 AM (This post was last modified: Jun-29-2021, 09:18 AM by Laplace12.) Hey! I have a text file that I want to sort out. I've coded this and tried dataframe, but that only prints the last line. The code I have now is this, producing the txt file: ```with open(output) as file, open(out, 'w') as file_out: for line in file: if '2101' in line and found: a = line.split() print(a[1], file=file_out) elif 'Lifetimes' in line and found: b = line.split() print(b[3], b[4], b[5], file=file_out) elif 'Std deviations' in line and found: # print(c[3:6]) c = line print(deviations(c), file=file_out) elif 'Intensities' in line and found: d = line.split() print(d[3], d[4], d[5], file=file_out) elif 'Time-zero' in line and found: e = line.split() print(e[4], file=file_out) else: found = True #This is what I tried so far with open(out) as a: cpt = 0 for line in a: cpt += 1 if cpt == 8: print(line) cpt = 0```The 'out' file is like this: ```Number Value1 Deviation Value2 Deviation Value3 Deviation Number Value1 Deviation ...```So basically the file is now a list I want to sort so that Lifetimes are all in one column, Value1 in the next, then Deviation, Value2, its deviation so on; I want every 8th value in the same column, and I'm guessing this could somehow be done by creating a loop that prints, skips 7 values and prints the next so that the start number could be changed from 1-7. I need to save the results in another file, so perhaps it'd be easier to code the columns in the 'out' file already without creating so many files, but for now it's enough to get the data sorted properly, so even the simplest code to produce columns from the txt file works! likes this post Reply Posts: 5,780 Threads: 113 Joined: Sep 2016 Reputation: Jun-28-2021, 10:54 AM (This post was last modified: Jun-28-2021, 10:54 AM by snippsat.) How dos the original file look it's properly a better way,but can not advice anything without a sample of original file. Reply Laplace12 Silly Frenchman Posts: 49 Threads: 19 Joined: Jul 2020 Reputation: Jun-28-2021, 12:18 PM (Jun-28-2021, 10:54 AM)snippsat Wrote: How dos the original file look it's properly a better way,but can not advice anything without a sample of original file. Hey, it looks like this: ```#0 0.4000 0.1250 2.0446 ['Fixed', 'Fixed', '0.0339'] 69.2721 9.6726 21.0553 ['1.0359', '0.8128', '0.4063'] 41.5603 ['0.0588', ' ', ' '] #1 0.4000 0.1250 2.0714 ['Fixed', 'Fixed', '0.0344'] 70.0338 9.0952 20.8710 ['1.0308', '0.8135', '0.4009'] 41.5853 ['0.0593', ' ', ' '] #2 0.4000 0.1250 2.0568 ['Fixed', 'Fixed', '0.0333'] 69.5963 8.7445 21.6592 ['1.0411', '0.8177', '0.4072'] 41.5541 ['0.0603', ' ', ' '] #3 0.4000 0.1250 2.0321 ['Fixed', 'Fixed', '0.0329'] ...```With 490 lines in total. Reply Posts: 5,780 Threads: 113 Joined: Sep 2016 Reputation: Jun-28-2021, 01:20 PM (This post was last modified: Jun-28-2021, 01:20 PM by snippsat.) What is the output you want from this? I can not see why you look for Lifetime,Std deviations...ect in this. Are you making this file? When put a Python list `['Fixed', 'Fixed', '0.0339']` in a text file it lose all it's meaning. Have to parse it back or could done something else like taken out values(eg CSV way) then save a list to a test file. If you no control of the text file then have to parse it to what you want. Reply Laplace12 Silly Frenchman Posts: 49 Threads: 19 Joined: Jul 2020 Reputation: Jun-28-2021, 06:47 PM (This post was last modified: Jun-28-2021, 06:47 PM by Laplace12.) (Jun-28-2021, 01:20 PM)snippsat Wrote: What is the output you want from this? I can not see why you look for Lifetime,Std deviations...ect in this. Are you making this file? When put a Python list `['Fixed', 'Fixed', '0.0339']` in a text file it lose all it's meaning. Have to parse it back or could done something else like taken out values(eg CSV way) then save a list to a test file. If you no control of the text file then have to parse it to what you want. Alright, I must've explained this quite badly, let me try again! The first part of the code (picking Lifetimes etc.) is just sorting out a file (called output) that looks like this: ```CA50_40_ref_data2101_E04_spec0-70 #0 Lifetimes (ns) : 0.4000 0.1250 2.0446 Std deviations : Fixed Fixed 0.0339 Intensities (%) : 69.2721 9.6726 21.0553 Std deviations : 1.0359 0.8128 0.4063 Time-zero Channel number : 41.5603 Std deviations : 0.0588 CA50_40_ref_data2101_E04_spec0-70 #1 Lifetimes (ns) : 0.4000 0.1250 2.0714 Std deviations : Fixed Fixed 0.0344 Intensities (%) : 70.0338 9.0952 20.8710 Std deviations : 1.0308 0.8135 0.4009 Time-zero Channel number : 41.5853 Std deviations : 0.0593 CA50_40_ref_data2101_E04_spec0-70 #2 Lifetimes (ns) : 0.4000 0.1250 2.0568 Std deviations : Fixed Fixed 0.0333 Intensities (%) : 69.5963 8.7445 21.6592 Std deviations : 1.0411 0.8177 0.4072 Time-zero Channel number : 41.5541 Std deviations : 0.0603 CA50_40_ref_data2101_E04_spec0-70 #3 Lifetimes (ns) : 0.4000 0.1250 2.0321 Std deviations : Fixed Fixed 0.0329 Intensities (%) : 70.4228 8.0614 21.5158 Std deviations : 1.0497 0.8219 0.4105 Time-zero Channel number : 41.4507 Std deviations : 0.0604 CA50_40_ref_data2101_E04_spec0-70 #4 Lifetimes (ns) : 0.4000 0.1250 2.0513 Std deviations : Fixed Fixed 0.0331 Intensities (%) : 67.2025 11.0731 21.7244 Std deviations : 1.0204 0.7976 0.4057 Time-zero Channel number : 41.6253 Std deviations : 0.0579 CA50_40_ref_data2101_E04_spec0-70 #5 ...```into this (file called out): ```#0 0.4000 0.1250 2.0446 ['Fixed', 'Fixed', '0.0339'] 69.2721 9.6726 21.0553 ['1.0359', '0.8128', '0.4063'] 41.5603 ['0.0588', ' ', ' '] #1 0.4000 0.1250 2.0714 ['Fixed', 'Fixed', '0.0344'] 70.0338 9.0952 20.8710 ['1.0308', '0.8135', '0.4009'] 41.5853 ['0.0593', ' ', ' '] #2 0.4000 0.1250 2.0568 ['Fixed', 'Fixed', '0.0333'] 69.5963 8.7445 21.6592 ['1.0411', '0.8177', '0.4072'] 41.5541 ['0.0603', ' ', ' '] #3 0.4000 0.1250 2.0321 ['Fixed', 'Fixed', '0.0329'] 70.4228 8.0614 21.5158 ['1.0497', '0.8219', '0.4105'] 41.4507 ['0.0604', ' ', ' '] #4 0.4000 0.1250 2.0513 ['Fixed', 'Fixed', '0.0331'] 67.2025 11.0731 21.7244 ['1.0204', '0.7976', '0.4057']```So the first loop was needed to extract the necessary information from the first file, and now I am trying to get the 'out' file above in this form for easier comparison: ```Dataset Lifetimes Std deviations Intensities Std deviations Time-zero Std deviation #0 0.4000 0.1250 2.0446 ['Fixed', 'Fixed', '0.0339'] 69.2721 9.6726 21.0553 ['1.0359', '0.8128', '0.4063'] 41.5603 ['0.0588', ' ', ' '] #1 0.4000 0.1250 2.0714 ['Fixed', 'Fixed', '0.0344'] 70.0338 9.0952 20.8710 ['1.0308', '0.8135', '0.4009'] 41.5853 ['0.0593', ' ', ' '] ...```So basically I'm just trying to sort out the 'out' file into columns with every eight value in the same column. Reply Posts: 1,951 Threads: 34 Joined: Sep 2016 Reputation: Jun-28-2021, 11:20 PM (This post was last modified: Jun-28-2021, 11:32 PM by Yoriz.) output ``````Output:CA50_40_ref_data2101_E04_spec0-70 #0 Lifetimes (ns) : 0.4000 0.1250 2.0446 Std deviations : Fixed Fixed 0.0339 Intensities (%) : 69.2721 9.6726 21.0553 Std deviations : 1.0359 0.8128 0.4063 Time-zero Channel number : 41.5603 Std deviations : 0.0588 CA50_40_ref_data2101_E04_spec0-70 #1 Lifetimes (ns) : 0.4000 0.1250 2.0714 Std deviations : Fixed Fixed 0.0344 Intensities (%) : 70.0338 9.0952 20.8710 Std deviations : 1.0308 0.8135 0.4009 Time-zero Channel number : 41.5853 Std deviations : 0.0593 CA50_40_ref_data2101_E04_spec0-70 #2 Lifetimes (ns) : 0.4000 0.1250 2.0568 Std deviations : Fixed Fixed 0.0333 Intensities (%) : 69.5963 8.7445 21.6592 Std deviations : 1.0411 0.8177 0.4072 Time-zero Channel number : 41.5541 Std deviations : 0.0603 CA50_40_ref_data2101_E04_spec0-70 #3 Lifetimes (ns) : 0.4000 0.1250 2.0321 Std deviations : Fixed Fixed 0.0329 Intensities (%) : 70.4228 8.0614 21.5158 Std deviations : 1.0497 0.8219 0.4105 Time-zero Channel number : 41.4507 Std deviations : 0.0604 CA50_40_ref_data2101_E04_spec0-70 #4 Lifetimes (ns) : 0.4000 0.1250 2.0513 Std deviations : Fixed Fixed 0.0331 Intensities (%) : 67.2025 11.0731 21.7244 Std deviations : 1.0204 0.7976 0.4057 Time-zero Channel number : 41.6253 Std deviations : 0.0579``````You may need to add some error checking ```from itertools import zip_longest HEADER = ('Dataset Lifetimes Std deviations ' 'Intensities Std deviations ' 'Time-zero Std deviation\n') def grouper(iterable, n, fillvalue=''): "Collect data into fixed-length chunks or blocks" # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx" args = [iter(iterable)] * n return zip_longest(*args, fillvalue=fillvalue) class Dataset: def __init__(self, line: str) -> None: self.data = line.strip().split()[1] def __repr__(self) -> str: return f'{self.data:7}' class LifeTimes: def __init__(self, line: str) -> None: self.data = line.strip().split()[3:6] def __repr__(self) -> str: return ' '.join(self.data) class StdDeviations: def __init__(self, line: str) -> None: split_line = line.strip().split() self.data = [split_line[num] if split_line[num:] else '' for num in range(3, 6)] def __repr__(self) -> str: return f"['{self.data[0]:^7}', '{self.data[1]:^7}', '{self.data[2]:^7}']" class Intensities: def __init__(self, line: str) -> None: self.data = line.strip().split()[3:6] def __repr__(self) -> str: return f'{self.data[0]:>7} {self.data[1]:>7} {self.data[2]:>7}' class TimeZero: def __init__(self, line: str) -> None: self.data = line.strip().split()[4] def __repr__(self) -> str: return f'{self.data:9}' class Block: def __init__(self, dataset, lifetimes, std_deviations, intensities, std_deviations2, time_zero, std_deviations3) -> None: self.dataset = Dataset(dataset) self.lifetimes = LifeTimes(lifetimes) self.std_deviations = StdDeviations(std_deviations) self.intensities = Intensities(intensities) self.std_deviations2 = StdDeviations(std_deviations2) self.time_zero = TimeZero(time_zero) self.std_deviations3 = StdDeviations(std_deviations3) def __repr__(self) -> str: return (f'{self.dataset} {self.lifetimes} {self.std_deviations}' f' {self.intensities} {self.std_deviations2}' f' {self.time_zero} {self.std_deviations3}\n') with open('output') as file_in, open('out', 'w') as file_out: file_out.write(HEADER) for group in grouper(file_in, 7): file_out.write(str(Block(*group)))```out ``````Output:Dataset Lifetimes Std deviations Intensities Std deviations Time-zero Std deviation #0 0.4000 0.1250 2.0446 [' Fixed ', ' Fixed ', '0.0339 '] 69.2721 9.6726 21.0553 ['1.0359 ', '0.8128 ', '0.4063 '] 41.5603 ['0.0588 ', ' ', ' '] #1 0.4000 0.1250 2.0714 [' Fixed ', ' Fixed ', '0.0344 '] 70.0338 9.0952 20.8710 ['1.0308 ', '0.8135 ', '0.4009 '] 41.5853 ['0.0593 ', ' ', ' '] #2 0.4000 0.1250 2.0568 [' Fixed ', ' Fixed ', '0.0333 '] 69.5963 8.7445 21.6592 ['1.0411 ', '0.8177 ', '0.4072 '] 41.5541 ['0.0603 ', ' ', ' '] #3 0.4000 0.1250 2.0321 [' Fixed ', ' Fixed ', '0.0329 '] 70.4228 8.0614 21.5158 ['1.0497 ', '0.8219 ', '0.4105 '] 41.4507 ['0.0604 ', ' ', ' '] #4 0.4000 0.1250 2.0513 [' Fixed ', ' Fixed ', '0.0331 '] 67.2025 11.0731 21.7244 ['1.0204 ', '0.7976 ', '0.4057 '] 41.6253 ['0.0579 ', ' ', ' ']`````` Laplace12 likes this post Reply Posts: 5,780 Threads: 113 Joined: Sep 2016 Reputation: Jun-29-2021, 12:39 AM If look at data so should it be turn around this is called transpose() if want data into Pandas for calculation, plot..ect. If just want display data then can Yoriz method work. To give example,just using first record. ```record = {} with open('ca_data.txt') as f: header = next(f) for line in f: line = line.strip() line = line.replace('Time-zero ', '') line = line.split(':') line_1 = line[0].strip() line_2 = ''.join(line[1:]) record[line_1] = line_2.split() # Read like this so it fill in empty values with None df = pd.DataFrame.from_dict(record, orient='index') print(df) `````````Output:Lifetimes (ns) 0.4000 0.1250 2.0446 Std deviations 0.0588 None None Intensities (%) 69.2721 9.6726 21.0553 Channel number 41.5603 None None``````No can use `transpose()`,then it will a useful DataFrame. ```>>> df = df.transpose() >>> df Lifetimes (ns) Std deviations Intensities (%) Channel number 0 0.4000 0.0588 69.2721 41.5603 1 0.1250 None 9.6726 None 2 2.0446 None 21.0553 None``` Laplace12 likes this post Reply Laplace12 Silly Frenchman Posts: 49 Threads: 19 Joined: Jul 2020 Reputation: Jun-29-2021, 09:17 AM (Jun-28-2021, 11:20 PM)Yoriz Wrote: output ``````Output:CA50_40_ref_data2101_E04_spec0-70 #0 Lifetimes (ns) : 0.4000 0.1250 2.0446 Std deviations : Fixed Fixed 0.0339 Intensities (%) : 69.2721 9.6726 21.0553 Std deviations : 1.0359 0.8128 0.4063 Time-zero Channel number : 41.5603 Std deviations : 0.0588 CA50_40_ref_data2101_E04_spec0-70 #1 Lifetimes (ns) : 0.4000 0.1250 2.0714 Std deviations : Fixed Fixed 0.0344 Intensities (%) : 70.0338 9.0952 20.8710 Std deviations : 1.0308 0.8135 0.4009 Time-zero Channel number : 41.5853 Std deviations : 0.0593 CA50_40_ref_data2101_E04_spec0-70 #2 Lifetimes (ns) : 0.4000 0.1250 2.0568 Std deviations : Fixed Fixed 0.0333 Intensities (%) : 69.5963 8.7445 21.6592 Std deviations : 1.0411 0.8177 0.4072 Time-zero Channel number : 41.5541 Std deviations : 0.0603 CA50_40_ref_data2101_E04_spec0-70 #3 Lifetimes (ns) : 0.4000 0.1250 2.0321 Std deviations : Fixed Fixed 0.0329 Intensities (%) : 70.4228 8.0614 21.5158 Std deviations : 1.0497 0.8219 0.4105 Time-zero Channel number : 41.4507 Std deviations : 0.0604 CA50_40_ref_data2101_E04_spec0-70 #4 Lifetimes (ns) : 0.4000 0.1250 2.0513 Std deviations : Fixed Fixed 0.0331 Intensities (%) : 67.2025 11.0731 21.7244 Std deviations : 1.0204 0.7976 0.4057 Time-zero Channel number : 41.6253 Std deviations : 0.0579``````You may need to add some error checking ```from itertools import zip_longest HEADER = ('Dataset Lifetimes Std deviations ' 'Intensities Std deviations ' 'Time-zero Std deviation\n') def grouper(iterable, n, fillvalue=''): "Collect data into fixed-length chunks or blocks" # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx" args = [iter(iterable)] * n return zip_longest(*args, fillvalue=fillvalue) class Dataset: def __init__(self, line: str) -> None: self.data = line.strip().split()[1] def __repr__(self) -> str: return f'{self.data:7}' class LifeTimes: def __init__(self, line: str) -> None: self.data = line.strip().split()[3:6] def __repr__(self) -> str: return ' '.join(self.data) class StdDeviations: def __init__(self, line: str) -> None: split_line = line.strip().split() self.data = [split_line[num] if split_line[num:] else '' for num in range(3, 6)] def __repr__(self) -> str: return f"['{self.data[0]:^7}', '{self.data[1]:^7}', '{self.data[2]:^7}']" class Intensities: def __init__(self, line: str) -> None: self.data = line.strip().split()[3:6] def __repr__(self) -> str: return f'{self.data[0]:>7} {self.data[1]:>7} {self.data[2]:>7}' class TimeZero: def __init__(self, line: str) -> None: self.data = line.strip().split()[4] def __repr__(self) -> str: return f'{self.data:9}' class Block: def __init__(self, dataset, lifetimes, std_deviations, intensities, std_deviations2, time_zero, std_deviations3) -> None: self.dataset = Dataset(dataset) self.lifetimes = LifeTimes(lifetimes) self.std_deviations = StdDeviations(std_deviations) self.intensities = Intensities(intensities) self.std_deviations2 = StdDeviations(std_deviations2) self.time_zero = TimeZero(time_zero) self.std_deviations3 = StdDeviations(std_deviations3) def __repr__(self) -> str: return (f'{self.dataset} {self.lifetimes} {self.std_deviations}' f' {self.intensities} {self.std_deviations2}' f' {self.time_zero} {self.std_deviations3}\n') with open('output') as file_in, open('out', 'w') as file_out: file_out.write(HEADER) for group in grouper(file_in, 7): file_out.write(str(Block(*group)))```out ``````Output:Dataset Lifetimes Std deviations Intensities Std deviations Time-zero Std deviation #0 0.4000 0.1250 2.0446 [' Fixed ', ' Fixed ', '0.0339 '] 69.2721 9.6726 21.0553 ['1.0359 ', '0.8128 ', '0.4063 '] 41.5603 ['0.0588 ', ' ', ' '] #1 0.4000 0.1250 2.0714 [' Fixed ', ' Fixed ', '0.0344 '] 70.0338 9.0952 20.8710 ['1.0308 ', '0.8135 ', '0.4009 '] 41.5853 ['0.0593 ', ' ', ' '] #2 0.4000 0.1250 2.0568 [' Fixed ', ' Fixed ', '0.0333 '] 69.5963 8.7445 21.6592 ['1.0411 ', '0.8177 ', '0.4072 '] 41.5541 ['0.0603 ', ' ', ' '] #3 0.4000 0.1250 2.0321 [' Fixed ', ' Fixed ', '0.0329 '] 70.4228 8.0614 21.5158 ['1.0497 ', '0.8219 ', '0.4105 '] 41.4507 ['0.0604 ', ' ', ' '] #4 0.4000 0.1250 2.0513 [' Fixed ', ' Fixed ', '0.0331 '] 67.2025 11.0731 21.7244 ['1.0204 ', '0.7976 ', '0.4057 '] 41.6253 ['0.0579 ', ' ', ' ']`````` Brilliant, this works perfectly! Using def/class commands is not familiar to me at all but I have to take a much closer look at that, seems very useful for my purposes - and it's a huge plus having reduced the amount of files the code creates. Definitely something to try and learn better. Thank you! (Jun-29-2021, 12:39 AM)snippsat Wrote: If look at data so should it be turn around this is called transpose() if want data into Pandas for calculation, plot..ect. If just want display data then can Yoriz method work. To give example,just using first record. ```record = {} with open('ca_data.txt') as f: header = next(f) for line in f: line = line.strip() line = line.replace('Time-zero ', '') line = line.split(':') line_1 = line[0].strip() line_2 = ''.join(line[1:]) record[line_1] = line_2.split() # Read like this so it fill in empty values with None df = pd.DataFrame.from_dict(record, orient='index') print(df) `````````Output:Lifetimes (ns) 0.4000 0.1250 2.0446 Std deviations 0.0588 None None Intensities (%) 69.2721 9.6726 21.0553 Channel number 41.5603 None None``````No can use `transpose()`,then it will a useful DataFrame. ```>>> df = df.transpose() >>> df Lifetimes (ns) Std deviations Intensities (%) Channel number 0 0.4000 0.0588 69.2721 41.5603 1 0.1250 None 9.6726 None 2 2.0446 None 21.0553 None``` Big thanks to you as well, I'll take a look at this method! Reply

 Possibly Related Threads… Thread Author Replies Views Last Post writelines only writes one line to file gr3yali3n 2 83 Yesterday, 10:02 PM Last Post: gr3yali3n Updating a config file [solved] ebolisa 8 430 Nov-04-2021, 10:20 AM Last Post: Gribouillis |SOLVED] Glob JPGs, read EXIF, update file timestamp? Winfried 5 442 Oct-21-2021, 03:29 AM Last Post: buran [SOLVED] Read text file from some point till EOF? Winfried 1 273 Oct-10-2021, 10:29 PM Last Post: Winfried How to do next line output from CSV column? atomxkai 2 570 Oct-02-2021, 01:00 AM Last Post: Pedroski55 In need of insight regarding Python file reading mechanisms. EnfantNicolas 7 691 Sep-18-2021, 10:39 AM Last Post: ndc85430 [SOLVED] Input parameter: Single file or glob? Winfried 0 326 Sep-10-2021, 11:54 AM Last Post: Winfried [SOLVED] Why does regex fail cleaning line? Winfried 5 702 Aug-22-2021, 06:59 PM Last Post: Winfried [SOLVED] Find last occurence of pattern in text file? Winfried 4 703 Aug-13-2021, 08:21 PM Last Post: Winfried Parsing a YAML file without changing the string content..?, Flask - solved. SpongeB0B 2 509 Aug-05-2021, 08:02 AM Last Post: SpongeB0B

Forum Jump:

### User Panel Messages

##### Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020