Python Forum
read a text file, find all integers, append to list - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: read a text file, find all integers, append to list (/thread-37908.html)

Pages: 1 2


read a text file, find all integers, append to list - oldtrafford - Aug-07-2022

Hello everyone,

I have multiple text files with a lot of lines.
Inside I have numbers separate with space or multiple spaces.
For example :

1 2 3 4 5 9
7 10 15 8 87
14 58 69 10 100

To simplify, let's say we have 3 files. I would like to do the following steps :
  1. Open the first file
  2. Read line by line and find all numbers.
  3. Append these numbers to a list of integers, but keep the same order.
  4. Then attach this list to a dictionary

  5. Do the same tasks for the 2 others files and then save the dictionary which contains the 3 lists in a text file.

Thank you for your help


RE: read a text file, find all integers, append to list - Yoriz - Aug-07-2022

What have you tried so far, the forum will help you with your code but will not code it for you.
See the Homework and No Effort Questions link in my signature.


RE: read a text file, find all integers, append to list - rob101 - Aug-07-2022

Reading a file, line by line, is not too hard:

with open('file_one.txt', 'r') as reader:
    for line in reader:
        print(line, end='')
Why don't you try and code something up and post it back. That way OPs can see what kind of skills you have and advise you, based on what your skill level appears to be.


RE: read a text file, find all integers, append to list - oldtrafford - Aug-07-2022

import os, inquirer, glob, shutil, datetime, pandas, re
from subprocess import *
from typing import List
from inquirer.themes import GreenPassion
from pathlib import Path

Odb_File_Path = str(os.getcwd())

Path = {}
Inp_File_Selected_WOext = ['File_001.inp', 'File_002.inp', 'File_003.inp']
Inp_Short_Names_File = ["Path_1.inp","Path_2.inp","Path_2.inp" ]

for i in range(len(Inp_Short_Names_File)):
    Numbers = []
    inFile = open(Inp_File_Selected_WOext[i])
    
    outFile = open(Inp_Short_Names_File[i], "w")
    
    keepCurrentSet = False
    for line in inFile:
        if line.startswith("*"):
            keepCurrentSet = False

        if keepCurrentSet:
            outFile.write(line)

        if line.startswith("*Nset, nset=PATH, unsorted"):
            keepCurrentSet = True
    inFile.close()
    outFile.close()

    with open(Inp_Short_Names_File[i]) as f:
        lines = f.read()
        
    with open(Inp_Short_Names_File[i], "w") as f:
        for line in lines:
            f.write(re.sub(',', '', line))    

    with open(Inp_Short_Names_File[i]) as f:
        lines = f.read()
        for z in lines.split():
           if z.isdigit():
              Numbers.append(int(z))
          
    with open(Inp_Short_Names_File[i], "w") as f:
        for line in lines:
            f.write(str(Numbers))             
   
    with open(Inp_Short_Names_File[i],"r") as f:
        lines = f.readlines()            
        Path[i] = lines

          

     
OutputFile = open(r'output.inp',"w")
OutputFile.write(str(Path))
OutputFile.close

it seems that when I'm extracting the numbers he creates a list correctly, but it writes this list 3 times in each file
The files "File_001.inp" in my code look like this, and I want to copy just the numbers between the 2 *

Text
..
..
...
..
..
*Nset, nset=PATH, unsorted
13, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735
736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751
752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767
3434, 3435, 3436, 3437, 128, 128, 3357, 3358, 3359, 3360, 3361, 3362, 3363, 122, 122, 3243
3244, 3245, 3246, 3247, 121, 121, 112, 112, 3099, 3100, 3101, 3102, 99, 99, 2831, 2832
2833, 2834, 2835, 2836, 2837, 2838, 2839, 2840, 2841, 2842, 2843, 2844, 2845, 2846, 2847, 2848
2849, 2850, 2851, 2852, 2853, 2854, 2855, 2856, 2857, 2858, 2859, 2860, 2861, 2862, 2863, 2864
2865, 2866, 2867, 2868, 2869, 2870, 2871, 2872, 2873, 2874, 2875, 2876, 2877, 2878, 2879, 2880
2881, 2882, 2883, 2884, 2885, 2886, 2887, 2888, 2889, 88, 88, 2506, 2507, 2508, 2509, 2510
2511, 2512, 2513, 2514, 2515, 2516, 2517, 2518, 2519, 2520, 2521, 2522, 2523, 2524, 2525, 2526
2527, 2528, 2529, 2530, 2531, 2532, 2533, 2534, 2535, 2536, 2537, 2538, 2539, 2540, 2541, 2542
2543, 2544, 2545, 2546, 2547, 2548, 2549, 2550, 2551, 2552, 2553, 2554, 2555, 2556, 2557, 2558
2559, 2560, 2561, 2562, 2563, 2564, 72
*Text
....
...
...
..
Text


RE: read a text file, find all integers, append to list - deanhystad - Aug-07-2022

I don't understand what you mean by "save the dictionary in a text file"? What file format do you want to use? What are the keys in the dictionary?

This code saves the dictionary as a json format file. For keys I use the filename of the input file.
import json
import re

integer_pattern = re.compile("[+-]?[0-9]+")

def get_numberes_from_file(filename):
    numbers = []
    with open(filename, "r") as file:
        for line in file:
            if line.startswith("*Nset"):
                break
        for line in file:
            if line.startswith("*Text"):
                break
            numbers += map(int, re.findall(integer_pattern, line))
    print(numbers)
    return numbers

input_files = ["test.txt", "test2.txt", "test3.txt"]
numbers = {}
for filename in input_files:
    numbers[filename] = get_numberes_from_file(filename)

with open("output.inp", "w") as file:
    json.dump(numbers, file, indent=4)
I don't understand what you were doing with the short name output files.


RE: read a text file, find all integers, append to list - Pedroski55 - Aug-07-2022

@deanhystad I often copy stuff from you experts here and try it out at home. It's a good way to learn.

I made a text file with some text. Added some numbers on each line, then copied all the numbers from above in as well.

But when I try your code, it returns an empty list. Obviously, I'm doing something wrong, but I can't see what. Could you help?

Quote:>>> for filename in input_files:
numbers[filename] = get_numberes_from_file(filename)


[]

import json
import re

path2text = '/home/pedro/temp/'
myfile = 'test_number_finder.txt'
 
integer_pattern = re.compile("[+-]?[0-9]+")
 
def get_numberes_from_file(filename):
    numbers = []
    with open(path2text + filename, "r") as file:
        for line in file:
            if line.startswith("*Nset"):
                break
        for line in file:
            if line.startswith("*Text"):
                break
            numbers += map(int, re.findall(integer_pattern, line))
    print(numbers)
    return numbers
 
input_files = ['test_number_finder.txt']
numbers = {}
for filename in input_files:
    numbers[filename] = get_numberes_from_file(filename)



RE: read a text file, find all integers, append to list - deanhystad - Aug-08-2022

The file needs to look like the example posted by oldtrafford.
Output:
Text .. .. ... .. .. *Nset, nset=PATH, unsorted 13, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735 ... 2559, 2560, 2561, 2562, 2563, 2564, 72 *Text .... ... ... .. Text
I wouldn't be surprised if the starting mark should be Nset, not *Nset, and the ending mark Text, not *Text. Looking at it again I think maybe the numbers start after the Nset line and continue until there is a line that starts with text. Maybe this is a better fit:
import json
import re
 
integer_pattern = re.compile("[+-]?[0-9]+")
 
def get_numberes_from_file(filename):
    numbers = []
    with open(filename, "r") as file:
        for line in file:
            # Marks the beginning of int data
            if line.startswith("Nset"):
                break
        for line in file:
            # Read lines until encounter line without numbers
            matches = map(int, re.findall(integer_pattern, line))
            if matches:
                numbers += matches
            else:
                break
    return numbers
 
input_files = ["test.txt", "test2.txt", "test3.txt"]
numbers = {}
for filename in input_files:
    numbers[filename] = get_numberes_from_file(filename)
 
with open("output.inp", "w") as file:
    json.dump(numbers, file, indent=4)



RE: read a text file, find all integers, append to list - oldtrafford - Aug-08-2022

(Aug-07-2022, 09:45 PM)deanhystad Wrote: I don't understand what you mean by "save the dictionary in a text file"? What file format do you want to use? What are the keys in the dictionary?

This code saves the dictionary as a json format file. For keys I use the filename of the input file.
import json
import re

integer_pattern = re.compile("[+-]?[0-9]+")

def get_numberes_from_file(filename):
    numbers = []
    with open(filename, "r") as file:
        for line in file:
            if line.startswith("*Nset"):
                break
        for line in file:
            if line.startswith("*Text"):
                break
            numbers += map(int, re.findall(integer_pattern, line))
    print(numbers)
    return numbers

input_files = ["test.txt", "test2.txt", "test3.txt"]
numbers = {}
for filename in input_files:
    numbers[filename] = get_numberes_from_file(filename)

with open("output.inp", "w") as file:
    json.dump(numbers, file, indent=4)
I don't understand what you were doing with the short name output files.

Thank you very much :), it's exactly what I needed to make my program work. you save my day :)


RE: read a text file, find all integers, append to list - Pedroski55 - Aug-10-2022

Figured it out!
I did not have *Nset or *Text

When you read the lines like this:

with open(path2text + filename, "r") as file:
        for line in file:
            #print(line)
            if line.startswith("*Nset"):
                break
you have one of those 1 time use things, like csv.reader(), use it then lose it. (Don't know exactly why that happens, maybe someone could explain??)

Because the *Nset was not found, it read the whole of file, then file was dead.

The next loop had nothing to read.

This reduced function found all the numbers:

def get_numberes_from_file(filename):
    numbers = []
    with open(path2text + filename, "r") as file:        
        for line in file:
            print(line)
            if line.startswith("*Text"):
                break
            numbers += map(int, re.findall(integer_pattern, line))
    print(numbers)
    return numbers



RE: read a text file, find all integers, append to list - deanhystad - Aug-10-2022

I don't know what you mean by
Quote:you have one of those 1 time use things, like csv.reader(), use it then lose it
Are you talking about the context manager?
with open(filename, "r") as file:
If so, you can read about context managers online.

https://book.pythontips.com/en/latest/context_managers.html

Essentially this:
with open(filename, "r") as file:
    # do stuff with file
is the same as
file = open(filename, "r")
try:
    # do stuff with file
finally:
    file.close()