Python Forum
Read Multiples Text Files get specific lines based criteria
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Read Multiples Text Files get specific lines based criteria
#1
How can I read multiples text files get specific lines by criteria a save into CSV master file

I try this
import glob
import pandas as pd

lista = []

try:
    for f_name in glob.glob(r'C:\Users\zinho\Desktop\Projet_Txt'):
        if f_name.endswith('.txt'):
            with open(f_name, 'r') as f:
                for i in f:
                    if i[:6] == '|D100|':
                        lista.append(i)

except UnicodeDecodeError:
    pass

df = pd.DataFrame(lista)
df.to_csv('Master.csv', index=False, header=False)
Note: I can't use pandas to read, becouse everything that tray fail.
Reply
#2
Look into os.walk and readlines and split
I welcome all feedback.
The only dumb question, is one that doesn't get asked.
Gaming Collection
Homepage
my-python
Reply
#3
I don't undstand, at home I find way to solve my problem, but when I test this code at work does't work.

After run this code (change filepaths to my job computer), don't show D100 lines, showing nothing.

import glob
import pandas as pd

filepaths = glob.glob("/home/zinho/Downloads/*.txt")
lista = []

'''
Faz a leitura de vários arquivos txt
Copia as linhas baseado no critério |D100| e salva em um csv

'''

try:
    for fp in filepaths:
        with open(fp, 'r') as f:
            lin = f.readlines()
            for cnt in lin:
                if cnt[:6] == '|D100|':
                    lista.append(cnt)

# Essa linha cuida do final do arquivo com caracteres estrnhos
except UnicodeDecodeError:
    pass

df = pd.DataFrame(lista)
df.to_csv('/home/zinho/Downloads/Master.csv', index=False, header=False)
Reply
#4
This works for me
# /usr/bin/env python3

import os

for root, dirs, files in os.walk('./', topdown=True):

    for name in files:
        if 'txt' in name:
            with open(name, 'r') as lines:
                for line in lines:
                    for word in line.split():
                        if '|D100|' in word:
                            print(line) 
Output:
this is some text with |D100| text with |D100| still more |D100| one more text with |D100|
Directory structure
Output:
├── another.txt ├── my.txt └── walk.py Contents of my.txt this is some text with |D100| this text does not have it more text text with |D100| Contents of another.txt This is another file text More here still more |D100| one more text with |D100|
I welcome all feedback.
The only dumb question, is one that doesn't get asked.
Gaming Collection
Homepage
my-python
Reply
#5
I try your code, but the result it's same of mine, I mean nothing is show

My text letter file as exemple has 70K rows, I take peace of it
Output:
|C590|090|1253|0|710,44|0|0|0|0|0|| |C500|0|1|1-600281|06|00||||25878154|23042019|25042019|15309,16|0|15309,16|0|0|0|0|0|0|0||252,60|1163,50||| |C590|090|1253|0|15309,16|0|0|0|0|0|| |C990|885509| |D001|0| |D100|0|1|1-500145|57|00|001||2097819|32190405593147000156570010020978191053226735|01042019|08042019|0||313,46|0|0|313,46|313,46|37,62|0|||3205002|3205309| |D190|000|1353|12|313,46|313,46|37,62|0|| |D100|0|1|1-500145|57|00|001||2097820|32190405593147000156570010020978201053226752|01042019|08042019|0||313,46|0|0|313,46|313,46|37,62|0|||3205002|3204609| |D190|000|1353|12|313,46|313,46|37,62|0|| |D100|0|1|1-500145|57|00|001||2097821|32190405593147000156570010020978211053226768|01042019|08042019|0||313,46|0|0|313,46|313,46|37,62|0|||3205002|3200607| |D190|000|1353|12|313,46|313,46|37,62|0|| |D100|0|1|1-500145|57|00|001||2097822|32190405593147000156570010020978221053226773|01042019|08042019|0||313,46|0|0|313,46|313,46|37,62|0|||3205002|3202801| |D190|000|1353|12|313,46|313,46|37,62|0|| |D100|0|1|1-500145|57|00|001||2097823|32190405593147000156570010020978231053226789|01042019|08042019|0||313,46|0|0|313,46|313,46|37,62|0|||3205002|3201902| |D190|000|1353|12|313,46|313,46|37,62|0|| |D100|0|1|1-500145|57|00|001||2097824|32190405593147000156570010020978241053226794|01042019|08042019|0||313,46|0|0|313,46|313,46|37,62|0|||3205002|3202405| |D190|000|1353|12|313,46|313,46|37,62|0|| |D100|0|1|1-500145|57|00|001||2097825|32190405593147000156570010020978251053226805|01042019|08042019|0||313,46|0|0|313,46|313,46|37,62|0|||3205002|3205200| |D190|000|1353|12|313,46|313,46|37,62|0|| |D100|0|1|1-500145|57|00|001||2097826|32190405593147000156570010020978261053226810|01042019|08042019|0||313,46|0|0|313,46|313,46|37,62|0|||3205002|3201308| |D190|000|1353|12|313,46|313,46|37,62|0|| |D100|0|1|1-500145|57|00|001||2097827|32190405593147000156570010020978271053226826|01042019|08042019|0||313,46|0|0|313,46|313,46|37,62|0|||3205002|3201407| |D190|000|1353|12|313,46|313,46|37,62|0|| |D100|0|1|1-500145|57|00|001||2097828|32190405593147000156570010020978281053226831|01042019|08042019|0||313,46|0|0|313,46|313,46|37,62|0|||3205002|3202207|


After run your code
Output:
Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 22:45:29) [MSC v.1916 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license()" for more information. >>> = RESTART: C:/Users/zinho/Documents/Projetos_Python/gettext_V2.py >>>
My original file is here
https://santacruzdistribuidora-my.sharep...g?e=GWdqxr
Reply
#6
Hi

Finaly a solve this.
import glob, os
import pandas as pd

path = 'C:\\Users\\zinho\\Desktop\\Projet_Txt\\*.txt'
f_names = glob.glob(path)
lista = []


for file in f_names:
    try:
        with open(file, 'r') as f:
            
            try:
                for line in f:
                    if line[:6] == '|D100|':
                        lista.append(line)

            except UnicodeDecodeError:
                pass

    except IOError as exc:
        if exc.errno != errno.EISDIR:
            raise


df = pd.DataFrame(lista)
df.to_csv('C:\\Users\\zinho\\Downloads\\Master.csv', index=False, header=False)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Error in the files last 8 lines Led_Zeppelin 2 282 Sep-11-2021, 09:55 PM
Last Post: snippsat
  raspberry use scrolling text two lines together fishbone 0 127 Sep-06-2021, 03:24 AM
Last Post: fishbone
  Sorting and Merging text-files AlphaInc 10 512 Aug-20-2021, 05:42 PM
Last Post: snippsat
  Replace String in multiple text-files [SOLVED] AlphaInc 5 534 Aug-08-2021, 04:59 PM
Last Post: Axel_Erfurt
  Several pdf files to text mfernandes 10 875 Jul-07-2021, 11:39 PM
Last Post: Pedroski55
  Open and read multiple text files and match words kozaizsvemira 3 4,111 Jul-07-2021, 11:27 AM
Last Post: Larz60+
  Reading Multiple text Files in pyhton Fatim 1 381 Jun-25-2021, 01:37 PM
Last Post: deanhystad
  [Solved] Trying to read specific lines from a file Laplace12 7 748 Jun-21-2021, 11:15 AM
Last Post: Laplace12
  Extract specific sentences from text file Bubly 3 518 May-31-2021, 06:55 PM
Last Post: Larz60+
  Matching two files based on a spited elements tester_V 5 599 May-30-2021, 07:49 PM
Last Post: tester_V

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020