Python Forum

Full Version: Loop files - Extract List Data To Individual Columns in CSV
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi friends ,

I am trying to extract some data to a CSV file.
I have an list of values - that should be extracted from each text file.
If there are 3 array values - there should be 3 columns in the CSV.
I wasn’t able to work out the correct way to output the results to my csv file.

#!/usr/bin/python3
import os
import csv
import pandas

 
def extract_lines_from_files(filename, dirpath):
    filename  = os.path.join(dirpath, filename)
    
    search_keywords = ['Apple','Pear','Cherry']                # 3 Columns              

    with open(filename, 'r',errors='ignore') as Text_file:     

         for line in Text_file:
            found = False
            for word in search_keywords:
                if word in line:
                    found = True
            if found:
                lines = []
                lines.append(line)
    
                with open("a.csv",'a') as csv_file:
                    writer = csv.writer(csv_file)
                    writer.writerow([line])
 
#main
directory = 'C:/Users/home/Desktop/files/'
for root, dirs, files in os.walk(directory):
    for f in files:
        extract_lines_from_files(f, directory)
#done
Result should be
Output:
Col 1 - Apple | Col 2 - Pear | Col 3 - Cherry Apple 1 | Pear 1 | Cherry 1 Apple 2 | Pear 2 | Cherry 2 etc
At the moment its all output to 1 column.

Please may some one have a look at this, I have been around in circles trying to work it out Confused
I would appreciate that

Thank you
hi, could you upload the files?
Hello friends,

yes

1.txt
Output:
Apple 1 Apricot Avocado Banana Pear 1 Bilberry Blackberry Cherry 1
2.txt
Output:
Apple 2 Apricot Avocado Pear 2 Banana Bilberry Blackberry Blackcurrant Cherry 2
I wasnt able to upload but these are the contents

thank you
#!/usr/bin/python3
import os
import csv
import pandas
 
  
def extract_lines_from_files(filename, dirpath):
    filename  = os.path.join(dirpath, filename)
     
    search_keywords = ['Apple','Pear','Cherry']                              
 
    with open(filename, 'r',errors='ignore') as f:
        lines = []
        
        # the file object must be read first
        for line in f.readlines():

            # (word in line) expression returns either True or False
            if any((word in line) for word in search_keywords):

                # strip() removes "\n" (new line byte)
                lines.append(line.strip())
              
        with open("a.csv",'a', newline='') as csv_file:

            # delimeter is what separates the values in the csv file
            # btw csv = "comma separated values"
            writer = csv.writer(csv_file, delimiter='|')
            writer.writerow(lines)
            
directory = 'C:/Users/home/Desktop/files/'
for root, dirs, files in os.walk(directory):

    # using only the files that end with .txt
    # this way the files could be in the same directory with the .py file
    for f in filter(lambda x: x.endswith('.txt'), files):
        extract_lines_from_files(f, directory)
Contents of the "a.csv" file:
Output:
Apple 1|Pear 1|Cherry 1 Apple 2|Pear 2|Cherry 2
Thank you dear friend, let me do some testing :)

Thank you for your generous help
Hi michalmonday,

thank you for helping me with the revision of the code and some new python syntax.

I have tested it on my text files - and yay it works so amaaaaaazing!!!

I am very happy I can use python to do the work of extracting the data from my text files. I have a lot of text files sitting on my desktop waiting for me.

I hope you will have the best of weekends!

Thank you again for helping python newbies.