Python Forum
read multiple .xlsx files and text files in a directory
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
read multiple .xlsx files and text files in a directory
#1
I have a list of .xlsx files (names 1.xlsx, 2.xlsx etc), each with several worksheet, in a directory. I need to insert data from its corresponding text file (named 1.txt, 2.txt etc) on the second worksheet named 'Filtered' and save it along with its original contents. The code I am trying is below. I end up with a blank worksheet in 'Filtered'. I am not sure how to read multiple text files with corresponding excel files and paste data. Any suggestions/edits on code ? Thank you

#!/usr/bin/python

import os
from openpyxl.reader.excel import load_workbook
import csv

directoryPath = r'/REPORTS/'
os.chdir(directoryPath)
folder_list = os.listdir(directoryPath)
for folders, sub_folders, file in os.walk(directoryPath):  
   for name in file:
       if name.endswith(".xlsx"):
           filename = os.path.join(folders, name)
           wb = load_workbook(filename, data_only=True)
           ws = wb.get_sheet_by_name('Filtered')
           directory = os.path.join("/REPORTS/temp", "path")
           for root,dirs,files in os.walk(directory):
               for file in files:
                   if file.endswith(".txt"):
                       f = open(textfile, 'r')
                       reader = csv.reader(f, delimiter='\t')
                       for row in reader:
                           ws.append(row)
                           wb.save(filename)
Reply
#2
Question: Are you converting to text (outside of your python program) for the
sole reason of converting them into a format that can be imported into Python?
If so, there are multiple libraries available to read and/or write directly to Excel,
Pandas being one of the most dominant. see: http://pandas.pydata.org/
Reply
#3
Hello,
the text files are a result of some calculation done using awk and I need to paste the results back to the excel files.
Reply
#4
OK, I'm not familiar with openpyxl, but it is capable of reading and writing (according to the docs)
I would still look into pandas.
Assuming (tell me if I am wrong):
several input spreadsheets to be joined together into one output sheet.

so:
  • open output sheet as a writable sheet
  • for each input sheet, read workseheet, write to new sheet
  • when all read, close output
  • should be as simple as that

I would still look at pandas. I am not saying that there is anything wrong with the package
you are using, but I had problems with a few when I needed to manipulate excel.

Pandas is rock solid, and offers many more features
Reply
#5
(Jun-02-2017, 11:14 AM)BNB Wrote:
#!/usr/bin/python

import os
from openpyxl.reader.excel import load_workbook
import csv

I am not sure why are you using openpyxl.reader.excel, I just checked a code I've written several week ago
from openpyxl import Workbook, load_workbook
Works just fine.  

wb.get_sheet_by_name causes exception if the sheet with that name does not exist - but I guess that is not your problem.

PS I am not sure why, but it is recommended to open csv files with newline='' argument

Another thing - you are collecting the same data for each Excel file you find. It is sort of inefficient. What exactly are you trying to do?
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#6
(Jun-05-2017, 11:58 AM)Larz60+ Wrote: ......
I would still look at pandas. I am not saying that there is anything wrong with the package you are using, but I had problems with a few when I needed to manipulate excel.
At the risk of facing fire and brimstone  Tongue , I would like to point out that pandas  - great package, BTW Thumbs Up - may be intimidating for beginners.

And its functioning against Excel is not without flaws. When I saved a huge dataframe that contained a column of string digits, it (the column) was saved as numbers in Excel. Lucky for me, they were saved as integers - otherwise, my data would have been corrupted (like when it was converted to scientific form Cry ).

pandas will not provide you with formatting options either.

(Jun-05-2017, 11:58 AM)Larz60+ Wrote: Pandas is rock solid, and offers many more features
About "rock solid" - they have been making some changes that are not backward-compatible lately, I believe?! And occasionally you don't need "more features" Wink

(Jun-05-2017, 11:58 AM)Larz60+ Wrote: ......
I would still look at pandas. I am not saying that there is anything wrong with the package you are using, but I had problems with a few when I needed to manipulate excel.
At the risk of facing fire and brimstone  Tongue , I would like to point out that pandas  - great package, BTW Thumbs Up - may be intimidating for beginners.

And its functioning against Excel is not without flaws. When I saved a huge dataframe that contained a column of string digits, it (the column) was saved as numbers in Excel. Lucky for me, they were saved as integers - otherwise, my data would have been corrupted (like when it was converted to scientific form Cry ).

pandas will not provide you with formatting options either.

(Jun-05-2017, 11:58 AM)Larz60+ Wrote: Pandas is rock solid, and offers many more features
About "rock solid" - they have been making some changes that are not backward-compatible lately, I believe?! And occasionally you don't need "more features" Wink
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#7
Oh well, it's your option
Reply
#8
(Jun-06-2017, 08:22 PM)Larz60+ Wrote: Oh well, it's your option
When I have nothing to say - I prefer to pledge the fifth. Well, it's not the first time moderators try to shame me.

I think I am starting to understand Thaos Confused .

Let me repeat myself: pandas is not easy, not for beginners - and not always necessary...
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#9
(Jun-06-2017, 08:28 PM)volcano63 Wrote: Well, it's not the first time moderators try to shame me.
I'm a moderator.  And I...
1) don't recommend pandas, but mostly because...
2) I'm not even really sure what it is Tongue

From what I've seen it's basically numpy, but more complicated and without any additional benefits.
Reply
#10
Not trying to shame, only stating that it's ok to do as you wish.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Import multiple CSV files into pandas Krayna 0 1,716 May-20-2021, 04:56 PM
Last Post: Krayna
  Python - Pandas writing blank files to file tdunphy 0 1,998 Jan-14-2021, 12:11 AM
Last Post: tdunphy
  Most Compatible Text Editor to Handle Large Files? Robotguy 2 2,379 Aug-18-2020, 03:51 PM
Last Post: FortyTwo
  Creating many csv files from Pandas EMA 0 1,629 Jul-26-2020, 06:39 PM
Last Post: EMA
  Loading multiple JSON files to create a csv 0LI5A3A 0 2,102 Jun-28-2020, 10:35 PM
Last Post: 0LI5A3A
  Binning data to files Kappel 4 2,404 Jun-22-2020, 06:25 PM
Last Post: Kappel
  Filter rows by multiple text conditions in another data frame i.e contains strings an Pan 0 2,157 Jun-09-2020, 06:05 AM
Last Post: Pan
  Can't read text file with pandas zinho 6 12,108 May-24-2020, 06:13 AM
Last Post: azajali43
  XLSX file with multiple sheets to josn file ovidius 2 2,230 Apr-05-2020, 09:22 AM
Last Post: ovidius
  Can python read Marathi text files and summarize them? mcp111 0 1,826 Mar-18-2020, 08:58 AM
Last Post: mcp111

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020