Python Forum
Searching for URLs and printing the corresponding row from an Excel spreadsheet
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Searching for URLs and printing the corresponding row from an Excel spreadsheet
#1
Hi,

So far I have created a script that identifies the last modified folder in a list, appends all the spreadsheets in it and then outputs a new appended Spreadsheet.

Now I want to search the new spreadsheet I have created for a set list of URLs that I have and where a match is found, print the row that contains the URL.

Here ismy script so far:
import os
import pandas as pd
import xlsxwriter
exportroot = r'\\atlas\\Knowhow\\ExternalLinkValidationReports\\UK'  # change this to your folder path
folderlist = [] # blank list of folders - we'll put the paths to the subfolders in this
# check that the folder exists and is actually a folder
if os.path.exists(exportroot) and os.path.isdir(exportroot):
    exportfolders = next(os.walk(exportroot))[1]  # get list of children - this will include folders *and* files
    for exportfolder in exportfolders:  # for each of those children………
        exppath = os.path.join(exportroot, exportfolder)  # define the full path………
        if os.path.isdir(exppath):  # and if that path defines a directory………
            folderlist.append(exppath)    # add it to our list
sortedlist = sorted(folderlist, key=lambda t: -os.stat(t).st_mtime)  # sort the list, most recent first
mostrecent = sortedlist[0]  # get the full path to the most recently modified subfolder from the sorted list
print(mostrecent)
os.chdir(mostrecent)  # Changing current working directory to "mostrecent" folder
path = os.getcwd() # Defining path as new cwd
print(path)
cwd = path # Defining cwd as path
files = os.listdir(cwd)
df = pd.DataFrame() #Importing data frame
for file in files:
        if file.endswith('.xlsx'):
            df = df.append(pd.read_excel(file), ignore_index=True) #Identifying Excel files in "mostrecent" folder
df.head() #Printing first 5 rows to test data
    
writer = pd.ExcelWriter(r'Appended_Reports.xlsx', engine='xlsxwriter',options={'strings_to_urls': False})
df.to_excel(writer)
writer.close() #Creates report - changes options to avoid 255 character URL limit in excel 
Has anyone done something similar and might be able to help? I've looked through openpyxl guidance but as I'm a newbie I'm struggling to find the function. It's the Appended_Reports.xlsx spreadsheet that I create at the end of the process that I want to search.

Any help greatly appreciated!

Thanks,

John
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Returning Column and Row Data From Spreadsheet knight2000 0 481 Oct-22-2023, 07:07 AM
Last Post: knight2000
  Editing spreadsheet/csv BSDevo 6 1,112 Sep-01-2023, 05:47 PM
Last Post: BSDevo
  Looking to automate updating a spreadsheet with image from email cubangt 2 1,024 Feb-14-2023, 03:43 PM
Last Post: cubangt
  Import XML file directly into Excel spreadsheet demdej 0 891 Jan-24-2023, 02:48 PM
Last Post: demdej
  updating Google spreadsheet with gspread mgallotti 0 1,164 Sep-30-2022, 11:26 PM
Last Post: mgallotti
  Modify LibreOffice's ods spreadsheet Pavel_47 0 1,231 Jul-13-2022, 11:28 AM
Last Post: Pavel_47
  Python create a spreadsheet with column and row header ouruslife 4 1,722 Jul-09-2022, 11:01 AM
Last Post: Pedroski55
  Importing text file into excel spreadsheet with formatting david_dsmn 1 3,666 Apr-05-2021, 10:21 PM
Last Post: david_dsmn
  "Switch-to-spreadsheet" entry. Feasible in Python? whatspython 2 2,050 Sep-30-2020, 01:12 PM
Last Post: buran
  PYTHON - GOOGLE... create new spreadsheet? danclark81 3 2,746 Feb-02-2020, 08:57 PM
Last Post: danclark81

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020