Read a folder with a multiple files - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Read a folder with a multiple files (/thread-18110.html) |
Function returning elements in a list - NewBeie - May-03-2019 Hi, I have a list Quote:Names = ['John', 'Tom']if I say def ReadNames(): for name in Names: print(name)I get both names Quote:John How ever when I say def ReadNames(): for name in Names: return nameI only get one name back, how do I write a function that iterate through a list and give all values back? RE: Function returning elements in a list - scidam - May-03-2019 (May-03-2019, 11:17 AM)NewBeie Wrote: I only get one name back, how do I write a function that iterate through a list and give all values back? I don't sure if I understood your exactly, but you are likely talking about generators? def read_names(): for name in Names: # Names should be defined somewhere above yield name names = read_names() print(next(names)) # returns John print(next(names)) # returns Tom RE: Function returning elements in a list - perfringo - May-03-2019 (May-03-2019, 11:17 AM)NewBeie Wrote: How ever when I saydef ReadNames(): for name in Names: return nameI only get one name back, how do I write a function that iterate through a list and give all values back? After return statement function finishes and returns control to caller. So first element is returned and thats all what will happen. Scidam provided code to overcome this problem. However, it's unclear for me why you want to have function to iterate over elements? Isin't it easier directly iterate over list? Especially when function is with no parameters. RE: Function returning elements in a list - NewBeie - May-06-2019 I have a folder with two xml files (They could be more than two). I want to step in that folder and read each file, so I have this code: path = os.getcwd() + '/Emails/' files = os.listdir(path)So now, the "files" returns a list, I want loop the files and read the context, so I tried this: def Readfiles(): for file in files: with open(file, 'r') as f: message = f.read() return messageBut this is not giving me what I want. for each file context I get, I want to clean it, and I have a step for that: soup = BeautifulSoup(message, 'lxml')So what I want is for a code that will go through the folder and read each file, then pass the context to that step there for cleaning, then give me the output there of. Hope I made this clear (May-03-2019, 11:34 AM)scidam Wrote:(May-03-2019, 11:17 AM)NewBeie Wrote: I only get one name back, how do I write a function that iterate through a list and give all values back? Read a folder with a multiple files - NewBeie - May-06-2019 I have a folder with two xml files (They could be more than two). I want to step in that folder and read each file, so I do have this code: path = os.getcwd() + '/XmlFiles/' files = os.listdir(path)So now, the "files" returns a list print(files)I want to loop the files and read the context, so I tried this: def Readfiles(): for file in files: with open(file, 'r') as f: message = f.read() return messageBut this is not giving me what I want. For each file context, I want to clean it, and I have a step for that: soup = BeautifulSoup(message, 'lxml')So what I want is for a code that will go through the folder and read each file, then pass the context to the BeautifulSoup function for cleaning, then give me the output there of, results for each file. RE: Function returning elements in a list - perfringo - May-06-2019 Please read previous answers once again. You have got answer why you get only one name back and how to deal with it. From lxml FAQ: Quote:Take a look at the XML specification, it's all about byte sequences and how to map them to text and structure. That leads to rule number one: do not decode your XML data yourself. That's a part of the work of an XML parser, and it does it very well. Just pass it your data as a plain byte stream, it will always do the right thing, by specification. RE: Read a folder with a multiple files - DeaD_EyE - May-06-2019 The return statement is wrong. As first the high-level solution: from pathlib import Path def read_files(): root = Path.cwd() / 'XmlFiles' for file in root.glob('*.xml'): yield file.read_text() # yield file.read_bytes() # to get bytesThis function is a generator and works only, if you iterate over it or use a function/type which iterates implicit over the generator. To get the data of all *.xml files: file_data_as_list = list(read_files())If you change the function a little bit, you can store the path as a key in a dict together with the text as value. def read_files(): root = Path.cwd() / 'XmlFiles' for file in root.glob('*.xml'): # yield (key, value) yield (file, file.read_text()) xml_content = dict(read_files()) Path.cwd() returns the absolute path, the resulting object during iteration, are also pathlib objects.The pathlib object itself is not mutable. You can compare it to stings. Changing a path, results in a new path. Your old version, corrected: def read_files(): result = [] for file in files: with open(file, 'r') as f: message = f.read() result.append(message) return resultTo get rid of the list inside the function, you can convert it to an generator: def read_files(): for file in files: with open(file, 'r') as f: yield f.read()The object files should not accessed on global scope.Use arguments for your functions. In this case the root-directory should be one argument of your function: def read_files(files): for file in files: with open(file, 'r') as f: yield f.read()I use generators often to explain things. Often lesser code is needed and it looks like what it does. If you use a return statement somewhere in your function, you leave the function. RE: Function returning elements in a list - NewBeie - May-06-2019 The answer above doesn't really help my situation, from the For Loop, I want to read each element in a loop. I could get 2 files or more, so names = read_names() print(next(names)) # returns John print(next(names)) # returns TomWon't really help as they might be 40+ files. I want to iterate a list, for as many elements in the list, then for each element, read the context of it (files in a folder) As for XML, I use soup = BeautifulSoup(message, 'lxml')to clean all the garbage, so there's not an issue here. (May-06-2019, 07:42 AM)perfringo Wrote: Please read previous answers once again. You have got answer why you get only one name back and how to deal with it. This is what I've done so far: path = os.getcwd() + '/XmlFiles/' files = os.listdir(path) def Readfiles(): for file in files: # print(file) with open(path+file, 'r') as f: message = f.read() return (message) message = Readfiles() soup = BeautifulSoup(message, 'lxml') print(soup.text.strip())What this does is, it goes to my folder, get a file, read it and prints it, however when I put the second file in my folder, I only get the results of the first file.I would like to get results for each file in the folder. Thank you for the reply, it is helping a lot, I'm now close to get what I want, I used this code below: path = os.getcwd() + '/XmlFiles/' files = os.listdir(path) # print(files) def read_files(): result = [] for file in files: with open(path+file, 'r') as f: message = f.read() result.append(message) return(result) message = (read_files()) print(message)Which I do get two of my files returned in a list, I only copied part of the output, but I do get the whole 2 files, in the list format.However when I try to apply Quote:To get rid of the list inside the function, you can convert it to an generator: I'm getting this output I would like to get rid of the list inside the function, to get the results of my 2 files, separately, but not in a list.For instance if I do this def Readfiles(): for file in files: print(file)I'm getting This output is not a list, so I would like to get the same above, the output of the read files.(May-06-2019, 07:53 AM)DeaD_EyE Wrote: The return statement is wrong. |