![]() |
How to store the resulting Doc objects into a list named A - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: How to store the resulting Doc objects into a list named A (/thread-36593.html) |
How to store the resulting Doc objects into a list named A - xinyulon - Mar-08-2022 Hey folks. I am a real new beginner for python and NLP. I am stuck with this question: Store the resulting Doc objects into a list name A" Is it possible to store it by using "A=list(doc)". How could it still remain doc objects under list? Thank you so much! This is my exercise: 1. Load text files from a directory and read their contentsΒΆ The directory data contains a subdirectory named sotu with five State of the Union speeches from various presidents of the United States, which are stored as UTF-8 encoded plain text files. Import the pathlib module and use the module to read the contents of each text file into string objects. Then import the spacy library and load a small language model for English. Assign the model under the variable nlp. Process the texts using the language model and store the resulting Doc objects into a list named speeches. And my answer: from pathlib import Path corpus_dir = Path ("data/sotu") files = list(corpus_dir.glob(pattern='*.txt')) for file in files: text = file.read_text(encoding='utf-8') import spacy nlp=spacy.load('en_core_web_sm') doc=nlp(text) speeches=list(doc) RE: How to store the resulting Doc objects into a list named A - bowlofred - Mar-08-2022 Generally you should put imports at the start of your program. There's no reason to put the import inside a loop where it is run multiple times. To create a list of things, generally you can append() each of the things to your list. So something like. speeches = [] # list is created outside the loop for file in files: # create the doc speeches.append(doc) # items added to the list inside the loop. |