Hey folks. I am a real new beginner for python and NLP.
I am stuck with this question: Store the resulting Doc objects into a list name A"
Is it possible to store it by using "A=list(doc)". How could it still remain doc objects under list? Thank you so much!
This is my exercise:
1. Load text files from a directory and read their contentsΒΆ
The directory data contains a subdirectory named sotu with five State of the Union speeches from various presidents of the United States, which are stored as UTF-8 encoded plain text files.
Import the pathlib module and use the module to read the contents of each text file into string objects.
Then import the spacy library and load a small language model for English. Assign the model under the variable nlp.
Process the texts using the language model and store the resulting Doc objects into a list named speeches.
And my answer:
I am stuck with this question: Store the resulting Doc objects into a list name A"
Is it possible to store it by using "A=list(doc)". How could it still remain doc objects under list? Thank you so much!
This is my exercise:
1. Load text files from a directory and read their contentsΒΆ
The directory data contains a subdirectory named sotu with five State of the Union speeches from various presidents of the United States, which are stored as UTF-8 encoded plain text files.
Import the pathlib module and use the module to read the contents of each text file into string objects.
Then import the spacy library and load a small language model for English. Assign the model under the variable nlp.
Process the texts using the language model and store the resulting Doc objects into a list named speeches.
And my answer:
from pathlib import Path corpus_dir = Path ("data/sotu") files = list(corpus_dir.glob(pattern='*.txt')) for file in files: text = file.read_text(encoding='utf-8') import spacy nlp=spacy.load('en_core_web_sm') doc=nlp(text) speeches=list(doc)