Python Forum
How to use the re library to remove irrelevant words?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to use the re library to remove irrelevant words?
#4
Ah, right, so it's not the import that does not work, it's the sub() function pattern matching.
Is there not a way to extract the text from a tweet, rather including the URL only to have to filter it out? As is, it seems that you're creating an issue that needs to be solved, but the code snippet is a little short to fully understand your methodology.

There's a saying that goes something like, when you use regex to solve a problem, you now have two problems to solve.

edit to add:

If there's no other way, I'd try something like this:

pattern = 'https\S+'
repl = ''
result = re.sub(pattern, repl, tweets)

print(result)
Sig:
>>> import this

The UNIX philosophy: "Do one thing, and do it well."

"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse

"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Reply


Messages In This Thread
RE: wordcloud - by rob101 - Jan-11-2023, 10:57 PM
RE: wordcloud - by noahe - Jan-11-2023, 11:07 PM
RE: wordcloud - by rob101 - Jan-11-2023, 11:23 PM

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020