Do you only want the first line from your text?
with open(urltext, 'r') as infile:
text = infile.readline() # or readlines ??
Lately, I've been looking at re. You can do what want very simply like this using re:
import re
urltext = '/home/pedro/tmp/some_urls.txt' # urls mixed with text
with open(urltext, 'r') as infile:
text = infile.read()
# the thing about urls, they can't/shouldn't contain spaces
# spaces cause problems
# \S finds anything that is not whitespace
e = re.compile(r'(https?://\S+)') # https? finds http or https
res = e.findall(text)
for r in res:
print(r)
And you can change the re expression to specialise it for certain words.
(urls from a previous question on getting urls from a webpage.)
Output:
https://tree-diffusion.github.io/
https://github.com/Nike-Inc/koheesio
https://phys.org/news/2024-05-glimpses-volcanic-world-telescope-images.html
https://chipsandcheese.com/2024/06/03/intels-lion-cove-architecture-preview/
https://www.anandtech.com/show/21425/intel-lunar-lake-architecture-deep-dive-lion-cove-xe2-and-npu4
https://www.merriam-webster.com/wordplay/top-10-rare-and-amusing-insults-vol-2
https://asteriskmag.com/issues/06/how-to-make-a-great-government-website
https://www.ycombinator.com/blog/why-yc-went-to-dc/
https://toaster.llc/photon/
https://samcurry.net/hacking-millions-of-modems
https://arxiv.org/abs/2405.20233
https://physics.stackexchange.com/questions/816698/how-many-photons-are-received-per-bit-transmitted-from-voyager-1
https://www.belfercenter.org/publication/seeing-data-structure
https://kk.org/thetechnium/files/2023/12/howtowalkandtalk.pdf
https://danlark.org/2020/06/14/128-bit-division/
https://rootsofprogress.org/robert-allen-british-industrial-revolution
https://www.youtube.com/watch?v=EKWGGDXe5MA
https://github.com/fiddyschmitt/File-Tunnel
https://spectrum.ieee.org/geothermal-energy-gyrotron-quaise
https://technicalwriting.dev/a11y/skip.html
https://tridao.me/blog/2024/mamba2-part1-model/
https://careers.snapmagic.com/o/technical-project-manager
https://zompist.com/yingzi/yingzi.htm
https://github.com/Dicklesworthstone/grassman_article
https://spritely.institute/news/cirkoban-sokoban-meets-cellular-automata-written-in-scheme.html
https://awesomekling.substack.com/p/forking-ladybird-and-stepping-down-serenityos
https://bgr.com/science/new-theory-suggests-time-is-an-illusion-created-by-quantum-entanglement/
https://github.com/Jana-Marie/ligra