Python Forum
Different Output of findall and search in re module
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Different Output of findall and search in re module
link = '<a href="">Google</a>''<a[^>]+href=["\'](.*?)["\']',link,re.IGNORECASE).group()
This code gives the output '<a href=""'

But this code gives the output ['']

Why are both the outputs different? findall() should work like search() except findall() gives a list of matches and search() gives only a single match.
re.findall returns all captured groups in a list,in this case what's inside group(1) --> (.*?). return first match inside group(1) --> (.*?).
import re

link = '''\
 <a href="">Google</a>
 <a href="">Microsoft</a>'''

Output: -------------- ['', '']
Both solution can be looked at as the wrong way,because HTML should not be used with regex read You can't parse [X]HTML with regex Evil
from bs4 import BeautifulSoup

link = '''\
 <a href="">Google</a>
 <a href="">Microsoft</a>'''

soup = BeautifulSoup(link, 'lxml')
for link in soup.find_all('a'):

Possibly Related Threads…
Thread Author Replies Views Last Post
  .findAll() Truman 8 2,575 Nov-17-2018, 01:27 AM
Last Post: snippsat
  re.findall help searching for string in xml response mugster 2 1,945 May-30-2018, 03:27 PM
Last Post: mugster

Forum Jump:

User Panel Messages

Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020