Oct-25-2022, 05:26 PM
I can't translate html tags that contain other tags (such as <a href=..</a> OR <em></em>)
In example below, the paragraph <p class JAGAAA>..</p> is the problem, I cannot translate. All other p classes are translated very well. Except this class, because it has in it those <a href=..</a> OR <em></em>
I try so many things. I don't know why is not working my code. I don;t get any error. Just, this class is not translated.
In example below, the paragraph <p class JAGAAA>..</p> is the problem, I cannot translate. All other p classes are translated very well. Except this class, because it has in it those <a href=..</a> OR <em></em>
I try so many things. I don't know why is not working my code. I don;t get any error. Just, this class is not translated.
<p class="JAGAAA">Intr-un articol precedent, <a href="https://neculaifantanaru.com/dupa-toate-regulile-artei.html"> <em>Dupa toate regulile artei</em> </a>, v-am povestit despre tanarul Hamlet, care voia sa razbune moartea tatalui sau</p>.**THIS IS THE PART OF THE CODE**
import os from bs4 import BeautifulSoup, NavigableString import re import textwrap from googletrans import Translator import pprint ... with open(f"{base_path}/{file}" , "r" , encoding='utf8', errors='ignore') as open_file: data = open_file.read() if data == "": print("{} este gol".format(file)) continue lxml1 = str(BeautifulSoup(data, 'lxml')) #lxml1 = data lxml1 = lxml1.replace("\ufeff" , " ") #lxml1 = lxml1.replace("\n" , " ") #lxml1 = re.sub(' +', ' ', lxml1) if(read_tags == True): soup = BeautifulSoup(data, 'lxml') title_tag = soup.find("title") ist_p_tag = soup.find("p" , class_="text_obisnuit2") ist3_p_tag = soup.find("p" , class_="JAGAAA") second_p_tag = soup.find("p" , class_="donoo") meta_tag = soup.find("meta") if(title_tag == None): print("Title tag does not found") else: translated_title = translator.translate(title_tag.text, dest=input_lang) lxml1 = lxml1.replace(title_tag.text,translated_title.text) if(meta_tag == None): print("meta tag does not found") else: translated_meta = translator.translate(meta_tag["content"], dest=input_lang) lxml1 = lxml1.replace(meta_tag["content"],translated_meta.text) if(ist_p_tag == None): print("<p class='text_obisnuit2' /> not found") else: translated_p = translator.translate(ist_p_tag.text, dest=input_lang) lxml1 = lxml1.replace(ist_p_tag.text,translated_p.text) if(ist3_p_tag == None): print("<p class='JAGAAA' /> not found") else: translated_p = translator.translate(ist3_p_tag.text, dest=input_lang) lxml1 = lxml1.replace(ist3_p_tag.text,translated_p.text)