![]() |
Updating column name with translation - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Updating column name with translation (/thread-43229.html) |
Updating column name with translation - bobbydave - Sep-17-2024 I have a column of countries. However mixed within this country column are spanish country names. I have found the list of names in Spanish, translated them to english but now I want to either create another column showing the original column plus where my translate country will be or just replace the entry. Quote:OriginalColumn of Countries I have my translated list Quote:MexicoNow I want to put in another column and where i have found the translations in english, to insert them there (or if that fails, just update the original column with the translation so I should then have a list of all english countries import pycountry import pandas as pd from pathlib import Path from langdetect import detect from googletrans import Translator FILENAME = r"POS.xlsx" COUNTRYNAME = 'Country' df = pd.read_excel(FILENAME) def all_names() -> set[str]: # all Country objects have a "name" attribute return {country.name for country in pycountry.countries} # type: ignore def all_official_names() -> set[str]: s: set[str] = set() for country in pycountry.countries: # not all Country objects have an "official_name" attribute try: s.add(country.official_name) # type: ignore except AttributeError: pass return s def get_df_countries(filename: Path) -> set[str]: # construct a set because country names may be duplicated in the spreadsheet column # this potentially improves runtime performance when parsing the country names later return set(pd.read_excel(FILENAME)[COUNTRYNAME]) # Function to detect language of a word def detect_language(word): try: return detect(word) except: return 'unknown' translator = Translator() if __name__ == "__main__": names = all_names() | all_official_names() for country in get_df_countries(Path(FILENAME)): status = "valid" if country in names else "invalid" if status =="invalid": #print(f"{country} is {status}") # Translate country back into english translated_country = (translator.translate(country).text) print(translated_country) Any ideas how i can show either the new translations replaced but still show all others countries already in english? |