Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
URL DECODING
#1
This script reads a text file containing a HTML table, uses pandas to parse that table into a dataframe. Then writes that dataframe as a CSV to another text file.

import pandas as pd

f=open('input_table.html','r')
html = f.read()
df = pd.read_html(html)
df[0].to_csv('output.csv',index=False,header=False)
Problem is some of the cells of the table randomly contain URL encoded HTML. So for example <h1>My Heading</h1> will be %3Ch1%3EMy%20Heading%3C%2Fh1%3E

Ive tried

import pandas as pd
import urllib

f=open('input_table.html','r')
html = f.read()
html=urllib.unquote(html).decode('utf8')
df = pd.read_html(html)
df[0].to_csv('output.csv',index=False,header=False)
But that results in scrambled results in the CSV.
Reply


Messages In This Thread
URL DECODING - by UnionSystems - Jan-01-2019, 11:04 PM
RE: URL DECODING - by snippsat - Jan-02-2019, 12:56 AM
RE: URL DECODING - by UnionSystems - Jan-02-2019, 01:36 AM
RE: URL DECODING - by snippsat - Jan-02-2019, 01:49 AM
RE: URL DECODING - by UnionSystems - Jan-02-2019, 03:55 AM
RE: URL DECODING - by UnionSystems - Jan-02-2019, 05:28 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Decoding lat/long in file name johnmcd 4 383 Mar-22-2024, 11:51 AM
Last Post: johnmcd
  Enigma Decoding Problem krisarmstrong 4 738 Dec-14-2023, 10:42 AM
Last Post: Larz60+
  json decoding error deneme2 10 3,665 Mar-22-2023, 10:44 PM
Last Post: deanhystad
  flask app decoding problem mesbah 0 2,360 Aug-01-2021, 08:32 PM
Last Post: mesbah
  Decoding a serial stream AKGentile1963 7 8,587 Mar-20-2021, 08:07 PM
Last Post: deanhystad
  xml decoding failure(bs4) roughstroke 1 2,270 May-09-2020, 04:37 PM
Last Post: snippsat
  python3 decoding problem but python2 OK mesbah 0 1,807 Nov-30-2019, 04:42 PM
Last Post: mesbah
  utf-8 decoding failed every time i try adnanahsan 21 10,867 Aug-27-2019, 04:25 PM
Last Post: adnanahsan
  hex decoding in Python 3 rdirksen 2 4,617 May-12-2019, 11:49 AM
Last Post: rdirksen
  Decoding log files in binary using an XML file. captainfantastic 1 2,428 Apr-04-2019, 02:24 AM
Last Post: captainfantastic

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020