URL DECODING

Thread Rating:

0 Vote(s) - 0 Average
1
2
3
4
5

Thread Modes

URL DECODING

UnionSystems
Unladen Swallow

Posts: 4

Threads: 1

Joined: Jan 2019

Reputation: 0

Jan-01-2019, 11:04 PM

This script reads a text file containing a HTML table, uses pandas to parse that table into a dataframe. Then writes that dataframe as a CSV to another text file.

import pandas as pd

f=open('input_table.html','r')
html = f.read()
df = pd.read_html(html)
df[0].to_csv('output.csv',index=False,header=False)

Problem is some of the cells of the table randomly contain URL encoded HTML. So for example <h1>My Heading</h1> will be %3Ch1%3EMy%20Heading%3C%2Fh1%3E

Ive tried

import pandas as pd
import urllib

f=open('input_table.html','r')
html = f.read()
html=urllib.unquote(html).decode('utf8')
df = pd.read_html(html)
df[0].to_csv('output.csv',index=False,header=False)

But that results in scrambled results in the CSV.

Find

Messages In This Thread

URL DECODING - by UnionSystems - Jan-01-2019, 11:04 PM

RE: URL DECODING - by snippsat - Jan-02-2019, 12:56 AM

RE: URL DECODING - by UnionSystems - Jan-02-2019, 01:36 AM

RE: URL DECODING - by snippsat - Jan-02-2019, 01:49 AM

RE: URL DECODING - by UnionSystems - Jan-02-2019, 03:55 AM

RE: URL DECODING - by UnionSystems - Jan-02-2019, 05:28 PM

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Decoding lat/long in file name	johnmcd	4	610	Mar-22-2024, 11:51 AM Last Post: johnmcd
	Enigma Decoding Problem	krisarmstrong	4	1,023	Dec-14-2023, 10:42 AM Last Post: Larz60+
	json decoding error	deneme2	10	4,281	Mar-22-2023, 10:44 PM Last Post: deanhystad
	flask app decoding problem	mesbah	0	2,478	Aug-01-2021, 08:32 PM Last Post: mesbah
	Decoding a serial stream	AKGentile1963	7	9,168	Mar-20-2021, 08:07 PM Last Post: deanhystad
	xml decoding failure(bs4)	roughstroke	1	2,384	May-09-2020, 04:37 PM Last Post: snippsat
	python3 decoding problem but python2 OK	mesbah	0	1,889	Nov-30-2019, 04:42 PM Last Post: mesbah
	utf-8 decoding failed every time i try	adnanahsan	21	11,620	Aug-27-2019, 04:25 PM Last Post: adnanahsan
	hex decoding in Python 3	rdirksen	2	4,788	May-12-2019, 11:49 AM Last Post: rdirksen
	Decoding log files in binary using an XML file.	captainfantastic	1	2,542	Apr-04-2019, 02:24 AM Last Post: captainfantastic

Users browsing this thread: 1 Guest(s)

View a Printable Version

URL DECODING

User Panel Messages

Announcements