Python Forum
Comparing 2 Files - Step 1, import and remove tags
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Comparing 2 Files - Step 1, import and remove tags
#1
I have a project I completed (mostly) in VBA, but don't think it's great for larger data sets. I'm thinking that maybe I should try to use Python for the 'engine' and just keep the VBA side for the UI and distribution. However, as seems to be the case with everything I try in python, I just can't make it work. Maybe if I can get past the first step, I'll be able to move forward on my own. So if any of you can assist, I'd certainly appreciate it.

For the first step, all I'm trying to do is import 2 word documents and remove the HTML/XML tags. I've tried https://www.tutorialspoint.com/python/py...cument.htm but can't get passed PIP INSTALL DOCX. I've tried Beautiful Soup 4 but get errors like " looks like a filename, not markup. You should probably open this file and pass the filehandle into Beautiful Soup." Then "'"%s" looks like a filename, not markup. You should probably open this file and pass the filehandle into Beautiful Soup.' % markup)" and on and on it goes. At least half a dozen "this is easy and should work"

All I want to do is open two file sand remove the html/xml. Surely there has to be something out there that I can just plug the file path and names into and see the results?

Thanks for any assistance.
Reply


Messages In This Thread
Comparing 2 Files - Step 1, import and remove tags - by JP_ROMANO - Oct-03-2018, 01:16 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Import multiple CSV files into pandas Krayna 0 1,783 May-20-2021, 04:56 PM
Last Post: Krayna
  import numpy in sub-files paul18fr 1 2,103 Aug-06-2019, 12:38 PM
Last Post: chakrimakam
  comparing two columns two different files in pandas nuncio 0 2,463 Jun-06-2018, 01:04 PM
Last Post: nuncio
  import/use data from text files MichealPeterson 1 3,401 Jun-28-2017, 08:51 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020