Python Forum
Copy xml content from webpage and save to locally without special characters
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Copy xml content from webpage and save to locally without special characters
#8
Your code works absolutely fine with the w3schools link.

But for my .xml, I can read just the first line with the below error:

MarkupResemblesLocatorWarning: The input looks more like a filename than markup. You may want to open this file and pass the filehandle into Beautiful Soup. soup = BeautifulSoup(response.content, 'xml')
<?xml version="1.0" encoding="utf-8"?>


Few points:

1. What is a markup? How do we open this and resolve the error?
2. Am I not pointing to the root element? When I compare the 'w3schools.xml' with my xml, there is change in the first line. My xml header is extended (<?xml version='1.0' encoding='UTF-8'?><IM415 xmlns="http://www.ros.ie/schemas/customs/IM415">) See below.

Here's my code:
response = requests.get(driver.current_url)
soup = BeautifulSoup(response.content, 'xml')
print(soup)

# Save to disk
with open('test.xml', 'w') as fp:
      fp.write(soup.prettify())
My Xml looks like below:

<?xml version='1.0' encoding='UTF-8'?><IM415 xmlns="http://www.ros.ie/schemas/customs/IM415">
<Declaration>
<MsgType>H1</MsgType>
<DeclarationType_1_1>IM</DeclarationType_1_1>
<AdditionalDeclarationType_1_2>A</AdditionalDeclarationType_1_2>
<LRN_2_5>NIK243104_16nUlp</LRN_2_5>
<ValuationInformation>
<InvoiceCurrency_4_10>AFA</InvoiceCurrency_4_10>
<InvoiceAmount_4_11>5000</InvoiceAmount_4_11>
<InternalCurrency_4_12>AFA</InternalCurrency_4_12>
</ValuationInformation>
<GoodsInformation>
<GrossMass_6_5>33300</GrossMass_6_5>
<TotalPackageNumber_6_18>1665</TotalPackageNumber_6_18>
</GoodsInformation>


I'm expecting the below to be printed and copied to my test.xml:
<?xml version='1.0' encoding='UTF-8'?><IM415 xmlns="http://www.ros.ie/schemas/customs/IM415">
<Declaration>
<MsgType>H1</MsgType>
<DeclarationType_1_1>IM</DeclarationType_1_1>
<AdditionalDeclarationType_1_2>A</AdditionalDeclarationType_1_2>
<LRN_2_5>NIK243104_16nUlp</LRN_2_5>
<ValuationInformation>
<InvoiceCurrency_4_10>AFA</InvoiceCurrency_4_10>
<InvoiceAmount_4_11>5000</InvoiceAmount_4_11>
<InternalCurrency_4_12>AFA</InternalCurrency_4_12>
</ValuationInformation>
<GoodsInformation>
<GrossMass_6_5>33300</GrossMass_6_5>
<TotalPackageNumber_6_18>1665</TotalPackageNumber_6_18>
</GoodsInformation>
<GrossMass_6_5>33300</GrossMass_6_5>
<TotalPackageNumber_6_18>1665</TotalPackageNumber_6_18>
</GoodsInformation>


(Mar-21-2024, 10:13 PM)snippsat Wrote:
(Mar-21-2024, 06:27 PM)Nik1811 Wrote: I get ''https://www.********/aep2/xml/sad/NIK243179-AI.xml as my generated xml
Post your code,from what are you generated this .xml?
Then need just to save the content of .xml,and not use url as in my demos.
If you run my code(no changes) dos that work?
Reply


Messages In This Thread
RE: Copy xml content from webpage and save to locally without special characters - by Nik1811 - Mar-22-2024, 11:41 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Why is the copy method name in python list copy and not `__copy__`? YouHoGeon 2 365 Apr-04-2024, 01:18 AM
Last Post: YouHoGeon
  how to save to multiple locations during save cubangt 1 628 Oct-23-2023, 10:16 PM
Last Post: deanhystad
Question Special Characters read-write Prisonfeed 1 698 Sep-17-2023, 08:26 PM
Last Post: Gribouillis
  UPDATE SQLITE TABLE - Copy a fields content to another field. andrewarles 14 4,622 May-08-2021, 04:58 PM
Last Post: ibreeden
  Rename Multiple files in directory to remove special characters nyawadasi 9 6,690 Feb-16-2021, 09:49 PM
Last Post: BashBedlam
  copy content of text file with three delimiter into excel sheet vinaykumar 0 2,422 Jul-12-2020, 01:27 PM
Last Post: vinaykumar
  Remove escape characters / Unicode characters from string DreamingInsanity 5 14,154 May-15-2020, 01:37 PM
Last Post: snippsat
  Check for a special characters in a column and flag it ayomayam 0 2,103 Feb-12-2020, 03:04 PM
Last Post: ayomayam
  save content of table into file atlass218 10 10,189 Aug-28-2019, 12:12 PM
Last Post: Gribouillis
  Split pyscaffold project into packages locally mucrom 0 1,549 Aug-05-2019, 12:07 PM
Last Post: mucrom

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020