Python Forum
[SOLVED] Sub string not found in string ?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[SOLVED] Sub string not found in string ?
#1
#!/usr/bin/python

# Creating an output file in writing mode
output_file = open("newfile.txt", "w")

# write 3 header records
output_file.write('<?xml version="1.0" encoding="utf-8"?>\n')
output_file.write("<!DOCTYPE KMYMONEY-FILE>\n")
output_file.write("<KMYMONEY-FILE>\n")

write_flag = 0

# Open the file in read mode
with open('Australian-2024-11-30.xml', 'r') as file:
    # Read each line in the file
    for line in file:
        string = line
        sub_str1 = "<TRANSACTIONS"
        sub_str2 = " <SCHEDULES count"

        if sub_str1 in string:
            print("YES")
            write_flag = 1      #commence writing to newfile.txt
        elif sub_str2 in string:
            write_flag = 0      #stop writing when this string found
            print("schedules found")

        if write_flag:
            output_file.write(file.read())

# Close the output file
output_file.close()
The output has all the "<TRANSACTIONS" tag and associated children, BUT it also has all the "<SCHEDULES" tag , plus all data after that. The variable "write_flag" is not being turned off, despite the fact that the "schedules" tag is present ?

In the data, there is only one occurence of "sub_str1" and "sub_str2". So the writes to the output get turned ON at sub_str1 and then turned OFF at sub_str2. But once that flag is on, it stays on, which suggests the

elif sub_str2 in string:
is not being tested. Or is being tested, yet returns false.
Reply
#2
Your code does not look for “<SCHEDULES”. Maybe remove the leading blank and count from sub_str2.

But the real problem is using read(). output_file.write(file.read()) is the last command executed in the loop. It reads the remainder of file and writes that to the output file. It also moves the file pointer to the end of file, ending the loop. I think you might want to do this:
with open("input.txt", "r") as file, open("output.txt", "w") as output_file:
    writing = False
    for line in file:
        if "<TRANSACTIONS" in line:
            writing = True
        elif "<SCHEDULES" in line:
            writing = False
        elif writing:
            output_file.write(line)
When I run using this as the input.txt file:
Output:
A <TRANSACTIONS C D <SCHEDULES F
I get this in the output.txt file
Output:
C D
jehoshua likes this post
Reply
#3
Thanks @deanhystad , that code works just fine. Only a few extra lines as an XML requirement with BeautifulSoup. I have used the output file as input to other Python code, and the accounts now balance. Which they didn't do before, as the 'transactions' within schedules was altering totals.

#!/usr/bin/python

# Re-write the XML file - issues with BeautifulSoup finding "TRANSACTIONS" within schedules

with open("Australian-2024-11-30.xml", "r") as file, open("output.txt", "w") as output_file:

    # write 3 header records, otherwise BeautifulSoup doesn't recognise the output file as XML'
    output_file.write('<?xml version="1.0" encoding="utf-8"?>\n')
    output_file.write("<!DOCTYPE KMYMONEY-FILE>\n")
    output_file.write("<KMYMONEY-FILE>\n")
    writing = False

    for line in file:
        if "<TRANSACTIONS" in line:     #required
            writing = True
        elif "<SCHEDULES" in line:      #not requred
            writing = False
        elif writing:
            output_file.write(line)
Reply
#4
Quote: 'transactions' within schedules was altering totals
I think an xml parser would be a better choice for filtering out scheduled transactions.
jehoshua likes this post
Reply
#5
(Dec-03-2024, 03:55 PM)deanhystad Wrote: I think an xml parser would be a better choice for filtering out scheduled transactions.

Using a parser for this part of the project was the reason why I needed to re-write the file. The problem was a limiting one, in that to effectively 'filter', there was a need to 'chase' the parents. However the parent level in both sets of data was very different. The KIS method to first re-write the file as per code above, and then use BeautifulSoup on the second parse.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [SOLVED] [Beautiful Soup] Replace tag.string from another file? Winfried 2 352 May-01-2025, 03:43 PM
Last Post: Winfried
  Get the string after a specific char JanJan 5 452 Apr-30-2025, 05:04 AM
Last Post: snl_9527
  TypeError: string indices must be integers deneme2 2 701 Feb-14-2025, 12:23 AM
Last Post: deneme2
  How do I parse the string? anna17 8 2,206 Feb-13-2025, 07:08 AM
Last Post: michaeljordan
  question about changing the string value of a list element jacksfrustration 4 2,182 Feb-08-2025, 07:43 AM
Last Post: jacksfrustration
Question [SOLVED] Upgraded Python: Module no longer found Winfried 1 1,262 Jan-01-2025, 02:43 PM
Last Post: Larz60+
  How to read a file as binary or hex "string" so that I can do regex search? tatahuft 3 1,227 Dec-19-2024, 11:57 AM
Last Post: snippsat
  extracting from a string Stephanos 6 1,260 Oct-01-2024, 06:52 AM
Last Post: DeaD_EyE
  Unable to understand the function string.split() Hudjefa 8 2,673 Sep-16-2024, 04:25 AM
Last Post: Pedroski55
Question [SOLVED] How to replace characters in a string? Winfried 2 1,094 Sep-04-2024, 01:41 PM
Last Post: Winfried

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020