Python Forum
exception during iteration loop
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
exception during iteration loop
#1
i open a file for reading and read it like:
i=open(ifn)
for line in i:
    ...
    ...
it reads over 352000 lines then gets a UnicodeDecodeError exception. i just want to skip that. if it were some statement in the loop body i would put in a try: and do except: pass. but this is the loop control itself. how can i skip the exception this?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
i = open(ifn, errors='replace') # this will replace the character with '?' for example
More here: https://docs.python.org/3/library/functions.html#open

You may like 'backslashreplace'. I presume Wink
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#3
either replace as wavic suggests, or use proper codec

You can use: https://github.com/chardet/chardet
to detect (most of the time) the proper file codec
Reply
#4
(Oct-23-2018, 07:37 AM)wavic Wrote:
i = open(ifn, errors='replace') # this will replace the character with '?' for example
More here: https://docs.python.org/3/library/functions.html#open

You may like 'backslashreplace'. I presume Wink
'backslashreplace' worked. now i want to add some code to detect those backslashes to skip those lines. i suspect the file is not properly encoded in UTF-8.

(Oct-23-2018, 08:10 AM)Larz60+ Wrote: either replace as wavic suggests, or use proper codec

You can use: https://github.com/chardet/chardet
to detect (most of the time) the proper file codec
it is supposed to be encoded in UTF-8. apparently it isn't. i just want to skip the lines that are not valid UTF-8.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#5
Quote:i just want to skip the lines that are not valid UTF-8.
you can override decode This should work:
i = open(ifn, encoding="utf-8", errors="ignore")
Reply
#6
You could also write a wrapper class that implements the iterator protocol that just ignores any error except StopIteration. That's probably the dirty ugly way to do it, though.

class LineSkipper:
    def __init__(self, iterable):
        self.iterable = iter(iterable)

    def __iter__(self):
        return self

    def __next__(self):
        while True:
            try:
                return next(self.iterable)
            except StopIteration:
                # re-raise the stopiteration, so the caller knows we've reached the end of the iterable
                raise
            except:
                # ignore any errors reading the line and skip it entirely
                pass

with open("spam.txt") as f:
    for line in LineSkipper(f):
        print(f"{line.strip()}")
Reply
#7
i just want to keep this simple. the file is a list of every file (full path) that could be installed for every package in the repositories i have configured for my ubuntu system along with the package name it comes in. i populated a database with it so i can search by file name.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  saving each iteration of a loop sgcgrif3 3 1,680 Jul-27-2021, 01:02 PM
Last Post: DeaD_EyE
  String slicing and loop iteration divyansh 9 2,582 Jun-07-2020, 10:29 PM
Last Post: divyansh
  Changing a variable's name on each iteration of a loop rix 6 36,806 Jan-03-2020, 07:06 AM
Last Post: perfringo
  Parallel iteration with for loop Josh_Python890 1 1,289 Jul-19-2019, 11:50 PM
Last Post: metulburr
  Multiprocessing my Loop/Iteration (Try...Except) Jompie96 7 2,626 Jun-19-2019, 12:59 PM
Last Post: noisefloor
  First for loop stops after first iteration Divanova94 10 5,219 May-01-2019, 04:27 PM
Last Post: buran
  During handling of the above exception, another exception occurred Skaperen 7 20,816 Dec-21-2018, 10:58 AM
Last Post: Gribouillis
  issue with updating list every iteration of a loop ftrillaudp 2 1,856 Oct-29-2018, 03:23 AM
Last Post: ftrillaudp
  For Loop, execute one time for every loop iteration dragan979 2 3,014 Feb-20-2018, 12:02 PM
Last Post: dragan979

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020