Python Forum
Searching a text file to find words matching a pattern
Thread Rating:
  • 1 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Searching a text file to find words matching a pattern
#3
(Nov-07-2017, 06:37 PM)Micael Wrote: It's in Swedish so there is of course åäö in the file and that's a problem as well.
A lot have change regard Unicode,it was one the biggest changes moving to Python 3(as mention bye @heiner55 you should use Python 3).
In Python 3 open() has build in encoding parameter.
So the simple rule is to keep it UTF-8 in and out when reading a file.
Inside Python 3 is all strings sequences of Unicode character,if not encode in or Python 3 do not not recognize encoding it will be bytes (b'hello').
Python 3 will not guess as Python 2 do.

So if borrow code from @heiner55 it look like this:
import re

with open('ss.txt', encoding='utf-8') as f:
    for line in f:
        line = line.strip()
        if re.match(r"h...g..", line) and len(line)==7:
            print(line)
There is no need for # -*- coding: utf-8 -*- in Python 3,because UTF-8 is default.

In Python 2 it would look like this,same rule UTF-8 in and out.
But has to use a library io or codecs and # -*- coding: utf-8 -*- because Python 2 has ASCII default encoding.
# -*- coding: utf-8 -*-
import re
import io

with io.open('ss.txt', encoding='utf-8') as f:
    for line in f:
        line = line.strip()
        if re.match(r"h...g..", line) and len(line)==7:
            print(line)
Reply


Messages In This Thread
RE: Searching a text file to find words matching a pattern - by snippsat - Nov-07-2017, 07:28 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Matching string from a file tester_V 5 655 Mar-05-2024, 05:46 AM
Last Post: Danishhafeez
  find and group similar words with re? cartonics 4 891 Oct-27-2023, 05:36 PM
Last Post: deanhystad
  Form that puts diacritics on the words in the text Melcu54 13 1,845 Aug-22-2023, 07:07 AM
Last Post: Pedroski55
  FileNotFoundError: [WinError 2] The system cannot find the file specified NewBiee 2 1,806 Jul-31-2023, 11:42 AM
Last Post: deanhystad
  splitting file into multiple files by searching for string AlphaInc 2 1,112 Jul-01-2023, 10:35 PM
Last Post: Pedroski55
  Cannot find py credentials file standenman 5 1,811 Feb-25-2023, 08:30 PM
Last Post: Jeff900
  selenium can't find a file in my desk ? SouAmego22 0 835 Feb-14-2023, 03:21 PM
Last Post: SouAmego22
  Pypdf2 will not find text standenman 2 1,065 Feb-03-2023, 10:52 PM
Last Post: standenman
Thumbs Up Need to compare the Excel file name with a directory text file. veeran1991 1 1,273 Dec-15-2022, 04:32 PM
Last Post: Larz60+
  Find (each) element from a list in a file tester_V 3 1,380 Nov-15-2022, 08:40 PM
Last Post: tester_V

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020