Python Forum
The number of occurrences of statistical characters
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
The number of occurrences of statistical characters
#1
who can help me to resolve the question,thanks  Razz

# -*- coding:utf-8 -*-
# python 3.x
import re

patter = [chr(i) for i in range(33,126)]

with open("a.txt","r") as file:
    content = file.read()
    for i in patter:
        result = len(re.findall(r"[%s]" % i,content))
        if result != 0:
            print("%s:%d" % (i, result))
Error:
Traceback (most recent call last):   File "D:\robot\desk\Script\login_Discuz.py", line 13, in <module>     result = len(re.findall(r"[%s]" % i,content))   File "D:\Python\Python36-32\lib\re.py", line 222, in findall     return _compile(pattern, flags).findall(string)   File "D:\Python\Python36-32\lib\re.py", line 301, in _compile     p = sre_compile.compile(pattern, flags)   File "D:\Python\Python36-32\lib\sre_compile.py", line 562, in compile     p = sre_parse.parse(p, flags)   File "D:\Python\Python36-32\lib\sre_parse.py", line 855, in parse     p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)   File "D:\Python\Python36-32\lib\sre_parse.py", line 416, in _parse_sub     not nested and not items))   File "D:\Python\Python36-32\lib\sre_parse.py", line 523, in _parse     source.tell() - here) sre_constants.error: unterminated character set at position 0
Reply
#2
There are special characters that need to be escaped if you want to use them literally. The error comes at the \ which is RegEx own escape char. So the result is (when not escaped) is invalid pattern.

import re, sre_constants
 
patter = [chr(i) for i in range(33,126)]
 
with open("a.txt","r") as file:
    content = file.read()
    for i in patter:
        try:
            result = len(re.findall(r"[%s]" % i,content))
        except sre_constants.error:
            print('error with {}'.format(i))
Output:
error with \ error with ^
change the for body like this
        try:
            result = len(re.findall(r"[%s]" % i,content))
        except sre_constants.error:
            result = len(re.findall(r"[\%s]" % i,content))
        if result != 0:
            print("%s:%d" % (i, result))  
and it work.

That said, note that you also need to escape chars like *, ? or . in order to search for them literally. I will leave this to you
Reply
#3
thanks a lot! U are right,I'll be carefull about escape chars like*,?or.
Reply
#4
You can also use the re.escape method, to escape the strings before building a regex with it.

Before:
>>> import re
>>> [re.compile(r"[{0}]".format(chr(i))) for i in range(33, 126)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <listcomp>
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\re.py", line 224, in compile
    return _compile(pattern, flags)
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\re.py", line 293, in _compile
    p = sre_compile.compile(pattern, flags)
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\sre_compile.py", line 536, in compile
    p = sre_parse.parse(p, flags)
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\sre_parse.py", line 829, in parse
    p = _parse_sub(source, pattern, 0)
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\sre_parse.py", line 437, in _parse_sub
    itemsappend(_parse(source, state))
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\sre_parse.py", line 545, in _parse
    source.tell() - here)
sre_constants.error: unterminated character set at position 0
After:
>>> [re.compile(r"[{0}]".format(re.escape(chr(i)))) for i in range(33, 126)]
[re.compile('[\\!]'), re.compile('[\\"]'), re.compile('[\\#]'), re.compile('[\\$]'), re.compile('[\\%]'), re.compile('[\\&]'), re.compile("[\\']"), re.compile('[\\(]'), re.compile('[\\)]'), re.compile('[\\*]'), re.compile('[\\+]'), re.compile('[\\,]'), 
#snipped
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Find if chain of characters or number Frankduc 4 1,751 Feb-11-2022, 01:55 PM
Last Post: Frankduc
  Count number of occurrences of list items in list of tuples t4keheart 1 2,342 Nov-03-2020, 05:37 AM
Last Post: deanhystad
  Count & Sort occurrences of text in a file oradba4u 7 3,005 Sep-06-2020, 03:23 PM
Last Post: oradba4u
  Translation of R Code to Python for Statistical Learning Course SterlingAesir 2 2,092 Aug-27-2020, 08:46 AM
Last Post: ndc85430
  Remove escape characters / Unicode characters from string DreamingInsanity 5 13,416 May-15-2020, 01:37 PM
Last Post: snippsat
  Counting number of occurrences of a single digit in a list python_newbie09 12 5,357 Aug-12-2019, 01:31 PM
Last Post: perfringo
  Occurrences using FOR and IF cycle P86 2 2,474 Jul-29-2019, 04:37 PM
Last Post: ThomasL
  Split Column Text by Number of Characters cgoldstein 3 2,948 Mar-11-2019, 01:45 PM
Last Post: perfringo
  Printing Easter date occurrences samsonite 8 4,911 Mar-06-2019, 11:49 AM
Last Post: samsonite
  Counting number of characters in a string Drone4four 1 3,410 Aug-16-2018, 02:33 PM
Last Post: ichabod801

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020