Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 The number of occurrences of statistical characters
#1
who can help me to resolve the question,thanks  Razz

# -*- coding:utf-8 -*-
# python 3.x
import re

patter = [chr(i) for i in range(33,126)]

with open("a.txt","r") as file:
    content = file.read()
    for i in patter:
        result = len(re.findall(r"[%s]" % i,content))
        if result != 0:
            print("%s:%d" % (i, result))
Error:
Traceback (most recent call last):   File "D:\robot\desk\Script\login_Discuz.py", line 13, in <module>     result = len(re.findall(r"[%s]" % i,content))   File "D:\Python\Python36-32\lib\re.py", line 222, in findall     return _compile(pattern, flags).findall(string)   File "D:\Python\Python36-32\lib\re.py", line 301, in _compile     p = sre_compile.compile(pattern, flags)   File "D:\Python\Python36-32\lib\sre_compile.py", line 562, in compile     p = sre_parse.parse(p, flags)   File "D:\Python\Python36-32\lib\sre_parse.py", line 855, in parse     p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)   File "D:\Python\Python36-32\lib\sre_parse.py", line 416, in _parse_sub     not nested and not items))   File "D:\Python\Python36-32\lib\sre_parse.py", line 523, in _parse     source.tell() - here) sre_constants.error: unterminated character set at position 0
Quote
#2
There are special characters that need to be escaped if you want to use them literally. The error comes at the \ which is RegEx own escape char. So the result is (when not escaped) is invalid pattern.

import re, sre_constants
 
patter = [chr(i) for i in range(33,126)]
 
with open("a.txt","r") as file:
    content = file.read()
    for i in patter:
        try:
            result = len(re.findall(r"[%s]" % i,content))
        except sre_constants.error:
            print('error with {}'.format(i))
Output:
error with \ error with ^
change the for body like this
        try:
            result = len(re.findall(r"[%s]" % i,content))
        except sre_constants.error:
            result = len(re.findall(r"[\%s]" % i,content))
        if result != 0:
            print("%s:%d" % (i, result))  
and it work.

That said, note that you also need to escape chars like *, ? or . in order to search for them literally. I will leave this to you
Quote
#3
thanks a lot! U are right,I'll be carefull about escape chars like*,?or.
Quote
#4
You can also use the re.escape method, to escape the strings before building a regex with it.

Before:
>>> import re
>>> [re.compile(r"[{0}]".format(chr(i))) for i in range(33, 126)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <listcomp>
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\re.py", line 224, in compile
    return _compile(pattern, flags)
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\re.py", line 293, in _compile
    p = sre_compile.compile(pattern, flags)
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\sre_compile.py", line 536, in compile
    p = sre_parse.parse(p, flags)
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\sre_parse.py", line 829, in parse
    p = _parse_sub(source, pattern, 0)
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\sre_parse.py", line 437, in _parse_sub
    itemsappend(_parse(source, state))
  File "C:\Users\_\AppData\Local\Programs\Python\Python35-32\lib\sre_parse.py", line 545, in _parse
    source.tell() - here)
sre_constants.error: unterminated character set at position 0
After:
>>> [re.compile(r"[{0}]".format(re.escape(chr(i)))) for i in range(33, 126)]
[re.compile('[\\!]'), re.compile('[\\"]'), re.compile('[\\#]'), re.compile('[\\$]'), re.compile('[\\%]'), re.compile('[\\&]'), re.compile("[\\']"), re.compile('[\\(]'), re.compile('[\\)]'), re.compile('[\\*]'), re.compile('[\\+]'), re.compile('[\\,]'), 
#snipped
Gribouillis likes this post
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Counting number of occurrences of a single digit in a list python_newbie09 12 709 Aug-12-2019, 01:31 PM
Last Post: perfringo
  Occurrences using FOR and IF cycle P86 2 312 Jul-29-2019, 04:37 PM
Last Post: ThomasL
  Split Column Text by Number of Characters cgoldstein 3 505 Mar-11-2019, 01:45 PM
Last Post: perfringo
  Printing Easter date occurrences samsonite 8 820 Mar-06-2019, 11:49 AM
Last Post: samsonite
  Counting number of characters in a string Drone4four 1 840 Aug-16-2018, 02:33 PM
Last Post: ichabod801
  How can i restrict the number of characters in an input? kevencript 1 659 May-23-2018, 05:14 AM
Last Post: micseydel
  count string occurrences of 2nd file in lines of first showkat 1 807 Mar-01-2018, 11:25 AM
Last Post: showkat
  Regex: How to say 'any number of characters of any type until x'? JoeB 2 833 Jan-24-2018, 03:30 PM
Last Post: Mekire
  Number of characters ejstik 9 1,171 Jan-11-2018, 12:20 PM
Last Post: Larz60+

Forum Jump:


Users browsing this thread: 1 Guest(s)