Python Forum
Using re to find only uppercase letters
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Using re to find only uppercase letters
#7
(May-28-2021, 06:58 PM)nilamo Wrote: Something still seems off, as that regex won't match the string.
>>> import re
>>> test = 'ChrX        74226540        T       t       50      .'
>>> test
'ChrX\t74226540\tT\tt\t50\t.'
>>> print(test)
ChrX    74226540        T       t       50      .
>>> raw_regex = r"^[Cc]hr(?:0?[1-9]|[1-9][0-9]|[MXY])\t0*[1-9][0-9]*\t[^\t]*\t[ATGC]{2}"
>>> regex = re.compile(raw_regex)
>>> regex.match(test)
>>> regex
re.compile('^[Cc]hr(?:0?[1-9]|[1-9][0-9]|[MXY])\\t0*[1-9][0-9]*\\t[^\\t]*\\t[ATGC]{2}')

I kinda figured it out.. for some reason when I use the {2} its case insensitive so I just seperated it to do it twice:
def isVCF(file):
    num_format = re.compile(r"^[Cc]hr(?:0?[1-9]|[1-9][0-9]|[MXY])\t0*[1-9][0-9]*\t[^\t]*\t[ATGC]\t[ATGC]")
    with open(file, "r+") as my_file:
        for line in my_file:
            if line.startswith("#"):
                continue
            if num_format.match(line):
                return True
            else:
                return False
I used the if line.startwith to skip the headline
nilamo likes this post
Reply


Messages In This Thread
RE: Using re to find only uppercase letters - by ranbarr - May-31-2021, 03:19 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Uppercase problem MarcJuegos_YT 4 2,624 Aug-21-2020, 02:16 PM
Last Post: MarcJuegos_YT
  Check if string is uppercase or lowercase and eliminate Wolfpack2605 1 4,750 Jan-01-2018, 05:03 AM
Last Post: Mekire

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020