May-20-2021, 02:21 PM
Hi!
Im learning the re module and having a difficulties with the writing of it so Ill be happy to have some explanations.
I have a file with a lot of columns, and I try to find if its first column starts the same at all the rows but end with different range of numbers.
Example of the columns:
so I tried to do something like that:
appreciate any kind of help!
Im learning the re module and having a difficulties with the writing of it so Ill be happy to have some explanations.
I have a file with a lot of columns, and I try to find if its first column starts the same at all the rows but end with different range of numbers.
Example of the columns:
chr10 chr10 chr10 chr11 chr11 chr12 chr15 chr17 chr18 chrX chrX chr10 chr10 chr10 chr11 chr11 chr12 chr15 chr17 chr18 chrX chrX chr10 chr10 chr10 chr11 chr11 chr12 chr15 chr17 chr18 chrX chrX chr10 chr10 chr10 chr11 chr11 chr12 chr15 chr17 chr18 chrX chrXwhat I'm trying to do is to find if all the strings start with "chr" and ends with the range of 1-99 or "X" or "Y" or "M".
so I tried to do something like that:
with open("vcf1.vcf", "r+") as my_file: lines = my_file.readlines() for line in lines: columns = line.split("\t") if re.match(r"^chr([1-9][0-9]|[X]|[Y]|[M]?)$", columns[0]): print(True)which return as output:
True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True Truethats the output I should get but I dont feel like I did in the right way and its only considering the chr and not the rest.
appreciate any kind of help!