Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Regex help for newbie
#1
Hi,
New to this site and to python. As a first project, I am trying to extract entire line entries via regex from a text file (hosts.txt). Based on matches, I want to include those matches into ini-type sections of a flat text file, 'final_host'. So I am trying to put 'cars' matches under [cars], 'trucks' matches under [trucks], etc.

Here is an example hosts.txt file:

car1
car2
truck1
truck12
truck13

Here is my code:

import re

source = open("hosts", "r") 
car = re.compile(".*car\S+")
truck = re.compile(".*truck\S+")

for line in source:
  car_result = car.findall(line)
  truck_result = truck.findall(line)
source.close()

with open('final_host', 'w') as y:
  y.write("[car]\n")

  for x in car_result:
    print(x)

  y.write("[truck]\n")

  for x in truck_result:
    print(x)

y.close()
But all I'm getting for results are the ini headers. Is there an easier way to regex?

[car]
[truck]

OS: Ubuntu
Python: 2.7.15

Thank you in advance for any pointers.
Reply
#2
You need to add from __future__ import print_function and pass file keyword to the print function, e.g. print(x, file=y) or use y.write('{}\n'.format(x)). This should work, but not tested.
Why did you still using Python 2.x? If you have digits after car of truck words, it would be better to use a regexp something like this re.compile("car\d+").
Reply
#3
Hi scidam,
Thanks for the quick response. As to why I'm using python 2.x... Mostly my impetus is that I use Ansible a lot so I wanted to be able to take advantage of some of the advanced features like writing my own module, Jinja2, etc. And as of now, python 3 is not fully supported for most of the features.

I did try your suggestions and each combination but am getting the same results as before my post. Also, I used
re.compile("car\S+")
because my hosts file was just an example and the characters following the match may be alpha, digit, etc.
Reply
#4
It seems to me that if you are looking for certain strings it will be easier without regex

I am not familiar with Python 2 syntax therefore I use Python 3 code:

with open('cars.txt', 'r') as source:
    cars = ['[cars]\n']
    trucks = ['[trucks]\n']
    for row in source:
        if 'car' in row:
            cars.append(row)
        elif 'truck' in row:
            trucks.append(row)

with open('cars_result.txt', 'w') as filtered:
    print(*cars, *trucks, file=filtered)
cars_result.txt will look like:

Output:
[cars] car1 car2 [trucks] truck1 truck12 truck13
Ansible is not my cup of tea but quick check of documentation revealed statements about support of Python 3.

In your code you use old-style open and 'with open(...'. What is the reason of such mix?
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#5
Thanks perfringo, this works for my purposes!

There is support for Python 3 in Ansible but not entirely and it's a work in progress.

As for why I'm using 'with open', again I'm a newbie. This is really my first project, so I am just cobbling together what I can. I am happy to hear about an alternative to 'with open'.

Thanks again!
Reply
#6
Note that there are only six months of support left for Python 2.7.
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply
#7
Thank you, yes, I will be branching out and learning 3.x in parallel.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020