Python Forum

Hi,
New to this site and to python. As a first project, I am trying to extract entire line entries via regex from a text file (hosts.txt). Based on matches, I want to include those matches into ini-type sections of a flat text file, 'final_host'. So I am trying to put 'cars' matches under [cars], 'trucks' matches under [trucks], etc.

Here is an example hosts.txt file:

car1
car2
truck1
truck12
truck13

Here is my code:

import re

source = open("hosts", "r") 
car = re.compile(".*car\S+")
truck = re.compile(".*truck\S+")

for line in source:
  car_result = car.findall(line)
  truck_result = truck.findall(line)
source.close()

with open('final_host', 'w') as y:
  y.write("[car]\n")

  for x in car_result:
    print(x)

  y.write("[truck]\n")

  for x in truck_result:
    print(x)

y.close()

But all I'm getting for results are the ini headers. Is there an easier way to regex?

[car]
[truck]

OS: Ubuntu
Python: 2.7.15

Thank you in advance for any pointers.

You need to add from __future__ import print_function and pass file keyword to the print function, e.g. print(x, file=y) or use y.write('{}\n'.format(x)). This should work, but not tested.
Why did you still using Python 2.x? If you have digits after car of truck words, it would be better to use a regexp something like this re.compile("car\d+").

Hi scidam,
Thanks for the quick response. As to why I'm using python 2.x... Mostly my impetus is that I use Ansible a lot so I wanted to be able to take advantage of some of the advanced features like writing my own module, Jinja2, etc. And as of now, python 3 is not fully supported for most of the features.

I did try your suggestions and each combination but am getting the same results as before my post. Also, I used

re.compile("car\S+")

because my hosts file was just an example and the characters following the match may be alpha, digit, etc.

It seems to me that if you are looking for certain strings it will be easier without regex

I am not familiar with Python 2 syntax therefore I use Python 3 code:

with open('cars.txt', 'r') as source:
    cars = ['[cars]\n']
    trucks = ['[trucks]\n']
    for row in source:
        if 'car' in row:
            cars.append(row)
        elif 'truck' in row:
            trucks.append(row)

with open('cars_result.txt', 'w') as filtered:
    print(*cars, *trucks, file=filtered)

cars_result.txt will look like:

Output:[cars]
 car1
 car2
 [trucks]
 truck1
 truck12
 truck13

Ansible is not my cup of tea but quick check of documentation revealed statements about support of Python 3.

In your code you use old-style open and 'with open(...'. What is the reason of such mix?

Thanks perfringo, this works for my purposes!

There is support for Python 3 in Ansible but not entirely and it's a work in progress.

As for why I'm using 'with open', again I'm a newbie. This is really my first project, so I am just cobbling together what I can. I am happy to hear about an alternative to 'with open'.

Thanks again!

Note that there are only six months of support left for Python 2.7.

Thank you, yes, I will be branching out and learning 3.x in parallel.

mcmpdx

scidam

mcmpdx

perfringo

mcmpdx

ichabod801

mcmpdx