Python Forum
Morning Removing last character
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Morning Removing last character
#1
I have the need to change a list of ips into a regular expression, then copy / paste the results else where

The starting list
Quote:10.10.10.10 host1
10.10.10.11 host2
10.10.10.12 host3
10.10.10.13 host4
10.10.10.14 host5

The desired output
Quote:^10.10.10.10$|^10.10.10.11$|^10.10.10.12$|^10.10.10.13$|^10.10.10.14$

The current output .. notice the last "|", I want that removed.
Quote:^10.10.10.10$|^10.10.10.11$|^10.10.10.12$|^10.10.10.13$|^10.10.10.14$|

My cheep g code
!/usr/bin/python
# -*- coding: utf-8 -*-
from __future__ import print_function
import sys, os, re
def cls():
    os.system('clear')

def main():
       cls()
       try:
          #olist = []
          for line in open (sys.argv[1], 'r' ):
              word_list = line.split()
              word_list[0] = re.sub("^", "^", word_list[0], flags=re.M)
              word_list[0] = re.sub("$", "$|", word_list[0], flags=re.M)
              print(word_list[0],end='')
          print('\n\n')



       except IOError as e :
           print("File Open Error")
           print("Error :", str(e))
       except IndexError as i :
           print("Usage: argv[0] <file having ip as the first field, hostname as the second>\nExample : 10.10.10.10 host1\n          10.10.10.10 host2\n          10.10.10.12 host3")


main()
Working on a Linux vm

[localhost etc]$ cat system-release
CentOS Linux release 7.6.1810 (Core)

[localhost etc]$ python -V
Python 2.7.5

I know .. the Python version is old and crusty considering 3.8 is in beta ... but they are still using 2.7 at work.

My Question
The only way I can think of to get rid of the trailing pipe is to count the lines in the file, iterate a separate counter as I run through the file, compare the constant to the line counter, if equal do some thing like print word_list[0][:-1]

Is there a better way to do this .. as a side question .. is there a way to combine the 2 re's into a single line ?

Thanks for any help provided !!!

Regards
Sum
Reply
#2
Use str.join

My output:
Output:
deadeye@nexus ~ $ python2.7 parse_ips.py Without piping to program, you have to use --input-file deadeye@nexus ~ $ python2.7 parse_ips.py --input-file usage: parse_ips.py [-h] [--input-file INPUT_FILE] parse_ips.py: error: argument --input-file: expected one argument deadeye@nexus ~ $ python2.7 parse_ips.py --input-file hosts.txt ^10\.10\.10\.10$|^10\.10\.10\.11$|^10\.10\.10\.12$|^10\.10\.10\.13$|^10\.10\.10\.14$ deadeye@nexus ~ $ cat hosts.txt | python2.7 parse_ips.py ^10\.10\.10\.10$|^10\.10\.10\.11$|^10\.10\.10\.12$|^10\.10\.10\.13$|^10\.10\.10\.14$
Code:
#!/usr/bin/env python2.7
from __future__ import print_function
import sys
import argparse


def ip2regex(text):
    ips = []
    for row in text.splitlines():
        try:
            ip, hostname = row.split()
        except ValueError:
            # skip errors
            continue
        ip = '^' + ip.replace('.', r'\.') + '$'
        ips.append(ip)
    return '|'.join(ips)


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--input-file', required=False, help='Input file to generate regex output.')
    args = parser.parse_args()
    if args.input_file is None and not sys.stdin.isatty():
        print(ip2regex(sys.stdin.read()))
    elif args.input_file and sys.stdin.isatty():
        with open(args.input_file) as fd:
            print(ip2regex(fd.read()))
    else:
        print('Without piping to program, you have to use --input-file', file=sys.stderr)
Line 15-17 preparing the IP address. By the way, a dot is a metachar in regex. The dot stands for any kind of char.

If you use the dot without escaping it, the regex ^10.10.10.10$ will be also match: 10510710310

PS: split is the opposite of join.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#3
why complicate things that much? simple string methods and formating would do?

infile = sys.argv[1]
with (infile, 'r') as f:
    ips = [ip_addr for line in f for ip_addr, *_ in line.split()]
print('|'.join('^{}$'.format(ip_addr) for ip_addr in ips))
and these 4 lines can be shorten to 2
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#4
(Sep-06-2019, 03:17 PM)DeaD_EyE Wrote: Use str.join

My output:
Output:
deadeye@nexus ~ $ python2.7 parse_ips.py Without piping to program, you have to use --input-file deadeye@nexus ~ $ python2.7 parse_ips.py --input-file usage: parse_ips.py [-h] [--input-file INPUT_FILE] parse_ips.py: error: argument --input-file: expected one argument deadeye@nexus ~ $ python2.7 parse_ips.py --input-file hosts.txt ^10\.10\.10\.10$|^10\.10\.10\.11$|^10\.10\.10\.12$|^10\.10\.10\.13$|^10\.10\.10\.14$ deadeye@nexus ~ $ cat hosts.txt | python2.7 parse_ips.py ^10\.10\.10\.10$|^10\.10\.10\.11$|^10\.10\.10\.12$|^10\.10\.10\.13$|^10\.10\.10\.14$
Yep thanks .. I understand that he "." means any char. The App Im pasting into recognizes an ip just so long its wrapped in ^$.

Thanks.

Code:
#!/usr/bin/env python2.7
from __future__ import print_function
import sys
import argparse


def ip2regex(text):
    ips = []
    for row in text.splitlines():
        try:
            ip, hostname = row.split()
        except ValueError:
            # skip errors
            continue
        ip = '^' + ip.replace('.', r'\.') + '$'
        ips.append(ip)
    return '|'.join(ips)


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--input-file', required=False, help='Input file to generate regex output.')
    args = parser.parse_args()
    if args.input_file is None and not sys.stdin.isatty():
        print(ip2regex(sys.stdin.read()))
    elif args.input_file and sys.stdin.isatty():
        with open(args.input_file) as fd:
            print(ip2regex(fd.read()))
    else:
        print('Without piping to program, you have to use --input-file', file=sys.stderr)
Line 15-17 preparing the IP address. By the way, a dot is a metachar in regex. The dot stands for any kind of char.

If you use the dot without escaping it, the regex ^10.10.10.10$ will be also match: 10510710310

PS: split is the opposite of join.

Ive seen this construct in some example code .. but not in any instruction .... probably because Im just starting out.

What is it called and where can I learn about it ..
ips = [ip_addr for line in f for ip_addr, *_ in line.split()]
Reply
#5
(Sep-06-2019, 04:59 PM)sumncguy Wrote: What is it called and where can I learn about it ..
this is list comprehension. but yu can also have generator expression, e.g. (ip_addr for line in f for ip_addr, *_ in line.split()) in which case it will not create full list in memory or dict comprehension
note that it can be expanded as normal for loop
infile = sys.argv[1]
ips = []
with (infile, 'r') as f:
    for line in f:
        for ip_addr, *_ in line.split():
            ips.append(ip_addr)
print('|'.join('^{}$'.format(ip_addr) for ip_addr in ips))
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#6
list comprehension .. thank you

I dont like to just copy and paste solutions given that I don't understand. Main reason, if I did, next month when I look at the code again ... I'd be thinking 'What the heck does that do again ?" So if I get even a high level understanding an annotate my script .. it will be easier to jar this crusty 54 year old memory ! :)

Thanks
Sum
Reply
#7
The _ is a valid name in Python.
In interactive mode it holds the last result, if it was not assigned to a name.

>>> 5+5
10
>>> print(_)
10
And the effect of the wildcard in front of one of the names in a assignment:
start, *middle, end = 'start 1 2 3 4 5 end'.split()
print(start)
print(middle)
print(end)
Output:
start ['1', '2', '3', '4', '5'] end
ips = [ip_addr for line in f for ip_addr, *_ in line.split()]
Is the same like:
ips = []
for line in f:
    for ip_addr, *_ in line.split():
       ips.append(ip_addr) 
The name f should point to an open file. Iterating over a file-object, gets line by line.
But I think this example is overcomplicated. You can write this as:

ips = []
for line in f:
    ip_addr, *rest = line.split()
    ips.append(ip_addr) 
Then you get rid of the nested loop.
Turning this into a list comprehension:

ips = [line.split()[0] for line in f]
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#8
Angry

I found that a few VMs are using 2.6.6..

Seems that format wasnt introduced until 2.7 .. so the print solution doesnt work in some cases.

Can anyone point me to a place where I can find out how to truncate the last "|" in 2.6.6.

I wish
1. they would standardize the Linux and python version they are using.
2. upgrade at least to python 3.x .. especially being that 3.8 is in beta 2 !!

I work for a big company .. cant say which .. but I find it incredible that they arent really doing any admin on their VMs.

Thanks for the help
Sum
Reply
#9
it works, in 2.6 just need to number the placehodler(s) (i.e. explicitly specify the order in which to place values in palceholders).
print('|'.join('^{0}$'.format(ip_addr) for ip_addr in ips))
this will work also in 2.7 and 3.x versions
Or to say it the other way around - in 2.7 and 3.x you can skip the number
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  [solved] unexpected character after line continuation character paul18fr 4 3,379 Jun-22-2021, 03:22 PM
Last Post: deanhystad
  SyntaxError: unexpected character after line continuation character siteshkumar 2 3,149 Jul-13-2020, 07:05 PM
Last Post: snippsat
  how can i handle "expected a character " type error , when I input no character vivekagrey 2 2,720 Jan-05-2020, 11:50 AM
Last Post: vivekagrey
  Replace changing string including uppercase character with lowercase character silfer 11 6,124 Mar-25-2019, 12:54 PM
Last Post: silfer
  SyntaxError: unexpected character after line continuation character Saka 2 18,537 Sep-26-2017, 09:34 AM
Last Post: Saka

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020