Morning Removing last character - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Morning Removing last character (/thread-20922.html) |
Morning Removing last character - sumncguy - Sep-06-2019 I have the need to change a list of ips into a regular expression, then copy / paste the results else where The starting list Quote:10.10.10.10 host1 The desired output Quote:^10.10.10.10$|^10.10.10.11$|^10.10.10.12$|^10.10.10.13$|^10.10.10.14$ The current output .. notice the last "|", I want that removed. Quote:^10.10.10.10$|^10.10.10.11$|^10.10.10.12$|^10.10.10.13$|^10.10.10.14$| My cheep g code !/usr/bin/python # -*- coding: utf-8 -*- from __future__ import print_function import sys, os, re def cls(): os.system('clear') def main(): cls() try: #olist = [] for line in open (sys.argv[1], 'r' ): word_list = line.split() word_list[0] = re.sub("^", "^", word_list[0], flags=re.M) word_list[0] = re.sub("$", "$|", word_list[0], flags=re.M) print(word_list[0],end='') print('\n\n') except IOError as e : print("File Open Error") print("Error :", str(e)) except IndexError as i : print("Usage: argv[0] <file having ip as the first field, hostname as the second>\nExample : 10.10.10.10 host1\n 10.10.10.10 host2\n 10.10.10.12 host3") main()Working on a Linux vm [localhost etc]$ cat system-release CentOS Linux release 7.6.1810 (Core) [localhost etc]$ python -V Python 2.7.5 I know .. the Python version is old and crusty considering 3.8 is in beta ... but they are still using 2.7 at work. My Question The only way I can think of to get rid of the trailing pipe is to count the lines in the file, iterate a separate counter as I run through the file, compare the constant to the line counter, if equal do some thing like print word_list[0][:-1] Is there a better way to do this .. as a side question .. is there a way to combine the 2 re's into a single line ? Thanks for any help provided !!! Regards Sum RE: Morning Removing last character - DeaD_EyE - Sep-06-2019 Use str.join My output: Code:#!/usr/bin/env python2.7 from __future__ import print_function import sys import argparse def ip2regex(text): ips = [] for row in text.splitlines(): try: ip, hostname = row.split() except ValueError: # skip errors continue ip = '^' + ip.replace('.', r'\.') + '$' ips.append(ip) return '|'.join(ips) if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument('--input-file', required=False, help='Input file to generate regex output.') args = parser.parse_args() if args.input_file is None and not sys.stdin.isatty(): print(ip2regex(sys.stdin.read())) elif args.input_file and sys.stdin.isatty(): with open(args.input_file) as fd: print(ip2regex(fd.read())) else: print('Without piping to program, you have to use --input-file', file=sys.stderr)Line 15-17 preparing the IP address. By the way, a dot is a metachar in regex. The dot stands for any kind of char. If you use the dot without escaping it, the regex ^10.10.10.10$ will be also match: 10510710310 PS: split is the opposite of join .
RE: Morning Removing last character - buran - Sep-06-2019 why complicate things that much? simple string methods and formating would do? infile = sys.argv[1] with (infile, 'r') as f: ips = [ip_addr for line in f for ip_addr, *_ in line.split()] print('|'.join('^{}$'.format(ip_addr) for ip_addr in ips))and these 4 lines can be shorten to 2 RE: Morning Removing last character - sumncguy - Sep-06-2019 (Sep-06-2019, 03:17 PM)DeaD_EyE Wrote: Use Ive seen this construct in some example code .. but not in any instruction .... probably because Im just starting out. What is it called and where can I learn about it .. ips = [ip_addr for line in f for ip_addr, *_ in line.split()] RE: Morning Removing last character - buran - Sep-06-2019 (Sep-06-2019, 04:59 PM)sumncguy Wrote: What is it called and where can I learn about it ..this is list comprehension. but yu can also have generator expression, e.g. (ip_addr for line in f for ip_addr, *_ in line.split()) in which case it will not create full list in memory or dict comprehensionnote that it can be expanded as normal for loop infile = sys.argv[1] ips = [] with (infile, 'r') as f: for line in f: for ip_addr, *_ in line.split(): ips.append(ip_addr) print('|'.join('^{}$'.format(ip_addr) for ip_addr in ips)) RE: Morning Removing last character - sumncguy - Sep-06-2019 list comprehension .. thank you I dont like to just copy and paste solutions given that I don't understand. Main reason, if I did, next month when I look at the code again ... I'd be thinking 'What the heck does that do again ?" So if I get even a high level understanding an annotate my script .. it will be easier to jar this crusty 54 year old memory ! :) Thanks Sum RE: Morning Removing last character - DeaD_EyE - Sep-06-2019 The _ is a valid name in Python. In interactive mode it holds the last result, if it was not assigned to a name. >>> 5+5 10 >>> print(_) 10And the effect of the wildcard in front of one of the names in a assignment: start, *middle, end = 'start 1 2 3 4 5 end'.split() print(start) print(middle) print(end)
ips = [ip_addr for line in f for ip_addr, *_ in line.split()]Is the same like: ips = [] for line in f: for ip_addr, *_ in line.split(): ips.append(ip_addr)The name f should point to an open file. Iterating over a file-object, gets line by line. But I think this example is overcomplicated. You can write this as: ips = [] for line in f: ip_addr, *rest = line.split() ips.append(ip_addr)Then you get rid of the nested loop. Turning this into a list comprehension: ips = [line.split()[0] for line in f] RE: Morning Removing last character - sumncguy - Sep-16-2019 I found that a few VMs are using 2.6.6.. Seems that format wasnt introduced until 2.7 .. so the print solution doesnt work in some cases. Can anyone point me to a place where I can find out how to truncate the last "|" in 2.6.6. I wish 1. they would standardize the Linux and python version they are using. 2. upgrade at least to python 3.x .. especially being that 3.8 is in beta 2 !! I work for a big company .. cant say which .. but I find it incredible that they arent really doing any admin on their VMs. Thanks for the help Sum RE: Morning Removing last character - buran - Sep-16-2019 it works, in 2.6 just need to number the placehodler(s) (i.e. explicitly specify the order in which to place values in palceholders). print('|'.join('^{0}$'.format(ip_addr) for ip_addr in ips))this will work also in 2.7 and 3.x versions Or to say it the other way around - in 2.7 and 3.x you can skip the number |