Python Forum
Thread Rating:
  • 2 Vote(s) - 2.5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
n00b needs help
#1
hello guys\gals,
i'm sure this is simple but i'm a n00b and have spent a few hours trying to figure this out.  basically what i need is this;

read file1 (this file has a list of numbers, one on each new line, so line1 is 12345, line2 is 678910, etc..)
open file2 (this file contains line after line of text and includes the numbers from file1) and insert a # at the beginning of each line that matches the regex values from file1.
i'm thinking i need an array or list to house the values from file1?  then call them when searching file2
below is basic pseudo code;
with open("/path/to/file1", 'r') as s:
  list = s.read().splitlines()
with open("/path/to/file2", 'w') as f:
  for line in f:
      line = line.rstrip()
      if list in lines:
          append hashtag at beginning of line matching regex
i think i need to do some type of join('|') on the list or something?  just kind of stuck.  i've googled for days, if someone doesn't mind assisting me i'd greatly appreciate it.
Reply
#2
The best way to figure something out when you're stuck, is to simplify the problem until you're only working with what doesn't work.  For example, there's no reason to be messing with files, so cut all that out until you have something that works without files.

Maybe...
>>> regexes = [
... "12345",
... "678910",
... "47283"
... ]
>>> search = [
... "line1",
... "47283",
... "don't touch me",
... "me neither",
... "12345 - haha, just kidding, don't touch me",
... "12345"
... ]
>>> for ndx, line in enumerate(search):
...   if line in regexes:
...     search[ndx] = "#" + line
...
>>> search
['line1', '#47283', "don't touch me", 'me neither', "12345 - haha, just kidding, don't touch me", '#12345']
Reply
#3
thanks nilamo.

i cut it down to very basic, just trying to print the lines in file2 that match my regex.

with open("/path/to/file1", 'r') as s:
  number = '12345'

with open("/path/to/file2", 'r') as f:
  for line in f:
      if number in line:
        print line
this works, but now if i try to add a second number to number it doesn't work.  do i need an array or something for the values?
Reply
#4
Yep! :)
Reply
#5
(Jun-06-2017, 09:11 PM)theturd Wrote:  do i need an array or something for the values?
Yes,and in Python we call array for list.
Can use any(),to avoid a second loop or regex.
s = '''\
a12345gfdg
111111111
bbb789llll
777777777'''

>>> lst = ['12345', '789']
>>> for line in s.split('\n'):
...     if any(item in line for item in lst):
...         print(line)
...         
a12345gfdg
bbb789llll
Reply
#6
thanks snippsat, appreciate the info. i now have the code that will read the file and create the list, then search the second file and print the lines matching the values from the list. now i need to insert the # at the beginning of each matching line. i've read that once i open a file using 'w' the original file is deleted? not sure how i go about editing the existing file?
Reply
#7
(Jun-07-2017, 06:56 PM)theturd Wrote: not sure how i go about editing the existing file?
"editing" is not easy. You can read, and you can write. If you don't want to delete the whole file, the write what you want to keep.

If you want to try "editing", then open the file using "r+", and use seek() to move to where in the file you want to write.
Reply
#8
(Jun-07-2017, 06:56 PM)theturd Wrote: now i need to insert the # at the beginning of each matching line. i've read that once i open a file using 'w' the original file is deleted? not sure how i go about editing the existing file?
As mention bye nilamo it can be a little tricky.
Can seek and edit,
or write to new file and delete old file and rename new file to old file.

There is fileinput in standard library that can help with this.
It dos what i describe over with inplace=True

foo.txt:
a12345gfdg
111111111
bbb789llll
777777777
import fileinput
import sys

def insert_to_line(f_name):
   for line in fileinput.input(f_name, inplace=True):
       if any(item in line for item in lst):
           sys.stdout.write('# {}'.format(line))
       else:
           sys.stdout.write(line)

if __name__ == '__main__':
   lst = ['12345', '789']
   f_name = 'foo.txt'
   insert_to_line(f_name)
foo.txt:
Output:
# a12345gfdg 111111111 # bbb789llll 777777777
Reply
#9
thanks snippsat, i was able to make that work. i appreciate all the help, was a great learning experience.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  n00b question millpond 6 3,383 Jul-13-2019, 06:41 AM
Last Post: Gribouillis
  n00b help with referencing files theturd 8 5,130 Jul-21-2017, 04:16 PM
Last Post: nilamo
  serious n00b.. NLTK in python 2.7 and 3.5 pythlang 24 19,783 Oct-21-2016, 04:15 PM
Last Post: pythlang

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020