Python Forum
Help me with python read file and save file - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Help me with python read file and save file (/thread-17205.html)



Help me with python read file and save file - wereak - Apr-02-2019

ID, email, random letter, Password, random letter, Title, random letter, Email Account Register with.

Line 7 has a value "null" so if the null value is found skip the line and continue.

----------------------------------------------------------------------------------------
Filename input.txt

"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"

"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"demo9_email@Tut.by","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"

----------------------------------------------------------------------------------------

Output and safe in other file output.txt
Email : Password : Email Account Register with



[email protected]:password1:yahoo.com
[email protected]:password2:ymail.bg
[email protected]:password3:ya.ru
[email protected]:password4:yandex.ua
[email protected]:password5:yandex.by
[email protected]:password6:yahoo.com
[email protected]:password8:gmail.com
demo9_email@Tut.by:password9:tut.by
[email protected]:password10:t-online.de

use the function findall to match the email
re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)


RE: Help me with python read file and save file - Yoriz - Apr-02-2019

reading-and-writing-files
when you get stuck, post the specific part your stuck on with code in python code tags and any errors received in error tags.


RE: Help me with python read file and save file - wereak - Apr-02-2019

i know some basic on writing readlines and readline but i am confuse on how to go about creating a code where it takes what is needed and safe it to the other file

Open a file input.txt
Read a file input.txt
use findall to search for email and password and Email Account Register with.
output the file to another file output.txt

i am stuck here
file = open("input.txt", "r")
print "Name of the file is : ",file.name
print(file.read())

#re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)



RE: Help me with python read file and save file - chisox721 - Apr-02-2019

I'm not exactly sure what you're trying to do but seems like you'd be best served using Pandas to get the data cleaned up. Could be overkill but that's what I'd do.


RE: Help me with python read file and save file - perfringo - Apr-02-2019

Little mental exercise: data cleaning with list comprehension, string methods and indexing:

In [1]: lst = [ 
   ...: '"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"',
   ...: '"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"',
   ...: '"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"'
   ...: ]

In [2]: [row.split(':')[i].split(',')[0].strip('"') for row in lst for i in [1, 2, 4]]
Out[3]: 
['[email protected]',
 'password1',
 'yahoo.com',
 '[email protected]',
 'password2',
 'mail.bg',
 '[email protected]',
 'password3',
 'ya.ru'] 



RE: Help me with python read file and save file - wereak - Apr-02-2019

import json

file = open("input.txt", "r")
print ("Name of the file is : ",file.name)
print(file.read())

with open('input.txt') as f:
    read_data = f.read()
file.closed

#a for loop will be created to look for emails, password and email associate with
#re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)

#trying to read the file from input.txt and safe the output to output.txt with the help of json.dumps
file2 = open("output.txt", "rb+")
json.dumps(file)


file2.closed
Traceback (most recent call last):
  File "p.py", line 16, in <module>
    json.dumps(file)
  File "/usr/lib/python3.6/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python3.6/json/encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'TextIOWrapper' is not JSON serializable

This is the file called input.txt
Quote:"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"demo9_email@Tut.by","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"

by default output.txt is empty file. I want to grab the email using the function
Quote:findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)
and also grab password and the last line. So after the scraping the out should look like this

Quote:[email protected]:password1:yahoo.com
[email protected]:password2:ymail.bg
[email protected]:password3:ya.ru
[email protected]:password4:yandex.ua
[email protected]:password5:yandex.by
[email protected]:password6:yahoo.com
[email protected]:password8:gmail.com
demo9_email@Tut.by:password9:tut.by
[email protected]:password10:t-online.de

Line 7 has a value "null" so if the null value is found skip the line and continue outputting other lines.
the line 7 has bee removed because it encounter the null value
Quote:"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null



RE: Help me with python read file and save file - DeaD_EyE - Apr-02-2019

Just using Pandas solves the current problem, but does not improve the knowledge about Python.
You should know how to iterate over lines, how to work with split and replace etc.

Edit: I made here a mistake. I took the wrong field. Try to correct this. It's not difficult.
import io

def reader(file):
    try:
        next(file) # skip header
    except StopIteration:
        print('File is empty')
        return # Return inside a generator stops the iteration of the generator
    for row in file:
        try:
            row = [
                value.replace('"', '').strip()
                for item in row.split(',')
                for value in item.split(':')
                ]
            email, domain, password = row[1], row[3], row[-1]
            yield email, domain, password
        except IndexError:
            continue


def data_printer(file):
    for email, domain, password in reader(file):
        print(f'{email}@{domain}:{password}')


input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"

"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"'''


line_reader = io.StringIO(input_data)
# using the string as a file-like object
data_printer(line_reader)

# but it can used also with normal files
with open('somefile.txt') as fd:
    data_printer(fd)
The most stuff is going on here:

            row = [
                value.replace('"', '').strip()
                for item in row.split(',')
                for value in item.split(':')
                ]
Simplified as a nested loop:


row = []
for item in row.split(','):
    for value in item.split(':'):
        value = value.replace('"', '').strip()
        row.append(value)
Combined together with iteration over the lines:


for line in input_data.splitlines():
    row = []
    for item in line.split(','):
        for value in item.split(':'):
            value = value.replace('"', '').strip()
            row.append(value)
    print(row)

# or if input_data is a file-object opened in text mode


for line in input_data:
    row = []
    # row seems to have , and : as delimiter for data
    # first level, split by , 
    for item in line.split(','):
        # each item is a string, which could contain a :
        # split by :
        for value in item.split(':'):
            # remove the quoting and then white space at the beginning and end of the str
            value = value.replace('"', '').strip()
            # append the result to the list row
            row.append(value)
    # print the current result of the line
    # pay attention about indentation
    # this kind of nested loops leads into indentation errors
    print(row)
I hope I haven't done your homework.
By the way, it's a little bit strange, that the input data is delimited by , and :.
Using regex is not always the best solution.


RE: Help me with python read file and save file - wereak - Apr-02-2019

(Apr-02-2019, 08:17 AM)DeaD_EyE Wrote: Just using Pandas solves the current problem, but does not improve the knowledge about Python.
You should know how to iterate over lines, how to work with split and replace etc.

Edit: I made here a mistake. I took the wrong field. Try to correct this. It's not difficult.
import io

def reader(file):
    try:
        next(file) # skip header
    except StopIteration:
        print('File is empty')
        return # Return inside a generator stops the iteration of the generator
    for row in file:
        try:
            row = [
                value.replace('"', '').strip()
                for item in row.split(',')
                for value in item.split(':')
                ]
            email, domain, password = row[1], row[3], row[-1]
            yield email, domain, password
        except IndexError:
            continue


def data_printer(file):
    for email, domain, password in reader(file):
        print(f'{email}@{domain}:{password}')


input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"

"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"'''


line_reader = io.StringIO(input_data)
# using the string as a file-like object
data_printer(line_reader)

# but it can used also with normal files
with open('somefile.txt') as fd:
    data_printer(fd)
The most stuff is going on here:

            row = [
                value.replace('"', '').strip()
                for item in row.split(',')
                for value in item.split(':')
                ]
Simplified as a nested loop:


row = []
for item in row.split(','):
    for value in item.split(':'):
        value = value.replace('"', '').strip()
        row.append(value)
Combined together with iteration over the lines:


for line in input_data.splitlines():
    row = []
    for item in line.split(','):
        for value in item.split(':'):
            value = value.replace('"', '').strip()
            row.append(value)
    print(row)

# or if input_data is a file-object opened in text mode


for line in input_data:
    row = []
    # row seems to have , and : as delimiter for data
    # first level, split by , 
    for item in line.split(','):
        # each item is a string, which could contain a :
        # split by :
        for value in item.split(':'):
            # remove the quoting and then white space at the beginning and end of the str
            value = value.replace('"', '').strip()
            # append the result to the list row
            row.append(value)
    # print the current result of the line
    # pay attention about indentation
    # this kind of nested loops leads into indentation errors
    print(row)
I hope I haven't done your homework.
By the way, it's a little bit strange, that the input data is delimited by , and :.
Using regex is not always the best solution.

Thank you Sir it did give me some new ideas but does not solved the current issue since i have to import the text from the file and it need to have "import re" build in module or RegEx since what if the email id also include [email protected]


this is what i have come up with
import re
#multiple files at a time
with open('input.txt','r') as rf: #rf read from
        #read file from rf and safe the output to wf write file
        with open('output.txt','w') as wf: #wf write file
                for line in rf:
#find all the lines containing the keyword This
                        if re.findall("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')",line):
                                print(line.strip())
Now i need to figure out the password with the delimited ":"Password and "d":"mail.com

input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"
 
"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"'''
can we remove this and import the "input.txt" file which will contain all the details?

and output the result at
# but it can used also with normal files
with open('somefile.txt') as fd:
    data_printer(fd)



RE: Help me with python read file and save file - perfringo - Apr-02-2019

(Apr-02-2019, 09:28 AM)wereak Wrote:
re.findall("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')",line)

Is this regex delivers expected results? My casual observation with regex101.com matches "r'[email protected]'".

re.compile("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')", re.DEBUG)
SUBPATTERN 1 0 0
  LITERAL 114     # r 
  LITERAL 39      # '
  SUBPATTERN 2 0 0
    MAX_REPEAT 1 MAXREPEAT
      IN
        RANGE (97, 122)
        RANGE (48, 57)
    MAX_REPEAT 0 MAXREPEAT
      SUBPATTERN 3 0 0
        LITERAL 46
        MAX_REPEAT 1 MAXREPEAT
          IN
            RANGE (97, 122)
            RANGE (48, 57)
/../



RE: Help me with python read file and save file - wereak - Apr-02-2019

https://regex101.com

([a-z0-9-]+[a-z0-9_]+[a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+))
"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"
"11":"[email protected]","p":"password11","r":"PYTHON DEMO OUTPUT,"d":"mail.co.uk"
"12":"[email protected]","p":"password12","r":"PYTHON DEMO OUTPUT,"d":"mail.ru"
"13":"[email protected]","p":"password13","r":"PYTHON DEMO OUTPUT,"d":"mail.ru"
now i am trying to find out [email protected] and grab password and last line which is mail.ru etc