Python Forum
Help me with python read file and save file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Help me with python read file and save file
#1
ID, email, random letter, Password, random letter, Title, random letter, Email Account Register with.

Line 7 has a value "null" so if the null value is found skip the line and continue.

----------------------------------------------------------------------------------------
Filename input.txt

"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"

"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"demo9_email@Tut.by","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"

----------------------------------------------------------------------------------------

Output and safe in other file output.txt
Email : Password : Email Account Register with



[email protected]:password1:yahoo.com
[email protected]:password2:ymail.bg
[email protected]:password3:ya.ru
[email protected]:password4:yandex.ua
[email protected]:password5:yandex.by
[email protected]:password6:yahoo.com
[email protected]:password8:gmail.com
demo9_email@Tut.by:password9:tut.by
[email protected]:password10:t-online.de

use the function findall to match the email
re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)
Reply
#2
reading-and-writing-files
when you get stuck, post the specific part your stuck on with code in python code tags and any errors received in error tags.
Reply
#3
i know some basic on writing readlines and readline but i am confuse on how to go about creating a code where it takes what is needed and safe it to the other file

Open a file input.txt
Read a file input.txt
use findall to search for email and password and Email Account Register with.
output the file to another file output.txt

i am stuck here
file = open("input.txt", "r")
print "Name of the file is : ",file.name
print(file.read())

#re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)
Reply
#4
I'm not exactly sure what you're trying to do but seems like you'd be best served using Pandas to get the data cleaned up. Could be overkill but that's what I'd do.
Reply
#5
Little mental exercise: data cleaning with list comprehension, string methods and indexing:

In [1]: lst = [ 
   ...: '"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"',
   ...: '"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"',
   ...: '"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"'
   ...: ]

In [2]: [row.split(':')[i].split(',')[0].strip('"') for row in lst for i in [1, 2, 4]]
Out[3]: 
['[email protected]',
 'password1',
 'yahoo.com',
 '[email protected]',
 'password2',
 'mail.bg',
 '[email protected]',
 'password3',
 'ya.ru'] 
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#6
import json

file = open("input.txt", "r")
print ("Name of the file is : ",file.name)
print(file.read())

with open('input.txt') as f:
    read_data = f.read()
file.closed

#a for loop will be created to look for emails, password and email associate with
#re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)

#trying to read the file from input.txt and safe the output to output.txt with the help of json.dumps
file2 = open("output.txt", "rb+")
json.dumps(file)


file2.closed
Traceback (most recent call last):
  File "p.py", line 16, in <module>
    json.dumps(file)
  File "/usr/lib/python3.6/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python3.6/json/encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'TextIOWrapper' is not JSON serializable

This is the file called input.txt
Quote:"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"demo9_email@Tut.by","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"

by default output.txt is empty file. I want to grab the email using the function
Quote:findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)
and also grab password and the last line. So after the scraping the out should look like this

Quote:[email protected]:password1:yahoo.com
[email protected]:password2:ymail.bg
[email protected]:password3:ya.ru
[email protected]:password4:yandex.ua
[email protected]:password5:yandex.by
[email protected]:password6:yahoo.com
[email protected]:password8:gmail.com
demo9_email@Tut.by:password9:tut.by
[email protected]:password10:t-online.de

Line 7 has a value "null" so if the null value is found skip the line and continue outputting other lines.
the line 7 has bee removed because it encounter the null value
Quote:"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
Reply
#7
Just using Pandas solves the current problem, but does not improve the knowledge about Python.
You should know how to iterate over lines, how to work with split and replace etc.

Edit: I made here a mistake. I took the wrong field. Try to correct this. It's not difficult.
import io

def reader(file):
    try:
        next(file) # skip header
    except StopIteration:
        print('File is empty')
        return # Return inside a generator stops the iteration of the generator
    for row in file:
        try:
            row = [
                value.replace('"', '').strip()
                for item in row.split(',')
                for value in item.split(':')
                ]
            email, domain, password = row[1], row[3], row[-1]
            yield email, domain, password
        except IndexError:
            continue


def data_printer(file):
    for email, domain, password in reader(file):
        print(f'{email}@{domain}:{password}')


input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"

"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"'''


line_reader = io.StringIO(input_data)
# using the string as a file-like object
data_printer(line_reader)

# but it can used also with normal files
with open('somefile.txt') as fd:
    data_printer(fd)
The most stuff is going on here:

            row = [
                value.replace('"', '').strip()
                for item in row.split(',')
                for value in item.split(':')
                ]
Simplified as a nested loop:


row = []
for item in row.split(','):
    for value in item.split(':'):
        value = value.replace('"', '').strip()
        row.append(value)
Combined together with iteration over the lines:


for line in input_data.splitlines():
    row = []
    for item in line.split(','):
        for value in item.split(':'):
            value = value.replace('"', '').strip()
            row.append(value)
    print(row)

# or if input_data is a file-object opened in text mode


for line in input_data:
    row = []
    # row seems to have , and : as delimiter for data
    # first level, split by , 
    for item in line.split(','):
        # each item is a string, which could contain a :
        # split by :
        for value in item.split(':'):
            # remove the quoting and then white space at the beginning and end of the str
            value = value.replace('"', '').strip()
            # append the result to the list row
            row.append(value)
    # print the current result of the line
    # pay attention about indentation
    # this kind of nested loops leads into indentation errors
    print(row)
I hope I haven't done your homework.
By the way, it's a little bit strange, that the input data is delimited by , and :.
Using regex is not always the best solution.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#8
(Apr-02-2019, 08:17 AM)DeaD_EyE Wrote: Just using Pandas solves the current problem, but does not improve the knowledge about Python.
You should know how to iterate over lines, how to work with split and replace etc.

Edit: I made here a mistake. I took the wrong field. Try to correct this. It's not difficult.
import io

def reader(file):
    try:
        next(file) # skip header
    except StopIteration:
        print('File is empty')
        return # Return inside a generator stops the iteration of the generator
    for row in file:
        try:
            row = [
                value.replace('"', '').strip()
                for item in row.split(',')
                for value in item.split(':')
                ]
            email, domain, password = row[1], row[3], row[-1]
            yield email, domain, password
        except IndexError:
            continue


def data_printer(file):
    for email, domain, password in reader(file):
        print(f'{email}@{domain}:{password}')


input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"

"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"'''


line_reader = io.StringIO(input_data)
# using the string as a file-like object
data_printer(line_reader)

# but it can used also with normal files
with open('somefile.txt') as fd:
    data_printer(fd)
The most stuff is going on here:

            row = [
                value.replace('"', '').strip()
                for item in row.split(',')
                for value in item.split(':')
                ]
Simplified as a nested loop:


row = []
for item in row.split(','):
    for value in item.split(':'):
        value = value.replace('"', '').strip()
        row.append(value)
Combined together with iteration over the lines:


for line in input_data.splitlines():
    row = []
    for item in line.split(','):
        for value in item.split(':'):
            value = value.replace('"', '').strip()
            row.append(value)
    print(row)

# or if input_data is a file-object opened in text mode


for line in input_data:
    row = []
    # row seems to have , and : as delimiter for data
    # first level, split by , 
    for item in line.split(','):
        # each item is a string, which could contain a :
        # split by :
        for value in item.split(':'):
            # remove the quoting and then white space at the beginning and end of the str
            value = value.replace('"', '').strip()
            # append the result to the list row
            row.append(value)
    # print the current result of the line
    # pay attention about indentation
    # this kind of nested loops leads into indentation errors
    print(row)
I hope I haven't done your homework.
By the way, it's a little bit strange, that the input data is delimited by , and :.
Using regex is not always the best solution.

Thank you Sir it did give me some new ideas but does not solved the current issue since i have to import the text from the file and it need to have "import re" build in module or RegEx since what if the email id also include [email protected]


this is what i have come up with
import re
#multiple files at a time
with open('input.txt','r') as rf: #rf read from
        #read file from rf and safe the output to wf write file
        with open('output.txt','w') as wf: #wf write file
                for line in rf:
#find all the lines containing the keyword This
                        if re.findall("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')",line):
                                print(line.strip())
Now i need to figure out the password with the delimited ":"Password and "d":"mail.com

input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"
 
"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"'''
can we remove this and import the "input.txt" file which will contain all the details?

and output the result at
# but it can used also with normal files
with open('somefile.txt') as fd:
    data_printer(fd)
Reply
#9
(Apr-02-2019, 09:28 AM)wereak Wrote:
re.findall("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')",line)

Is this regex delivers expected results? My casual observation with regex101.com matches "r'[email protected]'".

re.compile("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')", re.DEBUG)
SUBPATTERN 1 0 0
  LITERAL 114     # r 
  LITERAL 39      # '
  SUBPATTERN 2 0 0
    MAX_REPEAT 1 MAXREPEAT
      IN
        RANGE (97, 122)
        RANGE (48, 57)
    MAX_REPEAT 0 MAXREPEAT
      SUBPATTERN 3 0 0
        LITERAL 46
        MAX_REPEAT 1 MAXREPEAT
          IN
            RANGE (97, 122)
            RANGE (48, 57)
/../
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#10
https://regex101.com

([a-z0-9-]+[a-z0-9_]+[a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+))
"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"
"11":"[email protected]","p":"password11","r":"PYTHON DEMO OUTPUT,"d":"mail.co.uk"
"12":"[email protected]","p":"password12","r":"PYTHON DEMO OUTPUT,"d":"mail.ru"
"13":"[email protected]","p":"password13","r":"PYTHON DEMO OUTPUT,"d":"mail.ru"
now i am trying to find out [email protected] and grab password and last line which is mail.ru etc
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python openyxl not updating Excel file MrBean12 1 250 Mar-03-2024, 12:16 AM
Last Post: MrBean12
  Python logging RotatingFileHandler writes to random file after the first log rotation rawatg 0 341 Feb-15-2024, 11:15 AM
Last Post: rawatg
  Open/save file on Android frohr 0 280 Jan-24-2024, 06:28 PM
Last Post: frohr
  connect sql by python using txt. file dawid294 2 380 Jan-12-2024, 08:54 PM
Last Post: deanhystad
  file open "file not found error" shanoger 8 944 Dec-14-2023, 08:03 AM
Last Post: shanoger
  Recommended way to read/create PDF file? Winfried 3 2,784 Nov-26-2023, 07:51 AM
Last Post: Pedroski55
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 1,308 Nov-09-2023, 10:56 AM
Last Post: mg24
  how to save to multiple locations during save cubangt 1 509 Oct-23-2023, 10:16 PM
Last Post: deanhystad
  Replace a text/word in docx file using Python Devan 4 2,852 Oct-17-2023, 06:03 PM
Last Post: Devan
  Help creating shell scrip for python file marciokoko 10 1,257 Sep-16-2023, 09:46 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020