Posts: 5
Threads: 1
Joined: Apr 2019
ID, email, random letter, Password, random letter, Title, random letter, Email Account Register with.
Line 7 has a value "null" so if the null value is found skip the line and continue.
----------------------------------------------------------------------------------------
Filename input.txt
"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"
"1":"demo1_email@yahoo.com","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"demo2_email@mail.bg","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"demo3_email@ya.ru","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"demo4_email@yandex.ua","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"demo5_email@yandex.by","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"demo6_email@yahoo.com","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"demo8_email@gmail.com","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"demo9_email@ Tut.by","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"demo10_email@t-online.de","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"
----------------------------------------------------------------------------------------
Output and safe in other file output.txt
Email : Password : Email Account Register with
demo1_email@yahoo.com:password1:yahoo.com
demo2_email@mail.bg:password2:ymail.bg
demo3_email@ya.ru:password3:ya.ru
demo4_email@yandex.ua:password4:yandex.ua
demo5_email@yandex.by:password5:yandex.by
demo6_email@yahoo.com:password6:yahoo.com
demo8_email@gmail.com:password8:gmail.com
demo9_email@ Tut.by:password9:tut.by
demo10_email@t-online.de:password10:t-online.de
use the function findall to match the email
re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)
Posts: 2,168
Threads: 35
Joined: Sep 2016
reading-and-writing-files
when you get stuck, post the specific part your stuck on with code in python code tags and any errors received in error tags.
Posts: 5
Threads: 1
Joined: Apr 2019
Apr-02-2019, 05:43 AM
(This post was last modified: Apr-02-2019, 06:01 AM by wereak.)
i know some basic on writing readlines and readline but i am confuse on how to go about creating a code where it takes what is needed and safe it to the other file
Open a file input.txt
Read a file input.txt
use findall to search for email and password and Email Account Register with.
output the file to another file output.txt
i am stuck here
1 2 3 4 5 |
file = open ( "input.txt" , "r" )
print "Name of the file is : " , file .name
print ( file .read())
|
Posts: 28
Threads: 8
Joined: Oct 2017
I'm not exactly sure what you're trying to do but seems like you'd be best served using Pandas to get the data cleaned up. Could be overkill but that's what I'd do.
Posts: 1,950
Threads: 8
Joined: Jun 2018
Little mental exercise: data cleaning with list comprehension, string methods and indexing:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
In [ 1 ]: lst = [
...: '"1":"demo1_email@yahoo.com","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"' ,
...: '"2":"demo2_email@mail.bg","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"' ,
...: '"3":"demo3_email@ya.ru","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"'
...: ]
In [ 2 ]: [row.split( ':' )[i].split( ',' )[ 0 ].strip( '"' ) for row in lst for i in [ 1 , 2 , 4 ]]
Out[ 3 ]:
[ 'demo1_email@yahoo.com' ,
'password1' ,
'yahoo.com' ,
'demo2_email@mail.bg' ,
'password2' ,
'mail.bg' ,
'demo3_email@ya.ru' ,
'password3' ,
'ya.ru' ]
|
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 5
Threads: 1
Joined: Apr 2019
Apr-02-2019, 07:02 AM
(This post was last modified: Apr-02-2019, 07:09 AM by wereak.)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
import json
file = open ( "input.txt" , "r" )
print ( "Name of the file is : " , file .name)
print ( file .read())
with open ( 'input.txt' ) as f:
read_data = f.read()
file .closed
file2 = open ( "output.txt" , "rb+" )
json.dumps( file )
file2.closed
|
1 2 3 4 5 6 7 8 9 10 11 12 |
Traceback (most recent call last):
File "p.py" , line 16 , in <module>
json.dumps( file )
File "/usr/lib/python3.6/json/__init__.py" , line 231 , in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python3.6/json/encoder.py" , line 199 , in encode
chunks = self .iterencode(o, _one_shot = True )
File "/usr/lib/python3.6/json/encoder.py" , line 257 , in iterencode
return _iterencode(o, 0 )
File "/usr/lib/python3.6/json/encoder.py" , line 180 , in default
o.__class__.__name__)
TypeError: Object of type 'TextIOWrapper' is not JSON serializable
|
This is the file called input.txt
Quote:"1":"demo1_email@yahoo.com","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"demo2_email@mail.bg","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"demo3_email@ya.ru","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"demo4_email@yandex.ua","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"demo5_email@yandex.by","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"demo6_email@yahoo.com","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"demo8_email@gmail.com","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"demo9_email@Tut.by","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"demo10_email@t-online.de","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"
by default output.txt is empty file. I want to grab the email using the function Quote:findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)
and also grab password and the last line. So after the scraping the out should look like this
Quote:demo1_email@yahoo.com:password1:yahoo.com
demo2_email@mail.bg:password2:ymail.bg
demo3_email@ya.ru:password3:ya.ru
demo4_email@yandex.ua:password4:yandex.ua
demo5_email@yandex.by:password5:yandex.by
demo6_email@yahoo.com:password6:yahoo.com
demo8_email@gmail.com:password8:gmail.com
demo9_email@Tut.by:password9:tut.by
demo10_email@t-online.de:password10:t-online.de
Line 7 has a value "null" so if the null value is found skip the line and continue outputting other lines.
the line 7 has bee removed because it encounter the null value
Quote:"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
Posts: 2,125
Threads: 11
Joined: May 2017
Apr-02-2019, 08:17 AM
(This post was last modified: Apr-02-2019, 08:17 AM by DeaD_EyE.)
Just using Pandas solves the current problem, but does not improve the knowledge about Python.
You should know how to iterate over lines, how to work with split and replace etc.
Edit: I made here a mistake. I took the wrong field. Try to correct this. It's not difficult.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
import io
def reader( file ):
try :
next ( file )
except StopIteration:
print ( 'File is empty' )
return
for row in file :
try :
row = [
value.replace( '"' , '').strip()
for item in row.split( ',' )
for value in item.split( ':' )
]
email, domain, password = row[ 1 ], row[ 3 ], row[ - 1 ]
yield email, domain, password
except IndexError:
continue
def data_printer( file ):
for email, domain, password in reader( file ):
print ( f '{email}@{domain}:{password}' )
input_data =
line_reader = io.StringIO(input_data)
data_printer(line_reader)
with open ( 'somefile.txt' ) as fd:
data_printer(fd)
|
The most stuff is going on here:
1 2 3 4 5 |
row = [
value.replace( '"' , '').strip()
for item in row.split( ',' )
for value in item.split( ':' )
]
|
Simplified as a nested loop:
1 2 3 4 5 |
row = []
for item in row.split( ',' ):
for value in item.split( ':' ):
value = value.replace( '"' , '').strip()
row.append(value)
|
Combined together with iteration over the lines:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
for line in input_data.splitlines():
row = []
for item in line.split( ',' ):
for value in item.split( ':' ):
value = value.replace(
, '').strip()
row.append(value)
print (row)
|
I hope I haven't done your homework.
By the way, it's a little bit strange, that the input data is delimited by , and : .
Using regex is not always the best solution.
Posts: 5
Threads: 1
Joined: Apr 2019
Apr-02-2019, 09:28 AM
(This post was last modified: Apr-02-2019, 09:47 AM by wereak.)
(Apr-02-2019, 08:17 AM)DeaD_EyE Wrote: Just using Pandas solves the current problem, but does not improve the knowledge about Python.
You should know how to iterate over lines, how to work with split and replace etc.
Edit: I made here a mistake. I took the wrong field. Try to correct this. It's not difficult.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
import io
def reader( file ):
try :
next ( file )
except StopIteration:
print ( 'File is empty' )
return
for row in file :
try :
row = [
value.replace( '"' , '').strip()
for item in row.split( ',' )
for value in item.split( ':' )
]
email, domain, password = row[ 1 ], row[ 3 ], row[ - 1 ]
yield email, domain, password
except IndexError:
continue
def data_printer( file ):
for email, domain, password in reader( file ):
print ( f '{email}@{domain}:{password}' )
input_data =
line_reader = io.StringIO(input_data)
data_printer(line_reader)
with open ( 'somefile.txt' ) as fd:
data_printer(fd)
|
The most stuff is going on here:
1 2 3 4 5 |
row = [
value.replace( '"' , '').strip()
for item in row.split( ',' )
for value in item.split( ':' )
]
|
Simplified as a nested loop:
1 2 3 4 5 |
row = []
for item in row.split( ',' ):
for value in item.split( ':' ):
value = value.replace( '"' , '').strip()
row.append(value)
|
Combined together with iteration over the lines:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
for line in input_data.splitlines():
row = []
for item in line.split( ',' ):
for value in item.split( ':' ):
value = value.replace(
, '').strip()
row.append(value)
print (row)
|
I hope I haven't done your homework.
By the way, it's a little bit strange, that the input data is delimited by , and : .
Using regex is not always the best solution.
Thank you Sir it did give me some new ideas but does not solved the current issue since i have to import the text from the file and it need to have "import re" build in module or RegEx since what if the email id also include Dead_eye@mail.co.uk
this is what i have come up with
1 2 3 4 5 6 7 8 9 |
import re
with open ( 'input.txt' , 'r' ) as rf:
with open ( 'output.txt' , 'w' ) as wf:
for line in rf:
if re.findall( "(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')" ,line):
print (line.strip())
|
Now i need to figure out the password with the delimited ":"Password and "d":"mail.com
can we remove this and import the "input.txt" file which will contain all the details?
and output the result at
1 2 3 |
with open ( 'somefile.txt' ) as fd:
data_printer(fd)
|
Posts: 1,950
Threads: 8
Joined: Jun 2018
(Apr-02-2019, 09:28 AM)wereak Wrote:
1 |
re.findall( "(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')" ,line)
|
Is this regex delivers expected results? My casual observation with regex101.com matches "r'a1.b2@c.d'" .
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
re. compile ( "(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')" , re.DEBUG)
SUBPATTERN 1 0 0
LITERAL 114
LITERAL 39
SUBPATTERN 2 0 0
MAX_REPEAT 1 MAXREPEAT
IN
RANGE ( 97 , 122 )
RANGE ( 48 , 57 )
MAX_REPEAT 0 MAXREPEAT
SUBPATTERN 3 0 0
LITERAL 46
MAX_REPEAT 1 MAXREPEAT
IN
RANGE ( 97 , 122 )
RANGE ( 48 , 57 )
/ .. /
|
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 5
Threads: 1
Joined: Apr 2019
Apr-02-2019, 04:17 PM
(This post was last modified: Apr-02-2019, 04:17 PM by wereak.)
https://regex101.com
([a-z0-9-]+[a-z0-9_]+[a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+))
1 2 3 4 5 6 7 8 9 10 11 12 13 |
"1" : "demo1_email@yahoo.com" , "p" : "password1" , "r" : "PYTHON DEMO OUTPUT" , "d" : "yahoo.com"
"2" : "demo2_email@mail.bg" , "p" : "password2" , "r" : "PYTHON DEMO OUTPUT" , "d" : "mail.bg"
"3" : "demo3_email@ya.ru" , "p" : "password3" , "r" : "PYTHON DEMO OUTPUT," d ":" ya.ru"
"4" : "demo4_email@yandex.ua" , "p" : "password4" , "r" : "PYTHON DEMO OUTPUT" , "d" : "yandex.ua"
"5" : "demo5_email@yandex.by" , "p" : "password5" , "r" : "PYTHON DEMO OUTPUT" , "d" : "yandex.by"
"6" : "demo6_email@yahoo.com" , "p" : "password6" , "r" : "PYTHON DEMO OUTPUT" , "d" : "yahoo.com"
"7" : "Demo_V;7829837" , "r" : "PYTHON DEMO OUTPUT" , "d" :null
"8" : "demo8_email@gmail.com" , "p" : "password8" , "r" : "PYTHON DEMO OUTPUT" , "d" : "gmail.com"
"9" : "demo9_email@tut.by" , "p" : "password9" , "r" : "PYTHON DEMO OUTPUT" , "d" : "tut.by"
"10" : "demo10_email@t-online.de" , "p" : "password10" , "r" : "PYTHON DEMO OUTPUT," d ":" t - online.de"
"11" : "demo11_email@mail.co.uk" , "p" : "password11" , "r" : "PYTHON DEMO OUTPUT," d ":" mail.co.uk"
"12" : "demo-12@mail.ru" , "p" : "password12" , "r" : "PYTHON DEMO OUTPUT," d ":" mail.ru"
"13" : "ABCDEFGHIGKLMNOPQUSTUVWXYZ@mail.ru" , "p" : "password13" , "r" : "PYTHON DEMO OUTPUT," d ":" mail.ru"
|
now i am trying to find out ABCDEFGHIGKLMNOPQUSTUVWXYZ@mail.ru and grab password and last line which is mail.ru etc
|