Posts: 5
Threads: 1
Joined: Apr 2019
ID, email, random letter, Password, random letter, Title, random letter, Email Account Register with.
Line 7 has a value "null" so if the null value is found skip the line and continue.
----------------------------------------------------------------------------------------
Filename input.txt
"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"
"1":" [email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":" [email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":" [email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":" [email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":" [email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":" [email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":" [email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"demo9_email@ Tut.by","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":" [email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"
----------------------------------------------------------------------------------------
Output and safe in other file output.txt
Email : Password : Email Account Register with
[email protected]:password1:yahoo.com
[email protected]:password2:ymail.bg
[email protected]:password3:ya.ru
[email protected]:password4:yandex.ua
[email protected]:password5:yandex.by
[email protected]:password6:yahoo.com
[email protected]:password8:gmail.com
demo9_email@ Tut.by:password9:tut.by
[email protected]:password10:t-online.de
use the function findall to match the email
re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)
Posts: 2,168
Threads: 35
Joined: Sep 2016
reading-and-writing-files
when you get stuck, post the specific part your stuck on with code in python code tags and any errors received in error tags.
Posts: 5
Threads: 1
Joined: Apr 2019
Apr-02-2019, 05:43 AM
(This post was last modified: Apr-02-2019, 06:01 AM by wereak.)
i know some basic on writing readlines and readline but i am confuse on how to go about creating a code where it takes what is needed and safe it to the other file
Open a file input.txt
Read a file input.txt
use findall to search for email and password and Email Account Register with.
output the file to another file output.txt
i am stuck here
file = open("input.txt", "r")
print "Name of the file is : ",file.name
print(file.read())
#re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)
Posts: 28
Threads: 8
Joined: Oct 2017
I'm not exactly sure what you're trying to do but seems like you'd be best served using Pandas to get the data cleaned up. Could be overkill but that's what I'd do.
Posts: 1,950
Threads: 8
Joined: Jun 2018
Little mental exercise: data cleaning with list comprehension, string methods and indexing:
In [1]: lst = [
...: '"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"',
...: '"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"',
...: '"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"'
...: ]
In [2]: [row.split(':')[i].split(',')[0].strip('"') for row in lst for i in [1, 2, 4]]
Out[3]:
['[email protected]',
'password1',
'yahoo.com',
'[email protected]',
'password2',
'mail.bg',
'[email protected]',
'password3',
'ya.ru']
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 5
Threads: 1
Joined: Apr 2019
Apr-02-2019, 07:02 AM
(This post was last modified: Apr-02-2019, 07:09 AM by wereak.)
import json
file = open("input.txt", "r")
print ("Name of the file is : ",file.name)
print(file.read())
with open('input.txt') as f:
read_data = f.read()
file.closed
#a for loop will be created to look for emails, password and email associate with
#re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)
#trying to read the file from input.txt and safe the output to output.txt with the help of json.dumps
file2 = open("output.txt", "rb+")
json.dumps(file)
file2.closed Traceback (most recent call last):
File "p.py", line 16, in <module>
json.dumps(file)
File "/usr/lib/python3.6/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python3.6/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.6/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python3.6/json/encoder.py", line 180, in default
o.__class__.__name__)
TypeError: Object of type 'TextIOWrapper' is not JSON serializable
This is the file called input.txt
Quote:"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"demo9_email@Tut.by","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"
by default output.txt is empty file. I want to grab the email using the function Quote:findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)
and also grab password and the last line. So after the scraping the out should look like this
Quote:[email protected]:password1:yahoo.com
[email protected]:password2:ymail.bg
[email protected]:password3:ya.ru
[email protected]:password4:yandex.ua
[email protected]:password5:yandex.by
[email protected]:password6:yahoo.com
[email protected]:password8:gmail.com
demo9_email@Tut.by:password9:tut.by
[email protected]:password10:t-online.de
Line 7 has a value "null" so if the null value is found skip the line and continue outputting other lines.
the line 7 has bee removed because it encounter the null value
Quote:"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
Posts: 2,121
Threads: 10
Joined: May 2017
Apr-02-2019, 08:17 AM
(This post was last modified: Apr-02-2019, 08:17 AM by DeaD_EyE.)
Just using Pandas solves the current problem, but does not improve the knowledge about Python.
You should know how to iterate over lines, how to work with split and replace etc.
Edit: I made here a mistake. I took the wrong field. Try to correct this. It's not difficult.
import io
def reader(file):
try:
next(file) # skip header
except StopIteration:
print('File is empty')
return # Return inside a generator stops the iteration of the generator
for row in file:
try:
row = [
value.replace('"', '').strip()
for item in row.split(',')
for value in item.split(':')
]
email, domain, password = row[1], row[3], row[-1]
yield email, domain, password
except IndexError:
continue
def data_printer(file):
for email, domain, password in reader(file):
print(f'{email}@{domain}:{password}')
input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"
"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"'''
line_reader = io.StringIO(input_data)
# using the string as a file-like object
data_printer(line_reader)
# but it can used also with normal files
with open('somefile.txt') as fd:
data_printer(fd) The most stuff is going on here:
row = [
value.replace('"', '').strip()
for item in row.split(',')
for value in item.split(':')
] Simplified as a nested loop:
row = []
for item in row.split(','):
for value in item.split(':'):
value = value.replace('"', '').strip()
row.append(value) Combined together with iteration over the lines:
for line in input_data.splitlines():
row = []
for item in line.split(','):
for value in item.split(':'):
value = value.replace('"', '').strip()
row.append(value)
print(row)
# or if input_data is a file-object opened in text mode
for line in input_data:
row = []
# row seems to have , and : as delimiter for data
# first level, split by ,
for item in line.split(','):
# each item is a string, which could contain a :
# split by :
for value in item.split(':'):
# remove the quoting and then white space at the beginning and end of the str
value = value.replace('"', '').strip()
# append the result to the list row
row.append(value)
# print the current result of the line
# pay attention about indentation
# this kind of nested loops leads into indentation errors
print(row) I hope I haven't done your homework.
By the way, it's a little bit strange, that the input data is delimited by , and : .
Using regex is not always the best solution.
Posts: 5
Threads: 1
Joined: Apr 2019
Apr-02-2019, 09:28 AM
(This post was last modified: Apr-02-2019, 09:47 AM by wereak.)
(Apr-02-2019, 08:17 AM)DeaD_EyE Wrote: Just using Pandas solves the current problem, but does not improve the knowledge about Python.
You should know how to iterate over lines, how to work with split and replace etc.
Edit: I made here a mistake. I took the wrong field. Try to correct this. It's not difficult.
import io
def reader(file):
try:
next(file) # skip header
except StopIteration:
print('File is empty')
return # Return inside a generator stops the iteration of the generator
for row in file:
try:
row = [
value.replace('"', '').strip()
for item in row.split(',')
for value in item.split(':')
]
email, domain, password = row[1], row[3], row[-1]
yield email, domain, password
except IndexError:
continue
def data_printer(file):
for email, domain, password in reader(file):
print(f'{email}@{domain}:{password}')
input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"
"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"'''
line_reader = io.StringIO(input_data)
# using the string as a file-like object
data_printer(line_reader)
# but it can used also with normal files
with open('somefile.txt') as fd:
data_printer(fd) The most stuff is going on here:
row = [
value.replace('"', '').strip()
for item in row.split(',')
for value in item.split(':')
] Simplified as a nested loop:
row = []
for item in row.split(','):
for value in item.split(':'):
value = value.replace('"', '').strip()
row.append(value) Combined together with iteration over the lines:
for line in input_data.splitlines():
row = []
for item in line.split(','):
for value in item.split(':'):
value = value.replace('"', '').strip()
row.append(value)
print(row)
# or if input_data is a file-object opened in text mode
for line in input_data:
row = []
# row seems to have , and : as delimiter for data
# first level, split by ,
for item in line.split(','):
# each item is a string, which could contain a :
# split by :
for value in item.split(':'):
# remove the quoting and then white space at the beginning and end of the str
value = value.replace('"', '').strip()
# append the result to the list row
row.append(value)
# print the current result of the line
# pay attention about indentation
# this kind of nested loops leads into indentation errors
print(row) I hope I haven't done your homework.
By the way, it's a little bit strange, that the input data is delimited by , and : .
Using regex is not always the best solution.
Thank you Sir it did give me some new ideas but does not solved the current issue since i have to import the text from the file and it need to have "import re" build in module or RegEx since what if the email id also include [email protected]
this is what i have come up with
import re
#multiple files at a time
with open('input.txt','r') as rf: #rf read from
#read file from rf and safe the output to wf write file
with open('output.txt','w') as wf: #wf write file
for line in rf:
#find all the lines containing the keyword This
if re.findall("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')",line):
print(line.strip()) Now i need to figure out the password with the delimited ":"Password and "d":"mail.com
input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with"
"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"''' can we remove this and import the "input.txt" file which will contain all the details?
and output the result at
# but it can used also with normal files
with open('somefile.txt') as fd:
data_printer(fd)
Posts: 1,950
Threads: 8
Joined: Jun 2018
(Apr-02-2019, 09:28 AM)wereak Wrote: re.findall("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')",line)
Is this regex delivers expected results? My casual observation with regex101.com matches "r'[email protected]'" .
re.compile("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')", re.DEBUG)
SUBPATTERN 1 0 0
LITERAL 114 # r
LITERAL 39 # '
SUBPATTERN 2 0 0
MAX_REPEAT 1 MAXREPEAT
IN
RANGE (97, 122)
RANGE (48, 57)
MAX_REPEAT 0 MAXREPEAT
SUBPATTERN 3 0 0
LITERAL 46
MAX_REPEAT 1 MAXREPEAT
IN
RANGE (97, 122)
RANGE (48, 57)
/../
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 5
Threads: 1
Joined: Apr 2019
Apr-02-2019, 04:17 PM
(This post was last modified: Apr-02-2019, 04:17 PM by wereak.)
https://regex101.com
([a-z0-9-]+[a-z0-9_]+[a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+))
"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"
"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"
"4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua"
"5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by"
"6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"
"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null
"8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com"
"9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by"
"10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"
"11":"[email protected]","p":"password11","r":"PYTHON DEMO OUTPUT,"d":"mail.co.uk"
"12":"[email protected]","p":"password12","r":"PYTHON DEMO OUTPUT,"d":"mail.ru"
"13":"[email protected]","p":"password13","r":"PYTHON DEMO OUTPUT,"d":"mail.ru" now i am trying to find out [email protected] and grab password and last line which is mail.ru etc
|