![]() |
Help me with python read file and save file - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Help me with python read file and save file (/thread-17205.html) |
Help me with python read file and save file - wereak - Apr-02-2019 ID, email, random letter, Password, random letter, Title, random letter, Email Account Register with. Line 7 has a value "null" so if the null value is found skip the line and continue. ---------------------------------------------------------------------------------------- Filename input.txt "ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with" "1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com" "2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg" "3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru" "4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua" "5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by" "6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com" "7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null "8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com" "9":"demo9_email@Tut.by","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by" "10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de" ---------------------------------------------------------------------------------------- Output and safe in other file output.txt Email : Password : Email Account Register with [email protected]:password1:yahoo.com [email protected]:password2:ymail.bg [email protected]:password3:ya.ru [email protected]:password4:yandex.ua [email protected]:password5:yandex.by [email protected]:password6:yahoo.com [email protected]:password8:gmail.com demo9_email@Tut.by:password9:tut.by [email protected]:password10:t-online.de use the function findall to match the email re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+) RE: Help me with python read file and save file - Yoriz - Apr-02-2019 reading-and-writing-files when you get stuck, post the specific part your stuck on with code in python code tags and any errors received in error tags. RE: Help me with python read file and save file - wereak - Apr-02-2019 i know some basic on writing readlines and readline but i am confuse on how to go about creating a code where it takes what is needed and safe it to the other file Open a file input.txt Read a file input.txt use findall to search for email and password and Email Account Register with. output the file to another file output.txt i am stuck here file = open("input.txt", "r") print "Name of the file is : ",file.name print(file.read()) #re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+) RE: Help me with python read file and save file - chisox721 - Apr-02-2019 I'm not exactly sure what you're trying to do but seems like you'd be best served using Pandas to get the data cleaned up. Could be overkill but that's what I'd do. RE: Help me with python read file and save file - perfringo - Apr-02-2019 Little mental exercise: data cleaning with list comprehension, string methods and indexing: In [1]: lst = [ ...: '"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com"', ...: '"2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg"', ...: '"3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru"' ...: ] In [2]: [row.split(':')[i].split(',')[0].strip('"') for row in lst for i in [1, 2, 4]] Out[3]: ['[email protected]', 'password1', 'yahoo.com', '[email protected]', 'password2', 'mail.bg', '[email protected]', 'password3', 'ya.ru'] RE: Help me with python read file and save file - wereak - Apr-02-2019 import json file = open("input.txt", "r") print ("Name of the file is : ",file.name) print(file.read()) with open('input.txt') as f: read_data = f.read() file.closed #a for loop will be created to look for emails, password and email associate with #re.findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+) #trying to read the file from input.txt and safe the output to output.txt with the help of json.dumps file2 = open("output.txt", "rb+") json.dumps(file) file2.closed Traceback (most recent call last): File "p.py", line 16, in <module> json.dumps(file) File "/usr/lib/python3.6/json/__init__.py", line 231, in dumps return _default_encoder.encode(obj) File "/usr/lib/python3.6/json/encoder.py", line 199, in encode chunks = self.iterencode(o, _one_shot=True) File "/usr/lib/python3.6/json/encoder.py", line 257, in iterencode return _iterencode(o, 0) File "/usr/lib/python3.6/json/encoder.py", line 180, in default o.__class__.__name__) TypeError: Object of type 'TextIOWrapper' is not JSON serializable This is the file called input.txt Quote:"1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com" by default output.txt is empty file. I want to grab the email using the function Quote:findall(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)and also grab password and the last line. So after the scraping the out should look like this Quote:[email protected]:password1:yahoo.com Line 7 has a value "null" so if the null value is found skip the line and continue outputting other lines. the line 7 has bee removed because it encounter the null value Quote:"7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null RE: Help me with python read file and save file - DeaD_EyE - Apr-02-2019 Just using Pandas solves the current problem, but does not improve the knowledge about Python. You should know how to iterate over lines, how to work with split and replace etc. Edit: I made here a mistake. I took the wrong field. Try to correct this. It's not difficult. import io def reader(file): try: next(file) # skip header except StopIteration: print('File is empty') return # Return inside a generator stops the iteration of the generator for row in file: try: row = [ value.replace('"', '').strip() for item in row.split(',') for value in item.split(':') ] email, domain, password = row[1], row[3], row[-1] yield email, domain, password except IndexError: continue def data_printer(file): for email, domain, password in reader(file): print(f'{email}@{domain}:{password}') input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with" "1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com" "2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg" "3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru" "4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua" "5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by" "6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com" "7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null "8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com" "9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by" "10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"''' line_reader = io.StringIO(input_data) # using the string as a file-like object data_printer(line_reader) # but it can used also with normal files with open('somefile.txt') as fd: data_printer(fd)The most stuff is going on here: row = [ value.replace('"', '').strip() for item in row.split(',') for value in item.split(':') ]Simplified as a nested loop: row = [] for item in row.split(','): for value in item.split(':'): value = value.replace('"', '').strip() row.append(value)Combined together with iteration over the lines: for line in input_data.splitlines(): row = [] for item in line.split(','): for value in item.split(':'): value = value.replace('"', '').strip() row.append(value) print(row) # or if input_data is a file-object opened in text mode for line in input_data: row = [] # row seems to have , and : as delimiter for data # first level, split by , for item in line.split(','): # each item is a string, which could contain a : # split by : for value in item.split(':'): # remove the quoting and then white space at the beginning and end of the str value = value.replace('"', '').strip() # append the result to the list row row.append(value) # print the current result of the line # pay attention about indentation # this kind of nested loops leads into indentation errors print(row)I hope I haven't done your homework. By the way, it's a little bit strange, that the input data is delimited by , and : .Using regex is not always the best solution. RE: Help me with python read file and save file - wereak - Apr-02-2019 (Apr-02-2019, 08:17 AM)DeaD_EyE Wrote: Just using Pandas solves the current problem, but does not improve the knowledge about Python. Thank you Sir it did give me some new ideas but does not solved the current issue since i have to import the text from the file and it need to have "import re" build in module or RegEx since what if the email id also include [email protected] this is what i have come up with import re #multiple files at a time with open('input.txt','r') as rf: #rf read from #read file from rf and safe the output to wf write file with open('output.txt','w') as wf: #wf write file for line in rf: #find all the lines containing the keyword This if re.findall("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')",line): print(line.strip())Now i need to figure out the password with the delimited ":"Password and "d":"mail.com input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with" "1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com" "2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg" "3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru" "4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua" "5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by" "6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com" "7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null "8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com" "9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by" "10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"'''can we remove this and import the "input.txt" file which will contain all the details? and output the result at # but it can used also with normal files with open('somefile.txt') as fd: data_printer(fd) RE: Help me with python read file and save file - perfringo - Apr-02-2019 (Apr-02-2019, 09:28 AM)wereak Wrote:re.findall("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')",line) Is this regex delivers expected results? My casual observation with regex101.com matches "r'[email protected]'" . re.compile("(r'([a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)+)')", re.DEBUG) SUBPATTERN 1 0 0 LITERAL 114 # r LITERAL 39 # ' SUBPATTERN 2 0 0 MAX_REPEAT 1 MAXREPEAT IN RANGE (97, 122) RANGE (48, 57) MAX_REPEAT 0 MAXREPEAT SUBPATTERN 3 0 0 LITERAL 46 MAX_REPEAT 1 MAXREPEAT IN RANGE (97, 122) RANGE (48, 57) /../ RE: Help me with python read file and save file - wereak - Apr-02-2019 https://regex101.com ([a-z0-9-]+[a-z0-9_]+[a-z0-9]+(\.[a-z0-9]+)*@[a-z]+(\.[a-z]+)) "1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com" "2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg" "3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru" "4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua" "5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by" "6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com" "7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null "8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com" "9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by" "10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de" "11":"[email protected]","p":"password11","r":"PYTHON DEMO OUTPUT,"d":"mail.co.uk" "12":"[email protected]","p":"password12","r":"PYTHON DEMO OUTPUT,"d":"mail.ru" "13":"[email protected]","p":"password13","r":"PYTHON DEMO OUTPUT,"d":"mail.ru"now i am trying to find out [email protected] and grab password and last line which is mail.ru etc |