Just using Pandas solves the current problem, but does not improve the knowledge about Python.
You should know how to iterate over lines, how to work with split and replace etc.
Edit: I made here a mistake. I took the wrong field. Try to correct this. It's not difficult.
By the way, it's a little bit strange, that the input data is delimited by
Using regex is not always the best solution.
You should know how to iterate over lines, how to work with split and replace etc.
Edit: I made here a mistake. I took the wrong field. Try to correct this. It's not difficult.
import io def reader(file): try: next(file) # skip header except StopIteration: print('File is empty') return # Return inside a generator stops the iteration of the generator for row in file: try: row = [ value.replace('"', '').strip() for item in row.split(',') for value in item.split(':') ] email, domain, password = row[1], row[3], row[-1] yield email, domain, password except IndexError: continue def data_printer(file): for email, domain, password in reader(file): print(f'{email}@{domain}:{password}') input_data = '''"ID":"Email","random letter":"Password","random letter":"Title","random letter":"Email Account Register with" "1":"[email protected]","p":"password1","r":"PYTHON DEMO OUTPUT","d":"yahoo.com" "2":"[email protected]","p":"password2","r":"PYTHON DEMO OUTPUT","d":"mail.bg" "3":"[email protected]","p":"password3","r":"PYTHON DEMO OUTPUT,"d":"ya.ru" "4":"[email protected]","p":"password4","r":"PYTHON DEMO OUTPUT","d":"yandex.ua" "5":"[email protected]","p":"password5","r":"PYTHON DEMO OUTPUT","d":"yandex.by" "6":"[email protected]","p":"password6","r":"PYTHON DEMO OUTPUT","d":"yahoo.com" "7":"Demo_V;7829837","r":"PYTHON DEMO OUTPUT","d":null "8":"[email protected]","p":"password8","r":"PYTHON DEMO OUTPUT","d":"gmail.com" "9":"[email protected]","p":"password9","r":"PYTHON DEMO OUTPUT","d":"tut.by" "10":"[email protected]","p":"password10","r":"PYTHON DEMO OUTPUT,"d":"t-online.de"''' line_reader = io.StringIO(input_data) # using the string as a file-like object data_printer(line_reader) # but it can used also with normal files with open('somefile.txt') as fd: data_printer(fd)The most stuff is going on here:
row = [ value.replace('"', '').strip() for item in row.split(',') for value in item.split(':') ]Simplified as a nested loop:
row = [] for item in row.split(','): for value in item.split(':'): value = value.replace('"', '').strip() row.append(value)Combined together with iteration over the lines:
for line in input_data.splitlines(): row = [] for item in line.split(','): for value in item.split(':'): value = value.replace('"', '').strip() row.append(value) print(row) # or if input_data is a file-object opened in text mode for line in input_data: row = [] # row seems to have , and : as delimiter for data # first level, split by , for item in line.split(','): # each item is a string, which could contain a : # split by : for value in item.split(':'): # remove the quoting and then white space at the beginning and end of the str value = value.replace('"', '').strip() # append the result to the list row row.append(value) # print the current result of the line # pay attention about indentation # this kind of nested loops leads into indentation errors print(row)I hope I haven't done your homework.
By the way, it's a little bit strange, that the input data is delimited by
,
and :
.Using regex is not always the best solution.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
All humans together. We don't need politicians!