Python Forum
create an array of each line of text - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: create an array of each line of text (/thread-27459.html)



create an array of each line of text - macieju1974 - Jun-07-2020

Hi,
I have the following text file:
... here's a fragment
I wrote a code in python that reads data from this file.
file = open('raport.txt', 'r').read()
lines = file.split('\n')
for line in lines:
    print (line)
How to create an array from each line of the file without whitespace and "###" and "RANGE". All in all I am interested in columns with date, time and numbers.

Output:
2020-05-22##18:00:00###RANGE ###RANGE ###RANGE ##201828##190182##96136##2 ##0 ###RANGE ###RANGE ##532387##644##313415 2020-05-22##19:00:00###RANGE ###RANGE ###RANGE ##201833##190185##96138##2 ##0 ###RANGE ###RANGE ##532387##644##313421 2020-05-22##20:00:00###RANGE ###RANGE ###RANGE ##201839##190191##96140##2 ##0 ###RANGE ###RANGE ##532387##644##313427 2020-05-22##21:00:00###RANGE ###RANGE ###RANGE ##201844##190195##96142##2 ##0 ###RANGE ###RANGE ##532387##644##313433 2020-05-22##22:00:00###RANGE ###RANGE ###RANGE ##201850##190201##96144##2 ##0 ###RANGE ###RANGE ##532387##644##313441 2020-05-22##23:00:01###RANGE ###RANGE ###RANGE ##201858##190207##96146##2 ##0 ###RANGE ###RANGE ##532387##644##313452 2020-05-23##00:00:00###RANGE ###RANGE ###RANGE ##201864##190212##96148##2 ##0 ###RANGE ###RANGE ##532387##644##313458 2020-05-23##01:00:00###RANGE ###RANGE ###RANGE ##201866##190215##96150##2 ##0 ###RANGE ###RANGE ##532387##644##313464 2020-05-23##02:00:00###RANGE ###RANGE ###RANGE ##201870##190217##96152##2 ##0 ###RANGE ###RANGE ##532387##644##313469 2020-05-23##03:00:00###RANGE ###RANGE ###RANGE ##201872##190220##96154##2 ##0 ###RANGE ###RANGE ##532387##644##313474 2020-05-23##04:00:00###RANGE ###RANGE ###RANGE ##201873##190221##96156##2 ##0 ###RANGE ###RANGE ##532387##644##313477 2020-05-23##05:00:00###RANGE ###RANGE ###RANGE ##201876##190223##96156##2 ##0 ###RANGE ###RANGE ##532387##644##313480 2020-05-23##06:00:00###RANGE ###RANGE ###RANGE ##201877##190224##96158##2 ##0 ###RANGE ###RANGE ##532387##644##313483 2020-05-23##07:00:01###RANGE ###RANGE ###RANGE ##201879##190226##96160##2 ##0 ###RANGE ###RANGE ##532387##644##313486 2020-05-23##08:00:00###RANGE ###RANGE ###RANGE ##201881##190228##96162##2 ##0 ###RANGE ###RANGE ##532387##644##313490 2020-05-23##09:00:01###RANGE ###RANGE ###RANGE ##201886##190233##96164##2 ##0 ###RANGE ###RANGE ##532387##644##313496 2020-05-23##10:00:00###RANGE ###RANGE ###RANGE ##201893##190238##96166##2 ##0 ###RANGE ###RANGE ##532387##644##313503 2020-05-23##11:00:00###RANGE ###RANGE ###RANGE ##201901##190246##96168##2 ##0 ###RANGE ###RANGE ##532387##644##313517 2020-05-23##12:00:00###RANGE ###RANGE ###RANGE ##201908##190253##96170##2 ##0 ###RANGE ###RANGE ##532387##644##313528 2020-05-23##13:00:00###RANGE ###RANGE ###RANGE ##201916##190259##96172##2 ##0 ###RANGE ###RANGE ##532387##644##313541 2020-05-23##14:00:00###RANGE ###RANGE ###RANGE ##201921##190265##96174##2 ##0 ###RANGE ###RANGE ##532387##644##313550 2020-05-23##15:00:00###RANGE ###RANGE ###RANGE ##201929##190272##96176##2 ##0 ###RANGE ###RANGE ##532387##644##313560 2020-05-23##16:00:00###RANGE ###RANGE ###RANGE ##201934##190278##96178##2 ##0 ###RANGE ###RANGE ##532387##644##313566 2020-05-23##17:00:00###RANGE ###RANGE ###RANGE ##201941##190283##96180##2 ##0 ###RANGE ###RANGE ##532387##644##313571 2020-05-23##18:00:00###RANGE ###RANGE ###RANGE ##201947##190289##96182##2 ##0 ###RANGE ###RANGE ##532387##644##313579 2020-05-23##19:00:00###RANGE ###RANGE ###RANGE ##201955##190295##96184##2 ##0 ###RANGE ###RANGE ##532387##644##313586 2020-05-23##20:00:01###RANGE ###RANGE ###RANGE ##201962##190301##96186##2 ##0 ###RANGE ###RANGE ##532387##644##313597 2020-05-23##21:00:00###RANGE ###RANGE ###RANGE ##201975##190312##96188##2 ##0 ###RANGE ###RANGE ##532387##644##313613 2020-05-23##22:00:00###RANGE ###RANGE ###RANGE ##202006##190326##96191##2 ##0 ###RANGE ###RANGE ##532387##644##313644 2020-05-23##23:00:00###RANGE ###RANGE ###RANGE ##202021##190341##96193##2 ##0 ###RANGE ###RANGE ##532387##644##313671 2020-05-24##00:00:00###RANGE ###RANGE ###RANGE ##202039##190355##96195##2 ##0 ###RANGE ###RANGE ##532387##644##313693 2020-05-24##01:00:00###RANGE ###RANGE ###RANGE ##202054##190369##96197##2 ##0 ###RANGE ###RANGE ##532387##644##313711 2020-05-24##02:00:00###RANGE ###RANGE ###RANGE ##202065##190379##96199##2 ##0 ###RANGE ###RANGE ##532387##644##313731 2020-05-24##03:00:00###RANGE ###RANGE ###RANGE ##202072##190386##96201##2 ##0 ###RANGE ###RANGE ##532387##644##313744 2020-05-24##04:00:00###RANGE ###RANGE ###RANGE ##202077##190390##96203##2 ##0 ###RANGE ###RANGE ##532387##644##313749 2020-05-24##05:00:00###RANGE ###RANGE ###RANGE ##202079##190392##96205##2 ##0 ###RANGE ###RANGE ##532387##644##313754 2020-05-24##06:00:00###RANGE ###RANGE ###RANGE ##202082##190394##96207##2 ##0 ###RANGE ###RANGE ##532387##644##313757 2020-05-24##07:00:00###RANGE ###RANGE ###RANGE ##202083##190397##96209##2 ##0 ###RANGE ###RANGE ##532387##644##313761 2020-05-24##08:00:00###RANGE ###RANGE ###RANGE ##202087##190399##96211##2 ##0 ###RANGE ###RANGE ##532387##644##313766 2020-05-24##09:00:00###RANGE ###RANGE ###RANGE ##202092##190403##96213##2 ##0 ###RANGE ###RANGE ##532387##644##313772 2020-05-24##10:00:00###RANGE ###RANGE ###RANGE ##202098##190410##96215##2 ##0 ###RANGE ###RANGE ##532387##644##313780



RE: create an array of each line of text - DPaul - Jun-07-2020

Looks straightforward, many possibilities.
We do not have the input file, but...
you might:
- read the line
- use regex (replace) to replace the'###RANGE' with '##'
- Then split the line on '##'
You will get an array of columns, take your pick of the ones you want.

Paul


RE: create an array of each line of text - Yoriz - Jun-07-2020

line = '2020-05-22##18:00:00###RANGE ###RANGE ###RANGE ##201828##190182##96136##2 ##0 ###RANGE ###RANGE ##532387##644##313415'
print(list(item.strip() for item in line.split('##') if item != '#RANGE '))
Output:
['2020-05-22', '18:00:00', '201828', '190182', '96136', '2', '0', '532387', '644', '313415']



RE: create an array of each line of text - macieju1974 - Jun-07-2020

thank you for the answer, I must admit I am honestly a beginner if you could give a simple example, please


RE: create an array of each line of text - Yoriz - Jun-07-2020

That is the equivalent of
line = '2020-05-22##18:00:00###RANGE ###RANGE ###RANGE ##201828##190182##96136##2 ##0 ###RANGE ###RANGE ##532387##644##313415'

items = []
for item in line.split('##'): # split into items at '##'
    if item != '#RANGE ': # if the item is not '#RANGE '
        items.append(item.strip()) # remove the whitespace and add to the list

print(items)
Output:
['2020-05-22', '18:00:00', '201828', '190182', '96136', '2', '0', '532387', '644', '313415']
Does it make sense like this?


RE: create an array of each line of text - menator01 - Jun-07-2020

A Lot shorter than what I came up with.
file = './play.txt'

array = []
with open(file, 'r') as lines:
    for line in lines:
        newlines = line.replace('###', ' ').replace('##', ' ').replace('RANGE', ' ').replace('\n', ' ').split()
        array.append(newlines)

    for line in array:
        print(line)
Output:
['2020-05-22', '18:00:00', '201828', '190182', '96136', '2', '0', '532387', '644', '313415'] ['2020-05-22', '19:00:00', '201833', '190185', '96138', '2', '0', '532387', '644', '313421'] ['2020-05-22', '20:00:00', '201839', '190191', '96140', '2', '0', '532387', '644', '313427'] ['2020-05-22', '21:00:00', '201844', '190195', '96142', '2', '0', '532387', '644', '313433'] ['2020-05-22', '22:00:00', '201850', '190201', '96144', '2', '0', '532387', '644', '313441'] ['2020-05-22', '23:00:01', '201858', '190207', '96146', '2', '0', '532387', '644', '313452'] ['2020-05-23', '00:00:00', '201864', '190212', '96148', '2', '0', '532387', '644', '313458'] ['2020-05-23', '01:00:00', '201866', '190215', '96150', '2', '0', '532387', '644', '313464'] ['2020-05-23', '02:00:00', '201870', '190217', '96152', '2', '0', '532387', '644', '313469'] ['2020-05-23', '03:00:00', '201872', '190220', '96154', '2', '0', '532387', '644', '313474'] ['2020-05-23', '04:00:00', '201873', '190221', '96156', '2', '0', '532387', '644', '313477'] ['2020-05-23', '05:00:00', '201876', '190223', '96156', '2', '0', '532387', '644', '313480'] ['2020-05-23', '06:00:00', '201877', '190224', '96158', '2', '0', '532387', '644', '313483'] ['2020-05-23', '07:00:01', '201879', '190226', '96160', '2', '0', '532387', '644', '313486'] ['2020-05-23', '08:00:00', '201881', '190228', '96162', '2', '0', '532387', '644', '313490'] ['2020-05-23', '09:00:01', '201886', '190233', '96164', '2', '0', '532387', '644', '313496'] ['2020-05-23', '10:00:00', '201893', '190238', '96166', '2', '0', '532387', '644', '313503'] ['2020-05-23', '11:00:00', '201901', '190246', '96168', '2', '0', '532387', '644', '313517'] ['2020-05-23', '12:00:00', '201908', '190253', '96170', '2', '0', '532387', '644', '313528'] ['2020-05-23', '13:00:00', '201916', '190259', '96172', '2', '0', '532387', '644', '313541'] ['2020-05-23', '14:00:00', '201921', '190265', '96174', '2', '0', '532387', '644', '313550'] ['2020-05-23', '15:00:00', '201929', '190272', '96176', '2', '0', '532387', '644', '313560'] ['2020-05-23', '16:00:00', '201934', '190278', '96178', '2', '0', '532387', '644', '313566'] ['2020-05-23', '17:00:00', '201941', '190283', '96180', '2', '0', '532387', '644', '313571'] ['2020-05-23', '18:00:00', '201947', '190289', '96182', '2', '0', '532387', '644', '313579'] ['2020-05-23', '19:00:00', '201955', '190295', '96184', '2', '0', '532387', '644', '313586'] ['2020-05-23', '20:00:01', '201962', '190301', '96186', '2', '0', '532387', '644', '313597'] ['2020-05-23', '21:00:00', '201975', '190312', '96188', '2', '0', '532387', '644', '313613'] ['2020-05-23', '22:00:00', '202006', '190326', '96191', '2', '0', '532387', '644', '313644'] ['2020-05-23', '23:00:00', '202021', '190341', '96193', '2', '0', '532387', '644', '313671'] ['2020-05-24', '00:00:00', '202039', '190355', '96195', '2', '0', '532387', '644', '313693'] ['2020-05-24', '01:00:00', '202054', '190369', '96197', '2', '0', '532387', '644', '313711'] ['2020-05-24', '02:00:00', '202065', '190379', '96199', '2', '0', '532387', '644', '313731'] ['2020-05-24', '03:00:00', '202072', '190386', '96201', '2', '0', '532387', '644', '313744'] ['2020-05-24', '04:00:00', '202077', '190390', '96203', '2', '0', '532387', '644', '313749'] ['2020-05-24', '05:00:00', '202079', '190392', '96205', '2', '0', '532387', '644', '313754'] ['2020-05-24', '06:00:00', '202082', '190394', '96207', '2', '0', '532387', '644', '313757'] ['2020-05-24', '07:00:00', '202083', '190397', '96209', '2', '0', '532387', '644', '313761'] ['2020-05-24', '08:00:00', '202087', '190399', '96211', '2', '0', '532387', '644', '313766'] ['2020-05-24', '09:00:00', '202092', '190403', '96213', '2', '0', '532387', '644', '313772'] ['2020-05-24', '10:00:00', '202098', '190410', '96215', '2', '0', '532387', '644', '313780']



RE: create an array of each line of text - macieju1974 - Jun-07-2020

items = []
file = open('raport.txt', 'r').read()

for item in file.split('##'): 
    
    if item != '#RANGE ': 
        items.append(item.strip()) 
 
print(items)
I receive the following...

0-05-23', '10:00:00', '201893', '190238', '96166', '2', '0', '532387', '644', '313503\n2020-05-23', '11:00:00', '201901', '190246', '96168', '2', '0', '532387', '644', '313517\n2020-05-23', '12:00:00', '201908', '190253', '96170', '2', '0', '532387', '644', '313528\n2020-05-23', '13:00:00', '201916', '190259', '96172', '2', '0', '532387', '644', '313541\n2020-05-23', '14:00:00', '201921', '190265', '96174', '2', '0', '532387', '644', '313550\n2020-05-23', '15:00:00', '201929', '190272', '96176', '2', '0', '532387', '644', '313560\n2020-05-23', '16:00:00', '201934', '190278', '96178', '2', '0', '532387', '644', '313566\n2020-05-23', '17:00:00', '201941', '190283', '96180', '2', '0', '532387', '644', '313571\n2020-05-23', '18:00:00', '201947', '190289', '96182', '2', '0', '532387', '644', '313579\n2020-05-23', '19:00:00', '201955', '190295', '96184', '2', '0', '532387', '644', '313586\n2020-05-23', '20:00:01', '201962', '190301', '96186', '2', '0', '532387', '644', '313597\n2020-05-23', '21:00:00', '201975', '190312', '96188', '2', '0', '532387', '644', '313613\n2020-05-23', '22:00:00', '202006', '190326', '96191', '2', '0', '532387', '644', '313644\n2020-05-23', '23:00:00', '202021', '190341', '96193', '2', '0', '532387', '644', '313671\n2020-05-24', '00:00:00', '202039', '190355', '96195', '2', '0', '532387', '644', '313693\n2020-05-24', '01:00:00', '202054', '190369', '96197', '2', '0', '532387', '644', '313711\n2020-05-24', '02:00:00', '202065', '190379', '96199', '2', '0', '532387', '644', '313731\n2020-05-24', '03:00:00', '202072', '190386', '96201', '2', '0', '532387', '644', '313744\n2020-05-24', '04:00:00', '202077', '190390', '96203', '2', '0', '532387', '644', '313749\n2020-05-24', '05:00:00', '202079', '190392', '96205', '2', '0', '532387', '644', '313754\n2020-05-24', '06:00:00', '202082', '190394', '96207', '2', '0', '532387', '644', '313757\n2020-05-24', '07:00:00', '202083', '190397', '96209', '2', '0', '532387', '644', '313761\n2020-05-24', '08:00:00', '202087', '190399', '96211', '2', '0', '532387', '644', '313766\n2020-05-24', '09:00:00', '202092', '190403', '96213', '2', '0', '532387', '644', '313772\n2020-05-24', '10:00:00', '202098', '190410', '96215', '2', '0', '532387', '644', '313780']

looks good but when I try to use line-by-line loading, nothing displays

item = file.split('\n')

(Jun-07-2020, 06:07 PM)menator01 Wrote: A Lot shorter than what I came up with.
file = './play.txt'

array = []
with open(file, 'r') as lines:
    for line in lines:
        newlines = line.replace('###', ' ').replace('##', ' ').replace('RANGE', ' ').replace('\n', ' ').split()
        array.append(newlines)

    for line in array:
        print(line)
Output:
['2020-05-22', '18:00:00', '201828', '190182', '96136', '2', '0', '532387', '644', '313415'] ['2020-05-22', '19:00:00', '201833', '190185', '96138', '2', '0', '532387', '644', '313421'] ['2020-05-22', '20:00:00', '201839', '190191', '96140', '2', '0', '532387', '644', '313427'] ['2020-05-22', '21:00:00', '201844', '190195', '96142', '2', '0', '532387', '644', '313433'] ['2020-05-22', '22:00:00', '201850', '190201', '96144', '2', '0', '532387', '644', '313441'] ['2020-05-22', '23:00:01', '201858', '190207', '96146', '2', '0', '532387', '644', '313452'] ['2020-05-23', '00:00:00', '201864', '190212', '96148', '2', '0', '532387', '644', '313458'] ['2020-05-23', '01:00:00', '201866', '190215', '96150', '2', '0', '532387', '644', '313464'] ['2020-05-23', '02:00:00', '201870', '190217', '96152', '2', '0', '532387', '644', '313469'] ['2020-05-23', '03:00:00', '201872', '190220', '96154', '2', '0', '532387', '644', '313474'] ['2020-05-23', '04:00:00', '201873', '190221', '96156', '2', '0', '532387', '644', '313477'] ['2020-05-23', '05:00:00', '201876', '190223', '96156', '2', '0', '532387', '644', '313480'] ['2020-05-23', '06:00:00', '201877', '190224', '96158', '2', '0', '532387', '644', '313483'] ['2020-05-23', '07:00:01', '201879', '190226', '96160', '2', '0', '532387', '644', '313486'] ['2020-05-23', '08:00:00', '201881', '190228', '96162', '2', '0', '532387', '644', '313490'] ['2020-05-23', '09:00:01', '201886', '190233', '96164', '2', '0', '532387', '644', '313496'] ['2020-05-23', '10:00:00', '201893', '190238', '96166', '2', '0', '532387', '644', '313503'] ['2020-05-23', '11:00:00', '201901', '190246', '96168', '2', '0', '532387', '644', '313517'] ['2020-05-23', '12:00:00', '201908', '190253', '96170', '2', '0', '532387', '644', '313528'] ['2020-05-23', '13:00:00', '201916', '190259', '96172', '2', '0', '532387', '644', '313541'] ['2020-05-23', '14:00:00', '201921', '190265', '96174', '2', '0', '532387', '644', '313550'] ['2020-05-23', '15:00:00', '201929', '190272', '96176', '2', '0', '532387', '644', '313560'] ['2020-05-23', '16:00:00', '201934', '190278', '96178', '2', '0', '532387', '644', '313566'] ['2020-05-23', '17:00:00', '201941', '190283', '96180', '2', '0', '532387', '644', '313571'] ['2020-05-23', '18:00:00', '201947', '190289', '96182', '2', '0', '532387', '644', '313579'] ['2020-05-23', '19:00:00', '201955', '190295', '96184', '2', '0', '532387', '644', '313586'] ['2020-05-23', '20:00:01', '201962', '190301', '96186', '2', '0', '532387', '644', '313597'] ['2020-05-23', '21:00:00', '201975', '190312', '96188', '2', '0', '532387', '644', '313613'] ['2020-05-23', '22:00:00', '202006', '190326', '96191', '2', '0', '532387', '644', '313644'] ['2020-05-23', '23:00:00', '202021', '190341', '96193', '2', '0', '532387', '644', '313671'] ['2020-05-24', '00:00:00', '202039', '190355', '96195', '2', '0', '532387', '644', '313693'] ['2020-05-24', '01:00:00', '202054', '190369', '96197', '2', '0', '532387', '644', '313711'] ['2020-05-24', '02:00:00', '202065', '190379', '96199', '2', '0', '532387', '644', '313731'] ['2020-05-24', '03:00:00', '202072', '190386', '96201', '2', '0', '532387', '644', '313744'] ['2020-05-24', '04:00:00', '202077', '190390', '96203', '2', '0', '532387', '644', '313749'] ['2020-05-24', '05:00:00', '202079', '190392', '96205', '2', '0', '532387', '644', '313754'] ['2020-05-24', '06:00:00', '202082', '190394', '96207', '2', '0', '532387', '644', '313757'] ['2020-05-24', '07:00:00', '202083', '190397', '96209', '2', '0', '532387', '644', '313761'] ['2020-05-24', '08:00:00', '202087', '190399', '96211', '2', '0', '532387', '644', '313766'] ['2020-05-24', '09:00:00', '202092', '190403', '96213', '2', '0', '532387', '644', '313772'] ['2020-05-24', '10:00:00', '202098', '190410', '96215', '2', '0', '532387', '644', '313780']

that's exactly what I meant, you're great, the others too. Now I have to figure out how to calculate the difference in the third column, e.g. from 2020-05-21 to 2020-05-23


RE: create an array of each line of text - Yoriz - Jun-07-2020

def split_lines(line):
    items = []
    for item in line.split('##'): # 
        if item != '#RANGE ':
            items.append(item.strip())
    return items


line_items = []
with open('raport.txt', 'r') as read_file:
    for line in read_file:
        line_items.append(split_lines(line))

for line in line_items:
    print(line)