Python Forum
create an array of each line of text
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
create an array of each line of text
#1
Hi,
I have the following text file:
... here's a fragment
I wrote a code in python that reads data from this file.
file = open('raport.txt', 'r').read()
lines = file.split('\n')
for line in lines:
    print (line)
How to create an array from each line of the file without whitespace and "###" and "RANGE". All in all I am interested in columns with date, time and numbers.

Output:
2020-05-22##18:00:00###RANGE ###RANGE ###RANGE ##201828##190182##96136##2 ##0 ###RANGE ###RANGE ##532387##644##313415 2020-05-22##19:00:00###RANGE ###RANGE ###RANGE ##201833##190185##96138##2 ##0 ###RANGE ###RANGE ##532387##644##313421 2020-05-22##20:00:00###RANGE ###RANGE ###RANGE ##201839##190191##96140##2 ##0 ###RANGE ###RANGE ##532387##644##313427 2020-05-22##21:00:00###RANGE ###RANGE ###RANGE ##201844##190195##96142##2 ##0 ###RANGE ###RANGE ##532387##644##313433 2020-05-22##22:00:00###RANGE ###RANGE ###RANGE ##201850##190201##96144##2 ##0 ###RANGE ###RANGE ##532387##644##313441 2020-05-22##23:00:01###RANGE ###RANGE ###RANGE ##201858##190207##96146##2 ##0 ###RANGE ###RANGE ##532387##644##313452 2020-05-23##00:00:00###RANGE ###RANGE ###RANGE ##201864##190212##96148##2 ##0 ###RANGE ###RANGE ##532387##644##313458 2020-05-23##01:00:00###RANGE ###RANGE ###RANGE ##201866##190215##96150##2 ##0 ###RANGE ###RANGE ##532387##644##313464 2020-05-23##02:00:00###RANGE ###RANGE ###RANGE ##201870##190217##96152##2 ##0 ###RANGE ###RANGE ##532387##644##313469 2020-05-23##03:00:00###RANGE ###RANGE ###RANGE ##201872##190220##96154##2 ##0 ###RANGE ###RANGE ##532387##644##313474 2020-05-23##04:00:00###RANGE ###RANGE ###RANGE ##201873##190221##96156##2 ##0 ###RANGE ###RANGE ##532387##644##313477 2020-05-23##05:00:00###RANGE ###RANGE ###RANGE ##201876##190223##96156##2 ##0 ###RANGE ###RANGE ##532387##644##313480 2020-05-23##06:00:00###RANGE ###RANGE ###RANGE ##201877##190224##96158##2 ##0 ###RANGE ###RANGE ##532387##644##313483 2020-05-23##07:00:01###RANGE ###RANGE ###RANGE ##201879##190226##96160##2 ##0 ###RANGE ###RANGE ##532387##644##313486 2020-05-23##08:00:00###RANGE ###RANGE ###RANGE ##201881##190228##96162##2 ##0 ###RANGE ###RANGE ##532387##644##313490 2020-05-23##09:00:01###RANGE ###RANGE ###RANGE ##201886##190233##96164##2 ##0 ###RANGE ###RANGE ##532387##644##313496 2020-05-23##10:00:00###RANGE ###RANGE ###RANGE ##201893##190238##96166##2 ##0 ###RANGE ###RANGE ##532387##644##313503 2020-05-23##11:00:00###RANGE ###RANGE ###RANGE ##201901##190246##96168##2 ##0 ###RANGE ###RANGE ##532387##644##313517 2020-05-23##12:00:00###RANGE ###RANGE ###RANGE ##201908##190253##96170##2 ##0 ###RANGE ###RANGE ##532387##644##313528 2020-05-23##13:00:00###RANGE ###RANGE ###RANGE ##201916##190259##96172##2 ##0 ###RANGE ###RANGE ##532387##644##313541 2020-05-23##14:00:00###RANGE ###RANGE ###RANGE ##201921##190265##96174##2 ##0 ###RANGE ###RANGE ##532387##644##313550 2020-05-23##15:00:00###RANGE ###RANGE ###RANGE ##201929##190272##96176##2 ##0 ###RANGE ###RANGE ##532387##644##313560 2020-05-23##16:00:00###RANGE ###RANGE ###RANGE ##201934##190278##96178##2 ##0 ###RANGE ###RANGE ##532387##644##313566 2020-05-23##17:00:00###RANGE ###RANGE ###RANGE ##201941##190283##96180##2 ##0 ###RANGE ###RANGE ##532387##644##313571 2020-05-23##18:00:00###RANGE ###RANGE ###RANGE ##201947##190289##96182##2 ##0 ###RANGE ###RANGE ##532387##644##313579 2020-05-23##19:00:00###RANGE ###RANGE ###RANGE ##201955##190295##96184##2 ##0 ###RANGE ###RANGE ##532387##644##313586 2020-05-23##20:00:01###RANGE ###RANGE ###RANGE ##201962##190301##96186##2 ##0 ###RANGE ###RANGE ##532387##644##313597 2020-05-23##21:00:00###RANGE ###RANGE ###RANGE ##201975##190312##96188##2 ##0 ###RANGE ###RANGE ##532387##644##313613 2020-05-23##22:00:00###RANGE ###RANGE ###RANGE ##202006##190326##96191##2 ##0 ###RANGE ###RANGE ##532387##644##313644 2020-05-23##23:00:00###RANGE ###RANGE ###RANGE ##202021##190341##96193##2 ##0 ###RANGE ###RANGE ##532387##644##313671 2020-05-24##00:00:00###RANGE ###RANGE ###RANGE ##202039##190355##96195##2 ##0 ###RANGE ###RANGE ##532387##644##313693 2020-05-24##01:00:00###RANGE ###RANGE ###RANGE ##202054##190369##96197##2 ##0 ###RANGE ###RANGE ##532387##644##313711 2020-05-24##02:00:00###RANGE ###RANGE ###RANGE ##202065##190379##96199##2 ##0 ###RANGE ###RANGE ##532387##644##313731 2020-05-24##03:00:00###RANGE ###RANGE ###RANGE ##202072##190386##96201##2 ##0 ###RANGE ###RANGE ##532387##644##313744 2020-05-24##04:00:00###RANGE ###RANGE ###RANGE ##202077##190390##96203##2 ##0 ###RANGE ###RANGE ##532387##644##313749 2020-05-24##05:00:00###RANGE ###RANGE ###RANGE ##202079##190392##96205##2 ##0 ###RANGE ###RANGE ##532387##644##313754 2020-05-24##06:00:00###RANGE ###RANGE ###RANGE ##202082##190394##96207##2 ##0 ###RANGE ###RANGE ##532387##644##313757 2020-05-24##07:00:00###RANGE ###RANGE ###RANGE ##202083##190397##96209##2 ##0 ###RANGE ###RANGE ##532387##644##313761 2020-05-24##08:00:00###RANGE ###RANGE ###RANGE ##202087##190399##96211##2 ##0 ###RANGE ###RANGE ##532387##644##313766 2020-05-24##09:00:00###RANGE ###RANGE ###RANGE ##202092##190403##96213##2 ##0 ###RANGE ###RANGE ##532387##644##313772 2020-05-24##10:00:00###RANGE ###RANGE ###RANGE ##202098##190410##96215##2 ##0 ###RANGE ###RANGE ##532387##644##313780
Reply
#2
Looks straightforward, many possibilities.
We do not have the input file, but...
you might:
- read the line
- use regex (replace) to replace the'###RANGE' with '##'
- Then split the line on '##'
You will get an array of columns, take your pick of the ones you want.

Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Reply
#3
line = '2020-05-22##18:00:00###RANGE ###RANGE ###RANGE ##201828##190182##96136##2 ##0 ###RANGE ###RANGE ##532387##644##313415'
print(list(item.strip() for item in line.split('##') if item != '#RANGE '))
Output:
['2020-05-22', '18:00:00', '201828', '190182', '96136', '2', '0', '532387', '644', '313415']
Reply
#4
thank you for the answer, I must admit I am honestly a beginner if you could give a simple example, please
Reply
#5
That is the equivalent of
line = '2020-05-22##18:00:00###RANGE ###RANGE ###RANGE ##201828##190182##96136##2 ##0 ###RANGE ###RANGE ##532387##644##313415'

items = []
for item in line.split('##'): # split into items at '##'
    if item != '#RANGE ': # if the item is not '#RANGE '
        items.append(item.strip()) # remove the whitespace and add to the list

print(items)
Output:
['2020-05-22', '18:00:00', '201828', '190182', '96136', '2', '0', '532387', '644', '313415']
Does it make sense like this?
Reply
#6
A Lot shorter than what I came up with.
file = './play.txt'

array = []
with open(file, 'r') as lines:
    for line in lines:
        newlines = line.replace('###', ' ').replace('##', ' ').replace('RANGE', ' ').replace('\n', ' ').split()
        array.append(newlines)

    for line in array:
        print(line)
Output:
['2020-05-22', '18:00:00', '201828', '190182', '96136', '2', '0', '532387', '644', '313415'] ['2020-05-22', '19:00:00', '201833', '190185', '96138', '2', '0', '532387', '644', '313421'] ['2020-05-22', '20:00:00', '201839', '190191', '96140', '2', '0', '532387', '644', '313427'] ['2020-05-22', '21:00:00', '201844', '190195', '96142', '2', '0', '532387', '644', '313433'] ['2020-05-22', '22:00:00', '201850', '190201', '96144', '2', '0', '532387', '644', '313441'] ['2020-05-22', '23:00:01', '201858', '190207', '96146', '2', '0', '532387', '644', '313452'] ['2020-05-23', '00:00:00', '201864', '190212', '96148', '2', '0', '532387', '644', '313458'] ['2020-05-23', '01:00:00', '201866', '190215', '96150', '2', '0', '532387', '644', '313464'] ['2020-05-23', '02:00:00', '201870', '190217', '96152', '2', '0', '532387', '644', '313469'] ['2020-05-23', '03:00:00', '201872', '190220', '96154', '2', '0', '532387', '644', '313474'] ['2020-05-23', '04:00:00', '201873', '190221', '96156', '2', '0', '532387', '644', '313477'] ['2020-05-23', '05:00:00', '201876', '190223', '96156', '2', '0', '532387', '644', '313480'] ['2020-05-23', '06:00:00', '201877', '190224', '96158', '2', '0', '532387', '644', '313483'] ['2020-05-23', '07:00:01', '201879', '190226', '96160', '2', '0', '532387', '644', '313486'] ['2020-05-23', '08:00:00', '201881', '190228', '96162', '2', '0', '532387', '644', '313490'] ['2020-05-23', '09:00:01', '201886', '190233', '96164', '2', '0', '532387', '644', '313496'] ['2020-05-23', '10:00:00', '201893', '190238', '96166', '2', '0', '532387', '644', '313503'] ['2020-05-23', '11:00:00', '201901', '190246', '96168', '2', '0', '532387', '644', '313517'] ['2020-05-23', '12:00:00', '201908', '190253', '96170', '2', '0', '532387', '644', '313528'] ['2020-05-23', '13:00:00', '201916', '190259', '96172', '2', '0', '532387', '644', '313541'] ['2020-05-23', '14:00:00', '201921', '190265', '96174', '2', '0', '532387', '644', '313550'] ['2020-05-23', '15:00:00', '201929', '190272', '96176', '2', '0', '532387', '644', '313560'] ['2020-05-23', '16:00:00', '201934', '190278', '96178', '2', '0', '532387', '644', '313566'] ['2020-05-23', '17:00:00', '201941', '190283', '96180', '2', '0', '532387', '644', '313571'] ['2020-05-23', '18:00:00', '201947', '190289', '96182', '2', '0', '532387', '644', '313579'] ['2020-05-23', '19:00:00', '201955', '190295', '96184', '2', '0', '532387', '644', '313586'] ['2020-05-23', '20:00:01', '201962', '190301', '96186', '2', '0', '532387', '644', '313597'] ['2020-05-23', '21:00:00', '201975', '190312', '96188', '2', '0', '532387', '644', '313613'] ['2020-05-23', '22:00:00', '202006', '190326', '96191', '2', '0', '532387', '644', '313644'] ['2020-05-23', '23:00:00', '202021', '190341', '96193', '2', '0', '532387', '644', '313671'] ['2020-05-24', '00:00:00', '202039', '190355', '96195', '2', '0', '532387', '644', '313693'] ['2020-05-24', '01:00:00', '202054', '190369', '96197', '2', '0', '532387', '644', '313711'] ['2020-05-24', '02:00:00', '202065', '190379', '96199', '2', '0', '532387', '644', '313731'] ['2020-05-24', '03:00:00', '202072', '190386', '96201', '2', '0', '532387', '644', '313744'] ['2020-05-24', '04:00:00', '202077', '190390', '96203', '2', '0', '532387', '644', '313749'] ['2020-05-24', '05:00:00', '202079', '190392', '96205', '2', '0', '532387', '644', '313754'] ['2020-05-24', '06:00:00', '202082', '190394', '96207', '2', '0', '532387', '644', '313757'] ['2020-05-24', '07:00:00', '202083', '190397', '96209', '2', '0', '532387', '644', '313761'] ['2020-05-24', '08:00:00', '202087', '190399', '96211', '2', '0', '532387', '644', '313766'] ['2020-05-24', '09:00:00', '202092', '190403', '96213', '2', '0', '532387', '644', '313772'] ['2020-05-24', '10:00:00', '202098', '190410', '96215', '2', '0', '532387', '644', '313780']
I welcome all feedback.
The only dumb question, is one that doesn't get asked.
My Github
How to post code using bbtags


Reply
#7
items = []
file = open('raport.txt', 'r').read()

for item in file.split('##'): 
    
    if item != '#RANGE ': 
        items.append(item.strip()) 
 
print(items)
I receive the following...

0-05-23', '10:00:00', '201893', '190238', '96166', '2', '0', '532387', '644', '313503\n2020-05-23', '11:00:00', '201901', '190246', '96168', '2', '0', '532387', '644', '313517\n2020-05-23', '12:00:00', '201908', '190253', '96170', '2', '0', '532387', '644', '313528\n2020-05-23', '13:00:00', '201916', '190259', '96172', '2', '0', '532387', '644', '313541\n2020-05-23', '14:00:00', '201921', '190265', '96174', '2', '0', '532387', '644', '313550\n2020-05-23', '15:00:00', '201929', '190272', '96176', '2', '0', '532387', '644', '313560\n2020-05-23', '16:00:00', '201934', '190278', '96178', '2', '0', '532387', '644', '313566\n2020-05-23', '17:00:00', '201941', '190283', '96180', '2', '0', '532387', '644', '313571\n2020-05-23', '18:00:00', '201947', '190289', '96182', '2', '0', '532387', '644', '313579\n2020-05-23', '19:00:00', '201955', '190295', '96184', '2', '0', '532387', '644', '313586\n2020-05-23', '20:00:01', '201962', '190301', '96186', '2', '0', '532387', '644', '313597\n2020-05-23', '21:00:00', '201975', '190312', '96188', '2', '0', '532387', '644', '313613\n2020-05-23', '22:00:00', '202006', '190326', '96191', '2', '0', '532387', '644', '313644\n2020-05-23', '23:00:00', '202021', '190341', '96193', '2', '0', '532387', '644', '313671\n2020-05-24', '00:00:00', '202039', '190355', '96195', '2', '0', '532387', '644', '313693\n2020-05-24', '01:00:00', '202054', '190369', '96197', '2', '0', '532387', '644', '313711\n2020-05-24', '02:00:00', '202065', '190379', '96199', '2', '0', '532387', '644', '313731\n2020-05-24', '03:00:00', '202072', '190386', '96201', '2', '0', '532387', '644', '313744\n2020-05-24', '04:00:00', '202077', '190390', '96203', '2', '0', '532387', '644', '313749\n2020-05-24', '05:00:00', '202079', '190392', '96205', '2', '0', '532387', '644', '313754\n2020-05-24', '06:00:00', '202082', '190394', '96207', '2', '0', '532387', '644', '313757\n2020-05-24', '07:00:00', '202083', '190397', '96209', '2', '0', '532387', '644', '313761\n2020-05-24', '08:00:00', '202087', '190399', '96211', '2', '0', '532387', '644', '313766\n2020-05-24', '09:00:00', '202092', '190403', '96213', '2', '0', '532387', '644', '313772\n2020-05-24', '10:00:00', '202098', '190410', '96215', '2', '0', '532387', '644', '313780']

looks good but when I try to use line-by-line loading, nothing displays

item = file.split('\n')

(Jun-07-2020, 06:07 PM)menator01 Wrote: A Lot shorter than what I came up with.
file = './play.txt'

array = []
with open(file, 'r') as lines:
    for line in lines:
        newlines = line.replace('###', ' ').replace('##', ' ').replace('RANGE', ' ').replace('\n', ' ').split()
        array.append(newlines)

    for line in array:
        print(line)
Output:
['2020-05-22', '18:00:00', '201828', '190182', '96136', '2', '0', '532387', '644', '313415'] ['2020-05-22', '19:00:00', '201833', '190185', '96138', '2', '0', '532387', '644', '313421'] ['2020-05-22', '20:00:00', '201839', '190191', '96140', '2', '0', '532387', '644', '313427'] ['2020-05-22', '21:00:00', '201844', '190195', '96142', '2', '0', '532387', '644', '313433'] ['2020-05-22', '22:00:00', '201850', '190201', '96144', '2', '0', '532387', '644', '313441'] ['2020-05-22', '23:00:01', '201858', '190207', '96146', '2', '0', '532387', '644', '313452'] ['2020-05-23', '00:00:00', '201864', '190212', '96148', '2', '0', '532387', '644', '313458'] ['2020-05-23', '01:00:00', '201866', '190215', '96150', '2', '0', '532387', '644', '313464'] ['2020-05-23', '02:00:00', '201870', '190217', '96152', '2', '0', '532387', '644', '313469'] ['2020-05-23', '03:00:00', '201872', '190220', '96154', '2', '0', '532387', '644', '313474'] ['2020-05-23', '04:00:00', '201873', '190221', '96156', '2', '0', '532387', '644', '313477'] ['2020-05-23', '05:00:00', '201876', '190223', '96156', '2', '0', '532387', '644', '313480'] ['2020-05-23', '06:00:00', '201877', '190224', '96158', '2', '0', '532387', '644', '313483'] ['2020-05-23', '07:00:01', '201879', '190226', '96160', '2', '0', '532387', '644', '313486'] ['2020-05-23', '08:00:00', '201881', '190228', '96162', '2', '0', '532387', '644', '313490'] ['2020-05-23', '09:00:01', '201886', '190233', '96164', '2', '0', '532387', '644', '313496'] ['2020-05-23', '10:00:00', '201893', '190238', '96166', '2', '0', '532387', '644', '313503'] ['2020-05-23', '11:00:00', '201901', '190246', '96168', '2', '0', '532387', '644', '313517'] ['2020-05-23', '12:00:00', '201908', '190253', '96170', '2', '0', '532387', '644', '313528'] ['2020-05-23', '13:00:00', '201916', '190259', '96172', '2', '0', '532387', '644', '313541'] ['2020-05-23', '14:00:00', '201921', '190265', '96174', '2', '0', '532387', '644', '313550'] ['2020-05-23', '15:00:00', '201929', '190272', '96176', '2', '0', '532387', '644', '313560'] ['2020-05-23', '16:00:00', '201934', '190278', '96178', '2', '0', '532387', '644', '313566'] ['2020-05-23', '17:00:00', '201941', '190283', '96180', '2', '0', '532387', '644', '313571'] ['2020-05-23', '18:00:00', '201947', '190289', '96182', '2', '0', '532387', '644', '313579'] ['2020-05-23', '19:00:00', '201955', '190295', '96184', '2', '0', '532387', '644', '313586'] ['2020-05-23', '20:00:01', '201962', '190301', '96186', '2', '0', '532387', '644', '313597'] ['2020-05-23', '21:00:00', '201975', '190312', '96188', '2', '0', '532387', '644', '313613'] ['2020-05-23', '22:00:00', '202006', '190326', '96191', '2', '0', '532387', '644', '313644'] ['2020-05-23', '23:00:00', '202021', '190341', '96193', '2', '0', '532387', '644', '313671'] ['2020-05-24', '00:00:00', '202039', '190355', '96195', '2', '0', '532387', '644', '313693'] ['2020-05-24', '01:00:00', '202054', '190369', '96197', '2', '0', '532387', '644', '313711'] ['2020-05-24', '02:00:00', '202065', '190379', '96199', '2', '0', '532387', '644', '313731'] ['2020-05-24', '03:00:00', '202072', '190386', '96201', '2', '0', '532387', '644', '313744'] ['2020-05-24', '04:00:00', '202077', '190390', '96203', '2', '0', '532387', '644', '313749'] ['2020-05-24', '05:00:00', '202079', '190392', '96205', '2', '0', '532387', '644', '313754'] ['2020-05-24', '06:00:00', '202082', '190394', '96207', '2', '0', '532387', '644', '313757'] ['2020-05-24', '07:00:00', '202083', '190397', '96209', '2', '0', '532387', '644', '313761'] ['2020-05-24', '08:00:00', '202087', '190399', '96211', '2', '0', '532387', '644', '313766'] ['2020-05-24', '09:00:00', '202092', '190403', '96213', '2', '0', '532387', '644', '313772'] ['2020-05-24', '10:00:00', '202098', '190410', '96215', '2', '0', '532387', '644', '313780']

that's exactly what I meant, you're great, the others too. Now I have to figure out how to calculate the difference in the third column, e.g. from 2020-05-21 to 2020-05-23
Reply
#8
def split_lines(line):
    items = []
    for item in line.split('##'): # 
        if item != '#RANGE ':
            items.append(item.strip())
    return items


line_items = []
with open('raport.txt', 'r') as read_file:
    for line in read_file:
        line_items.append(split_lines(line))

for line in line_items:
    print(line)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How do I add comments from a text-file to an array of folders? clausneergaard 2 1,735 Feb-08-2023, 07:45 PM
Last Post: Larz60+
  Create Excel Line Chart Programmatically dee 3 1,143 Dec-30-2022, 08:44 PM
Last Post: dee
  extract only text strip byte array Pir8Radio 7 2,790 Nov-29-2022, 10:24 PM
Last Post: Pir8Radio
  Graphic line plot with matplotlib, text file in pytho khadija 2 1,340 Aug-15-2022, 12:00 PM
Last Post: khadija
  Create array of values from 2 variables paulo79 1 1,057 Apr-19-2022, 08:28 PM
Last Post: deanhystad
  Skipping line in text without Restarting Loop IdMineThat 4 1,433 Apr-05-2022, 04:23 AM
Last Post: deanhystad
  Find and delete above a certain line in text file cubangt 12 3,354 Mar-18-2022, 07:49 PM
Last Post: snippsat
  CSV to Text File and write a line in newline atomxkai 4 2,612 Feb-15-2022, 08:06 PM
Last Post: atomxkai
  Python code to read second line from CSV files and create a master CSV file sh1704 1 2,353 Feb-13-2022, 07:13 PM
Last Post: menator01
  beginner text formatting single line to column jafrost 4 3,161 Apr-28-2021, 07:03 PM
Last Post: jafrost

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020