Posts: 118
Threads: 39
Joined: Nov 2021
hi, sorry for my bad English,
I have code to remove line(s) that do not start with '[' and end with ']':
with open('input.raw') as f:
lines = ""
for line in f:
lina = line.strip()
#print(line)
if lina.startswith("["):
if lina.endswith("]"):
lines = lines + lina + "\n"
writing = open('output.raw', 'w')
writing.write(lines)
writing.close() but somehow my file is over 200Mb, and it take hour(s) to get it done,
som how make my code faster?
Posts: 4,600
Threads: 72
Joined: Jan 2018
Aug-07-2022, 07:27 AM
(This post was last modified: Aug-07-2022, 07:29 AM by Gribouillis.)
The slow part is the accumulative string concatenation at line 8 (the + operator). Try this code
def good_lines(file):
for line in file:
line = line.strip()
if line[:1] == '[' and line[-1] == ']':
yield line + '\n'
with open('input.raw') as ifh, open('output.raw', 'w') as ofh:
ofh.writelines(good_lines(ifh))
Posts: 118
Threads: 39
Joined: Nov 2021
(Aug-07-2022, 07:27 AM)Gribouillis Wrote: The slow part is the accumulative string concatenation at line 8 (the + operator). Try this code
def good_lines(file):
for line in file:
line = line.strip()
if line[:1] == '[' and line[-1] == ']':
yield line + '\n'
with open('input.raw') as ifh, open('output.raw', 'w') as ofh:
ofh.writelines(good_lines(ifh)) can you belive it, it make hours to under a minute,
thank you, I will give you reputation point
Posts: 1,823
Threads: 2
Joined: Apr 2017
Aug-07-2022, 07:51 AM
(This post was last modified: Aug-07-2022, 07:56 AM by ndc85430.)
Is there a reason you're writing this yourself and not using existing tools like, say, grep ?
Posts: 118
Threads: 39
Joined: Nov 2021
(Aug-07-2022, 07:51 AM)ndc85430 Wrote: Is there a reason you're writing this yourself and not using existing tools like, say, grep ? what is grep? any link of it?
Posts: 453
Threads: 16
Joined: Jun 2022
(Aug-07-2022, 10:44 AM)kucingkembar Wrote: what is grep? any link of it?
https://www.man7.org/linux/man-pages/man1/grep.1.html
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Posts: 118
Threads: 39
Joined: Nov 2021
(Aug-07-2022, 12:00 PM)rob101 Wrote: (Aug-07-2022, 10:44 AM)kucingkembar Wrote: what is grep? any link of it?
https://www.man7.org/linux/man-pages/man1/grep.1.html something error, I tried this:
C:\Users\GIGABYTE>pip install grep
Collecting grep
Downloading grep-0.3.2.tar.gz (3.2 kB)
Preparing metadata (setup.py) ... done
Using legacy 'setup.py install' for grep, since package 'wheel' is not installed.
Installing collected packages: grep
Running setup.py install for grep ... done
Successfully installed grep-0.3.2
C:\Users\GIGABYTE>grep
'grep' is not recognized as an internal or external command,
operable program or batch file. I will study this
Posts: 453
Threads: 16
Joined: Jun 2022
Aug-07-2022, 01:17 PM
(This post was last modified: Aug-07-2022, 01:23 PM by rob101.)
(Aug-07-2022, 01:07 PM)kucingkembar Wrote: something error, I tried this:
I think that you're confused. grep as I know it, is a command-line utility for searching plain-text data sets for lines that match a regular expression and is a part of Unix type computer operating systems (I can't speak for MS Windows OS, as I don't use it), but so far as I know, there is no Python library of the same name: I stand to be corrected on that, if I'm wrong.
For a regular expression pattern match, I'd use regex:
import re
re.search(<regex>, <string>)
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Posts: 118
Threads: 39
Joined: Nov 2021
(Aug-07-2022, 01:17 PM)rob101 Wrote: (Aug-07-2022, 01:07 PM)kucingkembar Wrote: something error, I tried this:
I think that you're confused. grep as I know it, is a command-line utility for searching plain-text data sets for lines that match a regular expression and is a part of Unix type computer operating systems (I can't speak for MS Windows OS, as I don't use it), but so far as I know, there is no Python library of the same name: I stand to be corrected on that, if I'm wrong.
For a regular expression pattern match, I'd use regex:
import re
re.search(<regex>, <string>) that linux thing, that why I not use it
anyway is there any regex tutorial with lots of example?
Posts: 6,373
Threads: 17
Joined: Feb 2020
kucingkembar likes this post
|