Python Forum

Full Version: Removing the unwanted data from a file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
Just a small change to read the file into an array; works fine.

import re
 
# read file into an array 
f = open('https.txt', 'r+')
data = f.read()
f.close()

pattern = re.compile(r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+')
for match in pattern.finditer(data):
    print(match.group())
    
Thanks everyone for your help. Smile
(Nov-14-2021, 09:11 PM)jehoshua Wrote: [ -> ]Just a small change to read the file into an array; works fine.
Just a small correction you are not reading the file into a array(we call it list in Python),
when do f.read() read all into a string,which is okay it will work fine.

Let say don't want not want read all contend into memory,but do it line bye line and not close file bye using with open.
Then it will be like this.
import re

pattern = re.compile(r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+')
with open('https.txt', 'r+') as f:
    for line in f:
        for match in pattern.finditer(line):
            print(match.group())
Thanks, that produced exactly the same output. Smile
I think this article can help u just check once: https://www.scaler.com/topics/find-function-in-python/
@codinglearner - thanks
Pages: 1 2