Python Forum
Python script - search Apache access_log.txt for all of the JavaScript (.js)
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python script - search Apache access_log.txt for all of the JavaScript (.js)
#7
The regex r"(.*).js" has a mistake.
It will match also foojs, because the dot represent all chars.
You've to escape the dot with backslash: r"(.*)\n.js"

You should put your code in a function, then you use yield instead of return and then you have a generator.
import re
 
def log_reader(file):
    with open(file) as fd:
        for line in fd:
            if re.match("(.*)\.js", line):
                yield line.split()[6].split('/')[2]


my_reader = log_reader('/home/kali/Desktop/access_log.txt')
# nothing happens
# generator evaluates lazy
# consume the generator

paths = set(my_reader) # unique elements
# paths has now elements and my_reader is exhausted / empty

print(paths)
# sort unique paths
print(sorted(paths))
You can solve it also without regex:
def read_log(file, allowed_method=None):
    # use a contextmanger
    with open(file) as fd:
        # fd is a iterator and it iterates lines
        # line end is not stripped
        for line in fd:
            # splitting the log line by " brings a good result
            _, request, *_ = line.split('"')
            # the request is in the second field
            # _ are placeholder for throw away object
            # *_ consumes the rest of the elements
            # request is what you need
            meth, path, proto = request.split()
            # A request consists of: Method, Path, Protocol-Version
            #
            # Evaluate allowed_method first
            # if it's None, the second part after the end is not evaluated
            # this allows to set allowed_method to None to
            # skip this check
            if allowed_method and meth.upper() != allowed_method:
                continue
                # otherwise continue, if the method is a different
            if path.endswith(".js"):
                yield path.rsplit("/", 1)[-1]
Accessing the generator:
log_file = "access.log"
js_files = sorted(set(read_log(log_file)))

# first set consumes the generator read_log
# then sorted consumes set
# sorted returns a sorted list
And if you need to do something with your data for each file:
for js_file in js_files:
    print(js_file)
    # code
    ...


If you don't want a generator, you need two lines more:
def log_reader(file):
    results = set()
    with open(file) as fd:
        for line in fd:
            if re.match("(.*)\.js", line):
                results.add( line.split()[6].split('/')[2] )
    return sorted(results)
In this case I return a unique sorted list instead of a generator.
To add an element to a set, you have to use the add method.
A list has append to add en element.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Messages In This Thread
RE: Python script - search Apache access_log.txt for all of the JavaScript (.js) - by DeaD_EyE - May-04-2020, 04:52 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Is there a *.bat DOS batch script to *.py Python Script converter? pstein 3 3,877 Jun-29-2023, 11:57 AM
Last Post: gologica
  install apache-airflow[postgres,google] on Python 3.8.12 virtual env ShahajaK 1 11,234 Oct-07-2021, 03:05 PM
Last Post: Larz60+
  Apache 2.0 Licensed Python code Furkan 0 1,713 Jul-26-2021, 11:12 PM
Last Post: Furkan
Photo Integration of apache spark and Kafka on eclipse pyspark aupres 1 3,921 Feb-27-2021, 08:38 AM
Last Post: Serafim
  How to kill a bash script running as root from a python script? jc_lafleur 4 6,242 Jun-26-2020, 10:50 PM
Last Post: jc_lafleur
  crontab on RHEL7 not calling python script wrapped in shell script benthomson 1 2,441 May-28-2020, 05:27 PM
Last Post: micseydel
  Package python script which has different libraries as a single executable or script tej7gandhi 1 2,748 May-11-2019, 08:12 PM
Last Post: keames
  Twitter listen script, dynamic search value? quitte74 0 1,986 Nov-01-2018, 01:09 PM
Last Post: quitte74
  Run Script written in javascript from python Scientifix 3 3,552 Apr-28-2018, 06:48 PM
Last Post: Gribouillis
  How to use python variable in javascript arun28sharma44 0 3,143 Nov-05-2017, 03:52 PM
Last Post: arun28sharma44

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020