Given a date I need to separate data into weeks

cspower · (This post was last modified: Sep-01-2017, 09:05 PM by metulburr.)

Here is a snippet of the file.

Output:number|assigned_to|business_duration|description|priority|sys_created_by|opened_at|u_pending_closed_on|resolved_by|sys_created_on
INC3270235|Most of these are blank|0|Error messages go here|3|700000120|5/6/17 3:20|5/6/17 15:20
INC3271723|Most of these are blank|0|Error messages go here|3|700000120|5/7/17 9:43|5/7/17 9:43
INC3279067|Most of these are blank|0|Error messages go here|3|700000120|5/8/17 9:03|5/8/17 9:03
INC3319147|Most of these are blank|0|Error messages go here|3|700000120|5/11/17 1:24|5/11/17 1:24
INC3331818|Most of these are blank|0|Error messages go here|3|700000120|5/11/17 3:21|5/11/17 15:21
INC3331966|Most of these are blank|0|Error messages go here|3|700000120|5/11/17 3:31|5/11/17 15:31
INC3335862|Most of these are blank|0|Error messages go here|3|700000120|5/12/17 6:17|5/12/17 6:17
INC3344128|Most of these are blank|0|Error messages go here|3|700000120|5/12/17 3:11|5/12/17 15:11
INC3368733|Most of these are blank|0|Error messages go here|3|700000120|5/16/17 1:24|5/16/17 1:24

it is the "sys_created_on" date that I need to use as the filter.

Here is the script I need to modify in order to get this to happen.
I did not write this but I'm being asked to learn how to support it and improve it.
this is my first experience with python so I need a good kick start.
To note the script will not produce output unless the raw_data is delimited with |s.
The reason is that the actual descriptions(err) can contain any number of commas.
Generally the descriptions are IO Errors written into the tickets.
This also explains why there is so much clean up of the data in the script.
Also if you see ways to improve this script I'm open to suggestions.
At this time I can only hope to someday be able to write code in a very pythonic way.

#!/usr/bin/python

#IMPORTED LIBRARIES
import csv
import re

#FILE PATHES
raw_data = "ESC_Incidents.csv" # file path to the raw Data	
servicenow_analysis_output = "analysis_output.txt"	# path to  new text file

#OPEN FILES
data = open(raw_data, "r") # creates file descriptor and opens raw data file as read only
output = open(servicenow_analysis_output, "w")	# opens and create the new test file for the ouput to be written to

# CREATING DATA STRUCTURE
# the data is stored in a dictionary such that the column title is the key and the value is the list of all values within that column
parsedData = {} #instantiates the dictionary
#creating list for each column
parsedData['number'] = []
parsedData['assigned_to'] = []
parsedData['business_duration'] = []
parsedData['description'] = []
parsedData['priority'] = []
parsedData['sys_created'] = []
parsedData['opened_at'] = []
parsedData['u_pending_closed_on'] = []
parsedData['resolved_by'] = []
parsedData['sys_created_on'] = []

#FUNCTIONS
#this function will parse and store the data into the data structure constructed prior
def parseTheData(fd, parsedData): #fd stands for file descriptor and will be data for out purposes
    entireFile = fd.read() #reads entire file in
    entireFile = re.sub(r"\|\|","|NULL|", entireFile) #adds NULL in place of an empty value
    entireFile = re.sub(r"INC[0-9]*\|[a-zA-z]+","|\g<0>", entireFile) # adds another pipe before all incident numbers to allow parsing the file correctly
    entireFile = entireFile.split("|") #separates out the columns
    cleanedData = list()
    for item in entireFile:
        cleanedData.append(re.sub(',+|\\r|\\n|\"\"+', '', item))
    
    #stores the parsed data into the data structure
    count = 0
    for value in entireFile:
        value = value.lower()#converts all strings to lowercase before entering them into the data structure
        count += 1
        if count == 1:
            parsedData['number'].append(value)
        if count == 2:
            parsedData['assigned_to'].append(value)
        if count == 3:
            parsedData['business_duration'].append(value)
        if count == 4:
            parsedData['description'].append(value)
        if count == 5:
            parsedData['priority'].append(value)
        if count == 6:
            parsedData['sys_created'].append(value)
        if count == 7:
            parsedData['opened_at'].append(value)
        if count == 8:
            parsedData['u_pending_closed_on'].append(value)
        if count == 9:
            parsedData['resolved_by'].append(value)
        if count == 10:
            parsedData['sys_created_on'].append(value)
            count = 0
    return parsedData

#This function will remove all punctuation and make all letters lower case within the descriptions
def cleanTheData(ds):

    #removes excess characters from sys_created_on values
    cleanedData = list() #clears list
    for values in ds['sys_created_on']:
        cleanedData.append(re.sub(',+|\\r|\\n|', ' ', values))
    ds['sys_created_on'] = cleanedData

    return ds

def findContent(ds):
    dbErrors = dict()

    #iterates through all descriptions and pulls the db and the error message
    for message in ds["description"]:
        #searches for the error messages using regular expressions
        e = re.search('\"\"responsemessages\"\":\[{\"+message\"+:\"+[\w+ ,.:;-]+\"+', message)
        print(e)
        if e: #when error found remove exess words and punctuation just leaving the message
            errMsg = ' '.join((re.sub('\W+', ' ', e.group(0)).split(' ')[3:]))
            
        #searches for databasePlatform
        db = re.search('\"+resources\"+\:\[\"+[\w+ ]+\"+\]', message)
        
        if db:#when db platform is found removes excess puntucation and words
            dbPlatform = ' '.join((re.sub('\W+', ' ', db.group(0)).split(' ')[2:]))
            #stores the count of errors found per database in dictionary of dictionaries
            #dbErrors[database platform][error message] = count
            if dbPlatform in dbErrors.keys():
                if errMsg in dbErrors[dbPlatform].keys():
                    dbErrors[dbPlatform][errMsg] += 1
                else:
                    dbErrors[dbPlatform][errMsg] = 1
            else:
                dbErrors[dbPlatform] = 1
                dbErrors[dbPlatform][dbPlatform] = 1
                
        
    return dbErrors

#MAIN LOGIC
parsedData = parseTheData(data,parsedData)
parsedData = cleanTheData(parsedData)
dbErrors =findContent(parsedData)
print(dbErrors)

# CLOSE THE FILES 
data.close()
output.close()

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Compare current date on calendar with date format file name	Fioravanti	1	411	Mar-26-2024, 08:23 AM Last Post: Pedroski55
	Python date format changes to date & time	1418	4	842	Jan-20-2024, 04:45 AM Last Post: 1418
	importing functions from a separate python file in a separate directory	Scordomaniac	3	1,508	May-17-2022, 07:49 AM Last Post: Pedroski55
	Date format and past date check function	Turtle	5	4,613	Oct-22-2021, 09:45 PM Last Post: deanhystad
	How to add previous date infront of every unique customer id's invoice date	ur_enegmatic	1	2,330	Feb-06-2021, 10:48 PM Last Post: eddywinch82
	How to add date and years(integer) to get a date	NG0824	4	3,006	Sep-03-2020, 02:25 PM Last Post: NG0824
	Pulling & Reading Date from UDF that Compare it to Live Data	firebird	4	2,933	Jul-20-2019, 09:30 AM Last Post: snippsat
	Substracting today's date from a date in column of dates to get an integer value	firebird	1	2,212	Jul-04-2019, 06:54 PM Last Post: Axel_Erfurt
	How to change existing date to current date in a filename?	shankar455	1	2,384	Apr-17-2019, 01:53 PM Last Post: snippsat
	Date format conversion "/Date(158889600000)/"	lbitten	2	2,943	Nov-29-2018, 02:14 PM Last Post: Larz60+

Given a date I need to separate data into weeks

User Panel Messages

Announcements