Python Forum
Multiple conditions, one is null - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Multiple conditions, one is null (/thread-29613.html)



Multiple conditions, one is null - moralear27 - Sep-12-2020

I am not really good at python but I am trying to make a pretty basic program to concatenate tons of files, be able to filter according to some criteria, and then export a file with the result. This is what I have done so far (I still need to add a lot of input validation though):

from glob import glob
import numpy as np
import pandas as pd
pd.options.display.max_columns = 100

files = glob('C:/Personal/data*.log')
datos = pd.concat([pd.read_csv(f, header=None, names=range(100), low_memory=False) for f in files])

ip = '10.1.1.5'
user = 'john'
ini_date = '07/14/2020'
fin_date = '07/18/2020'
ini_hour = '01:42:00'
fin_hour = '16:15:20'

result = datos[(datos[0] == ip) & (datos[1] == user) & (datos[2] >= ini_date) & (datos[2] <= fin_date) & (datos[3] >= ini_hour) & (datos[3] <= fin_hour)] 

result.to_csv (C:/Personal/result.csv', index = False)
I would like to know the best way to do that my program ignore these variables/conditions which are not set. For example, if I run this the resut is anything:
ip = ''
user = 'john'
ini_date = '07/14/2020'
fin_date = '07/18/2020'
ini_hour = '01:42:00'
fin_hour = '16:15:20'
I need that these null variables would be ignored or taken into account like "any". If this possible? Thanks!


RE: Multiple conditions, one is null - scidam - Sep-13-2020

I would do something like the following:

from operator import gt, lt, eq, and_
from functools import reduce

ip = '10.1.1.5'
user = 'john'
ini_date = '07/14/2020'
fin_date = '07/18/2020'
ini_hour = '01:42:00'
fin_hour = '16:15:20'



conditions = {
            'ip': {'value': '10.1.1.5', 'op': eq, 'index': 0},
            'user': {'value': 'john', 'op': eq, 'index': 1},
            'ini_date': {'value': '07/14/2020', 'op', gt, 'index': 2},
            'fin_date': {'value': '07/18/2020', 'op', lt, 'index': 2},
            'ini_hour': {'value': '01:42:00', 'op', gt, 'index': 3},
            'fin_hour': {'value': '16:15:20', 'op', lt, 'index': 3}
}


 
result = datos[reduce(and_, (v.get('op')(datos.iloc[:, v.get('index')], v.get('value')) for k, v in contidions.items() if v.get('value'))] 
However, it still a bit ugly; you can convert date and time columns into one datetime; use corresponding .between method to find items fallen between two dates. Hope that helps.