Posts: 84
Threads: 23
Joined: Sep 2018
i need to compare/diff two lists, one comes from an csv file the other from a mysql query. problem is, they have slightly different formatting.
csv output:
[u'SAG01127_ZPIPDB_HA_LPM', u'AIXHEMADBPRC2_HA', u'AIXJBOSS8C2_HA', u'AIXJBOSS7C2_HA', u'AIXSTP16T1', u'AIXSAPP01C2_HA', u'AIXSTP11S1', u'AIXHYDDBT', u'AIXARVOPROD_LPM', u'AIXDEVOPSTEST02', u'AIXARVOT', u'AIXDX4PRODC2_HA', u'AIXSTP16T2', u'AIXSTP16PRC2_HA', u'SAG01128_ZPIPWEB_HA_LPM', u'AIXAPPL1LSHI', u'AIXEGIZENTW', u'AIXHVVIAB', u'AIXAPPL2ITU5', u'AIXSTP13PRC2_HA', u'AIXDDA3C2_HA', u'AIXSAPP03C2_HA', u'AIXEGIZTEST2', u'AIXZPVEDUC'] mysql output:
[(u'AIXDDA3C1_HA',), (u'SAG0168_AMTSSIGN_HA',), (u'KUG01143_STP14PR_HA',), (u'SAG0183_PFIFET',), (u'AIXSAGRU3',), (u'KUG01154_JBOSS5_HA',), (u'AIXSTP14S5',), (u'SAG01135_STP18PR_HA',), (u'AIXAMTEST2DB',), (u'AIXSTP13T1',), (u'AIXTESTHA2C1_HA_LPM',), (u'KUG01110_WPORTAL1',), (u'AIXDBITU5',), (u'SAG0133_ZPVTTM24',), (u'KUG01175_DABATEST',), (u'AIXSIGHTLINET2',), (u'SAG0194_WNOTES',), (u'KUG01177_WEBPROXY_STP_HA_LPM',), (u'KUG0126_JBOSS-E2',), (u'SAG0137_APACHE_HA',)] python compare routine
csv_data = unicodecsv.reader(file('ansible_load_file.txt'), encoding='utf-8')
sql = """SELECT DISTINCT NODE FROM ansible_hosts_view;"""
c2.execute(sql)
lresult = c2.fetchall()
csv_hosts_nim = []
for idx,row in enumerate(csv_data):
if not idx:#skip header if any
continue
csv_hosts_nim.append(row[0])
print(csv_hosts_nim)
diff_to_del = (list(set(lresult).difference(csv_hosts_nim))) so how do i format the mysql output to "look" more like the csv format?
Posts: 4,220
Threads: 97
Joined: Sep 2016
Use a list comprehension to pull out the tuple elements:
lresult = [row[0] for row in lresult]
Posts: 8,154
Threads: 160
Joined: Sep 2016
Oct-03-2018, 12:22 PM
(This post was last modified: Oct-03-2018, 12:22 PM by buran.)
not tested, but something like
# DB data
sql = """SELECT DISTINCT NODE FROM ansible_hosts_view;"""
c2.execute(sql)
lresult = {item[0] for item in c2.fetchall()} # already set object
# csv data
csv_reader = unicodecsv.DictReader(file('ansible_load_file.txt'), encoding='utf-8')
csv_hosts_nim = {row['id'] for row in csv_reader} # change 'id' to whatever respective column header is
#difference
diff_to_del = list(lresult.difference(csv_hosts_nim))
Posts: 84
Threads: 23
Joined: Sep 2018
Oct-03-2018, 12:26 PM
(This post was last modified: Oct-03-2018, 12:26 PM by wardancer84.)
hmm...not sure what that means. there are no headers in the csv file.
csv format:
SAG01105_SAAAP1_HA_LPM,210,Running,172.17.10.105
SAG01127_ZPIPDB_HA_LPM,209,Running,172.17.10.127
AIXHEMADBPRC2_HA,208,Running,172.17.10.208
AIXJBOSS8C2_HA,206,Running,172.17.10.206
AIXJBOSS7C2_HA,205,Running,172.17.10.205
...
Posts: 8,154
Threads: 160
Joined: Sep 2016
Oct-03-2018, 12:26 PM
(This post was last modified: Oct-03-2018, 12:27 PM by buran.)
(Oct-03-2018, 12:22 PM)wardancer84 Wrote: hmm...not sure what that means? lresult will be a set object, constructed using set comprehension.
instead of using reader object to read from the csv file - better use DictReader (no need to exclude the header row, it's used as fieldnames) - and then read respective column (in my example id, but in your file the header may be different).
you can even iterate over the cursor
lresult = {item[0] for item in c2} # already set object
Posts: 8,154
Threads: 160
Joined: Sep 2016
Oct-03-2018, 12:30 PM
(This post was last modified: Oct-03-2018, 12:30 PM by buran.)
based on this
for idx,row in enumerate(csv_data):
if not idx:#skip header if any
continue you have header row or if not then you omit the first row by mistake. In any case if idx==0 you don't process that line
if you are certain you don't have header row then this would also do
# DB data
sql = """SELECT DISTINCT NODE FROM ansible_hosts_view;"""
c2.execute(sql)
lresult = {item[0] for item in c2.fetchall()} # already set object
# csv data
csv_reader = unicodecsv.reader(file('ansible_load_file.txt'), encoding='utf-8')
csv_hosts_nim = {row[0] for row in csv_reader} # change 'id' to whatever respective column header is
#difference
diff_to_del = list(lresult.difference(csv_hosts_nim))
Posts: 84
Threads: 23
Joined: Sep 2018
ok, changed the code accordingly, looks nearly perfect except one thing.
now a "set" and two braces are smeared in which seems to prevent sucessful diff.
set(
[u'AIXSIGHTLINET1', u'AIXSIGHTLINET2', u'SAG01111_SPORTAL2_HA', u'SAG0147_DDSTT', u'KUG01148_JBOSS-T6', u'KUG01146_ALVA_HA', u'AIXSAPP02C2_HA', u'SAG0185_KTNGWI_HA', u'KUG0147_SAPDB03']
...
)
Posts: 8,154
Threads: 160
Joined: Sep 2016
Oct-03-2018, 01:18 PM
(This post was last modified: Oct-03-2018, 01:18 PM by buran.)
post your full code
the two braces ([ and the closing ]) are just representation of set object, i.e. that is because you print it
csv_data = set([u'SAG01127_ZPIPDB_HA_LPM', u'AIXHEMADBPRC2_HA', u'AIXJBOSS8C2_HA', u'AIXJBOSS7C2_HA', u'AIXSTP16T1', u'AIXSAPP01C2_HA', u'AIXSTP11S1', u'AIXHYDDBT', u'AIXARVOPROD_LPM', u'AIXDEVOPSTEST02', u'AIXARVOT', u'AIXDX4PRODC2_HA', u'AIXSTP16T2', u'AIXSTP16PRC2_HA', u'SAG01128_ZPIPWEB_HA_LPM', u'AIXAPPL1LSHI', u'AIXEGIZENTW', u'AIXHVVIAB', u'AIXAPPL2ITU5', u'AIXSTP13PRC2_HA', u'AIXDDA3C2_HA', u'AIXSAPP03C2_HA', u'AIXEGIZTEST2', u'AIXZPVEDUC'])
print(csv_data)
print('\n------------------------\n')
sql_data = [(u'AIXDDA3C1_HA',), (u'SAG0168_AMTSSIGN_HA',), (u'KUG01143_STP14PR_HA',), (u'SAG0183_PFIFET',), (u'AIXSAGRU3',), (u'KUG01154_JBOSS5_HA',), (u'AIXSTP14S5',), (u'SAG01135_STP18PR_HA',), (u'AIXAMTEST2DB',), (u'AIXSTP13T1',), (u'AIXTESTHA2C1_HA_LPM',), (u'KUG01110_WPORTAL1',), (u'AIXDBITU5',), (u'SAG0133_ZPVTTM24',), (u'KUG01175_DABATEST',), (u'AIXSIGHTLINET2',), (u'SAG0194_WNOTES',), (u'KUG01177_WEBPROXY_STP_HA_LPM',), (u'KUG0126_JBOSS-E2',), (u'SAG0137_APACHE_HA',)]
sql_set = {item[0] for item in sql_data}
print(sql_set)
print('\n------------------------\n')
print(list(sql_set.difference(csv_data))) Output: set([u'AIXDDA3C2_HA', u'AIXEGIZENTW', u'SAG01128_ZPIPWEB_HA_LPM', u'AIXHYDDBT', u'AIXSAPP03C2_HA', u'AIXDX4PRODC2_HA', u'AIXAPPL2ITU5', u'AIXSTP13PRC2_HA', u'AIXJBOSS7C2_HA', u'AIXSTP16PRC2_HA', u'AIXSTP11S1', u'SAG01127_ZPIPDB_HA_LPM', u'AIXEGIZTEST2', u'AIXHVVIAB', u'AIXJBOSS8C2_HA', u'AIXARVOT', u'AIXZPVEDUC', u'AIXSAPP01C2_HA', u'AIXAPPL1LSHI', u'AIXHEMADBPRC2_HA', u'AIXSTP16T2', u'AIXARVOPROD_LPM', u'AIXSTP16T1', u'AIXDEVOPSTEST02'])
------------------------
set([u'AIXDDA3C1_HA', u'SAG0194_WNOTES', u'SAG0183_PFIFET', u'KUG01154_JBOSS5_HA', u'AIXSTP14S5', u'SAG01135_STP18PR_HA', u'AIXSTP13T1', u'AIXSAGRU3', u'AIXTESTHA2C1_HA_LPM', u'KUG01175_DABATEST', u'AIXSIGHTLINET2', u'AIXAMTEST2DB', u'KUG01110_WPORTAL1', u'AIXDBITU5', u'SAG0168_AMTSSIGN_HA', u'KUG01177_WEBPROXY_STP_HA_LPM', u'SAG0133_ZPVTTM24', u'SAG0137_APACHE_HA', u'KUG01143_STP14PR_HA', u'KUG0126_JBOSS-E2'])
------------------------
[u'AIXDDA3C1_HA', u'SAG0194_WNOTES', u'SAG0183_PFIFET', u'KUG01154_JBOSS5_HA', u'AIXSTP14S5', u'SAG01135_STP18PR_HA', u'AIXSTP13T1', u'AIXSAGRU3', u'AIXTESTHA2C1_HA_LPM', u'KUG01175_DABATEST', u'AIXSIGHTLINET2', u'AIXAMTEST2DB', u'KUG01110_WPORTAL1', u'KUG0126_JBOSS-E2', u'AIXDBITU5', u'SAG0168_AMTSSIGN_HA', u'KUG01177_WEBPROXY_STP_HA_LPM', u'SAG0133_ZPVTTM24', u'KUG01143_STP14PR_HA', u'SAG0137_APACHE_HA']
>>>
Posts: 84
Threads: 23
Joined: Sep 2018
goal is to delete the hosts which are in "lresult" and not in "csv_hosts_nim"
then insert what is in "csv_data" row0 and row3.
'''
populates ansible inventory db from
aix registry db
'''
# -*- coding: utf-8 -*-
import os
import sys
import unicodecsv
import pymysql as db
try:
import json
from json import JSONDecodeError
except ImportError:
import simplejson as json
from simplejson import JSONDecodeError
import logging
logging.basicConfig(filename='/tmp/reg2inv.log',level=logging.DEBUG, format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p')
logger = logging.getLogger(__name__)
conn = db.connect(host='localhost', user='ansible', passwd='ansible', db='ansible_inv', charset='utf8')
c2 = conn.cursor()
try:
csv_data = unicodecsv.reader(file('ansible_load_file.txt'), encoding='utf-8')
csv_hosts_nim = {row[0] for row in csv_data}
sql = """SELECT DISTINCT NODE FROM ansible_hosts_view;"""
c2.execute(sql)
lresult = {item[0] for item in c2.fetchall()}
print("csv_hosts_nim contains:", csv_hosts_nim)
diff_to_del = list(lresult.difference(csv_hosts_nim))
print("diff_to_del contains:", diff_to_del)
idx = 0
for idx,row in enumerate(diff_to_del, start=1):
host = row
sql = """DELETE FROM host WHERE host='%s';""".format(host)
c2.execute(sql % (host))
logger.debug("hosts to delete: {}".format(diff_to_del))
logger.debug("delete:affected rows = {}".format(idx))
for idx,row in enumerate(csv_data, start=1):
host = row[0]
hostname = row[3]
var_data = {'inventory_lparname': host}
var_json = json.dumps(var_data)
ena = "1"
sql = """INSERT INTO host(host, hostname, variables, enabled) VALUES ('%s', '%s', '%s', '%s') ON DUPLICATE KEY UPDATE
host='{}', hostname='{}',variables='{}',enabled='{}';""".format(host, hostname, var_json, ena)
c2.execute(sql % (host, hostname, var_json, ena))
logger.debug("hosts to insert: {}".format(csv_hosts_nim))
logger.debug("insert:affected rows = {}".format(idx))
except db.Error,e:
print e[0], e[1]
c2.rollback()
c2.close()
conn.commit()
conn.close() stdout:
seems reasonable as db is empty so nothing to delete
('csv_hosts_nim contains:', set([u'AIXSIGHTLINET1', u'AIXSIGHTLINET2', u'SAG01111_SPORTAL2_HA', u'SAG0147_DDSTT', u'KUG01148_JBOSS-T6', u'KUG01146_ALVA_HA', u'AIXSAPP02C2_HA', u'SAG0185_KTNGWI_HA', u'KUG01171_TQLENTW03', u'KUG0175_ORACLE2_HA_LPM'
...
]))
('diff_to_del contains:', []) logfile output:
seems it, for whatever reason, ignores the insert part...
10/03/2018 03:57:55 PM hosts to delete: []
10/03/2018 03:57:55 PM delete:affected rows = 0
10/03/2018 03:57:55 PM hosts to insert: set([u'AIXSIGHTLINET1', u'AIXSIGHTLINET2', u'SAG01111_SPORTAL2_HA', u'SAG0147_DDSTT', u'KUG01148_JBOSS-T6', u'KUG01146_ALVA_HA', u'AIXSAPP02C2_HA', u'SAG0185_KTNGWI_HA', u'KUG01171_TQLENTW03'
...
])
10/03/2018 03:57:55 PM insert:affected rows = 0
Posts: 8,154
Threads: 160
Joined: Sep 2016
Oct-03-2018, 02:20 PM
(This post was last modified: Oct-03-2018, 02:20 PM by buran.)
csv_data is reader object. On line 29 you iterate over it so, after line 29 it is exhausted (i.e. you are at the end). Now your data are in csv_hosts_nim .
Then on line 43 you try to iterate over csv_data again, but at that time you have exhausted the reader (you are at the end of the file), so it just skip that part.
Instead over csv_data, iterate over csv_hosts_nim .
although I would do it differently altogether, you can do
rdr = unicodecsv.reader(file('ansible_load_file.txt'), encoding='utf-8')
csv_data = ((row[0], row[3]) for row in rdr)
csv_hosts_nim = {row[0] for row in csv_data} to minimize changes
|