Posts: 84
Threads: 23
Joined: Sep 2018
i need to compare/diff two lists, one comes from an csv file the other from a mysql query. problem is, they have slightly different formatting.
csv output:
1 |
[u 'SAG01127_ZPIPDB_HA_LPM' , u 'AIXHEMADBPRC2_HA' , u 'AIXJBOSS8C2_HA' , u 'AIXJBOSS7C2_HA' , u 'AIXSTP16T1' , u 'AIXSAPP01C2_HA' , u 'AIXSTP11S1' , u 'AIXHYDDBT' , u 'AIXARVOPROD_LPM' , u 'AIXDEVOPSTEST02' , u 'AIXARVOT' , u 'AIXDX4PRODC2_HA' , u 'AIXSTP16T2' , u 'AIXSTP16PRC2_HA' , u 'SAG01128_ZPIPWEB_HA_LPM' , u 'AIXAPPL1LSHI' , u 'AIXEGIZENTW' , u 'AIXHVVIAB' , u 'AIXAPPL2ITU5' , u 'AIXSTP13PRC2_HA' , u 'AIXDDA3C2_HA' , u 'AIXSAPP03C2_HA' , u 'AIXEGIZTEST2' , u 'AIXZPVEDUC' ]
|
mysql output:
1 |
[(u 'AIXDDA3C1_HA' ,), (u 'SAG0168_AMTSSIGN_HA' ,), (u 'KUG01143_STP14PR_HA' ,), (u 'SAG0183_PFIFET' ,), (u 'AIXSAGRU3' ,), (u 'KUG01154_JBOSS5_HA' ,), (u 'AIXSTP14S5' ,), (u 'SAG01135_STP18PR_HA' ,), (u 'AIXAMTEST2DB' ,), (u 'AIXSTP13T1' ,), (u 'AIXTESTHA2C1_HA_LPM' ,), (u 'KUG01110_WPORTAL1' ,), (u 'AIXDBITU5' ,), (u 'SAG0133_ZPVTTM24' ,), (u 'KUG01175_DABATEST' ,), (u 'AIXSIGHTLINET2' ,), (u 'SAG0194_WNOTES' ,), (u 'KUG01177_WEBPROXY_STP_HA_LPM' ,), (u 'KUG0126_JBOSS-E2' ,), (u 'SAG0137_APACHE_HA' ,)]
|
python compare routine
1 2 3 4 5 6 7 8 9 10 11 |
csv_data = unicodecsv.reader( file ( 'ansible_load_file.txt' ), encoding = 'utf-8' )
sql =
c2.execute(sql)
lresult = c2.fetchall()
csv_hosts_nim = []
for idx,row in enumerate (csv_data):
if not idx:
continue
csv_hosts_nim.append(row[ 0 ])
print (csv_hosts_nim)
diff_to_del = ( list ( set (lresult).difference(csv_hosts_nim)))
|
so how do i format the mysql output to "look" more like the csv format?
Posts: 4,220
Threads: 97
Joined: Sep 2016
Use a list comprehension to pull out the tuple elements:
1 |
lresult = [row[ 0 ] for row in lresult]
|
Posts: 8,165
Threads: 160
Joined: Sep 2016
Oct-03-2018, 12:22 PM
(This post was last modified: Oct-03-2018, 12:22 PM by buran.)
not tested, but something like
1 2 3 4 5 6 7 8 9 10 11 |
sql =
c2.execute(sql)
lresult = {item[ 0 ] for item in c2.fetchall()}
csv_reader = unicodecsv.DictReader( file ( 'ansible_load_file.txt' ), encoding = 'utf-8' )
csv_hosts_nim = {row[ 'id' ] for row in csv_reader}
diff_to_del = list (lresult.difference(csv_hosts_nim))
|
Posts: 84
Threads: 23
Joined: Sep 2018
Oct-03-2018, 12:26 PM
(This post was last modified: Oct-03-2018, 12:26 PM by wardancer84.)
hmm...not sure what that means. there are no headers in the csv file.
csv format:
1 2 3 4 5 6 |
SAG01105_SAAAP1_HA_LPM, 210 ,Running, 172.17 . 10.105
SAG01127_ZPIPDB_HA_LPM, 209 ,Running, 172.17 . 10.127
AIXHEMADBPRC2_HA, 208 ,Running, 172.17 . 10.208
AIXJBOSS8C2_HA, 206 ,Running, 172.17 . 10.206
AIXJBOSS7C2_HA, 205 ,Running, 172.17 . 10.205
...
|
Posts: 8,165
Threads: 160
Joined: Sep 2016
Oct-03-2018, 12:26 PM
(This post was last modified: Oct-03-2018, 12:27 PM by buran.)
(Oct-03-2018, 12:22 PM)wardancer84 Wrote: hmm...not sure what that means? lresult will be a set object, constructed using set comprehension.
instead of using reader object to read from the csv file - better use DictReader (no need to exclude the header row, it's used as fieldnames) - and then read respective column (in my example id, but in your file the header may be different).
you can even iterate over the cursor
1 |
lresult = {item[ 0 ] for item in c2}
|
Posts: 8,165
Threads: 160
Joined: Sep 2016
Oct-03-2018, 12:30 PM
(This post was last modified: Oct-03-2018, 12:30 PM by buran.)
based on this
1 2 3 |
for idx,row in enumerate (csv_data):
if not idx:
continue
|
you have header row or if not then you omit the first row by mistake. In any case if idx==0 you don't process that line
if you are certain you don't have header row then this would also do
1 2 3 4 5 6 7 8 9 10 11 |
sql =
c2.execute(sql)
lresult = {item[ 0 ] for item in c2.fetchall()}
csv_reader = unicodecsv.reader( file ( 'ansible_load_file.txt' ), encoding = 'utf-8' )
csv_hosts_nim = {row[ 0 ] for row in csv_reader}
diff_to_del = list (lresult.difference(csv_hosts_nim))
|
Posts: 84
Threads: 23
Joined: Sep 2018
ok, changed the code accordingly, looks nearly perfect except one thing.
now a "set" and two braces are smeared in which seems to prevent sucessful diff.
1 2 3 4 |
set (
[u 'AIXSIGHTLINET1' , u 'AIXSIGHTLINET2' , u 'SAG01111_SPORTAL2_HA' , u 'SAG0147_DDSTT' , u 'KUG01148_JBOSS-T6' , u 'KUG01146_ALVA_HA' , u 'AIXSAPP02C2_HA' , u 'SAG0185_KTNGWI_HA' , u 'KUG0147_SAPDB03' ]
...
)
|
Posts: 8,165
Threads: 160
Joined: Sep 2016
Oct-03-2018, 01:18 PM
(This post was last modified: Oct-03-2018, 01:18 PM by buran.)
post your full code
the two braces ([ and the closing ]) are just representation of set object, i.e. that is because you print it
1 2 3 4 5 6 7 8 9 10 11 |
csv_data = set ([u 'SAG01127_ZPIPDB_HA_LPM' , u 'AIXHEMADBPRC2_HA' , u 'AIXJBOSS8C2_HA' , u 'AIXJBOSS7C2_HA' , u 'AIXSTP16T1' , u 'AIXSAPP01C2_HA' , u 'AIXSTP11S1' , u 'AIXHYDDBT' , u 'AIXARVOPROD_LPM' , u 'AIXDEVOPSTEST02' , u 'AIXARVOT' , u 'AIXDX4PRODC2_HA' , u 'AIXSTP16T2' , u 'AIXSTP16PRC2_HA' , u 'SAG01128_ZPIPWEB_HA_LPM' , u 'AIXAPPL1LSHI' , u 'AIXEGIZENTW' , u 'AIXHVVIAB' , u 'AIXAPPL2ITU5' , u 'AIXSTP13PRC2_HA' , u 'AIXDDA3C2_HA' , u 'AIXSAPP03C2_HA' , u 'AIXEGIZTEST2' , u 'AIXZPVEDUC' ])
print (csv_data)
print ( '\n------------------------\n' )
sql_data = [(u 'AIXDDA3C1_HA' ,), (u 'SAG0168_AMTSSIGN_HA' ,), (u 'KUG01143_STP14PR_HA' ,), (u 'SAG0183_PFIFET' ,), (u 'AIXSAGRU3' ,), (u 'KUG01154_JBOSS5_HA' ,), (u 'AIXSTP14S5' ,), (u 'SAG01135_STP18PR_HA' ,), (u 'AIXAMTEST2DB' ,), (u 'AIXSTP13T1' ,), (u 'AIXTESTHA2C1_HA_LPM' ,), (u 'KUG01110_WPORTAL1' ,), (u 'AIXDBITU5' ,), (u 'SAG0133_ZPVTTM24' ,), (u 'KUG01175_DABATEST' ,), (u 'AIXSIGHTLINET2' ,), (u 'SAG0194_WNOTES' ,), (u 'KUG01177_WEBPROXY_STP_HA_LPM' ,), (u 'KUG0126_JBOSS-E2' ,), (u 'SAG0137_APACHE_HA' ,)]
sql_set = {item[ 0 ] for item in sql_data}
print (sql_set)
print ( '\n------------------------\n' )
print ( list (sql_set.difference(csv_data)))
|
Output: set([u'AIXDDA3C2_HA', u'AIXEGIZENTW', u'SAG01128_ZPIPWEB_HA_LPM', u'AIXHYDDBT', u'AIXSAPP03C2_HA', u'AIXDX4PRODC2_HA', u'AIXAPPL2ITU5', u'AIXSTP13PRC2_HA', u'AIXJBOSS7C2_HA', u'AIXSTP16PRC2_HA', u'AIXSTP11S1', u'SAG01127_ZPIPDB_HA_LPM', u'AIXEGIZTEST2', u'AIXHVVIAB', u'AIXJBOSS8C2_HA', u'AIXARVOT', u'AIXZPVEDUC', u'AIXSAPP01C2_HA', u'AIXAPPL1LSHI', u'AIXHEMADBPRC2_HA', u'AIXSTP16T2', u'AIXARVOPROD_LPM', u'AIXSTP16T1', u'AIXDEVOPSTEST02'])
------------------------
set([u'AIXDDA3C1_HA', u'SAG0194_WNOTES', u'SAG0183_PFIFET', u'KUG01154_JBOSS5_HA', u'AIXSTP14S5', u'SAG01135_STP18PR_HA', u'AIXSTP13T1', u'AIXSAGRU3', u'AIXTESTHA2C1_HA_LPM', u'KUG01175_DABATEST', u'AIXSIGHTLINET2', u'AIXAMTEST2DB', u'KUG01110_WPORTAL1', u'AIXDBITU5', u'SAG0168_AMTSSIGN_HA', u'KUG01177_WEBPROXY_STP_HA_LPM', u'SAG0133_ZPVTTM24', u'SAG0137_APACHE_HA', u'KUG01143_STP14PR_HA', u'KUG0126_JBOSS-E2'])
------------------------
[u'AIXDDA3C1_HA', u'SAG0194_WNOTES', u'SAG0183_PFIFET', u'KUG01154_JBOSS5_HA', u'AIXSTP14S5', u'SAG01135_STP18PR_HA', u'AIXSTP13T1', u'AIXSAGRU3', u'AIXTESTHA2C1_HA_LPM', u'KUG01175_DABATEST', u'AIXSIGHTLINET2', u'AIXAMTEST2DB', u'KUG01110_WPORTAL1', u'KUG0126_JBOSS-E2', u'AIXDBITU5', u'SAG0168_AMTSSIGN_HA', u'KUG01177_WEBPROXY_STP_HA_LPM', u'SAG0133_ZPVTTM24', u'KUG01143_STP14PR_HA', u'SAG0137_APACHE_HA']
>>>
Posts: 84
Threads: 23
Joined: Sep 2018
goal is to delete the hosts which are in "lresult" and not in "csv_hosts_nim"
then insert what is in "csv_data" row0 and row3.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
import os
import sys
import unicodecsv
import pymysql as db
try :
import json
from json import JSONDecodeError
except ImportError:
import simplejson as json
from simplejson import JSONDecodeError
import logging
logging.basicConfig(filename = '/tmp/reg2inv.log' ,level = logging.DEBUG, format = '%(asctime)s %(message)s' , datefmt = '%m/%d/%Y %I:%M:%S %p' )
logger = logging.getLogger(__name__)
conn = db.connect(host = 'localhost' , user = 'ansible' , passwd = 'ansible' , db = 'ansible_inv' , charset = 'utf8' )
c2 = conn.cursor()
try :
csv_data = unicodecsv.reader( file ( 'ansible_load_file.txt' ), encoding = 'utf-8' )
csv_hosts_nim = {row[ 0 ] for row in csv_data}
sql =
c2.execute(sql)
lresult = {item[ 0 ] for item in c2.fetchall()}
print ( "csv_hosts_nim contains:" , csv_hosts_nim)
diff_to_del = list (lresult.difference(csv_hosts_nim))
print ( "diff_to_del contains:" , diff_to_del)
idx = 0
for idx,row in enumerate (diff_to_del, start = 1 ):
host = row
sql = . format (host)
c2.execute(sql % (host))
logger.debug( "hosts to delete: {}" . format (diff_to_del))
logger.debug( "delete:affected rows = {}" . format (idx))
for idx,row in enumerate (csv_data, start = 1 ):
host = row[ 0 ]
hostname = row[ 3 ]
var_data = { 'inventory_lparname' : host}
var_json = json.dumps(var_data)
ena = "1"
sql =
. format (host, hostname, var_json, ena)
c2.execute(sql % (host, hostname, var_json, ena))
logger.debug( "hosts to insert: {}" . format (csv_hosts_nim))
logger.debug( "insert:affected rows = {}" . format (idx))
except db.Error,e:
print e[ 0 ], e[ 1 ]
c2.rollback()
c2.close()
conn.commit()
conn.close()
|
stdout:
seems reasonable as db is empty so nothing to delete
1 2 3 4 |
( 'csv_hosts_nim contains:' , set ([u 'AIXSIGHTLINET1' , u 'AIXSIGHTLINET2' , u 'SAG01111_SPORTAL2_HA' , u 'SAG0147_DDSTT' , u 'KUG01148_JBOSS-T6' , u 'KUG01146_ALVA_HA' , u 'AIXSAPP02C2_HA' , u 'SAG0185_KTNGWI_HA' , u 'KUG01171_TQLENTW03' , u 'KUG0175_ORACLE2_HA_LPM'
...
]))
( 'diff_to_del contains:' , [])
|
logfile output:
seems it, for whatever reason, ignores the insert part...
1 2 3 4 5 6 |
10 / 03 / 2018 03 : 57 : 55 PM hosts to delete: []
10 / 03 / 2018 03 : 57 : 55 PM delete:affected rows = 0
10 / 03 / 2018 03 : 57 : 55 PM hosts to insert: set ([u 'AIXSIGHTLINET1' , u 'AIXSIGHTLINET2' , u 'SAG01111_SPORTAL2_HA' , u 'SAG0147_DDSTT' , u 'KUG01148_JBOSS-T6' , u 'KUG01146_ALVA_HA' , u 'AIXSAPP02C2_HA' , u 'SAG0185_KTNGWI_HA' , u 'KUG01171_TQLENTW03'
...
])
10 / 03 / 2018 03 : 57 : 55 PM insert:affected rows = 0
|
Posts: 8,165
Threads: 160
Joined: Sep 2016
Oct-03-2018, 02:20 PM
(This post was last modified: Oct-03-2018, 02:20 PM by buran.)
csv_data is reader object. On line 29 you iterate over it so, after line 29 it is exhausted (i.e. you are at the end). Now your data are in csv_hosts_nim .
Then on line 43 you try to iterate over csv_data again, but at that time you have exhausted the reader (you are at the end of the file), so it just skip that part.
Instead over csv_data, iterate over csv_hosts_nim .
although I would do it differently altogether, you can do
1 2 3 |
rdr = unicodecsv.reader( file ( 'ansible_load_file.txt' ), encoding = 'utf-8' )
csv_data = ((row[ 0 ], row[ 3 ]) for row in rdr)
csv_hosts_nim = {row[ 0 ] for row in csv_data}
|
to minimize changes
|