Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
script optimization
#1
First post to the forum!  Looking forward to hearing your guys input.

I have a script I'm writing at work that will help us audit our cisco ASA firewall rules.  It parses the acl output into objects and then inserts/updates the information in a MySQL database.  The whole process takes longer then I think it should, and I need some help with a few things.  Using multithreading I am getting the firewall acl output eight firewalls at a time.  This has reduced the time for getting the acl out from over an hour down to around 15 minutes (this is an enterprise level organization, and this is being done with 52 firewall contexts).  However, inserting/updating the rule information for all firewalls into/on the database still takes over an hour.  I would really like to cut that time down as I did with getting the acl output.  I'm just not sure what the best approach to take.  Would it be best to use multithreading, multiprocessing, or mysql connection pooling?  Or a combination of any of the three?  Thanks in advance to anyone who can help!
Reply
#2
How long does it take to get the acl for a single firewall? I feel like 15 minutes is still a huge amount of time, unless you're spending most of that time just waiting for the device to respond.

For the mysql tables, are there indexes in place? How many rows? Over an hour just to add some records is also a little ridiculous. How many acl rules are you getting at once? Is it several thousand? When inserting, do you insert each individually, or have a single insert statement for all of them?

Before talking multiprocessing, I think it'd be worthwhile to figure out why it's so slow to begin with...
Reply
#3
Depends on the firewall.  But at most 5 minutes.  The acls are in clusters and the output has a side a and side b.  The largest output file is 5.1MB.  The database does have multiple tables and they are indexed.  I do some checks before inserting/updating the rules.  Currently the table has over 110,000 rules in it.  Most files are downloaded and parsed fairly quickly, but there are about 10 of the 52 contexts that take longer then the others, with the 5.1MB file taking almost 30 minutes to get the output and parse into the database.  I have a DB engineer on my team who helped me ensure the database design was optimized, as well as the queries.  I've revised my code quite a few times thus far, optimizing functions and classes where I can.
Reply
#4
Have you done some real profiling of your application? No use optimizing the wrong bits.

The design of the DB can be OK but if you insert data element by element and committing each time this takes time. Bulk insertion can be your friend.
Unless noted otherwise, code in my posts should be understood as "coding suggestions", and its use may require more neurones than the two necessary for Ctrl-C/Ctrl-V.
Your one-stop place for all your GIMP needs: gimp-forum.net
Reply
#5
Also, if the table(s) have indexes, it can be beneficial to delete the indexes while inserting/updating many rows, and then readding the index when done. Otherwise, the db rebuilds the index at each update, making it very slow.
Reply
#6
Thanks @Ofnuts and @nilamo for the good suggestions.  I will look at both of those today and report back.   Smile
Reply
#7
So, I've found an issue with pulling the configs via SSH using multithreading.  Going to take that into the networking help area.  BUT.......I did add some functions for doing bulk updates and inserts into my database rather then one at a time.  Made a HUGE difference.  Ran through all 52 firewall contexts in 20 minutes.  I didn't know that was an option, but I guess if I would have kept digging I would have found it eventually.  Thanks for the help thus far (seriously, thank you), and I will return back to this to discuss some more once I figure out my SSH issue.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020