Python Forum
python script fails with a lot of data
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
python script fails with a lot of data
#1
I wrote a python script that seemed to work pretty well. It lists EC2 instances in AWS and then writes them to our Confluence wiki.

If it processes one environment with 10 servers it works and writes to Confluence. If it works against multiple AWS accounts or just one account with 100 or more servers it fails to write to Confluence with this stack trace:

Error:
Traceback (most recent call last): File ".\aws_ec2_list_instances_orig.py", line 550, in <module> main() File ".\aws_ec2_list_instances_orig.py", line 543, in main write_data_to_confluence(auth, html, pageid, title) File ".\aws_ec2_list_instances_orig.py", line 391, in write_data_to_confluence r.raise_for_status() File "C:\Users\tdunphy\AppData\Roaming\Python\Python37\site-packages\requests\models.py", line 940, in raise_for_status requests.exceptions.HTTPError: 400 Client Error: for url: https://wiki.us.cworld.company.com/rest/api/content/138317098
Please note I AM NOT ALLOWED TO POST THE COMPANY DOMAIN IN MY POSTS. I will substitute 'company.com' where my real company domain would be.

Here is the script:

#!/usr/bin/env python3

# Import modules
import boto3
import time
import objectpath
import csv
import os
import sys
import json
import requests
from requests_kerberos import HTTPKerberosAuth
import codecs
from datetime import datetime
from os.path import basename
from subprocess import check_output,CalledProcessError,PIPE

BASE_URL = "https://kiki.us.kworld.kpmg.com/rest/api/content"
VIEW_URL = "https://kiki.us.kworld.kpmg.com/pages/viewpage.action?pageId="


def banner(message, border='-'):
    line = border * len(message)
    print(line)
    print(message)
    print(line)


def initialize(interactive, aws_account):
    # Set the date
    today = datetime.today()
    today = today.strftime("%m-%d-%Y")
    # Set source files
    aws_env_list="../../source_files/aws_environments/aws_environments_all.txt"
    output_dir = "../../output_files/aws_instance_list/csv/"
    if interactive == 1:
        output_file = output_dir + 'aws-instance-master-list-' + aws_account + '-' + today +'.csv'
        output_file_name = 'aws-instance-master-list-' + aws_account + '-' + today +'.csv'
    else:
        output_file = output_dir + 'aws-instance-master-list-' + today +'.csv'
        output_file_name = 'aws-instance-master-list-' + today +'.csv'    
    return today, aws_env_list, output_file, output_file_name

def authenticate():
    #auth = get_login()
    auth = ('tdunphy', 'local4tl4nt1cNJ!')
    auth = str(auth).replace('(','').replace('\'','').replace(',',':').replace(')','').replace(' ','')
    kerberos_auth = HTTPKerberosAuth(mutual_authentication="DISABLED",principal=auth)
    auth = kerberos_auth
    return auth

## These are dummy AWS account numbers. I cannot post account number for my company.
def aws_accounts_to_account_numbers(aws_account):
    switcher = {
        'kpmg-lab': '123456789101',
        'kpmg-bill': '123456789102',
        'kpmg-stage': '123456789103',
        'kpmg-dlab': '123456789103',
    }
    return switcher.get(aws_account, "nothing")


def list_instances(aws_account,aws_account_number, interactive):
    today, aws_env_list, output_file, output_file_name = initialize(interactive, aws_account)
    engagement = None
    # Set the account
    session = boto3.Session(profile_name=aws_account)
    ec2 = session.client("ec2")
    fieldnames = [ 'AWS Account', 'Account Number', 'Name', 'Instance ID', 'VPC ID', 'Type', 'Platform', 'State', 'Key Name', 'Private IP', 'Public IP', 'Private DNS', 'Volumes', 'Availability Zone', 'Launch Date', 'Engagement Code']
    # Set the ec2 dictionary
    ec2info = {}
    public_ips_list = ''
    private_ips_list = ''
    private_dns = None
	with open(output_file, mode='w+') as csv_file:
		writer = csv.DictWriter(csv_file, fieldnames=fieldnames, delimiter=',', lineterminator='\n')
		writer.writeheader()
    if 'gov' in aws_account and not 'admin' in aws_account:
        print("This is a gov account.")
        session = boto3.Session(profile_name=aws_account,region_name='us-gov-west-1')
    else:
        print("This is a commercial account.")
        session = boto3.Session(profile_name=aws_account,region_name='us-east-1')

    ec2 = session.client("ec2")
    # Loop through the instances
    instance_list = ec2.describe_instances()
    for reservation in instance_list["Reservations"]:
            for instance in reservation.get("Instances", []):
                launch_time = instance["LaunchTime"]
                launch_time_friendly = launch_time.strftime("%B %d %Y")
                tree = objectpath.Tree(instance)
                block_devices = set(tree.execute('$..BlockDeviceMappings[\'Ebs\'][\'VolumeId\']'))
                if len(block_devices) == 0:
                    block_devices_list = None
                else:
                    block_devices_list = list(block_devices)
                    block_devices_list = str(block_devices_list).replace('[','').replace(']','').replace('\'','').replace('{','').replace('}', '')
                private_ips =  set(tree.execute('$..PrivateIpAddress'))
                if len(private_ips) == 0:
                    private_ips_list = None
                else:
                    private_ips_list = list(private_ips)
                    private_ips_list = str(private_ips_list).replace('[','').replace(']','').replace('\'','')
                public_ips =  set(tree.execute('$..PublicIp'))
                if len(public_ips) == 0:
                    public_ips_list = None
                else:
                    public_ips_list = list(public_ips)
                    public_ips_list = str(public_ips_list).replace('[','').replace(']','').replace('\'','')
                if 'KeyName' in instance:
                    key_name = instance['KeyName']
                else:
                    key_name = None
                name = None
                if 'Tags' in instance:
                    try:
                        tags = instance['Tags']
                        name = None
                        for tag in tags:
                            if tag["Key"] == "Name":
                                name = tag["Value"]
                        for tag in tags:
                            if tag["Key"] == "Engagement":
                                engagement = tag["Value"]
                            else:
                                engagement = None
                    except ValueError:
                        print("Instance: %s has no tags" % instance_id)
                if 'VpcId' in instance:
                    vpc_id = instance['VpcId']
                else:
                    vpc_id = None
                if 'PrivateDnsName' in instance:
                    private_dns = instance['PrivateDnsName']
                else:
                    private_dns = None
                if 'Platform' in instance:
                    platform = instance['Platform']
                else:
                    platform = None
                ec2info[instance['InstanceId']] = {
                    'AWS Account': aws_account,
                    'Account Number': aws_account_number,
                    'Name': name,
                    'Instance ID': instance['InstanceId'],
                    'VPC ID': vpc_id,
                    'Type': instance['InstanceType'],
                    'Platform': platform,
                    'State': instance['State']['Name'],
                    'Key Name': key_name,
                    'Private IP': private_ips_list,
                    'Public IP': public_ips_list,
                    'Private DNS': private_dns,
                    'Volumes': block_devices_list,
                    'Availability Zone': instance['Placement']['AvailabilityZone'],
                    'Launch Date': launch_time_friendly,
                    'Engagement Code': engagement
                }
                with open(output_file,'a') as csv_file:
                    writer = csv.DictWriter(csv_file, fieldnames=fieldnames, delimiter=',', lineterminator='\n')
                    writer.writerow({'AWS Account': aws_account, "Account Number": aws_account_number, 'Name': name, 'Instance ID': instance["InstanceId"], 'VPC ID': vpc_id, 'Type': instance["InstanceType"], 'Platform': platform, 'State': instance["State"]["Name"], 'Key Name': key_name,  'Private IP': private_ips_list, 'Public IP': public_ips_list, 'Private DNS': private_dns, 'Volumes': block_devices, 'Availability Zone': instance['Placement']['AvailabilityZone'], 'Launch Date': launch_time_friendly, 'Engagement Code': engagement})
    for instance_id, instance in ec2info.items():
        print(Fore.RESET + "-------------------------------------")
        for key in [
            'AWS Account',
            'Account Number',
            'Name',
            'Instance ID',
            'VPC ID',
            'Type',
            'Platform',
            'Key Name',
            'State',
            'Private IP',
            'Public IP',
            'Private DNS',
            'Volumes',
            'Availability Zone',
            'Launch Date',
            'Engagement Code'
        ]:
            print(Fore.GREEN + "{0}: {1}".format(key, instance.get(key)))   
        time.sleep(2)
    print(Fore.RESET + "-------------------------------------")
    with open(output_file,'a') as csv_file:
        csv_file.close()
    return output_file


def convert_csv_to_html_table(output_file, today, interactive, aws_account):
    output_dir = "../../output_files/aws_instance_list/html/"
    if interactive == 1:
        htmlfile = output_dir + 'aws-instance-master-list-' + aws_account + '-' + today +'.html'
        htmlfile_name = 'aws-instance-master-list-' + aws_account + '-' + today +'.html'
    else:
        htmlfile = output_dir + 'aws-instance-master-list-' + today +'.html'
        htmlfile_name = 'aws-instance-master-list-' + today +'.html'
    count = 0
    html = ''
    with open(output_file,'r') as CSVFILE:
        reader = csv.reader(CSVFILE)
        with open(output_file,'r') as CSVFILE:
            reader = csv.reader(CSVFILE)
            html += "<table><tbody>"
            for row in reader:
                html += "<tr>"
                # Process the headers
                if count == 0:
                    for column in row:
                        html += "<th>%s</th>" % column
                else:
                    # Process the data
                    for column in row:
                        html += "<td>%s</td>" % column
                html += "</tr>"
                count += 1
            html += "</tbody></table>"
    with open(htmlfile,'w+') as HTMLFILE:
        HTMLFILE.write(html)
    return htmlfile, htmlfile_name


def get_page_ancestors(auth, pageid):
    # Get basic page information plus the ancestors property
    url = '{base}/{pageid}?expand=ancestors'.format(
        base = BASE_URL,
        pageid = pageid)
    r = requests.get(url, auth = auth)
    r.raise_for_status()
    return r.json()['ancestors']


def get_page_info(auth, pageid):
    url = '{base}/{pageid}'.format(
        base = BASE_URL,
        pageid = pageid)
    r = requests.get(url, auth = auth)
    r.raise_for_status()
    return r.json()

 
def write_data_to_confluence(auth, html, pageid, title = None):
    info = get_page_info(auth, pageid)
    ver = int(info['version']['number']) + 1
    ancestors = get_page_ancestors(auth, pageid)
    anc = ancestors[-1]
    del anc['_links']
    del anc['_expandable']
    del anc['extensions']
    if title is not None:
        info['title'] = title
    data = {
        'id' : str(pageid),
        'type' : 'page',
        'title' : info['title'],
        'version' : {'number' : ver},
        'ancestors' : [anc],
        'body'  : {
            'storage' :
            {
                'representation' : 'storage',
                'value' : str(html)
            }
        }
    }
    data = json.dumps(data)
    url = '{base}/{pageid}'.format(base = BASE_URL, pageid = pageid)
    r = requests.put(
        url,
        data = data,
        auth = auth,
        headers = { 'Content-Type' : 'application/json' }
    )
    r.raise_for_status()
    print("Wrote '%s' version %d" % (info['title'], ver))
    print("URL: %s%d" % (VIEW_URL, pageid))
 
def main():
    pageid = 138317098
    title = 'AWS EC2 Instance List'
	aws_account = input("Enter the name of the AWS account you'll be working in: ")
	aws_account_number = aws_accounts_to_account_numbers(aws_account)
	today, aws_env_list, output_file, output_file_name = initialize(interactive, aws_account)
	output_file = list_instances(aws_account,aws_account_number, interactive)
	htmlfile, htmlfile_name = convert_csv_to_html_table(output_file, today, interactive, aws_account)
	with open(htmlfile, 'r', encoding='utf-8') as htmlfile:
		html = htmlfile.read()
	auth = authenticate()
	write_data_to_confluence(auth, html, pageid, title)


if __name__ == "__main__":
    main()
Why does this script fail to write to confluence only when it processes a lot of servers?
Reply
#2
error is pretty clear:
Quote:requests.exceptions.HTTPError: 400 Client Error: for url: https://wiki.us.cworld.company.com/rest/.../138317098
400 error is bad request -- I tried the URL, does not exist!
Reply
#3
(Nov-11-2019, 12:12 AM)Larz60+ Wrote: error is pretty clear:
Quote:requests.exceptions.HTTPError: 400 Client Error: for url: https://wiki.us.cworld.company.com/rest/.../138317098
400 error is bad request -- I tried the URL, does not exist!

I am not allowed to post the company domain in my posts. I am substituting 'company.com' where my real company domain would be. I've updated the post to reflect that.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Anaconda 2.4.2: The JupyterLab 3.5.3 fails to run the python code of the Geographical jamalnuman 0 313 Aug-23-2023, 07:48 AM
Last Post: jamalnuman
  Is there a *.bat DOS batch script to *.py Python Script converter? pstein 3 3,008 Jun-29-2023, 11:57 AM
Last Post: gologica
  How to modify python script to append data on file using sql server 2019? ahmedbarbary 1 1,176 Aug-03-2022, 06:03 AM
Last Post: Pedroski55
  Refresh data in python script while running in Terminal frankenchrist 4 7,100 Feb-03-2021, 09:54 AM
Last Post: Larz60+
  Embedded python fails to compile on Raspberry Pi tryfon 2 3,398 Dec-22-2020, 02:06 PM
Last Post: tryfon
  pip3 install opencv-python fails on 'skbuild' Pedroski55 2 5,622 Sep-15-2020, 11:33 AM
Last Post: snippsat
  Runs perfect in Python but fails to print last statement when converted to .exe. Help sunil422 3 2,747 Aug-13-2020, 01:22 PM
Last Post: deanhystad
  Python 3.6.10 fails when running a localized script. Mikee 2 2,100 Jul-09-2020, 10:25 PM
Last Post: Mikee
  How to kill a bash script running as root from a python script? jc_lafleur 4 5,788 Jun-26-2020, 10:50 PM
Last Post: jc_lafleur
  crontab on RHEL7 not calling python script wrapped in shell script benthomson 1 2,248 May-28-2020, 05:27 PM
Last Post: micseydel

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020