Python Forum
Python - Pandas writing blank files to file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python - Pandas writing blank files to file
#1
I have a python script that writes to several file formats via Pandas. It can write to CSV/JSON/HTML/Excel.

I'm pulling the data from MongoDB and into a Pandas DF, then writing the contents to a file.

However for some reason the script is writing blank files. When I open the file this is what I see:

https://pasteboard.co/JJwc1ZK.png

Before printing the file I am printing the dataframe to the screen output so I can validate that the data is there. For example with CSV the output to the screen is this:

CSV data: ,AWS Account,Account Number,Name,Instance ID,AMI ID,Volumes,Private IP,Public IP,Private DNS,Availability Zone,VPC ID,Type,Key Pair Name,State,Launch Date
0,project-client-lab,123456789101,bastion001,i-xxxxxxxxxxxxxxxxxxxx,ami-xxxxxxxxxxxxxxxxxxx,vol-xxxxxxxxxxxxxxx,10.238.2.166,3.214.15.175,ip-10-238-2-166.ec2.internal,us-east-1a,vpc-xxxxxxxxxxxxxxxxx,t3.small,project-client-int01,running,March 10 2020
1,project-client-lab,123456789101,logicmonitor001,i-xxxxxxxxxxxxxxxxxxxx,ami-xxxxxxxxxxxxxxxxxxx,vol-0xxxxxxxxxxxxxx,10.238.2.52,,ip-10-238-2-52.ec2.internal,us-east-1a,vpc-xxxxxxxxxxxxxxxxx,m5.large,project-client-int01,running,September 02 2019
2,project-client-lab,123456789101,project-cassandra001,i-xxxxxxxxxxxxxxxxxxxx,ami-xxxxxxxxxxxxxxxxxxx,"vol-xxxxxxxxxxxxxxxxxx, vol-xxxxxxxxxxxxxxxxx",10.238.2.221,,ip-10-238-2-221.ec2.internal,us-east-1a,vpc-xxxxxxxxxxxxxxxxx,m5.large,project-client-int01,running,January 14 2020
3,project-client-lab,123456789101,project-cassandra003,i-xxxxxxxxxxxxxxxxxxxx,ami-xxxxxxxxxxxxxxxxxxx,"vol-xxxxxxxxxxxxxxxxxx, vol-xxxxxxxxxxxxxxxxx",10.238.2.207,,ip-10-238-2-207.ec2.internal,us-east-1a,vpc-xxxxxxxxxxxxxxxxx,m5.large,project-client-int01,running,January 14 2020
4,project-client-lab,123456789101,project-cassandra003,i-xxxxxxxxxxxxxxxxxxxx,ami-xxxxxxxxxxxxxxxxxxx,"vol-xxxxxxxxxxxxxxxxxx, vol-xxxxxxxxxxxxxxxxx",10.238.2.203,,ip-10-238-2-203.ec2.internal,us-east-1a,vpc-xxxxxxxxxxxxxxxxx,c5.xlarge,project-client-int01,running,January 22 2020
5,project-client-lab,123456789101,project-cassandra001,i-xxxxxxxxxxxxxxxxxxxx,ami-xxxxxxxxxxxxxxxxxxx,"vol-xxxxxxxxxxxxxxxxxx, vol-xxxxxxxxxxxxxxxxx",10.238.2.209,,ip-10-238-2-209.ec2.internal,us-east-1a,vpc-xxxxxxxxxxxxxxxxx,c5.xlarge,project-client-int01,running,January 22 2020
6,project-client-lab,123456789101,haproxy001,i-xxxxxxxxxxxxxxxxx,ami-xxxxxxxxxxxxxxxxxxx,vol-xxxxxxxxxxxxxxxxxx,10.238.2.169,54.242.118.165,ip-10-238-2-169.ec2.internal,us-east-1a,vpc-xxxxxxxxxxxxxxxxx,m5.large,project-client-int01,running,February 20 2020
7,project-client-lab,123456789101,logicmonitor002,i-xxxxxxxxxxxxxxx,ami-xxxxxxxxxxxxxxxxxxx,vol-0c48ff6ebb031008a,10.238.2.69,,ip-10-238-2-69.ec2.internal,us-east-1b,vpc-xxxxxxxxxxxxxxxxx,m5.large,project-client-int01,running,September 13 2019
These are the functions that write to file:

def mongo_export_to_file(interactive, aws_account, aws_account_number,instance_col=None,date=None):
    create_directories()
    if date == None:
        format= "%m-%d-%Y"
        today = datetime.today()
        today = today.strftime(format)
        date = today
    else:
        format= "%m-%d-%Y"
        date = datetime.strptime(date,"%m%d%Y")
        date = date.strftime(format)
    if not instance_col:
        _, _, instance_col = set_db()
    # make an API call to the MongoDB server
    if interactive == 0:
        mongo_docs = instance_col.find({})
    else:
        mongo_docs = instance_col.find({"Account Number": aws_account_number})
    # Convert the mongo docs to a DataFrame
    docs = pandas.DataFrame(mongo_docs)
    # Discard the Mongo ID for the documents
    docs.pop("_id")
    if __name__ == "__main__":
        print("Choose a file format")
        print("1. CSV")
        print("2. JSON")
        print("3. HTML")
        print("4. Excel")
        choice = input("Enter a number 1-4: ")
        choice = int(choice)
    else:
        choice = 1
    if choice == 1:
        if __name__ == "__main__":
            # export MongoDB documents to CSV
            csv_export = docs.to_csv(sep=",") # CSV delimited by commas
            print ("\nCSV data:", csv_export)
        # Set the CSV output directory
        output_dir = os.path.join("..", "..", "output_files", "aws_instance_list", "csv", "")
        if interactive == 1:
            output_file = os.path.join(output_dir, "aws-instance-list-" + aws_account + "-" + date +".csv")
        else:
            output_file = os.path.join(output_dir, "aws-instance-master-list-" + date +".csv")

        # export MongoDB documents to a CSV file, leaving out the row "labels" (row numbers)
        docs.to_csv(output_file, ",", index=False) # CSV delimited by commas
    elif choice == 2:
        if __name__ == "__main__":
            json_export = docs.to_json() # return JSON data
            print ("\nJSON data:", json_export)
        # Set the JSON output directory
        output_dir = os.path.join("..", "..", "output_files", "aws_instance_list", "json", "")
        if interactive == 1:
            output_file = os.path.join(output_dir, "aws-instance-list-" + aws_account + "-" + date +".json")
        else:
            output_file = os.path.join(output_dir, "aws-instance-master-list-" + date +".json")
        # export MongoDB documents to a CSV file, leaving out the row "labels" (row numbers)
        docs.to_json(output_file)
    elif choice == 3:
        html_str = io.StringIO()
        # export as HTML
        docs.to_html(
        buf=html_str,
        classes="table table-striped"
        )
        if __name__ == "__main__":
            # print out the HTML table
            print (html_str.getvalue())
        # Set the HTML output directory
        output_dir = os.path.join("..", "..", "output_files", "aws_instance_list", "html", "")
        if interactive == 1:
            output_file = os.path.join(output_dir, "aws-instance-list-" + aws_account + "-" + date +".html")
        else:
            output_file = os.path.join(output_dir, "aws-instance-master-list-" + date + ".html")
        # save the MongoDB documents as an HTML table
        docs.to_html(output_file)
    elif choice == 4:
        # Set the Excel output directory
        output_dir = os.path.join("..", "..", "output_files", "aws_instance_list", "excel", "")
        time.sleep(5)
        if interactive == 1:
            output_file = os.path.join(output_dir, "aws-instance-list-" + aws_account + "-" + date + ".xlsx")
        else:
            output_file = os.path.join(output_dir, "aws-instance-master-list-" + date + ".xlsx")
        # export MongoDB documents to a Excel file, leaving out the row "labels" (row numbers)
        writer = ExcelWriter(output_file)
        docs.to_excel(writer,"EC2 List",index=False)
        writer.save()
        writer.close()
    if __name__ == "__main__":
        exit = input("Exit program (y/n): ")
        if exit.lower() == "y" or exit.lower() == "yes":
            exit_program()
        else:
            main()


def print_reports(interactive,aws_account,aws_account_number):
    set_db(instance_col=None)
    inputDate = input("Enter the date in format 'dd/mm/yyyy': ")
    day,month,year = inputDate.split('/')
    isValidDate = True
    try:
        datetime(int(year),int(month),int(day))
    except ValueError :
        isValidDate = False
        print_reports(interactive,aws_account,aws_account_number)

    if(isValidDate) :
        print(f"Input date is valid: {inputDate}")
        format= "%m%d%Y"
        inputDate = datetime.strptime(inputDate,"%m/%d/%Y")
        inputDate = inputDate.strftime(format)
    else:
        print(f"Input date is not valid: {inputDate}")
        print_reports(interactive,aws_account,aws_account_number)
    myclient = connect_db()
    mydb = myclient["aws_inventories"]
    instance_col = "ec2_list_" + inputDate
    instance_col = mydb[instance_col]
    mongo_export_to_file(interactive, aws_account, aws_account_number,instance_col,date=inputDate)
This is all my code in this script.

Why is this happening and how to I correct that?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Errors if an xlsx file has blank rows in the beginning…. tester_V 1 790 Aug-28-2023, 06:22 PM
Last Post: deanhystad
  How to import an xml file to Pandas sjhazard 0 2,324 Jun-08-2021, 08:19 PM
Last Post: sjhazard
  Import multiple CSV files into pandas Krayna 0 1,693 May-20-2021, 04:56 PM
Last Post: Krayna
  Creating many csv files from Pandas EMA 0 1,609 Jul-26-2020, 06:39 PM
Last Post: EMA
  how to solve the 'NO SUCH DIRECTORY OR FILE' in pandas, python MohammedSohail 10 15,077 May-08-2020, 07:45 AM
Last Post: nnk
  isnull() in pandas not able to identify blank value (Python coding) darpInd 1 2,087 Mar-20-2020, 11:14 AM
Last Post: scidam
  Pandas merge csv files karlito 2 3,137 Dec-16-2019, 10:59 AM
Last Post: karlito
  Need Help With Filtering Data For Excel Files Using Pandas eddywinch82 9 5,980 Aug-06-2019, 03:44 PM
Last Post: eddywinch82
  pandas writing to excel .. help anna 0 2,072 Jun-20-2019, 06:34 AM
Last Post: anna
  loading a csv file into python from pandas. Variable is not recognized vijjumodi 2 2,877 Apr-19-2019, 04:09 AM
Last Post: kus

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020