Python Forum
Adding Columns to CSV using iterator
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Adding Columns to CSV using iterator
#1
Here is what I need to do.

1. I have an existing CSV file with 12 columns in it containing financial information. This CSV is sorted. I need to ADD two columns: 'product item number' (a 7 digit int) and 'product item number ID' (a 6 digit int).

2. Once I add the two columns, I need to write a number into the first row for that column and then add 'one' to the number in each subsequent row cell in that column so that each subsequent cell is one more than the previous, i.e. the number increases sequentially with each row in the sort. I need to do this for both 'product item number' and 'product item number ID'.

My main problem is putting the csv writer into a loop so I can increment the variable and write it into subsequent cells in the loop. I have looked at DictWriter, pandas concat (axis 1) and similar. Any help would make my week....thank you very much!
Reply
#2
csv module and enumerate() is all you need. Please, show us your code in code tags and ask specific questions.
csv module has an extensive documentation. also PMOTW article on csv module: https://pymotw.com/3/csv/
Reply
#3
Try

df[ 'product item number']=lambda x:x for x in range(start,end)
Replace start by the number you want to start with, and end by start+number of rows in your data set.

Should work for the other column too.
Reply
#4
Below is what I tried based on previous advice. Enumerate does not work with a writer (writer is not iterable). I am simply trying to open an existing CSV file and write into columns that are already present (or overwrite values that are in those columns). Below is the code I am using. Thank you!

ofile = open('PSL_Combined_Sorted.csv', 'w', newline='')
writer = csv.writer(ofile)
PO_ID = 1000000
PO_IDTrans_ID = 2000000
for row in enumerate(writer):
    writer.writerow({'Purchase Order Number' : PO_ID, 'Purchases Transaction ID Number' : PO_IDTrans_ID})
    PO_IDTrans_ID = PO_IDTrans_ID + 1
    PO_ID = PO_ID + 1
Reply
#5
import csv

in_file = 'InputFile.csv' # change this accordingly
out_file = 'PSL_Combined_Sorted.csv'
with open(in_file, 'r') as in_f, open(out_file, 'w', newline='') as out_f:
    rdr =  csv.DictReader(in_f)
    fieldnames =  ['Purchase Order Number', 'Purchases Transaction ID Number']
    fieldnames.extend(rdr.fieldnames)
    wrtr = csv.DictWriter(out_f, fieldnames=fieldnames)
    wrtr.writeheader()
    for row_id, row in enumerate(rdr, start=1):
        row['Purchase Order Number'] = '{:0>7}'.format(row_id)
        row['Purchases Transaction ID Number'] = '{:0>6}'.format(row_id)
        wrtr.writerow(row)
Reply
#6
I can't be much help since my python isn't great, but I can offer a few snippets of code that may be useful..

df.insert(12, 'Column Name:', "")                                 # insert new blank column
df.insert(13, 'Column Name:', "")                                 # insert new blank column

df.iloc[1, 12] = 'Data here'                           # write to cell
Not sure if it would work, but I'd try something like:

counter = 2
list_total = len(get total of items in column)

df.iloc[1, 12] = '98383'

while counter <= list_total:
    increment = counter
    df.iloc[counter, 12] = '98383' + increment 
    counter = counter + 1
Reply
#7
Thank you, Buran, that code works great. The only remaining issue that I cannot seem to resolve is that the enumerate winds up duplicating the two columns since it makes one pass through the loop for row_ID and one for row (each causes the two columns to be added). Once I have this resolved, I will post the final code here for others who review this post. Thank you again!
Reply
#8
The code in my post does not double the columns. Please, post your code in code tags to see what the problem is. Also, can you elaborate on the difference between the two ID numbers - do they both start at 1 and are virtually identical?
Reply
#9
Sorry, here is the code, in tags. It seems the for loop will run once for row_ID and once for row hence the double entry in my CSV file. Maybe I'm wrong about why this is happening but it is happening. I have tried variations to fix it but to no avail. You will note that the two numbers do start at different points (as below, one starts at 60000 and the other at 700000). Thank you again for your help!

with open(in_file, 'r') as in_f, open(out_file, 'w', newline='') as out_f:
    rdr =  csv.DictReader(in_f)
    fieldnames =  ['Purchase Order Number', 'Purchases Transaction ID Number']
    fieldnames.extend(rdr.fieldnames)
    wrtr = csv.DictWriter(out_f, fieldnames=fieldnames)
    wrtr.writeheader()
    for row_id, row in enumerate(rdr, start=1):
        row['Purchase Order Number'] = '{:0>7}'.format(row_id + 60000)
        row['Purchases Transaction ID Number'] = '{:0>6}'.format(row_id + 700000)
        wrtr.writerow(row)
Reply
#10
The only possible reason for duplicate columns that I can think of is that you already have empty 'Purchase Order Number' and 'Purchases Transaction ID Number' columns in your input file. Because you said you want to ADD my understanding was you need to add the column headers to one already in input file. That is lines#3-#4. If this is the case, try:
with open(in_file, 'r') as in_f, open(out_file, 'w', newline='') as out_f:
    rdr =  csv.DictReader(in_f)
    wrtr = csv.DictWriter(out_f, fieldnames=rdr.fieldnames)
    wrtr.writeheader()
    for row_id, row in enumerate(rdr, start=1):
        row['Purchase Order Number'] = row_id + 60000
        row['Purchases Transaction ID Number'] = row_id + 700000
        wrtr.writerow(row)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Merging rows and adding columns based on matching index pythonnewbie78 3 748 Dec-24-2023, 11:51 AM
Last Post: Pedroski55
  numpy adding columns rwahdan 4 2,281 Sep-21-2021, 08:25 PM
Last Post: deanhystad
  Tesseract-ocr ->iterator.WordFontAttributes() does not work Maia07 0 3,373 Sep-01-2018, 02:43 PM
Last Post: Maia07

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020