Posts: 15
Threads: 5
Joined: Jan 2018
Hello Forum,
I'm using csv package to build a .csv file in Python v3.6.
What I'm doing is converting and combining several .bib files into one CSV, which like reading text files and then putting them all in one CSV format. The nature of the files that I'm dealing with does not allow me to know the full header of the CSV in advance. Each file might add one or more header item. Which means that as I'm reading the files, the header keeps updating.
The code that I wrote is using csv.writer.writerow() to build the "Body" of the CSV first. After the CSV body is completely built, I want to go back and write the header.
So the question is, after using csv.writer.writerow(), how to go back to the top of the CSV file and add new first row (the header)?
Thank you all,
Posts: 87
Threads: 1
Joined: Dec 2017
As far as I know you cannot go back to the beginning of a file with the csv package. One approach could be to write the complete header to a new file and append the data rows (read from the "body" file) right after it.
Posts: 12,022
Threads: 484
Joined: Sep 2016
Jan-03-2018, 01:51 AM
(This post was last modified: Jan-03-2018, 01:52 AM by Larz60+.)
Quote:As far as I know you cannot go back to the beginning of a file with the csv package
It's a file, so you should be able to use seek command to position to beginning.
If for some obscure reason that didn't work, you can always close and then reopen.
I assume since cpython is written in C. Seek works the same.
So you should be able to move around the file at will. This doesn't make sense unless dealing with
fixed length records, unless you return to a previous position.
for example, use tell to get current position, seek to 0 to reread header, and then seek back to position saved from the tell
Posts: 87
Threads: 1
Joined: Dec 2017
Quote:As far as I know you cannot go back to the beginning of a file with the csv package
It's a file, so you should be able to use seek command to position to beginning.
Correct, you can use file.seek(0) to position yourself at the beginning of a file but it is not part of the csv package. Furthermore, the OP doesn't want to read from the beginning of the file but to write a longer header and definitively the csv package is not meant for that.
Posts: 8,151
Threads: 160
Joined: Sep 2016
(Jan-03-2018, 09:10 AM)squenson Wrote: you can use file.seek(0) to position yourself at the beginning of a file but it is not part of the csv package file opening is not dealt with from csv module anyway. otherwise I agree that after using seek to reposition and write longer header would overwrite part of the information
Posts: 12,022
Threads: 484
Joined: Sep 2016
Jan-03-2018, 10:42 AM
(This post was last modified: Jan-03-2018, 10:42 AM by Larz60+.)
You can do this:
class ChangeHeader:
def __init__(self, file, newheader=None, newfile=None):
self.file = file
self.newfile = newfile
self.nhead = newheader
self.fix_header()
def fix_header(self):
addednew = False
with open(self.file, 'r') as f, open(self.newfile, 'w') as f1:
for line in f:
if not addednew:
f1.write(self.nhead)
addednew = True
else:
f1.write(line)
def testit():
newhead = 'Product Name PartDesc Drawings Issues Documents'
ChangeHeader(file='somedwgs.csv', newheader=newhead, newfile='somedwgs_new.csv')
if __name__ == '__main__':
testit() results:
Before:
after
Attached Files
somedwgs.csv (Size: 391 bytes / Downloads: 348)
Posts: 15
Threads: 5
Joined: Jan 2018
Thank you all for your very helpful and informative responses.
Unfortunately, I'm still having the problem.
I cannot use regular text writing function write() because some of the data has "commas", which leads to misinterpretation by the CSV reader. That because I'm using csv package.
I would appreciate if you kindly lake a look at my code in GitHub repository in this link:
GitHub repository for converting .bib 2 CSV
Please Open "ReadingBib1.py"
Two problems I have in the resultant "ConferencePublication.csv" file:
- The file has to header. Header's content is in a "List variable" named: CSVHeader. It's printed at the end of the code in line no. 62.
- I do not know why I always get an empty line between the lines in "ConferencePublication.csv" file!
Please notice that the Python code write each line of CSV using the command: CSVWriterPointer.writerow(CSVLineContent) in line no. 61
Thanks again for you all  ...
Posts: 4,781
Threads: 76
Joined: Jan 2018
Can't you make a two-pass program? You could read the input file once and compute the csv header without writing any output, then go back at the beginning of the input file and use a csv writer on the second pass.
Posts: 12,022
Threads: 484
Joined: Sep 2016
Jan-03-2018, 11:57 PM
(This post was last modified: Jan-03-2018, 11:58 PM by Larz60+.)
The class that I presented allows a CSV file to be read in and modified, without compromising
file integrity. This is demonstrated in the two excel displays on my post. The first with the original
header, the second after the header was modified.
A file is a file. So long as the separators are not messed with, things like this can be done.
At any point, I cloned your github repository and will look at it after supper.
Posts: 12,022
Threads: 484
Joined: Sep 2016
Question: Did the .bib file originate from a website?
If so, there may be an easier way to do this.
|