Python Forum
Creating a delimited file from DB Table - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Creating a delimited file from DB Table (/thread-12821.html)



Creating a delimited file from DB Table - anubhav2020 - Sep-14-2018

Hi there,

I am new to python (SAS background).
I have this requirement to create a PIPE delimited file from Netezza tables. I tried the following approach. However, nothing is being created. The tables are pretty huge in size (like 2 billion rows in some).

Please can you point out what am I doing wrong here.

import pyodbc
import csv
import sys

dbconn = <db connection > /* works fine */

dbcursor = connection.cursor() /* works fine */

dbcursor.execute("select column1, column2 from schema.table")
result=dbcursor .fetchall()
c = csv.writer(open("C:\location\output.dat","w+"), delimiter='|')
c.writerow(result)
c.close()
dbconn.close()



RE: Creating a delimited file from DB Table - anubhav2020 - Sep-15-2018

The code runs for about 20 minutes and the file has 0 rows.
There is nothing printed to the screen (i am running it using Python IDLE (v3.6)


RE: Creating a delimited file from DB Table - Larz60+ - Sep-15-2018

Don't know much at all about Netezza, can you point to some docs?
There may be a better way to access other than ODBC


RE: Creating a delimited file from DB Table - volcano63 - Sep-15-2018

The file path is the source of your problem - C:\location\output.dat. Backslash is an escape symbol, giving a special meaning to to following character, e.g. \n.

Either prefix the path with r to make it a raw string - r'C:\location\output.dat', or use double backslash. Replacing backslash with a slash will work too.

PS IDLE is one of the worst Python environments, I would recommend switching to Spyder or Jupyter


RE: Creating a delimited file from DB Table - anubhav2020 - Sep-16-2018

(Sep-15-2018, 02:23 PM)volcano63 Wrote: The file path is the source of your problem - C:\location\output.dat. Backslash is an escape symbol, giving a special meaning to to following character, e.g. \n.

Either prefix the path with r to make it a raw string - r'C:\location\output.dat', or use double backslash. Replacing backslash with a slash will work too.

PS IDLE is one of the worst Python environments, I would recommend switching to Spyder or Jupyter

Thanks much for the solution. The irony is, my code works for smaller tables just fine (like 100000 rows). The issue (if at all it is an issue) crops up as the table size increases.

I will try the 'r' in my code and see if it gives my any result set.


RE: Creating a delimited file from DB Table - anubhav2020 - Sep-17-2018

I tried the 'r' for raw string and did not make any difference. So I changed my code as follows.

dbconn = "db connection" 
 
dbcursor = connection.cursor()
 
dbcursor.execute("select column1, column2 from schema.table limit 5000000")
result=dbcursor .fetchall()
c = csv.writer(open(r'C:\location\output.dat',"w"), delimiter='|')
for row in result:
	c.writerow(row)
The code works fine for tables upto 5,000,000 rows, beyond that, i see the following error. The table that i am working has 21 million rows and i need them all. I also failed to insert the header row.


Error:
========= RESTART: C:\Users\bkumarm\Desktop\2018\Python\nzunload.py ========= Traceback (most recent call last): File "C:\Users\bkumarm\Desktop\2018\Python\nzunload.py", line 11, in <module> result=cur.fetchall() MemoryError



RE: Creating a delimited file from DB Table - volcano63 - Sep-17-2018

(Sep-17-2018, 01:28 PM)anubhav2020 Wrote: I tried the 'r' for raw string and did not make any difference. So I changed my code as follows.
[/python]

Looks like handling of backslash in strings has changed lately - I was not aware of that. It only breaks when backslash with the following character creates a defined escape sequence, e.g.
  • \n - line feed
  • \t - tabulation
  • \a - beep
  • \b - backspace
(for extended list - see table in Wiki)

Nevertheless, while that was not your problem, backslashes in string may cause an unintended behaviour, so - be careful with those. Pay attention how r prefix or double backslash change the string behaviour
Output:
In [111]: print('C:\location\noutput.dat', 'C:\location\\noutput.dat') C:\location output.dat C:\location\noutput.dat In [112]: print('C:\location\toutput.dat', 'C:\location\\toutput.dat') C:\location output.dat C:\location\toutput.dat In [113]: print('C:\location\aoutput.dat', r'C:\location\aoutput.dat') C:\locationoutput.dat C:\location\aoutput.dat In [114]: print('C:\location\boutput.dat', r'C:\location\boutput.dat') C:\locatiooutput.dat C:\location\boutput.dat In [115]: print('C:\location\x25output.dat', r'C:\location\x25output.dat') C:\location%output.dat C:\location\x25output.dat



RE: Creating a delimited file from DB Table - anubhav2020 - Sep-19-2018

(Sep-17-2018, 03:19 PM)volcano63 Wrote:
(Sep-17-2018, 01:28 PM)anubhav2020 Wrote: I tried the 'r' for raw string and did not make any difference. So I changed my code as follows.
[/python]

Looks like handling of backslash in strings has changed lately - I was not aware of that. It only breaks when backslash with the following character creates a defined escape sequence, e.g.
  • \n - line feed
  • \t - tabulation
  • \a - beep
  • \b - backspace
(for extended list - see table in Wiki)

Nevertheless, while that was not your problem, backslashes in string may cause an unintended behaviour, so - be careful with those. Pay attention how r prefix or double backslash change the string behaviour
Output:
In [111]: print('C:\location\noutput.dat', 'C:\location\\noutput.dat') C:\location output.dat C:\location\noutput.dat In [112]: print('C:\location\toutput.dat', 'C:\location\\toutput.dat') C:\location output.dat C:\location\toutput.dat In [113]: print('C:\location\aoutput.dat', r'C:\location\aoutput.dat') C:\locationoutput.dat C:\location\aoutput.dat In [114]: print('C:\location\boutput.dat', r'C:\location\boutput.dat') C:\locatiooutput.dat C:\location\boutput.dat In [115]: print('C:\location\x25output.dat', r'C:\location\x25output.dat') C:\location%output.dat C:\location\x25output.dat

After multiple attempts, it finally worked.
But the issue with large files running out of memory still persists. Any suggestions as to what cn be changed in the code!!


RE: Creating a delimited file from DB Table - volcano63 - Sep-19-2018

why don't you use fetchmany instead of fetchall?


RE: Creating a delimited file from DB Table - Axel_Erfurt - Sep-19-2018

I have a DB Viewer at github
I use it to export Tables from DB to CSV (comma) or TSV (tab)

made with PyQt5