Python Forum
Thread Rating:
  • 1 Vote(s) - 2 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Issue with reading CSV file
#3
Below is the code.
I tried with print row and observations are as below
1. When file is created on windows and save as csv output is as expected i.e
[Column_1 Varchar(20)]
[Column_2 Number(10,2)]
[Column_3 Decimal(4,1)]
and it reads the row correctly.

2. When same file is taken to Unix it is read as below
['Column_1', 'Varchar(20)']
['Column_2', 'Number(10', '2)']
['Column_3', 'Decimal(4', '1)']

There seems the problem for , (Comma )it is treating it as separate string.

import csv,os,sys

if len(sys.argv)<2:
         print ("\nUsage: csv2tbl.py path/datafile.csv (0,1,2,3 = column name format):")
         print ("\nFormat: 0 = TitleCasedWords")
         print ("        1 = Titlecased_Words_Underscored")
         print ("        2 = lowercase_words_underscored")
         print ("        3 = Words_underscored_only (leave case as in source)")
         sys.exit()
else:
         if len(sys.argv)==3:
                  dummy,schemaname, datafile, = sys.argv
                  namefmt = '0'
         else: dummy, datafile, namefmt = sys.argv


#outfile = os.path.basename(datafile)
filename = os.path.basename(datafile).split('.')[0]
outfile = os.path.dirname(datafile)  + filename + '.sql'

tblname = schemaname + '.' + filename


partition_param_1 = 'ingestion_year  int '
partition_param_2 = 'ingestion_month  int'
partition_param_3 = 'ingestion_day int'
partition_string = partition_param_1 + ',' + partition_param_2 + ',' + partition_param_3

row_format='org.apache.hadoop.hive.serde2.avro.AveroSerDe'
stored_as=''
output_format=''
location='/HADOOP/RAW/' + schemaname + '/' + tblname + '/GOOD'
table_properties= '/HADOOP/RAW/' + schemaname + '/' + tblname + '/GOOD'

    


sql = 'CREATE EXTERNAL TABLE %s\n(' % (tblname)
# Create list of column [names],[widths]
with open (datafile) as csvfile:
        reader = csv.reader(csvfile,dialect='excel')
        row = next(reader)
        for row in reader:
            print(row)
            sql = sql + (" ".join(row)) +  (",") + "\n"
        
sql= sql[:-2]

sql = sql + ') \n Partition By (' + partition_string +')'
sql = sql + ' \n ROW FORMAT SERDE (' + row_format +')'
sql = sql + ' \n STORED AS (' + stored_as +')'
sql = sql + ' \n OUTPUT FORMAT (' + stored_as +')'
sql = sql + ' \n LOCATION (' + location +')'
sql = sql + ' \n TABLE PROPERTIES (' + table_properties +')'


with  open(outfile,'w') as sqlfile:
    sqlfile.write(sql)

sqlfile.close

print ('%s created.' % (outfile))
Reply


Messages In This Thread
Issue with reading CSV file - by nnsatpute - Dec-10-2018, 11:38 AM
RE: Issue with reading CSV file - by Gribouillis - Dec-10-2018, 12:46 PM
RE: Issue with reading CSV file - by nnsatpute - Dec-10-2018, 01:08 PM
RE: Issue with reading CSV file - by Gribouillis - Dec-10-2018, 03:19 PM
RE: Issue with reading CSV file - by woooee - Dec-10-2018, 04:34 PM
RE: Issue with reading CSV file - by nnsatpute - Dec-11-2018, 04:36 AM
RE: Issue with reading CSV file - by Gribouillis - Dec-11-2018, 06:01 AM
RE: Issue with reading CSV file - by nnsatpute - Dec-11-2018, 06:23 AM
RE: Issue with reading CSV file - by Gribouillis - Dec-11-2018, 06:35 AM
RE: Issue with reading CSV file - by nnsatpute - Dec-19-2018, 07:44 AM
RE: Issue with reading CSV file - by Gribouillis - Dec-19-2018, 08:39 AM
RE: Issue with reading CSV file - by nnsatpute - Dec-19-2018, 10:58 AM
RE: Issue with reading CSV file - by Gribouillis - Dec-19-2018, 12:08 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
Sad problems with reading csv file. MassiJames 3 731 Nov-16-2023, 03:41 PM
Last Post: snippsat
  Reading a file name fron a folder on my desktop Fiona 4 1,008 Aug-23-2023, 11:11 AM
Last Post: Axel_Erfurt
  Reading data from excel file –> process it >>then write to another excel output file Jennifer_Jone 0 1,168 Mar-14-2023, 07:59 PM
Last Post: Jennifer_Jone
  Reading a file JonWayn 3 1,158 Dec-30-2022, 10:18 AM
Last Post: ibreeden
  Reading Specific Rows In a CSV File finndude 3 1,038 Dec-13-2022, 03:19 PM
Last Post: finndude
  Excel file reading problem max70990 1 931 Dec-11-2022, 07:00 PM
Last Post: deanhystad
  Replace columns indexes reading a XSLX file Larry1888 2 1,032 Nov-18-2022, 10:16 PM
Last Post: Pedroski55
  Failing reading a file and cannot exit it... tester_V 8 1,868 Aug-19-2022, 10:27 PM
Last Post: tester_V
  I have an issue with Netmiko Error reading SSH protocol banner omarhegazy 2 3,639 May-16-2022, 06:05 PM
Last Post: omarhegazy
  Reading .csv file doug2019 4 1,769 Apr-29-2022, 09:55 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020