Mar-30-2023, 03:07 PM
Hi Dean, I inserted the code into my existing code for reading the file and deleting empty lines. I altered it some because the file is an excel file, and am getting the following error:
TypeError: read_excel() got an unexpected keyword argument 'delimiter'
Here is the whole block of code, some names are changed for confidentiality. I also changed the column names after checking my data.
# Read the Directory file and delete the empty lines
project = "V2_L"
folder = "File exports"
filename = "DIRECTORY.xlsx"
filePath = project + "/" + folder + "/" + filename
print(filePath)
s3 = boto3.client('s3')
obj = s3.get_object(Bucket=bucket, Key=ennovfPath)
data = obj['Body'].read()
directory = pd.read_excel(io.BytesIO(data), delimiter="-",names=["SAMPLEID", "VIS_ISOLATE_NUMBER"])
directory.sort_values(by=["SAMPLEID", "VIS_ISOLATE_NUMBER"], inplace=True)
directory.columns = map(lambda x: str(x).upper(), directory.sort_values.columns)
directory=directory.sort_values.columns[directory['SAMPLEID'].isna()!=True]
directory['SAMPLEID']=directory['SAMPLEID'].astype(str)
#Create a variable SITEID based on the SUBJID (run)
directory['SITEID'] = directory['SUBJID'].str.split('_').str[0]
TypeError: read_excel() got an unexpected keyword argument 'delimiter'
Here is the whole block of code, some names are changed for confidentiality. I also changed the column names after checking my data.
# Read the Directory file and delete the empty lines
project = "V2_L"
folder = "File exports"
filename = "DIRECTORY.xlsx"
filePath = project + "/" + folder + "/" + filename
print(filePath)
s3 = boto3.client('s3')
obj = s3.get_object(Bucket=bucket, Key=ennovfPath)
data = obj['Body'].read()
directory = pd.read_excel(io.BytesIO(data), delimiter="-",names=["SAMPLEID", "VIS_ISOLATE_NUMBER"])
directory.sort_values(by=["SAMPLEID", "VIS_ISOLATE_NUMBER"], inplace=True)
directory.columns = map(lambda x: str(x).upper(), directory.sort_values.columns)
directory=directory.sort_values.columns[directory['SAMPLEID'].isna()!=True]
directory['SAMPLEID']=directory['SAMPLEID'].astype(str)
#Create a variable SITEID based on the SUBJID (run)
directory['SITEID'] = directory['SUBJID'].str.split('_').str[0]
buran write Mar-30-2023, 04:04 PM:
Please, use proper tags when post code, traceback, output, etc.
See BBcode help for more info.
Please, use proper tags when post code, traceback, output, etc.
See BBcode help for more info.