Python Forum

Full Version: Errors to get information of multiple files into a single file csv
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I'm trying to get information from several files (in this order: 1.bateaulecalife.csv (https://imgur.com/a/831o2s3 - 201 lines), 2.boutary (https://imgur.com/a/aCE9VV9 - 251 lines), 3.epicure (https://imgur.com/a/aZGgVHO - 268 lines), 4.iletaitunsquare (https://imgur.com/a/xrd1cdL - 965 lines) and 5.le1114figourg.csv(https://imgur.com/a/aftXmC0 - 271 lines) in just one file with a total of 1956 lines, following the order of the files explained above. I was trying differents codes suggested by differents persons in this site but I have always errors. The first one was
import csv
with open('2boutary.csv', 'r') as csvfile:
    original = csvfile.read()
with open('2epicure.csv', 'r') as csvfile1:
    original1 = csvfile1.read()
with open('2iletaitunsquare.csv', 'r') as csvfile2:
    original2 = csvfile2.read()
with open('2le1114figourg.csv', 'r') as csvfile3:
    original3 = csvfile3.read()
    
with open('2bateaulecalife.csv', 'a') as csvfile4:
    csvfile4.write('\n')
    csvfile4.write(original + original1 + original2 + original3)
It shows me an error like :
Quote:UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 52171: character maps to

I tried:
import csv
reader = csv.reader(open('1bateaulecalife.csv', 'r'))
reader1 = csv.reader(open('1boutary.csv', 'r'))
reader2 = csv.reader(open('1epicure.csv', 'r'))
reader3 = csv.reader(open('1iletaitunsquare.csv', 'r'))
reader4 = csv.reader(open('1le1114figourg.csv', 'r'))
writer = csv.writer(open('tousrestosFR.csv', 'w' ))
for row in reader:
    row1 = next(reader1)
    row2 = next(reader2)
    row3 = next(reader3)
    row4 = next(reader4)
    writer.writerow(row + row1 + row2 + row3 + row4)
but it shows
Quote:UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 1091: character maps to <undefined>

Could you epxlaine me please, how can I get information from these files in just one file csv ?...I will appreciate your help.
First we create a list of all your filenames:
filenames = ['1bateaulecalife.csv', '1boutary.csv', '1epicure.csv', '1iletaitunsquare.csv', '1le1114figourg.csv']
Then we need to know what is the encoding of your csv files
for filename in filenames:
    with open(filename) as fhandle:
        print(fhandle)
The output is something like this:
Output:
<_io.TextIOWrapper name='>your filenames<' mode='r' encoding='utf8'>
Hopefully encoding is the same for all files, here it is 'utf8'

Then import pandas (my advise for handling csv files) read in files and concatenate them
import pandas as pd
df = pd.concat((pd.read_csv(filename, encoding='utf8') for filename in filenames))
Of course you need to put in your encoding instead of 'utf8'

and lastly writing as new csv file
df.to_csv('tousrestosFR.csv', encoding='utf-8')
Again, put your own encoding instead of 'utf8'
Thank you for your response.

The output for me is <_io.TextIOWrapper name='1bateaulecalife.csv' mode='r' encoding='cp1252'> , is necesary to me that encoding is <encoding='utf8'> How can I change the encoding?

If I type encoding 'utf8' it shows me "
File "pandas/_libs/parsers.pyx", line 2172, in pandas._libs.parsers.raise_parser_error

ParserError: Error tokenizing data. C error: Expected 3 fields in line 4, saw 4"

can you explain me please?
If you want to change the encoding of your individual files just read them one by one using pandas.read_csv() as shown and then immediately write them using df.to_csv() with the encoding you want.