Python Forum

Full Version: Apply textual data cleaning to several CSV files
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I need to perform a textual analysis that includes several speeches. The speeches were transcribed (using OCR) from several PDFs files into CSVs files. Each CSV file contains a column titled speech, with several speeches from different speakers (one speaker, one row). I wrote a function to "clean" a little the most common shortfalls of the OCR. I applied this function to a single files and it does the job. Therefore, I am now trying to apply this function to all CSVs files. However, I keep getting the error "TypeError: expected string or bytes-like object". However, when I apply the code to a single file it does work, so I am stuck...Can someone help me? Any suggestion is appreciated.