Nov-15-2021, 06:58 PM
I am working on an inventory software which use Optical Character Recognition (EasyOCR) to can a document. Now I got the problem of cleaning/sorting the data (using pandas). I want a csv file like this:
The data I got looks like this:
I tried:
Any idea how I can clean/sort the data properly (only the numbers on one side and the name on the other side)?
Thank you! :)
Quote:Article Number: Name:
56451748468434 Shoe
24564165165145 Boat
...
The data I got looks like this:
Quote:00605555
Retail Apple I2W USB Power Adapler
00605558
Lightning t0 USB Cable (2 m)
00605613
Apple Lightn t0
Smm Headphone
00805614
Apple iPhone Lightning Dock Black
0C605615
Apple EarPods wilh Lighining Con;
00605774
Huawei GigaCube LTE CPE Fenstera
00605806
Google Home
00605834
Belkin BOOSTUP Wireless Charger
00605872
Google Hame Mini
I tried:
data = pd.DataFrame(data['1'].values.reshape(-1, 2), columns=['Artikel', 'AN'])But then my output looks like this:
Output: Artikel AN
0 Retail Apple I2W USB Power Adapler 00605558
1 Lightning t0 USB Cable (2 m) 00605613
2 Apple Lightn t0 Smm Headphone
3 00805614 Apple iPhone Lightning Dock Black
4 0C605615 Apple EarPods wilh Lighining Con;
5
Any idea how I can clean/sort the data properly (only the numbers on one side and the name on the other side)?
Thank you! :)