Python Forum
remove gilberishs from a "string" - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: remove gilberishs from a "string" (/thread-41758.html)



remove gilberishs from a "string" - kucingkembar - Mar-15-2024

sorry for my bad English,
I got stuck on this problem from 2 day ago,
i tried to extract all of the dialogues from the PS1 game
this is the parts of it:
Quote:[v1s9jiue]
[a ;]
[a2b'v6]
[my father's been in a good mood since cliff became his son.]
[wsm]
[v e]
[e n]
[{|{9;<9z]
[i't]
[ryiv?1enns]
[yja'al]
[a ;]
[a2b'v6]
[hello peter. welcome.]
[yssu]
[ys u y8]
i tried multiple language detectors and translators to split which is real dialogue and which is not,
but do not work,
is there any clue to solve this?
Note :
1. i can do it manually, but the data is about 5mb of "string", it will take days to clean the data
2. some data like [stu] and [kai] is valid dialogue as it is their names
3. i prefer non online/register way

thank you for reading
have a nice day


RE: remove gilberishs from a "string" - deanhystad - Mar-15-2024

Is every line of dialogue preceeded by a2b'v6?


RE: remove gilberishs from a "string" - kucingkembar - Mar-15-2024

(Mar-15-2024, 08:06 AM)deanhystad Wrote: Is every line of dialogue preceeded by a2b'v6?
using
for i in f.readlines:
to check line by line