Jul-21-2019, 07:14 PM
Hi
There is xml-file (350 mb) that is a dictionary of tagged russian words. I want use it to tag words but problem is that search in xml-file is very time-complex thing.
My question is: in what I should convert xml to handle this data more efficiently? I think of trie: whether it is good idea to create "forest" of tries ? Each word will look like this :
![[Image: ZwqzBHG__mU.jpg]](https://pp.userapi.com/c851436/v851436602/1702ac/ZwqzBHG__mU.jpg)
// "КОТ" - this is a russian word for a male cat
There is xml-file (350 mb) that is a dictionary of tagged russian words. I want use it to tag words but problem is that search in xml-file is very time-complex thing.
My question is: in what I should convert xml to handle this data more efficiently? I think of trie: whether it is good idea to create "forest" of tries ? Each word will look like this :
![[Image: ZwqzBHG__mU.jpg]](https://pp.userapi.com/c851436/v851436602/1702ac/ZwqzBHG__mU.jpg)
// "КОТ" - this is a russian word for a male cat