Jan-21-2018, 04:53 AM
the numeric string sort is a sort based on numeric string compare. this compare works like a normal string compare until it encounters numeric digits. if it is comparing a digit to a non-digit, it still compares like a normal string compare, comparing the individual characters of each string to be compared. if it encounters digits in both strings it is comparing at the same time, then the behavior is changed. it scans forward in both strings to get the number of digits in a run. if the two digit runs are different, then the difference number of digits are scanned on the longer run to see if any digits are non-zero. if so then the longer run is considered to be the higher value. otherwise the remaining equal number of digits in both runs are compared to determine how the strings compare.
i have done this in C and the performance was somewhat less than a normal compare since it did not get to use machine specific optimization in the library compare and compiler optimization was less as well. many CPU architectures, such as IBM mainfram S/360, S/370, S/390, have CPU instructions to do a wholes compare, but not for a numeric string compare.
in Python, the performance comparison, i expect, would be even more exaggerated since an implementation of everything from comparison to sorting would all be in Python code. to make an implementation be even better, i suspect this would need to be done in C. have you seen such a thing available?
i have done this in C and the performance was somewhat less than a normal compare since it did not get to use machine specific optimization in the library compare and compiler optimization was less as well. many CPU architectures, such as IBM mainfram S/360, S/370, S/390, have CPU instructions to do a wholes compare, but not for a numeric string compare.
in Python, the performance comparison, i expect, would be even more exaggerated since an implementation of everything from comparison to sorting would all be in Python code. to make an implementation be even better, i suspect this would need to be done in C. have you seen such a thing available?