Python Forum
Sort by the largest number of the same results (frequency) - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Sort by the largest number of the same results (frequency) (/thread-25239.html)



Sort by the largest number of the same results (frequency) - inlovewiththedj - Mar-24-2020

Hello

I have a sorting question
Now I’m using Notepad++ but it is difficult to sort in my way.

For example
I have few examples in random order

10AndrewJNVR
10Andrpt1Pf7
10Anuiot18g2
10Andrew1H54
10Andrew17yb
10Andre614uw
10Andr321EUb
10And48n1Q2K
10And1in15tA
10Andrew13gG
10An987ppKeP
10Andrew1Dom


How to sort text ie. in Notepad ++ (with python addon) like in bellow
at the very top after sorting there will be addressess with the names most often repeating or closest,
then subsequent, subsequent and subsequent ones as in the example below

So
How to sort to this way:



10Andrew1Dom
10Andrew1H54
10Andrew13gG
10Andrew17yb
10AndrewJNVR
10Andre614uw
10Andr321EUb
10Andrpt1Pf7
10And1in15tA
10And48n1Q2K
10Ana87ppKeP
10Anuiot18g2


RE: Sort by the largest number of the same results (frequency) - DPaul - Mar-25-2020

How do you know when to take all the letters and when not.
10Ana87ppKeP should come first ?

DPaul


RE: Sort by the largest number of the same results (frequency) - inlovewiththedj - Mar-25-2020

(Mar-25-2020, 04:49 PM)DPaul Wrote: How do you know when to take all the letters and when not.
10Ana87ppKeP should come first ?

DPaul

No, as You can see i need solution to find moast similar string so string with the most same characters
like
10Andrew1Dom
10Andrew1H54
10Andrew13gG
10Andrew17yb

then

10AndrewJNVR
10Andre614uw

and son on, so on


RE: Sort by the largest number of the same results (frequency) - DPaul - Apr-01-2020

Apart from the fact that i can think of better coding practices,
there is no simple way this can be done.

In my opinion, the reason is that you should sort on the
longest matching string (andrew1) to the smallest (an),
except you can't because they are hidden in the coded string.
If "an"... happens to be first you are going to have a lot of
matches, but you don't want the andrew1...to be included.
catch 22 if you ask me.

There is a possibility, but clumsy and elaborate. I have not worked it out.
What you could try:
for every entry in the list:
match the whole code with the other codes, write down the number of matches + length of the matchstring.
For that same code, take away the last position, do the same thing. [:-1]
Then take away the then last letter again, etc. until the len = 0
Then repeat the process for the second code, entc.
At the end you should be able to see what the longest len() with the most matches was.

Good luck
Paul