Python Forum
HELP!!! string logical sorting
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
HELP!!! string logical sorting
#1
sorry for my bad english,
i have this problem :
thelist = ["t9","t8","t11","t10"]
print(thelist)
thelist.sort()
print(thelist)
Output:
['t9', 't8', 't10', 't11'] ['t10', 't11', 't8', 't9']
expected output is :
Output:
['t8', 't9', 't10', 't11']
i google for [python string "logical" "sorting"] but found nothing
please help me!!!
Reply
#2
Natural sorting. When sorting strings, t11 logically comes before t8, but it doesn’t look natural
Reply
#3
Try this:

thelist = ["t9","t8","t11","t10"]
# not quite good
sorted_list = sorted(thelist, key=lambda x: len(x))
# ok
sorted_list = sorted(thelist, key=lambda x: int(x[1:]))
Reply
#4
thank you for both answers,
the real problem is the above is a simplified example,
the real case is used for list files in folders that may contain hundreds or thousands of files
Reply
#5
Fix for your example.

def by_number(text):
    return int(text[1:])


thelist = ["t9","t8","t11","t10"]
print(thelist)

# using the key function by_number, which returns int
# the int is then used for comparison

thelist.sort(key=by_number)
print(thelist)
If you work with filenames, it's similar. Then you have to know the structure of the name, decompose it until only the number is left and then converting it to an int. If your filenames have a ISO8601 prefix, then you can parse it with datetime. datetime objects are sortable.

Example with ISO8601 prefix and pathlib:
from datetime import date as Date
from pathlib import Path


def by_date(path):
    date_str, _ = path.name.split("_", maxsplit=1)
    return Date.fromisoformat(date_str)


def walk(root, pattern):
    for path in Path(root).glob(pattern):
        yield path


def main():
    root = "."

    pattern = "????-??-??_*.*"
    # or more explicit glob pattern: pattern = "[0-2][0-0][0-9][0-9]-[0-1][0-9]-[0-3][0-9]_*.*"
    # 2022-10-12_bla1.txt matches the glob pattern

    for path in sorted(walk(root, pattern), key=by_date):
        print(path)


if __name__ == "__main__":
    main()
kucingkembar likes this post
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#6
@kucingkembar

You may paste an example list with file names if you wish!
Reply
#7
this project is translating raw manga/manhua/comics from non-alphabet words to English,
the data obtained using selenium to [save page as - web page complete],
so the image names may be random,
but if you read them using image-viewer, they have right order,
like these :
-001.jpg, 002.jpg
-1.jpg, 10.jpg
-6183416182372182_86981783_1.webp ,6183416182372182_86981783_2.webp
-1722013291_90649092356080.jpg, 1722013293_31683094936372.jpg
Reply
#8
I would use lists as sorting keys, where the integers have been converted to Python's int type
examples = [
    ["002.jpg", "001.jpg"],
    ["10.jpg", "1.jpg"],
    ["6183416182372182_86981783_2.webp", "6183416182372182_86981783_1.webp"],
    ["1722013293_31683094936372.jpg", "1722013291_90649092356080.jpg"],
]

import re


def numed(s):
    L = re.split(r"(\d+)", s)
    for i in range(1, len(L), 2):
        L[i] = int(L[i])
    return L


print("Sorting keys:")
for names in examples:
    print([numed(x) for x in names])

print("Sorted names")
for names in examples:
    print("unsorted:", names)
    print("sorted  :", sorted(names, key=numed))
Output:
Sorting keys: [['', 2, '.jpg'], ['', 1, '.jpg']] [['', 10, '.jpg'], ['', 1, '.jpg']] [['', 6183416182372182, '_', 86981783, '_', 2, '.webp'], ['', 6183416182372182, '_', 86981783, '_', 1, '.webp']] [['', 1722013293, '_', 31683094936372, '.jpg'], ['', 1722013291, '_', 90649092356080, '.jpg']] Sorted names unsorted: ['002.jpg', '001.jpg'] sorted : ['001.jpg', '002.jpg'] unsorted: ['10.jpg', '1.jpg'] sorted : ['1.jpg', '10.jpg'] unsorted: ['6183416182372182_86981783_2.webp', '6183416182372182_86981783_1.webp'] sorted : ['6183416182372182_86981783_1.webp', '6183416182372182_86981783_2.webp'] unsorted: ['1722013293_31683094936372.jpg', '1722013291_90649092356080.jpg'] sorted : ['1722013291_90649092356080.jpg', '1722013293_31683094936372.jpg']
There are also solutions in Pypi such as natsort, but I don't know these modules. Install with care.
kucingkembar likes this post
« We can solve any problem by introducing an extra level of indirection »
Reply
#9
@Gribouillis
your code is working flawlessly,
but i don't understand this part:
key=numed
how the sort only pick [int] data only, and ignore the [str]
Reply
#10
(Aug-09-2024, 09:03 AM)kucingkembar Wrote: how the sort only pick [int] data only, and ignore the [str]
The numed function takes a string and returns a list where the (non negative) integers have been converted as python int
for example
print(numed('foo10bar035spam'))  # -> prints ['foo', 10, 'bar', 35, 'spam']
print(numed('foo2bar035spam'))    # -> ['foo', 2, 'bar', 35, 'spam']
The sorted function compares these lists instead of comparing the original strings, which produces the correct result. Lists are compared with lexicographic order
>>> ['foo', 2, 'bar', 35, 'spam'] < ['foo', 10, 'bar', 35, 'spam']
True
Strings are not ignored, but 'foo' and 'foo' compare equal.
« We can solve any problem by introducing an extra level of indirection »
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Help with Logical error processing List of strings dmc8300 3 1,418 Nov-27-2022, 04:10 PM
Last Post: Larz60+
  Greedy algorithms on logical problems Opensourcehacker 0 1,781 Nov-22-2020, 05:12 PM
Last Post: Opensourcehacker
  Unable to bit shift and logical OR bytes and ints? MysticLord 7 8,024 Sep-01-2020, 03:31 PM
Last Post: deanhystad
  Basic logical errors cpopescu 3 2,420 Jun-03-2020, 11:30 AM
Last Post: menator01
  Python logical operator AND rasec70 4 3,001 May-07-2020, 03:40 PM
Last Post: pyzyx3qwerty
  parsing logical expression with pyparsing palo173 2 5,951 May-13-2019, 09:22 AM
Last Post: palo173
  Sorting a copied list is also sorting the original list ? SN_YAZER 3 3,803 Apr-11-2019, 05:10 PM
Last Post: SN_YAZER
  Sorting list of lists with string and int Otbredbaron 6 4,738 May-07-2018, 06:04 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020