Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 split by character class
#1
i want to split a string at the boundary of different classes of character values. in my current case a string has a few ranges of digits so i want something like: docs/3.4.10/python-3.4.10-pdf-letter.tar.bz2 to split up like ['docs/','3','.','4','.','10','/python-','3','.','4','.','10','-pdf-letter.tar.bz','2'].

way back in C i would have to implement this the hard way by scanning the string character by character and look up its class and split and step to the next result element when the character class changes, as mapped by the caller who might want everything not a digit to be grouped together. i don't want to do it that way in Python.
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Quote
#2
I think you already asked this question in the past. Here is the solution for sequences of digits
>>> import re
>>> re.split(r'(\d+)', "docs/3.4.10/python-3.4.10-pdf-letter.tar.bz2")
['docs/', '3', '.', '4', '.', '10', '/python-', '3', '.', '4', '.', '10', '-pdf-letter.tar.bz', '2', '']
Quote
#3
thanks! yes, i could have asked this in the past. it has been on my mind in a number of forms for a while. in this case then intent is next to scan the list and find the length of the longest digits string, ignoring those that are not digits. the scan again and pad each digits string with leading zeros to make them all equal length. then the list can either be returned as the key as is or joined back into one string and returned to implement the numeric sort.
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Quote
#4
it works as re.split('(\d+)',...), e.g. not raw. apparently there is no \d backslash sequence. but i think this is not a good way to code.
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Replace changing string including uppercase character with lowercase character silfer 11 871 Mar-25-2019, 12:54 PM
Last Post: silfer
  [split] python class JPan 6 812 Aug-19-2018, 04:58 PM
Last Post: JPan
  SyntaxError: unexpected character after line continuation character Saka 2 12,563 Sep-26-2017, 09:34 AM
Last Post: Saka

Forum Jump:


Users browsing this thread: 1 Guest(s)