Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
function to untabify
#1
is there a function that comes with Python that can untabify a line of text that is UTF-8 encoded? it should have a parameter for the tab size and/or default to 8.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
Could you clarify what you expect such a function to do? My guess at the behaviour you want:

import unittest


class TestUntabify(unittest.TestCase):
    def test_it_removes_a_tab_of_the_given_size_from_a_line(self):
        tab_size = 4
        tab = " " * tab_size
        line = f"{tab}ぃかの"

        untabbed_line = untabify(line, tab_size)

        self.assertEqual(untabbed_line, "ぃかの")

    def test_it_does_not_remove_a_tab_that_is_smaller_than_the_given_size(self):
        tab_size = 4
        tab = " " * 2
        line = f"{tab}ぃかの"

        returned_line = untabify(line, tab_size)

        self.assertEqual(returned_line, line)

    def test_it_removes_tab_characters_from_the_line_if_their_number_match_the_given_size(self):
        tab_size = 2
        tab = "\t" * tab_size
        line = f"{tab}abc"

        untabbed_line = untabify(line, tab_size)

        self.assertEqual(untabbed_line, "abc")


if __name__ == "__main__":
    unittest.main()
I didn't implement the function, but I tend to use tests to clarify understanding of a problem. These were just the cases I could think of. Are there others?
Reply
#3
There is built-in textwrap which could have required functionality
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#4
we obviously overlooked str.expandtabs(). bytes can do this, too, but it will probably count Unicode characters as the number of bytes UTF-8 encodes them as, so the str version is probably the one to use.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  does anyon want to write an untabify command? Skaperen 10 3,700 Sep-09-2019, 06:05 AM
Last Post: Skaperen

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020