Python Forum
reformatting text with comma separated numbers
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
reformatting text with comma separated numbers
#1
i have found that some programs (rsync is a big example) output numbers larger than 999 with commas in various numbers. i want to make a program that fixes this. it will look for numbers with comma separating digits. where it finds such a number, it will fix it, using exactly the same number of spaces so any other formatting is not misaligned. once the line is finished, it will be output. any nice functions i should know about?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
To remove commas use replace(",","")
Reply
#3
Do you mean output like this:
Output:
>f+++++++++ some/dir/new-file.txt .f....og..x some/dir/existing-file-with-changed-owner-and-group.txt .f........x some/dir/existing-file-with-changed-unnamed-attribute.txt >f...p....x some/dir/existing-file-with-changed-permissions.txt >f..t..g..x some/dir/existing-file-with-changed-time-and-group.txt >f.s......x some/dir/existing-file-with-changed-size.txt >f.st.....x some/dir/existing-file-with-changed-size-and-time-stamp.txt cd+++++++++ some/dir/new-directory/ .d....og... some/dir/existing-directory-with-changed-owner-and-group/ .d..t...... some/dir/existing-directory-with-different-time-stamp/
"""
https://stackoverflow.com/questions/4493525/what-does-f-mean-in-rsync-logs


YXcstpoguax  path/to/file
|||||||||||
||||||||||╰- x: The extended attribute information changed
|||||||||╰-- a: The ACL information changed
||||||||╰--- u: The u slot is reserved for future use
|||||||╰---- g: Group is different
||||||╰----- o: Owner is different
|||||╰------ p: Permission are different
||||╰------- t: Modification time is different
|||╰-------- s: Size is different
||╰--------- c: Different checksum (for regular files), or
||              changed value (for symlinks, devices, and special files)
|╰---------- the file type:
|            f: for a file,
|            d: for a directory,
|            L: for a symlink,
|            D: for a device,
|            S: for a special file (e.g. named sockets and fifos)
╰----------- the type of update being done::
             <: file is being transferred to the remote host (sent)
             >: file is being transferred to the local host (received)
             c: local change/creation for the item, such as:
                - the creation of a directory
                - the changing of a symlink,
                - etc.
             h: the item is a hard link to another item (requires 
                --hard-links).
             .: the item is not being updated (though it might have
                attributes that are being modified)
             *: means that the rest of the itemized-output area contains
                a message (e.g. "deleting")
"""
import sys
from pathlib import Path


import rich


def parse_flags(flags):
    update_types = {"<": "sent", ">": "received", "c": "local", "h": "hardlink", ".": "noop", "*": "message"}
    file_types = {"f": "file", "d": "directory", "L": "symlink", "D": "device", "S": "special file"}
    fields = ("checksum", "size", "modification_time", "permission", "owner", "group", "reserved", "acl", "attr")
    fields = tuple(f + "_different" for f in fields)
    update_type, file_type, *differences = flags
    update_type = update_types.get(update_type, "")
    file_type = file_types.get(file_type, "")
    flags = {field: flag == "." for field, flag in zip(fields, differences)}
    flags.update({"file_type": file_type, "update_type": update_type})
    return flags
    
  
def parse_line(line):
    flags, path = line.split(maxsplit=2)
    return {"path": Path(path), **parse_flags(flags)}


def parse_output(file):
    with open(file) as fd:
        for line in fd:
            try:
                yield parse_line(line)
            except Exception as e:
                print(e, file=sys.stderr)


rich.print(list(parse_output("task.txt")))
Output in console:    

Rich does the formatting.
I guess there are also on pypi packages.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#4
(May-06-2020, 08:09 AM)anbu23 Wrote: To remove commas use replace(",","")

how do i make that only remove the commas that are part of numbers and leave other commas where they are? how do i add spaces in front of the number to keep other columns correctly aligned?

(May-06-2020, 08:46 AM)DeaD_EyE Wrote: Do you mean output like this:

Output in console:

Rich does the formatting.
I guess there are also on pypi packages.

that appears to be focused on rsync. but, rsync is merely an example. when someone gives an example, don't focus the solution on that. that's why so many people don't like to give examples.

i generally need to do "foo 4,567,891 bar" -> "foo 4567891 bar". and this is just an example. any text can be in place of "foo" or "bar"?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#5
>>> import re
>>> pat_obj=re.compile('[0-9,]+')
>>> str='foo 4,566 bar 1,234'
>>>
>>> text=pat_obj.findall(str)
>>> for num in text:
...   str=str.replace(num,num.replace(',','').rjust(len(num),' '))
...
>>> str
'foo  4566 bar  1234'
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  sorting with numbers in text Skaperen 5 3,702 Jul-20-2021, 01:15 AM
Last Post: Skaperen

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020