DEC pack, unpack and disk-images

Curbie · (This post was last modified: Jun-03-2024, 11:09 PM by Larz60+.)

Pyton newbie here, but programming in one language or another 50 years. Someone who’s opinion I trust said I should look into python because of it’s rapid development characteristics,which I prize. I’ve taken a 12 hour on-line tutorial and a couple of ‘learn by doing’ projects type tutorials.

No tutorial covers anything close to what I’m trying to do, which I’ve been doing in C, the project is pretty big, reading and writing to an SD disk-image for vintage 70/80’s computer. But no matter how big the project is, it’s still comprised of small functions, and it’s just one of these small functions I choose to start with.

They are DEC type rad50 calls “pack” and “unpack”, and in C look like this:

//                      "0000000000111111111122222222223333333333" tens row
//                      "0123456789012345678901234567890123456789" ones row
static char     r50[] = " ABCDEFGHIJKLMNOPQRSTUVWXYZ.$%0123456789"; // Rad50 character array

// *** "chrs" MUST be a three cell array, "r50" is always 40 cell array ***
int srch(char chr) {                                        // search r50 array for character and return character's position
int i;                                                      // array index
    for(i=0; i<39; i++) {                                   // search 40 characters of "r50" for "chr"
        if(r50[i] == chr) {                                 // if "chr" = character @ "r50" position "i"
            return i;                                       // return found "r50" index
         }                                                  // end if "chr" = character @ "r50" position "i"
    }                                                       // end search 40 characters of "r50" for "chr"
    return 645001;                                          // return illegal character flag
}                                                           // end search r50 array for character and return character's position

// pack example
// char chrs[3]= "tom";                                     // setup three character array
// unsigned short packed;                                   // unsigned short to pack three "chrs" into
// pack(packed, chr);                                       //
// *** pack firct 3 character string "chr", returned into "pack()" unsigned short ***
unsigned short pack(char *chrs) {                           // rad50 character pack routine
int pacc = 0;                                               // int for packed word
int pac = 0;                                                // int for packed word
    pac = srch(toupper(chrs[0]));                           // get pack value of ascii character
    if(pac==64001) {return 64001;}                          // found non-rad50 character
    pacc = pacc + (pac * (050 * 050));                      // left shift character over three places
    pac = srch(toupper(chrs[1]));                           // get pack value of ascii character two
    if(pac==64001) {return 64001;}                          // get pack value of ascii character
    pacc = pacc + (pac * 050);                              // left shift character over two places
    pac = srch(toupper(chrs[2]));                           // get pack value of ascii character three
    if(pac==64001) {return 64001;}                          // get pack value of ascii character
    pacc = pacc + pac;                                      // place character in first character place
    return pacc;                                            // return packed number
}                                                           // end rad50 character pack routine


// *** unpack - converts RAD50 packed "word" to three "chrs" of ASCII ***
void unpack(unsigned short word, char *chrs)                // pointer to character array[3]
{                                                           // start unpack word block
    if (word < 0175000) {                                   // if word is legal RAD50 word < 64,000?
        chrs[0] = r50[word / 050*050];                      // to get first character, packed num / (o50 xo50)
        chrs[1] = r50[(word / 050) % 050];                  // to get second character, (packed num / o50) mod o50)
        chrs[2] = r50[word % 050];                          // to get third character, (packed num mod o50)
    } else                                                  // if word is NOT legal RAD50 word > 64,000?
        chrs[0] = chrs[1] = chrs[2] = ' ';                  // set 3 characters of array to ' '
}                                                           // end unpack word block
// pack and unpack examples
//char chr[3]="tom";                                          // set RAD50 chars
//unsigned short packed;                                      // packed 3 character string
//packed = pack(chr);                                         // pack 3 character string "chr"
//printf("unpacked number is: %d\n", packed);                 // print promt
//unpack(packed);                                             // unpack "packed" into 3 character string "chr"
//printf("unpacked number is: %c\n", chr[0]);                 // print promt
//printf("unpacked number is: %c\n", chr[1]);                 // print promt
//printf("unpacked number is: %c\n", chr[2]);                 // print promtf

These routines are expanded for better operation understanding and debugging.

There’s endian, byte and word data type issues, along with some bit twiddling that none of the python tutorials covered..

My questions are:
1. Is python still a good choice for these types of projects?
2. Can anyone point to helpful tutorials for either the rad50 routines or disk-image file reading/writing?
3. Anyone have any other helpful pionts

Larz60+ write Jun-03-2024, 11:09 PM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
works for C code as well, I added tags for you this post.

**Gribouillis** · (This post was last modified: Jun-03-2024, 09:43 PM by Gribouillis.)

I can only answer the first question, yes I think Python is an excellent choice for this task.
Here is my attempt to code the pack() function. Tell me if the result looks correct

r50 = b'ABCDEFGHIJKLMNOPQRSTUVWXYZ.$%0123456789'
illegal = 127

def prepare_table():
    s = r50.decode('ascii')
    d = {ord(c): i for i, c in enumerate(s)}
    d.update({ord(c.lower()): i for i, c in enumerate(s)})
    L = [illegal for i in range(256)]
    for c, i in d.items():
        L[c] = i
    table = ''.join(chr(x) for x in L).encode('ascii')
    return table

table = prepare_table()

class IllegalCharacter(Exception):
    pass

def pack(c3):
    x = c3.translate(table)
    if illegal in x:
        raise IllegalCharacter
    return 1600 * x[0] + 40 * x[1] + x[2]

res = pack(b'tom')
print(res)

Output:λ python paillasse/pf/curbie.py
30972

Curbie

Gribouillis,

I’m sorry, I didn’t intend for you to write the functions for me 1) cause that annoyed me when people tried to get me to write their code and 2) because I don’t learn anything, which was the point of my post.

At first glance, one small error in your routine :
r50 = b'ABCDEFGHIJKLMNOPQRSTUVWXYZ.$%0123456789'
should have a space character before the 'A'…
r50 = b' ABCDEFGHIJKLMNOPQRSTUVWXYZ.$%0123456789'
Now r50 is being assigned 39 characters (40 zero indexed or 50 zero indexed in octal)

After that small correction, the output is 77545 Octal or 32613 decimal, which is correct.

I’m not familiar with the b preceding the initialize string (b'ABCDEFGHIJKLMNOPQRSTUVWXYZ.$%0123456789') does that denote an initialize string of bytes or something else?

I’ll be going though this line by line tomorrow.

Thanks for all your effort.

Curbie

**Gribouillis** · (This post was last modified: Jun-04-2024, 06:09 AM by Gribouillis.)

(Jun-04-2024, 03:43 AM)Curbie Wrote: does that denote an initialize string of bytes or something else?

Yes the b in front of a literal string denotes a string of bytes

"ABC" # this is a string of unicode characters (type str)
'ABC'  # this is also a string of unicode characters (type str)
b'ABC' # this is a string of bytes (type bytes)

A string of bytes behaves like a read-only array of 8 bits integers

>>> list( b'tom' )
[116, 111, 109]
>>> "tom".encode('ascii')
b'tom'
>>> b'tom'.decode('ascii')
'tom'
>>>

(Jun-04-2024, 03:43 AM)Curbie Wrote: because I don’t learn anything, which was the point of my post.

Your main problem will be the time it takes to get familiar with Python's libraries. In this case I want to use the bytes.translate() method because then I don't have to iterate through the bytes of the string (more precisely the iteration is done by the underlying C implementation of Python), so I first create a 256 bytes translation table that I need to use this function.

Here is a version of unpack()

rev_table = r50 + b'\xff' * (256 - 40)

def unpack(n):
    if 0 <= n < 64000:
        a, c = divmod(n, 40)
        a, b = divmod(a, 40)
        return bytes((a, b, c)).translate(rev_table)
    raise ValueError(n)

In Python it is better to raise exceptions than to return error codes (because error codes need more ifs for the caller while exceptions propagate by themselves)

Curbie · Jun-04-2024, 04:06 PM

Quote:Your main problem will be the time it takes to get familiar with Python's libraries.

agreed, but to my experience no different than learning other languages. I think my main issue is going to be, I have a 50 year library of helper programs already written that, most anything I do now is going to need a pretty thorough understanding of libraries.

Any hints on the best way to go after that Python library familiarity?

I think I'll also be better to read SD disk-image data in as "chunks" is there a library for that?

I also assume that there an SQL library?

Quote:return 1600 * x[0] + 40 * x[1] + x[2]

seems standard PEMDAS applies?

Thanks for all your help.

Curbie

Pedroski55 · Jun-06-2024, 08:47 AM

Quote:I think I'll also be better to read SD disk-image data in as "chunks" is there a library for that?

Following the Linux motto "Everything is a file" you can easily read your file in chunks:

# set chunksize much bigger than 1024, this is 1024 * 1024
def read_in_chunks(file_object, chunk_size=1048576):
    while True:
        data = file_object.read(chunk_size)
        if not data:
            break
        yield data

path2data = '/home/pedro/myPython/books/Ashwin Pajankar - Practical Python Data Visualization_ A Fast Track Approach To Learning Data Visualization With Python (2021, Apress) - libgen.li.pdf'    

with open(path2data, 'rb') as f:
    for piece in read_in_chunks(f):
        print(len(piece))
        # do something

Looks like this:

Output:for piece in read_in_chunks(f):
    print(len(piece))

    
1048576
1048576
1048576
1048576
845377
0

That said, if I want to copy anything from somewhere to somewhere, I would just use rsync.

This copies from my laptop to a usb stick, but you can copy to anywhere on a network if you have write permission.
In bash:

Quote:rsync -av -e "ssh" --progress /home/pedro/myPython [email protected]:/media/pedro/295df732-017f-490a-b6cb-19061b2965e8/home/pedro/

rsync checks that data has changed before overwriting existing files, so saves a lot of time.

Curbie · Jun-06-2024, 04:40 PM

Pedroski55,

Thanks for the reply.

i fear that the foundation I've been thinking about building is weak, reading the whole SD in, and resolving endian for 65536 sectors of read and byte swapping at one time.

Can python read one targeted/indexed (I.E. sector #1 or #x) sector of the SD at a time?

I only need to resolve sector 1 (MFD master file directory, think root) , the sector 2 - 9 (BITMAP think bitmap of sectors in use), a couple UFD sectors (user file directory, think first and only user-file directory-level), and however many sectors the target file uses. less than an average of 20 sector vs 65536 seems better.

without having to understand stone-age file structures, processing an average of 20 sectors vs 65536 seems obvious, is there a way to read sectors?

Curbie

Pedroski55 · (This post was last modified: Jun-07-2024, 06:35 AM by Pedroski55.)

I am not technical enough to answer that, sorry! I just tinker with Python.

I would ask that question on linuxquestions.org Tell them what you want to do.

There is very little that they do NOT know there!

I would be surprised if you did not get a very helpful answer!

Probably a 3 line bash script!

Found this, maybe it will help!

Curbie · Jun-07-2024, 06:45 AM

Pedroski55,

Thanks for your help anyway, i have more stuff to chase now.

Pedroski55 · Jun-07-2024, 07:12 AM

Is this anything like what you want?

I plugged in an old usb stick. It is mounted at /dev/sda

#! /usr/bin/python3
"""Read a single sector of a physical disk. Tested on Mac OS 10.13.3 and Windows 8."""

import os

def main(usb):  # Read the first sector of the first disk as example.
    """Demo usage of function."""
    if os.name == "nt":
        # Windows based OS normally uses '\\.\physicaldriveX' for disk drive identification.
        print(read_sector(r"\\.\physicaldrive0"))
    else:
        # Linux based OS normally uses '/dev/diskX' for disk drive identification.
        print(read_sector(usb))

#usb = '/media/pedro/UraltUSB/'
usb = '/dev/sda'
def read_sector(disk, sector_no=0):
    """Read a single sector of the specified disk.

    Keyword arguments:
    disk -- the physical ID of the disk to read.
    sector_no -- the sector number to read (default: 0).
    """
    # Static typed variable
    read = None
    # File operations with `with` syntax. To reduce file handeling efforts.
    with open(disk, 'rb') as fp:
        fp.seek(sector_no * 512)
        read = fp.read(512)
    return read

if __name__ == "__main__":
    usb = '/dev/sda'
    main(usb)

Make sure the Python script read_sectors is executable.
Also need to run as sudo, because normal user has no access to /dev/sda
Run this in bash:

Quote:pedro@pedro-HP:~/myPython$ sudo ./read_sectors.py

Gives this:

Output:
b'3\xc0\xfa\x8e\xd8\x8e\xd0\xbc\x00|\x89\xe6\x06W\x8e\xc0\xfb\xfc\xbf\x00\x06\xb9\x00\x01\xf3\xa5\xea\x1f\x06\x00\x00R\x89\xe5\x83\xec\x1cj\x1e\xc7F\xfa\x00\x02R\xb4A\xbb\xaaU1\xc90\xf6\xf9\xcd\x13Z\xb4\x08r\x17\x81\xfbU\xaau\x11\xd1\xe9s\rf\xc7\x06Y\x07\xb4B\xeb\x13\xb4H\x89\xe6\xcd\x13\x83\xe1?Q\x0f\xb6\xc6@\xf7\xe1RPf1\xc0f\x99@\xe8\xdc\x00\x8bNV\x8bFZPQ\xf7\xe1\xf7v\xfa\x91Af\x8bFNf\x8bVRS\xe8\xc4\x00\xe2\xfb1\xf6_YXf\x8b\x15f\x0bU\x04f\x0bU\x08f\x0bU\x0ct\x0c\xf6E0\x04t\x06!\xf6u\x19\x89\xfe\x01\xc7\xe2\xdf!\xf6u.\xe8\xe1\x00Missing OS\r\n\xe8\xd2\x00Multiple active partitions\r\n\x91\xbf\xbe\x07Wf1\xc0\xb0\x80f\xab\xb0\xedf\xabf\x8bD f\x8bT$\xe8@\x00f\x8bD(f\x8bT,f+D f\x1bT$\xe8p\x00\xe8*\x00f\x0f\xb7\xc1f\xab\xf3\xa4^f\x8bD4f\x8bT8\xe8"\x00\x81>\xfe}U\xaau\x85\x89\xecZ_\x07f\xb8!GPT\xfa\xff\xe4f!\xd2t\x04f\x83\xc8\xfff\xab\xc3\xbb\x00|f`fRfP\x06Sj\x01j\x10\x89\xe6f\xf7v\xdc\xc0\xe4\x06\x88\xe1\x88\xc5\x92\xf6v\xe0\x88\xc6\x08\xe1A\xb8\x01\x02\x8aV\x00\xcd\x13\x8dd\x10far\x0c\x02~\xfbf\x83\xc0\x01f\x83\xd2\x00\xc3\xe8\x0c\x00Disk error\r\n^\xac\xb4\x0e\x8a>b\x04\xb3\x07\xcd\x10<\nu\xf1\xcd\x18\xf4\xeb\xfd\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xee\xff\xff\xff\x01\x00\x00\x00\xff\xbf\xd4\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00U\xaa'

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Too much values to unpack	actualpy	3	641	Feb-11-2024, 05:38 PM Last Post: deanhystad
	Hard disk structure like a file selection dialog	malonn	2	916	Aug-09-2023, 09:14 PM Last Post: malonn
	unpack dict	menator01	1	1,311	Apr-09-2022, 03:10 PM Last Post: menator01
	ValueError: not enough values to unpack (expected 4, got 1)	vlearner	2	6,553	Jan-28-2022, 06:36 PM Last Post: deanhystad
	JS Buffer.from VS struct.pack	DreamingInsanity	3	2,644	Apr-05-2021, 06:27 PM Last Post: DreamingInsanity
	[SOLVED] [geopy] "ValueError: too many values to unpack (expected 2)"	Winfried	2	3,022	Mar-30-2021, 07:01 PM Last Post: Winfried
	Cannot unpack non-iterable NoneType object, i would like to ask for help on this.	Jadiac	3	9,146	Oct-18-2020, 02:11 PM Last Post: Jadiac
	subprogram issues: cannot unpack non-iterable function object error	djwilson0495	13	6,359	Aug-20-2020, 05:53 PM Last Post: deanhystad
	How to Calculate CPU, Disk, Memory and Network utilization rate	skvivekanand	1	2,147	Jun-16-2020, 08:53 PM Last Post: jefsummers
	struct.unpack failed	Roro	2	3,504	Jun-13-2020, 05:28 PM Last Post: DreamingInsanity

DEC pack, unpack and disk-images

User Panel Messages

Announcements