Python Forum

Full Version: what is the name?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
for ages, i have been writing dabs and blobs of code in a couple unnamed languages of length 6 and 1, to encode and decode between byte sized octets and characters in the form of strings having lots of backslashes, quite a few tildes, numerous digits of many bases, and just about everything else, and have finally reached an impasse: i don't know the name of this encoding scheme. i need to be able to name the many variables my code will have but am finding more and more resistance to the effort.

does anyone know the name of the coding scheme that lets your carriage returns be known as ^M or \r and your tabs be known as ^I or \t and make many random octets be known in the form \123 or the like? sorry, ass-key is already taken.
control character
I think it originated with the teletypes, specifically the ASR 33.
^M I believe is ctrl-M which is a Form Feed, or hex 0x0c, or decimal 12
the value was arrived at by adding the letter position value to 0, thus 'M' being 12th character since A is 0,
so this gives the 12th ascii symbol. Ctrl-A would be a Nul, ctrl-B 'Start of header' or SOH and so on.

wiki search on control character gets: https://en.wikipedia.org/wiki/Control_character
control character was one of my early thoughts but it "feels" wrong because the numeric encoding in octal, decimal, and hexadecimal can encode the entire range of 256 values in an octet (why the decoded side of this is a bytes type in python). so it's more than just control characters although i would expect many uses of this to just do control characters. but it may be the right name since the default mode of encoding results in "ab^cd\r" when encoding "\141\142\003\144\015". lacking any other "genius" idea, "control character(s)" is going to be the name.
I think this is the ASCII representation of the non-printable control characters. About the names... Don't know. Perhaps they come from the old typewriters.
I used the ASR 33 in the early days of microprocessor development.
One of the first 'real' applications that I was assigned was control the operation of a 30 channel inductively coupled plasma spectrometer. I chose an Intel 8080 s-100 bus single board computer which had (if I recall properly) 4 2708 E-Prom slots on it.
At the time, my development was done on a DEC pdp-11, on a cross assembler, which had two outputs, an 8080 assembly language printout and a machine code image for the 8080. The machine code was fed to the ASR-33 and punched on paper tape. E-Prom burners were expensive back then, so every iteration of code had to be punched on paper tape, carried to a neighboring company who would read the tape back in, and burn a set of E-Proms for $25. We of course also kept a backup on 80 column punched cards. When floppy disks came out, life got a lot easier!
(Jun-24-2018, 02:07 AM)Skaperen Wrote: [ -> ]does anyone know the name of the coding scheme that lets your carriage returns be known as ^M or \r and your tabs be known as ^I or \t ...

From my ASR 33 days, I don't think there was a name to the scheme. The characters corresponded to the decimal position in the alphabet:
Output:
7 = Ctrl G = BELL (The 7th letter in the alphabet is 'G') 9 = Ctrl I = TAB 10 = Ctrl J = LF (Linefeed) 12 = Ctrl L = FF (Formfeed) 13 = Ctrl M = CR (Carriage Return)
For an ASCII Table see: https://www.asciitable.com/

Lewis
ljmetzger Wrote:The characters corresponded to the decimal position in the alphabet:

correct, as stated earlier:

Larz60+ Wrote:the value was arrived at by adding the letter position value to 0, thus 'M' being 12th character since A is 0,

which is also ASCII 12 (decimal)

I found an interesting pdf about the teletype history (invented in 1907) that was written for the 50'th anniversary.

Even though I started programming in the 1960's, I never used one as computer peripheral until the late 70's or early 80's. But everyone had one, which was usually put next to the poor receptionist or telephone operator because they were so noisy! But, I guess you could say that the telex was an early form of email.

One (perhaps last) comment, note page 37:

Quote:Since the stunt, box can recognize sequences of control codes as well as single codes, the number
of permutations available for a coding system runs into the
thousands.

Thus the reason for the ctrl key on our keyboards today.
literally. You literally held the control key down while typing the second letter.

Here's an image of the code wheel (the beginnings of the ASCII (coined in 1963) character set)

[attachment=430]
i am thinking of function names like decodecc() or decc() and encodecc() or encc(). the decode function will output bytes. a string output version could be made as well or a transparent bytes to string function can be used.

there are so many cases of using decodecc() i can think of where it would be desired to enter control characters as data, including on command line arguments, in GUI app fields, and in web forms. and it can be useful in many cases to print bytes where non-printable bytes are normally not shown but be shown in control character representation form (CCRF) where printable codes still print the same.
Hi Larz,
Very useful link on Control character and found lots of new stuff here so thanks for sharing. Keep sharing!
wow, a 1000 mile current loop. must have been an awesome lightning catcher. at least, lightning tends to make it's next jump to ground nearby, since the opposing charge is there. but charge movement in shifting storms can have interesting effects.

and their fixation on fixed length marks and spaces limited them for a long time. a variable code with alternating mark and space with the information in the length of time could have synchronized the sequence with no loss of time for information.