Python Forum
Correct way to change bytes in a file?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Correct way to change bytes in a file?
#1
I'm a beginner working on my first Python program.  Using Python 3.6.

I don't understand what is actually stored in memory for the following two scenarios:

Scenario-1:
blist = b'\x76\x12\x0B\x08'
blist[2] = 10
TypeError: 'bytes' object does not support item assignment
After a LOT of Googling, I saw a post that suggested using the list() method.  So I tried scenario-2.

Scenario-2:
blist = list(b'\x76\x12\x0B\x08')
blist[2] = 10
No error
After more Googling I learned that scenario-1 is a bytes type and that is immutable.  Scenario-2 is a list type and lists are mutable which allows scenario-2 to work without errors.

In trying to understand this, what is actually stored in memory for the two scenarios?
After line-1 is executed in both scenarios, aren't the bytes in memory identical?

For example:
In scenario-1, I assume blist[2] = hex 0B = decimal 11 is stored in memory.
In scenario-2, I assume blist[2] is also hex 0B = decimal 11 stored in memory.

Am I wrong?  Please help me to understand this.
Reply
#2
Python is not C. When you are trying to assign to an index of a list you can't think of it like having a pointer to the actual location in memory where this data is; that just isn't relevant in python.

It is my understanding that yes, essentially strings and lists are both, at their bare bones, implemented as C arrays; lists having tons of logic for resizing and other such stuff. But this is neither here nor there in terms of python programming. Lists are mutable, byte strings are not. That is just how python is designed.
Reply
#3
You don't have to worry about how things are stored. There is an immutable one and a mutable one. They could be stored exactly the same way, with just an "immutable" flag somewhere. Or one is stored as a group of contiguous bytes (or ints), while the other is a bunch of individual byte objects (control block+byte/int value). The immutable one can be used in sets and as a dictionary key because it is hashable. Not the other. The immutable one is iterable and indexable, but that doesn't make it a list, even if the lists are iterable and indexable.

A very informative video on the subject: https://www.youtube.com/watch?v=_AEJHKGk9ns
Unless noted otherwise, code in my posts should be understood as "coding suggestions", and its use may require more neurones than the two necessary for Ctrl-C/Ctrl-V.
Your one-stop place for all your GIMP needs: gimp-forum.net
Reply
#4
Well, you have a choice what data type to use. 

data_set = {'one', 'two'}
data_list = ['one', 'two']
data_tuple = ('one', 'two')
data_string = "one,two" #csv style
All of this holds the same data and they are iterable.

In [16]: print("set: {}\nlist: {}\ntuple: {}\nstring: {}".format(
    ...:                       sys.getsizeof(data_set), 
    ...:                       sys.getsizeof(data_list), 
    ...:                       sys.getsizeof(data_tuple),
    ...:                       sys.getsizeof(data_string)))
Output:
set: 224 list: 80 tuple: 64 string: 56
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#5
Quote:In scenario-1, I assume blist[2] = hex 0B = decimal 11 is stored in memory.
id() show location in memory.
So bytes string are immutable,same as str string type.
Nothing get changed in memory,not allowed.
>>> b_string = b'\x76\x12\x0B\x08'
>>> [id(i) for i in b_string]
[1400293216, 1400291616, 1400291504, 1400291456]
>>> b_string[2] = 10
Traceback (most recent call last):
  File "<string>", line 301, in runcode
  File "<interactive input>", line 1, in <module>
TypeError: 'bytes' object does not support item assignment
>>> [id(i) for i in b_string]
[1400293216, 1400291616, 1400291504, 1400291456]
Quote:In scenario-2, I assume blist[2] is also hex 0B = decimal 11 stored in memory.
List are mutable so here get item 2 changed,the original 11(with memory location 1400291504),
will be garbage collect and replaced with 10(new memory location 1400291488).
>>> blist = list(b'\x76\x12\x0B\x08')
>>> [id(i) for i in blist]
[1400293216, 1400291616, 1400291504, 1400291456]
>>> blist[2] = 10
>>> [id(i) for i in blist]
[1400293216, 1400291616, 1400291488, 1400291456]
As mention you don't have to worry about how things are stored in memory.
Python clean up(garbage collect) stuff in memory for you python  Thumbs Up
Reply
#6
(Feb-22-2017, 02:11 PM)snippsat Wrote: [python]>>> b_string = b'\x76\x12\x0B\x08'
>>> [id(i) for i in b_string]
[1400293216, 1400291616, 1400291504, 1400291456]
The difference between ids and addresses make me believe that the 'i' elements you get are already copies of the bytes in the sequence...
Unless noted otherwise, code in my posts should be understood as "coding suggestions", and its use may require more neurones than the two necessary for Ctrl-C/Ctrl-V.
Your one-stop place for all your GIMP needs: gimp-forum.net
Reply
#7
(Feb-22-2017, 04:50 PM)Ofnuts Wrote: The difference between ids and addresses make me believe that the 'i' elements you get are already copies of the bytes in the sequence...
No they are not copies,id() only return the object memory address.
Everything in python is stored as reference.
>>> import sys

>>> blist = list(b'\x76\x12\x0B\x08')
>>> id(blist)
6114280
>>> sys.getsizeof(blist)
64

>>> blist[0]
118
>>> id(blist[0])
1351010144
>>> sys.getsizeof(blist[0])
14
What is actually stored(the "pointer" itself of blist[0]) is something we cannot access or manipulate in any way.
blist[0] evaluates to a Python object(that's get a memory address) and that's all we can rely on.
Reply
#8
I started a different thread named "What is actually strored in memory?" but after 93 views, no answers so I'll present my actual task to hopefully get some answers.
(EDIT:  Opps, I see now that there are answers to that thread.  Thanks for those replies.  But please help me with this thread.  Thanks from a newbie.)

I want to read a file into memory, change some bytes in the memory data, then write the memory data to a new file.  I will not be "inserting" bytes but just changing byte values.  Here's some sample code:

with open("test 18.vf", "rb") as bfile:     # Just opens the file. Does not read the file.
    bdata = bfile.read()

bdata[2] = 10
TypeError: 'bytes' object does not support item assignment
I learned that "bdata" is a "bytes" type so is immutable.  If I convert "bdata" using the list() method, then I can change bytes without error.  Here's an example:

with open("test 18.vf", "rb") as bfile:     # Just opens the file. Does not read the file.
    bdata = bfile.read()

tmpdata = list(bdata)
tmpdata[2] = 10     

# This works with no errors
So after the bytes are changed in "tmpdata", is it correct that I need to convert it back to "bytes" data using this code:
.... bdata = bytes(tmpdata)
and then write bdata to a new file?

Or is my logic wrong or using unnecessary steps?
Reply
#9
Well, as I know in Python the variables are names for references which are pointers to memory addresses. We can play with names only
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#10
Mekire, Ofnuts, wavic and snippsat:
... Thanks so much for your helpful replies.

Ofnuts,
Thanks for that youtube link.  I will watch it tonight when I have some free time.

To all,
I'll forget about trying to figure out how things are actually stored in memory.  I did open another thread named "Correct way to change bytes in a file?".  Hopefully that will be clearer for me to understand since it deals with my actual task.

Thanks again all for taking the time to help.
Raptor88
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [SOLVED] Correct way to convert file from cp-1252 to utf-8? Winfried 8 543 Feb-29-2024, 12:30 AM
Last Post: Winfried
  logging: change log file permission with RotatingFileHandler erg 0 955 Aug-09-2023, 01:24 PM
Last Post: erg
  How can I change the uuid name of a file to his original file? MaddoxMB 2 868 Jul-17-2023, 10:15 PM
Last Post: Pedroski55
  Change HID bytes using pywinusb.hid Stealthrt 0 589 Jul-06-2023, 03:36 PM
Last Post: Stealthrt
  unittest generates multiple files for each of my test case, how do I change to 1 file zsousa 0 918 Feb-15-2023, 05:34 PM
Last Post: zsousa
  find some word in text list file and a bit change to them RolanRoll 3 1,482 Jun-27-2022, 01:36 AM
Last Post: RolanRoll
Photo (Beginners problem) Does file change during this code? fiqu 3 1,849 Nov-03-2021, 10:23 PM
Last Post: bowlofred
  change csv file into adjency list ainisyarifaah 0 1,479 Sep-21-2021, 02:49 AM
Last Post: ainisyarifaah
  Use Python to change a PHP file on my little webpage Pedroski55 0 1,481 Aug-28-2021, 12:42 AM
Last Post: Pedroski55
  Get amount of bytes in a file chesschaser 1 1,543 Aug-23-2021, 03:24 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020