Hi, I want to optimize this function and make it faster (if possible). Please let me know if there is a better way of doing this. Thanks!
# Reading data from a YML file.
def read(value):
raw = open(file)
reader = yaml.safe_load(raw)
separate = value.split(".")
i = 0
x = reader.get(separate[i])
for i in enumerate(separate):
if isinstance(x, dict):
try:
num = (i[0] + 1)
x = x.get(separate[num])
except IndexError:
return x
return x
with open(yaml_source) as yaml_file:
reader = yaml.safe_load(yaml_file)
for key in value,split('.'):
try:
reader = reader[key]
except (KeyError, TypeError):
break
return reader
i
,
x
are terrible variable names - and your code is messy and inconsistent.
enumerate
is redundant
- You will never get
IndexError
with get
- Use
with
operator to open file
- PS When you get to a level that the value is not a
dict
- you may break out of the loop
IndexError is not being thrown by the
.get()
statement. It's being thrown by
separate[num]
. I agree that the variable names are not good. I had already changed them shortly after making this post.
with
seems unnecessary if my current version works just fine. If you can show me that using
with
improves the speed of my function, then I will use it. As for your function, I tried implementing it and it simply did not work. Here is my edited version which still works.
# Reading data from a file.
def read(value):
raw = open(file)
reader = yaml.safe_load(raw)
separate = value.split(".")
original = reader.get(separate[0])
for v, value in enumerate(separate):
if isinstance(original, dict):
try:
original = original.get(separate[(v+1)])
except IndexError:
return original
return original
For one thing, you're testing original within the loop. I would think it would be faster to check original once, and only loop if it's a dict. Also, is it common or rare that it's not a dict? If it's common, and if statement is generally faster; if it's rare, a try/except block is generally faster. And (not familiar with yaml) what is it if it's not a dict? Could testing for that be faster? The same applies for your index error. I would guess not creating a separate variable fro num would be faster, but probably not significantly.
(Jun-29-2018, 03:24 PM)Brennan Wrote: [ -> ]with
seems unnecessary if my current version works just fine.
How about writing
proper Python code?
You could have explained that you wanted to skip element 1
so
for value in separate[2:]:
instead of
enumerate
-and you'll eliminate
try/except
too.
Showing data example is usually a good idea too
(Jun-29-2018, 04:05 PM)volcano63 Wrote: [ -> ]How about writing proper Python code?
How about not being an arrogant [CENSORED BY MOD]? If you have nothing nice to say, get off of my thread. I don't need people like you spreading negativity. I started coding Python two days ago. I'm still learning.
(Jun-29-2018, 03:30 PM)ichabod801 Wrote: [ -> ]For one thing, you're testing original within the loop. I would think it would be faster to check original once, and only loop if it's a dict. Also, is it common or rare that it's not a dict? If it's common, and if statement is generally faster; if it's rare, a try/except block is generally faster. And (not familiar with yaml) what is it if it's not a dict? Could testing for that be faster? The same applies for your index error. I would guess not creating a separate variable fro num would be faster, but probably not significantly.
Great questions. So, let me give some examples. Let's say we have a .YML with random data.
If someone uses the read function as such:
read("date")
then our job is pretty simple. All we need to do is return
yaml.safe_load(raw).get("date")
However, let's say that I want to get python-version. I would need to do:
yaml.safe_load(raw).get("settings.python-version")
. Unfortunately, this does not work by default. So, how can we get around this? I begin by splitting their input:
value.split(".")
and assigning this array to the variable
separate
. We loop through the array. If the current looped element is not of type
dict
, then we are finished and we can return. Otherwise, we must append the function like such:
yaml.safe_load(raw).get("settings").get("python-version")
Using your advice, I changed the try/catch to an if-statement. This now makes the code look much cleaner. So thanks. :)
# Reading data from a file.
def read(value):
with open(file) as raw:
reader = yaml.safe_load(raw)
separate = value.split(".")
original = reader.get(separate[0])
for v, value in enumerate(separate, 1):
if v == len(separate):
return original
original = original.get(separate[v])
return original
Guys, chill out. Any more attitude or vulgarity and I am shutting this thread down.
Sorry about that ichabod801. I'm now working on my edit function. It looks like this:
# Editing data of a file.
def edit(key, value):
with open(file) as loader:
document = yaml.safe_load(loader)
document[key] = value
with open(file, "w") as writer:
yaml.dump(document, writer, default_flow_style=False)
It works fine if I do:
edit("date", "some value")
. However, if I want to do something like this:
edit("settings.python-version", "some value")
it doesn't. Any suggestions?
How exactly does it not work? Are you getting an error? Is the output wrong (and how is it wrong)?
The function wasn't setup to edit values inside of other values. For reference, the picture I posted above is the YML file that I am manipulating/reading. If I pass in date, it would look like this: document['date'] = value
. This currently works perfectly.
To edit the python version, I would need to do: document['settings']['python-version'] = value
. I can easily do this by hard-coding it. However, what if I want to take key
(the function argument) which would be "settings.python-version"
and convert it to: document['settings']['python-version']