Python Forum
Efficiency with regard to nested conditionals or and statements
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Efficiency with regard to nested conditionals or and statements
#1
My backtester program iterates down the rows of a .csv file. In order to backtest a strategy, over 99.5% of the rows correspond to unused options and should be skipped.

In order to choose the correct rows, I have a code block that is to be executed if three criteria are met: A, B, C. I'm trying to think about the most efficient way of doing this.

I have currently coded this as nested if statements:

if A:
    if B:
        if C:
            <instructions>
With this approach, it seems to me the most efficient approach would be to order this from most to least restrictive. For example, let's say the data file has 1000 rows and five share all three criteria (and will ultimately be processed). Consider two cases of criteria distribution (for lack of a better term): the first (second) being such that 50 (200) rows have A, 100 (100) rows have B, and 200 (50) rows have C. Coding as shown in the first case would minimize the number of rows to evaluate and is therefore most efficient. Coding as shown for the second case would mean L2 has to evaluate 200 rows rather than 50 (first case) and likely more rows that remain by L3 (compared to the first case). That amounts to lower efficiency.

Do you agree?

Finally, what are the implications of coding as above compared to a compound AND statement (e.g. if A and B and C)? I seem to think I've run into problems with the latter as it doesn't work in Python like I think it will. I may be wrong about that.
Reply
#2
Without knowing A, B and C it is difficult to answer your question. It might be possible to make a hashable result, and if you can do that, you could use a dictionary to select a which function to call.

Maybe you can use a case statement. Again I cannot sa without knowing more about A, B, and C.

It is also difficult to see if "and" will lead to confusion. Python and is a little different. I think it is very useful, but it is not C's &&.

Several if statements can indicate that there are better ways to solve the problem.
Reply
#3
(Apr-28-2022, 02:21 PM)deanhystad Wrote: Without knowing A, B and C it is difficult to answer your question. It might be possible to make a hashable result, and if you can do that, you could use a dictionary to select a which function to call.

Maybe you can use a case statement. Again I cannot sa without knowing more about A, B, and C.

It is also difficult to see if "and" will lead to confusion. Python and is a little different. I think it is very useful, but it is not C's &&.

Several if statements can indicate that there are better ways to solve the problem.

I'm trying to think why the specific content of A, B, and C matter. They are logical statements to be evaluated. Maybe the complexity of the statements matter? That would make sense. That is, not only the number of rows for which A, B, and C are True but also how many elements go into the truth statements themselves?

I don't know where hashable or dictionaries enter into this.

This shows A, B, and C:

bf = open(file, "r")   
for line in bf:
    datalist = line.split(",") 
    if control_flag == 'fl':
        if int(float(datalist[2])) >= mte * 30 and int(float(datalist[2])) < (((mte + 1) * 30) + 5):
            if float(datalist[6]) % 10 == 0:
                if float(datalist[6]) > float(datalist[3]) and float(datalist[6]) % 10 == 0 and float(datalist[6]) - float(datalist[3]) < 11:
                    <instructions>
    if control_flag == 'fs':
        <...>
Having me look closer does make me realize B is included in C so that's redundant. A and C are also AND statements. All should be true in order for execution of this branch to proceed.
Reply
#4
Python can use anything in an if statement, not just boolean values or expressions that have a boolean result. A Python "and" or "or" may result in a non-boolean result, like a list or a string. That is why the nature of A, B and C are important in knowing how they can be used.

You could use the new (python 3.10) match statement
match control_flag:
    case "f1":
        # more tests
    case "fs":
        # more tests
If you are worried about efficiency you should not do lots of unnecessary conversions . Why int(float(datalist[2]))? You can compare a float(datalist[2]) against an int mte * 30. And I don't know if Python's code generator is smart enough to not evaluate float(datalist[6] thee times in this expression.
if float(datalist[6]) > float(datalist[3]) and float(datalist[6]) % 10 == 0 and float(datalist[6]) - float(datalist[3]) < 11:
Mark17 likes this post
Reply
#5
You could compute once for all
mtelb, mteub = mte * 30, (mte + 1) * 30 + 5
then use
if mtelb <= int(float(datalist[2])) < mteub: ...
The
if float(datalist[6]) % 10 == 0
is questionable because usually one does not compare equality of floating point values with 0. Is it really what you want?
Mark17 likes this post
Reply
#6
(Apr-28-2022, 04:11 PM)deanhystad Wrote: Python can use anything in an if statement, not just boolean values or expressions that have a boolean result. A Python "and" or "or" may result in a non-boolean result, like a list or a string. That is why the nature of A, B and C are important in knowing how they can be used.

You could use the new (python 3.10) match statement
match control_flag:
    case "f1":
        # more tests
    case "fs":
        # more tests
If you are worried about efficiency you should not do lots of unnecessary conversions . Why int(float(datalist[2]))? You can compare a float(datalist[2]) against an int mte * 30. And I don't know if Python's code generator is smart enough to not evaluate float(datalist[6] thee times in this expression.
if float(datalist[6]) > float(datalist[3]) and float(datalist[6]) % 10 == 0 and float(datalist[6]) - float(datalist[3]) < 11:

Thanks for the explanation.

My thought on those conversions was to eliminate the decimal portion of some of the output generated. Some fields of the .csv file couldn't be converted by int() directly, though. I therefore used int(float()).
Reply
#7
(Apr-28-2022, 04:28 PM)Gribouillis Wrote: You could compute once for all
mtelb, mteub = mte * 30, (mte + 1) * 30 + 5
then use
if mtelb <= int(float(datalist[2])) < mteub: ...
The
if float(datalist[6]) % 10 == 0
is questionable because usually one does not compare equality of floating point values with 0. Is it really what you want?

Thanks for the suggestions.

In the last, I'm looking for multiples of 10. Is there a better way to evaluate that?
Reply
#8
Quote:My thought on those conversions was to eliminate the decimal portion of some of the output generated. Some fields of the .csv file couldn't be converted by int() directly, though. I therefore used int(float()).
Why do you care about the decimal portion?
mte = 2
lower_bounds = mte * 30
upper_bounds = (mte + 1) * 30 + 5
for d2 in ("59.9", "60.0", "60.1", "94.9", "95", "95.1"):
    print(d2, lower_bounds <= float(d2) < upper_bounds, lower_bounds <= int(float(d2)) < upper_bounds) 
Output:
59.9 False False 60.0 True True 60.1 True True 94.9 True True 95 False False 95.1 False False
The decimal portion has no effect on the result of the comparisons.
Reply
#9
Maybe you should use pandas or numpy. This does not look like the optimal way to do anything.
Reply
#10
(Apr-28-2022, 06:42 PM)deanhystad Wrote:
Quote:My thought on those conversions was to eliminate the decimal portion of some of the output generated. Some fields of the .csv file couldn't be converted by int() directly, though. I therefore used int(float()).
Why do you care about the decimal portion?
mte = 2
lower_bounds = mte * 30
upper_bounds = (mte + 1) * 30 + 5
for d2 in ("59.9", "60.0", "60.1", "94.9", "95", "95.1"):
    print(d2, lower_bounds <= float(d2) < upper_bounds, lower_bounds <= int(float(d2)) < upper_bounds) 
Output:
59.9 False False 60.0 True True 60.1 True True 94.9 True True 95 False False 95.1 False False
The decimal portion has no effect on the result of the comparisons.

The program generates a results file with different trade statistics and parameters. That is where I got a lot of unnecessary decimal output that I figured I could clean up by converting to int so there would be no decimals.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Doubt about conditionals in Python. Carmazum 6 1,615 Apr-01-2023, 12:01 AM
Last Post: Carmazum
  Numpy Structure and Efficiency garynewport 2 694 Oct-19-2022, 10:11 PM
Last Post: paul18fr
  conditionals based on data frame mbrown009 1 905 Aug-12-2022, 08:18 AM
Last Post: Larz60+
  Nested conditionals vs conditionals connected by operators dboxall123 8 3,076 Feb-18-2022, 09:34 PM
Last Post: dboxall123
  Nested Conditionals shen123 3 2,642 Jul-28-2021, 08:24 AM
Last Post: Yoriz
  How to use vectorization instead of for loop to improve efficiency in python? PJLEMZ 4 2,417 Feb-06-2021, 09:45 AM
Last Post: paul18fr
  Invalid syntax using conditionals if - else jperezqu 1 2,345 Jan-13-2021, 07:32 PM
Last Post: bowlofred
  conditionals with boolean logic?? ridgerunnersjw 3 2,008 Sep-26-2020, 02:13 PM
Last Post: deanhystad
  two conditionals with intermediate code Skaperen 5 2,792 Jul-12-2020, 07:18 PM
Last Post: Skaperen
  Conditionals, while loops, continue, break (PyBite 102) Drone4four 2 2,993 Jun-04-2020, 12:08 PM
Last Post: Drone4four

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020