Efficiency with regard to nested conditionals or and statements

Mark17 · Apr-28-2022, 02:01 PM

My backtester program iterates down the rows of a .csv file. In order to backtest a strategy, over 99.5% of the rows correspond to unused options and should be skipped.

In order to choose the correct rows, I have a code block that is to be executed if three criteria are met: A, B, C. I'm trying to think about the most efficient way of doing this.

I have currently coded this as nested if statements:

if A:
    if B:
        if C:
            <instructions>

With this approach, it seems to me the most efficient approach would be to order this from most to least restrictive. For example, let's say the data file has 1000 rows and five share all three criteria (and will ultimately be processed). Consider two cases of criteria distribution (for lack of a better term): the first (second) being such that 50 (200) rows have A, 100 (100) rows have B, and 200 (50) rows have C. Coding as shown in the first case would minimize the number of rows to evaluate and is therefore most efficient. Coding as shown for the second case would mean L2 has to evaluate 200 rows rather than 50 (first case) and likely more rows that remain by L3 (compared to the first case). That amounts to lower efficiency.

Do you agree?

Finally, what are the implications of coding as above compared to a compound AND statement (e.g. if A and B and C)? I seem to think I've run into problems with the latter as it doesn't work in Python like I think it will. I may be wrong about that.

**deanhystad** · Apr-28-2022, 02:21 PM

Without knowing A, B and C it is difficult to answer your question. It might be possible to make a hashable result, and if you can do that, you could use a dictionary to select a which function to call.

Maybe you can use a case statement. Again I cannot sa without knowing more about A, B, and C.

It is also difficult to see if "and" will lead to confusion. Python and is a little different. I think it is very useful, but it is not C's &&.

Several if statements can indicate that there are better ways to solve the problem.

Mark17 · Apr-28-2022, 02:42 PM

(Apr-28-2022, 02:21 PM)deanhystad Wrote: Without knowing A, B and C it is difficult to answer your question. It might be possible to make a hashable result, and if you can do that, you could use a dictionary to select a which function to call.

Maybe you can use a case statement. Again I cannot sa without knowing more about A, B, and C.

It is also difficult to see if "and" will lead to confusion. Python and is a little different. I think it is very useful, but it is not C's &&.

Several if statements can indicate that there are better ways to solve the problem.

I'm trying to think why the specific content of A, B, and C matter. They are logical statements to be evaluated. Maybe the complexity of the statements matter? That would make sense. That is, not only the number of rows for which A, B, and C are True but also how many elements go into the truth statements themselves?

I don't know where hashable or dictionaries enter into this.

This shows A, B, and C:

bf = open(file, "r")   
for line in bf:
    datalist = line.split(",") 
    if control_flag == 'fl':
        if int(float(datalist[2])) >= mte * 30 and int(float(datalist[2])) < (((mte + 1) * 30) + 5):
            if float(datalist[6]) % 10 == 0:
                if float(datalist[6]) > float(datalist[3]) and float(datalist[6]) % 10 == 0 and float(datalist[6]) - float(datalist[3]) < 11:
                    <instructions>
    if control_flag == 'fs':
        <...>

Having me look closer does make me realize B is included in C so that's redundant. A and C are also AND statements. All should be true in order for execution of this branch to proceed.

**deanhystad** · (This post was last modified: Apr-28-2022, 04:12 PM by deanhystad.)

Python can use anything in an if statement, not just boolean values or expressions that have a boolean result. A Python "and" or "or" may result in a non-boolean result, like a list or a string. That is why the nature of A, B and C are important in knowing how they can be used.

You could use the new (python 3.10) match statement

match control_flag:
    case "f1":
        # more tests
    case "fs":
        # more tests

If you are worried about efficiency you should not do lots of unnecessary conversions . Why int(float(datalist[2]))? You can compare a float(datalist[2]) against an int mte * 30. And I don't know if Python's code generator is smart enough to not evaluate float(datalist[6] thee times in this expression.

if float(datalist[6]) > float(datalist[3]) and float(datalist[6]) % 10 == 0 and float(datalist[6]) - float(datalist[3]) < 11:

**Gribouillis** · Apr-28-2022, 04:28 PM

You could compute once for all

mtelb, mteub = mte * 30, (mte + 1) * 30 + 5

then use

if mtelb <= int(float(datalist[2])) < mteub: ...

The

if float(datalist[6]) % 10 == 0

is questionable because usually one does not compare equality of floating point values with 0. Is it really what you want?

Mark17 · Apr-28-2022, 04:32 PM

(Apr-28-2022, 04:11 PM)deanhystad Wrote: Python can use anything in an if statement, not just boolean values or expressions that have a boolean result. A Python "and" or "or" may result in a non-boolean result, like a list or a string. That is why the nature of A, B and C are important in knowing how they can be used.

You could use the new (python 3.10) match statement
match control_flag:
    case "f1":
        # more tests
    case "fs":
        # more tests
If you are worried about efficiency you should not do lots of unnecessary conversions . Why int(float(datalist[2]))? You can compare a float(datalist[2]) against an int mte * 30. And I don't know if Python's code generator is smart enough to not evaluate float(datalist[6] thee times in this expression.
if float(datalist[6]) > float(datalist[3]) and float(datalist[6]) % 10 == 0 and float(datalist[6]) - float(datalist[3]) < 11:

Thanks for the explanation.

My thought on those conversions was to eliminate the decimal portion of some of the output generated. Some fields of the .csv file couldn't be converted by int() directly, though. I therefore used int(float()).

Mark17 · Apr-28-2022, 04:36 PM

(Apr-28-2022, 04:28 PM)Gribouillis Wrote: You could compute once for all
mtelb, mteub = mte * 30, (mte + 1) * 30 + 5
then use
if mtelb <= int(float(datalist[2])) < mteub: ...
The
if float(datalist[6]) % 10 == 0
is questionable because usually one does not compare equality of floating point values with 0. Is it really what you want?

Thanks for the suggestions.

In the last, I'm looking for multiples of 10. Is there a better way to evaluate that?

**deanhystad** · Apr-28-2022, 06:42 PM

Quote:My thought on those conversions was to eliminate the decimal portion of some of the output generated. Some fields of the .csv file couldn't be converted by int() directly, though. I therefore used int(float()).

Why do you care about the decimal portion?

mte = 2
lower_bounds = mte * 30
upper_bounds = (mte + 1) * 30 + 5
for d2 in ("59.9", "60.0", "60.1", "94.9", "95", "95.1"):
    print(d2, lower_bounds <= float(d2) < upper_bounds, lower_bounds <= int(float(d2)) < upper_bounds)

Output:59.9 False False
60.0 True True
60.1 True True
94.9 True True
95 False False
95.1 False False

The decimal portion has no effect on the result of the comparisons.

**deanhystad** · Apr-28-2022, 09:11 PM

Maybe you should use pandas or numpy. This does not look like the optimal way to do anything.

Mark17 · Apr-29-2022, 07:36 PM

(Apr-28-2022, 06:42 PM)deanhystad Wrote:
Quote:My thought on those conversions was to eliminate the decimal portion of some of the output generated. Some fields of the .csv file couldn't be converted by int() directly, though. I therefore used int(float()).
Why do you care about the decimal portion?
mte = 2
lower_bounds = mte * 30
upper_bounds = (mte + 1) * 30 + 5
for d2 in ("59.9", "60.0", "60.1", "94.9", "95", "95.1"):
    print(d2, lower_bounds <= float(d2) < upper_bounds, lower_bounds <= int(float(d2)) < upper_bounds) 
Output:59.9 False False
60.0 True True
60.1 True True
94.9 True True
95 False False
95.1 False False
The decimal portion has no effect on the result of the comparisons.

The program generates a results file with different trade statistics and parameters. That is where I got a lot of unnecessary decimal output that I figured I could clean up by converting to int so there would be no decimals.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Doubt about conditionals in Python.	Carmazum	6	3,329	Apr-01-2023, 12:01 AM Last Post: Carmazum
	Numpy Structure and Efficiency	garynewport	2	1,462	Oct-19-2022, 10:11 PM Last Post: paul18fr
	conditionals based on data frame	mbrown009	1	1,645	Aug-12-2022, 08:18 AM Last Post: Larz60+
	Nested conditionals vs conditionals connected by operators	dboxall123	8	4,843	Feb-18-2022, 09:34 PM Last Post: dboxall123
	Nested Conditionals	shen123	3	3,570	Jul-28-2021, 08:24 AM Last Post: Yoriz
	How to use vectorization instead of for loop to improve efficiency in python?	PJLEMZ	4	3,582	Feb-06-2021, 09:45 AM Last Post: paul18fr
	Invalid syntax using conditionals if - else	jperezqu	1	3,063	Jan-13-2021, 07:32 PM Last Post: bowlofred
	conditionals with boolean logic??	ridgerunnersjw	3	2,863	Sep-26-2020, 02:13 PM Last Post: deanhystad
	two conditionals with intermediate code	Skaperen	5	4,111	Jul-12-2020, 07:18 PM Last Post: Skaperen
	Conditionals, while loops, continue, break (PyBite 102)	Drone4four	2	4,217	Jun-04-2020, 12:08 PM Last Post: Drone4four

Efficiency with regard to nested conditionals or and statements

User Panel Messages

Announcements