Cleaning my code to make it more efficient

**deanhystad** · (This post was last modified: Sep-27-2023, 07:48 PM by deanhystad.)

It is not really a language thing. It is a forum thing. This is a Python forum, so posters ask technical questions related to Python. The problem with this is most programs fail in the design stage, not the coding stage. I worry that many of the answers provided here fail to solve the real problem, the problem in design that is hidden behind a technical problem. snippsat's earlier response is a technically excellent solution to a problem that I don't think exists.

I have growing confidence that your approach to filtering is wrong, but I have a question about this.

elif not any ([regionas, marketas]):
    filtered_df = df[df["State"].isin(valstija)]
elif valstija and marketas:
    filtered_df = df12[df["State"].isin(valstija) & df12["Marketas"].isin(marketas)]

What is df12, and how does it differ from df? Is df12 the same as df with just the 12 filter columns? If true, does that mean that selecting state and market results in your filtered_df having fewer columns when you filter Market and State than if you only filter Market?

In an earlier post your said this "im using examples from internet". Do you have a link to something that uses your filtering method? I am curious.

BSDevo · Sep-27-2023, 08:46 PM

(Sep-27-2023, 07:48 PM)deanhystad Wrote: It is not really a language thing. It is a forum thing. This is a Python forum, so posters ask technical questions related to Python. The problem with this is most programs fail in the design stage, not the coding stage. I worry that many of the answers provided here fail to solve the real problem, the problem in design that is hidden behind a technical problem. snippsat's earlier response is a technically excellent solution to a problem that I don't think exists.

I have growing confidence that your approach to filtering is wrong, but I have a question about this.
elif not any ([regionas, marketas]):
    filtered_df = df[df["State"].isin(valstija)]
elif valstija and marketas:
    filtered_df = df12[df["State"].isin(valstija) & df12["Marketas"].isin(marketas)]
What is df12, and how does it differ from df? Is df12 the same as df with just the 12 filter columns? If true, does that mean that selecting state and market results in your filtered_df having fewer columns when you filter Market and State than if you only filter Market?

In an earlier post your said this "im using examples from internet". Do you have a link to something that uses your filtering method? I am curious.

Understood.
Yes, i provided a video link in my last post with time stamp where the guy using this filtering method.
VIDEO: https://youtu.be/7yAw1nPareM?si=1se8zb_-YN6-BV-y&t=1071
My actual code so you will see why its df12.

        # Filtrai
        st.sidebar.header("Choose your filter")
            # Create for Region
        regionas = st.sidebar.multiselect("Pick your Region", options=df.sort_values(by="Region").Region.unique())
        if not regionas:
            df2 = df.copy()
        else:
            df2 = df[df["Region"].isin(regionas)]
        # Create for Market
        marketas = st.sidebar.multiselect("Pick the Market Area", options=df2.sort_values(by="Market").Market.unique())
        if not marketas:
            df3 = df2.copy()
        else:
            df3 = df2[df2["Market"].isin(marketas)]
        # Create for State
        valstija = st.sidebar.multiselect("Pick the State",options=df3.sort_values(by="State").State.unique())
        if not valstija:
            df4 = df3.copy()
        else:
            df4 = df3[df3["State"].isin(valstija)]
        # Create for Person
        zmogus = st.sidebar.multiselect("Pick a Person", options=df4.sort_values(by="Zmogus").Zmogus.unique())
        if not zmogus:
            df5 = df4.copy()
        else:
            df5 = df4[df4["Zmogus"].isin(zmogus)]
        # Create for Thingy
       daiktas = st.sidebar.multiselect("Pick a Thing", options=df5.sort_values(by="Daiktas").Daiktas.unique())
        if not daiktas:
            df6 = df5.copy()
        else:
            df6 = df5[df5["Daiktas"].isin(daiktas)]
        # Create for Days
        dienos = st.sidebar.multiselect("Pick a Day", options=df6.sort_values(by="WeekDay").WeekDay.unique())
        if not dienos:
            df7 = df6.copy()
        else:
            df7 = df6[df6["WeekDay"].isin(dienos)]
        # Create for Months
        menesiai = st.sidebar.multiselect("Pick a Month", options=df7.sort_values(by="Month").Month.unique())
        if not menesiai:
            df8 = df7.copy()
        else:
            df8 = df7[df7["Month"].isin(menesiai)]
        # Create for Surprise
        siurprizas = st.sidebar.multiselect("Pick a Surprise", options=df8.sort_values(by="Surprise").Surprise.unique())
        if not siurprizas:
            df9 = df8.copy()
        else:
            df9 = df8[df8["Siurprizas"].isin(loadtype)]
        # Create for Shop
        parduotuve = st.sidebar.multiselect("Pick a Shop", options=df9.sort_values(by="Shop").Shop.unique())
        if not parduotuve:
            df10 = df9.copy()
        else:
            df10 = df9[df9["Shop"].isin(parduotuve)]
        # Create for Family
        seima = st.sidebar.multiselect("Sort by Family", options=df10.sort_values(by="Seima").Seima.unique())
        if not seima:
            df11 = df10.copy()
        else:
            df11 = df10[df10["Seima"].isin(seima)]
        # Create for Type
        tipas = st.sidebar.multiselect("Filter by Type", options=df11.sort_values(by="Type").Type.unique())
        if not tipas:
            df12 = df11.copy()
        else:
            df12 = df11[df11["Type"].isin(tipas)]

What is df12, and how does it differ from df? Is df12 the same as df with just the 12 filter columns? If true, does that mean that selecting state and market results in your filtered_df having fewer columns when you filter Market and State than if you only filter Market?
Yes df12 is same as df with 12 filter columns
No, it filters my rows not columns.

**deanhystad** · (This post was last modified: Sep-27-2023, 09:36 PM by deanhystad.)

I like your new approach more than the old, but it could be improved. Keeping track of all those dataframes is going to get complicated. Unless you use df10 for something other than a way to get from df9 to df11, your approach is just a longer and more resource intensive version of this:

df2 = df.copy()

st.sidebar.header("Choose your filter")
     # Create for Region
 regionas = st.sidebar.multiselect("Pick your Region", options=df2.sort_values(by="Region").Region.unique())
 if regionas:
     df2 = df2[df2["Region"].isin(regionas)]
 # Create for Market
 marketas = st.sidebar.multiselect("Pick the Market Area", options=df2.sort_values(by="Market").Market.unique())
 if marketas:
     df2= df2[df2["Market"].isin(marketas)]

 . . .

 # Create for Type
 tipas = st.sidebar.multiselect("Filter by Type", options=d2.sort_values(by="Type").Type.unique())
 if tipas:
     df2 = df2[df2["Type"].isin(tipas)]

This is shorter, but it is still messy. The less you have to type, the better the code. If you see that you are going to do the same thing over and over, use program tools designed to repeat something over and over, like a loop. Once you get rid of all the extra dataframes, the only difference between filtering by Region and Market are the names of the columns and the prompt in the sidebar. The rest of the code is the same, and easily moved to a loop. In the code below, I put the column name:sidebar prompt information in a dictionary.

# These are the columns we filter by.  column name: sidebar prompt.
# Could be list of column names if the column names always matched the prompt.
filter_columns = {
    "Region": "Region",
    "Market": "Market",
    "State": "State",
    "Zmogus": "Person",
    "Daiktas": "Thing",
    "Weekday": "Day",
    "Month": "Month",
    "Surprise": "Surprise",
    "Shop": "Shop",
    "Selma": "Family",
    "Type": "Type",
}


st.sidebar.header("Choose your filter")

# Loop through all the columns
df2 = df.copy()
for column, prompt in filter_columns.items():
    choices = st.sidebar.multiselect(f"Pick your {prompt}", options=sorted(set(df2[column].values)))
    if choices:
        df2 = df2[df2[column].isin(choices)]

To add, remove, change a sorting column all you need to do is edit the dictionary.

BSDevo · Sep-27-2023, 10:39 PM

(Sep-27-2023, 09:36 PM)deanhystad Wrote: This is shorter, but it is still messy. The less you have to type, the better the code. If you see that you are going to do the same thing over and over, use program tools designed to repeat something over and over, like a loop. Once you get rid of all the extra dataframes, the only difference between filtering by Region and Market are the names of the columns and the prompt in the sidebar. The rest of the code is the same, and easily moved to a loop. In the code below, I put the column name:sidebar prompt information in a dictionary.
# These are the columns we filter by.  column name: sidebar prompt.
# Could be list of column names if the column names always matched the prompt.
filter_columns = {
    "Region": "Region",
    "Market": "Market",
    "State": "State",
    "Zmogus": "Person",
    "Daiktas": "Thing",
    "Weekday": "Day",
    "Month": "Month",
    "Surprise": "Surprise",
    "Shop": "Shop",
    "Selma": "Family",
    "Type": "Type",
}


st.sidebar.header("Choose your filter")

# Loop through all the columns
df2 = df.copy()
for column, prompt in filter_columns.items():
    choices = st.sidebar.multiselect(f"Pick your {prompt}", options=sorted(set(df2[column].values)))
    if choices:
        df2 = df2[df2[column].isin(choices)]
To add, remove, change a sorting column all you need to do is edit the dictionary.

Your code with dictionary is way more pleasing, i will play with it more and try to understand it because if i choose State as an example - it filters everything what is after the State but not whats above it ( market and region ) [ in sidebar selection mode ]
Also im trying to use less df[numbers] - i like naming it like filtered_df etc as its easier for me to track what i have done and what i have defined.
Thank You again for your time and patience

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Make code run faster: point within polygon lookups	Bennygib	2	377	Apr-19-2025, 09:33 AM Last Post: Larz60+
	How can I make this code more efficient and process faster?	steven_ximen	0	403	Dec-17-2024, 04:27 PM Last Post: steven_ximen
	hi need help to make this code work correctly	atulkul1985	5	1,980	Nov-20-2023, 04:38 PM Last Post: deanhystad
	newbie question - can't make code work	tronic72	2	1,561	Oct-22-2023, 09:08 PM Last Post: tronic72
	A more efficient code	titanif	2	1,196	Oct-17-2023, 02:07 PM Last Post: deanhystad
	how to make bot that sends instagram auto password reset code	kraixx	2	2,662	Mar-04-2023, 09:59 PM Last Post: jefsummers
	Make code non-blocking?	Extra	0	2,049	Dec-03-2022, 10:07 PM Last Post: Extra
	Making a function more efficient	CatorCanulis	9	3,590	Oct-06-2022, 07:47 AM Last Post: DPaul
	Apply textual data cleaning to several CSV files	ErcoleL99	0	1,402	Jul-09-2022, 03:01 PM Last Post: ErcoleL99
	Make the code shorter	quest	2	2,240	Mar-14-2022, 04:28 PM Last Post: deanhystad

Cleaning my code to make it more efficient

User Panel Messages

Announcements