Python Forum
Cleaning my code to make it more efficient
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Cleaning my code to make it more efficient
#11
It is not really a language thing. It is a forum thing. This is a Python forum, so posters ask technical questions related to Python. The problem with this is most programs fail in the design stage, not the coding stage. I worry that many of the answers provided here fail to solve the real problem, the problem in design that is hidden behind a technical problem. snippsat's earlier response is a technically excellent solution to a problem that I don't think exists.

I have growing confidence that your approach to filtering is wrong, but I have a question about this.
elif not any ([regionas, marketas]):
    filtered_df = df[df["State"].isin(valstija)]
elif valstija and marketas:
    filtered_df = df12[df["State"].isin(valstija) & df12["Marketas"].isin(marketas)]
What is df12, and how does it differ from df? Is df12 the same as df with just the 12 filter columns? If true, does that mean that selecting state and market results in your filtered_df having fewer columns when you filter Market and State than if you only filter Market?

In an earlier post your said this "im using examples from internet". Do you have a link to something that uses your filtering method? I am curious.
BSDevo likes this post
Reply
#12
(Sep-27-2023, 07:48 PM)deanhystad Wrote: It is not really a language thing. It is a forum thing. This is a Python forum, so posters ask technical questions related to Python. The problem with this is most programs fail in the design stage, not the coding stage. I worry that many of the answers provided here fail to solve the real problem, the problem in design that is hidden behind a technical problem. snippsat's earlier response is a technically excellent solution to a problem that I don't think exists.

I have growing confidence that your approach to filtering is wrong, but I have a question about this.
elif not any ([regionas, marketas]):
    filtered_df = df[df["State"].isin(valstija)]
elif valstija and marketas:
    filtered_df = df12[df["State"].isin(valstija) & df12["Marketas"].isin(marketas)]
What is df12, and how does it differ from df? Is df12 the same as df with just the 12 filter columns? If true, does that mean that selecting state and market results in your filtered_df having fewer columns when you filter Market and State than if you only filter Market?

In an earlier post your said this "im using examples from internet". Do you have a link to something that uses your filtering method? I am curious.

Understood.
Yes, i provided a video link in my last post with time stamp where the guy using this filtering method.
VIDEO: https://youtu.be/7yAw1nPareM?si=1se8zb_-YN6-BV-y&t=1071
My actual code so you will see why its df12.
        # Filtrai
        st.sidebar.header("Choose your filter")
            # Create for Region
        regionas = st.sidebar.multiselect("Pick your Region", options=df.sort_values(by="Region").Region.unique())
        if not regionas:
            df2 = df.copy()
        else:
            df2 = df[df["Region"].isin(regionas)]
        # Create for Market
        marketas = st.sidebar.multiselect("Pick the Market Area", options=df2.sort_values(by="Market").Market.unique())
        if not marketas:
            df3 = df2.copy()
        else:
            df3 = df2[df2["Market"].isin(marketas)]
        # Create for State
        valstija = st.sidebar.multiselect("Pick the State",options=df3.sort_values(by="State").State.unique())
        if not valstija:
            df4 = df3.copy()
        else:
            df4 = df3[df3["State"].isin(valstija)]
        # Create for Person
        zmogus = st.sidebar.multiselect("Pick a Person", options=df4.sort_values(by="Zmogus").Zmogus.unique())
        if not zmogus:
            df5 = df4.copy()
        else:
            df5 = df4[df4["Zmogus"].isin(zmogus)]
        # Create for Thingy
       daiktas = st.sidebar.multiselect("Pick a Thing", options=df5.sort_values(by="Daiktas").Daiktas.unique())
        if not daiktas:
            df6 = df5.copy()
        else:
            df6 = df5[df5["Daiktas"].isin(daiktas)]
        # Create for Days
        dienos = st.sidebar.multiselect("Pick a Day", options=df6.sort_values(by="WeekDay").WeekDay.unique())
        if not dienos:
            df7 = df6.copy()
        else:
            df7 = df6[df6["WeekDay"].isin(dienos)]
        # Create for Months
        menesiai = st.sidebar.multiselect("Pick a Month", options=df7.sort_values(by="Month").Month.unique())
        if not menesiai:
            df8 = df7.copy()
        else:
            df8 = df7[df7["Month"].isin(menesiai)]
        # Create for Surprise
        siurprizas = st.sidebar.multiselect("Pick a Surprise", options=df8.sort_values(by="Surprise").Surprise.unique())
        if not siurprizas:
            df9 = df8.copy()
        else:
            df9 = df8[df8["Siurprizas"].isin(loadtype)]
        # Create for Shop
        parduotuve = st.sidebar.multiselect("Pick a Shop", options=df9.sort_values(by="Shop").Shop.unique())
        if not parduotuve:
            df10 = df9.copy()
        else:
            df10 = df9[df9["Shop"].isin(parduotuve)]
        # Create for Family
        seima = st.sidebar.multiselect("Sort by Family", options=df10.sort_values(by="Seima").Seima.unique())
        if not seima:
            df11 = df10.copy()
        else:
            df11 = df10[df10["Seima"].isin(seima)]
        # Create for Type
        tipas = st.sidebar.multiselect("Filter by Type", options=df11.sort_values(by="Type").Type.unique())
        if not tipas:
            df12 = df11.copy()
        else:
            df12 = df11[df11["Type"].isin(tipas)]
What is df12, and how does it differ from df? Is df12 the same as df with just the 12 filter columns? If true, does that mean that selecting state and market results in your filtered_df having fewer columns when you filter Market and State than if you only filter Market?
Yes df12 is same as df with 12 filter columns
No, it filters my rows not columns.
Reply
#13
I like your new approach more than the old, but it could be improved. Keeping track of all those dataframes is going to get complicated. Unless you use df10 for something other than a way to get from df9 to df11, your approach is just a longer and more resource intensive version of this:
df2 = df.copy()

st.sidebar.header("Choose your filter")
     # Create for Region
 regionas = st.sidebar.multiselect("Pick your Region", options=df2.sort_values(by="Region").Region.unique())
 if regionas:
     df2 = df2[df2["Region"].isin(regionas)]
 # Create for Market
 marketas = st.sidebar.multiselect("Pick the Market Area", options=df2.sort_values(by="Market").Market.unique())
 if marketas:
     df2= df2[df2["Market"].isin(marketas)]

 . . .

 # Create for Type
 tipas = st.sidebar.multiselect("Filter by Type", options=d2.sort_values(by="Type").Type.unique())
 if tipas:
     df2 = df2[df2["Type"].isin(tipas)]
This is shorter, but it is still messy. The less you have to type, the better the code. If you see that you are going to do the same thing over and over, use program tools designed to repeat something over and over, like a loop. Once you get rid of all the extra dataframes, the only difference between filtering by Region and Market are the names of the columns and the prompt in the sidebar. The rest of the code is the same, and easily moved to a loop. In the code below, I put the column name:sidebar prompt information in a dictionary.
# These are the columns we filter by.  column name: sidebar prompt.
# Could be list of column names if the column names always matched the prompt.
filter_columns = {
    "Region": "Region",
    "Market": "Market",
    "State": "State",
    "Zmogus": "Person",
    "Daiktas": "Thing",
    "Weekday": "Day",
    "Month": "Month",
    "Surprise": "Surprise",
    "Shop": "Shop",
    "Selma": "Family",
    "Type": "Type",
}


st.sidebar.header("Choose your filter")

# Loop through all the columns
df2 = df.copy()
for column, prompt in filter_columns.items():
    choices = st.sidebar.multiselect(f"Pick your {prompt}", options=sorted(set(df2[column].values)))
    if choices:
        df2 = df2[df2[column].isin(choices)]
To add, remove, change a sorting column all you need to do is edit the dictionary.
BSDevo likes this post
Reply
#14
(Sep-27-2023, 09:36 PM)deanhystad Wrote: This is shorter, but it is still messy. The less you have to type, the better the code. If you see that you are going to do the same thing over and over, use program tools designed to repeat something over and over, like a loop. Once you get rid of all the extra dataframes, the only difference between filtering by Region and Market are the names of the columns and the prompt in the sidebar. The rest of the code is the same, and easily moved to a loop. In the code below, I put the column name:sidebar prompt information in a dictionary.
# These are the columns we filter by.  column name: sidebar prompt.
# Could be list of column names if the column names always matched the prompt.
filter_columns = {
    "Region": "Region",
    "Market": "Market",
    "State": "State",
    "Zmogus": "Person",
    "Daiktas": "Thing",
    "Weekday": "Day",
    "Month": "Month",
    "Surprise": "Surprise",
    "Shop": "Shop",
    "Selma": "Family",
    "Type": "Type",
}


st.sidebar.header("Choose your filter")

# Loop through all the columns
df2 = df.copy()
for column, prompt in filter_columns.items():
    choices = st.sidebar.multiselect(f"Pick your {prompt}", options=sorted(set(df2[column].values)))
    if choices:
        df2 = df2[df2[column].isin(choices)]
To add, remove, change a sorting column all you need to do is edit the dictionary.

Your code with dictionary is way more pleasing, i will play with it more and try to understand it because if i choose State as an example - it filters everything what is after the State but not whats above it ( market and region ) [ in sidebar selection mode ]
Also im trying to use less df[numbers] - i like naming it like filtered_df etc as its easier for me to track what i have done and what i have defined.
Thank You again for your time and patience
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  hi need help to make this code work correctly atulkul1985 5 801 Nov-20-2023, 04:38 PM
Last Post: deanhystad
  newbie question - can't make code work tronic72 2 700 Oct-22-2023, 09:08 PM
Last Post: tronic72
  A more efficient code titanif 2 505 Oct-17-2023, 02:07 PM
Last Post: deanhystad
  how to make bot that sends instagram auto password reset code kraixx 2 1,394 Mar-04-2023, 09:59 PM
Last Post: jefsummers
  Make code non-blocking? Extra 0 1,147 Dec-03-2022, 10:07 PM
Last Post: Extra
  Making a function more efficient CatorCanulis 9 1,864 Oct-06-2022, 07:47 AM
Last Post: DPaul
  Apply textual data cleaning to several CSV files ErcoleL99 0 849 Jul-09-2022, 03:01 PM
Last Post: ErcoleL99
  Make the code shorter quest 2 1,526 Mar-14-2022, 04:28 PM
Last Post: deanhystad
  How would you (as an python expert) make this code more efficient/simple coder_sw99 3 1,820 Feb-21-2022, 10:52 AM
Last Post: Gribouillis
  Pyspark - my code works but I want to make it better Kevin 1 1,799 Dec-01-2021, 05:04 AM
Last Post: Kevin

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020