Posts: 6,780
Threads: 20
Joined: Feb 2020
Sep-27-2023, 07:48 PM
(This post was last modified: Sep-27-2023, 07:48 PM by deanhystad.)
It is not really a language thing. It is a forum thing. This is a Python forum, so posters ask technical questions related to Python. The problem with this is most programs fail in the design stage, not the coding stage. I worry that many of the answers provided here fail to solve the real problem, the problem in design that is hidden behind a technical problem. snippsat's earlier response is a technically excellent solution to a problem that I don't think exists.
I have growing confidence that your approach to filtering is wrong, but I have a question about this.
elif not any ([regionas, marketas]):
filtered_df = df[df["State"].isin(valstija)]
elif valstija and marketas:
filtered_df = df12[df["State"].isin(valstija) & df12["Marketas"].isin(marketas)] What is df12, and how does it differ from df? Is df12 the same as df with just the 12 filter columns? If true, does that mean that selecting state and market results in your filtered_df having fewer columns when you filter Market and State than if you only filter Market?
In an earlier post your said this "im using examples from internet". Do you have a link to something that uses your filtering method? I am curious.
Posts: 25
Threads: 7
Joined: Aug 2023
(Sep-27-2023, 07:48 PM)deanhystad Wrote: It is not really a language thing. It is a forum thing. This is a Python forum, so posters ask technical questions related to Python. The problem with this is most programs fail in the design stage, not the coding stage. I worry that many of the answers provided here fail to solve the real problem, the problem in design that is hidden behind a technical problem. snippsat's earlier response is a technically excellent solution to a problem that I don't think exists.
I have growing confidence that your approach to filtering is wrong, but I have a question about this.
elif not any ([regionas, marketas]):
filtered_df = df[df["State"].isin(valstija)]
elif valstija and marketas:
filtered_df = df12[df["State"].isin(valstija) & df12["Marketas"].isin(marketas)] What is df12, and how does it differ from df? Is df12 the same as df with just the 12 filter columns? If true, does that mean that selecting state and market results in your filtered_df having fewer columns when you filter Market and State than if you only filter Market?
In an earlier post your said this "im using examples from internet". Do you have a link to something that uses your filtering method? I am curious.
Understood.
Yes, i provided a video link in my last post with time stamp where the guy using this filtering method.
VIDEO: https://youtu.be/7yAw1nPareM?si=1se8zb_-YN6-BV-y&t=1071
My actual code so you will see why its df12.
# Filtrai
st.sidebar.header("Choose your filter")
# Create for Region
regionas = st.sidebar.multiselect("Pick your Region", options=df.sort_values(by="Region").Region.unique())
if not regionas:
df2 = df.copy()
else:
df2 = df[df["Region"].isin(regionas)]
# Create for Market
marketas = st.sidebar.multiselect("Pick the Market Area", options=df2.sort_values(by="Market").Market.unique())
if not marketas:
df3 = df2.copy()
else:
df3 = df2[df2["Market"].isin(marketas)]
# Create for State
valstija = st.sidebar.multiselect("Pick the State",options=df3.sort_values(by="State").State.unique())
if not valstija:
df4 = df3.copy()
else:
df4 = df3[df3["State"].isin(valstija)]
# Create for Person
zmogus = st.sidebar.multiselect("Pick a Person", options=df4.sort_values(by="Zmogus").Zmogus.unique())
if not zmogus:
df5 = df4.copy()
else:
df5 = df4[df4["Zmogus"].isin(zmogus)]
# Create for Thingy
daiktas = st.sidebar.multiselect("Pick a Thing", options=df5.sort_values(by="Daiktas").Daiktas.unique())
if not daiktas:
df6 = df5.copy()
else:
df6 = df5[df5["Daiktas"].isin(daiktas)]
# Create for Days
dienos = st.sidebar.multiselect("Pick a Day", options=df6.sort_values(by="WeekDay").WeekDay.unique())
if not dienos:
df7 = df6.copy()
else:
df7 = df6[df6["WeekDay"].isin(dienos)]
# Create for Months
menesiai = st.sidebar.multiselect("Pick a Month", options=df7.sort_values(by="Month").Month.unique())
if not menesiai:
df8 = df7.copy()
else:
df8 = df7[df7["Month"].isin(menesiai)]
# Create for Surprise
siurprizas = st.sidebar.multiselect("Pick a Surprise", options=df8.sort_values(by="Surprise").Surprise.unique())
if not siurprizas:
df9 = df8.copy()
else:
df9 = df8[df8["Siurprizas"].isin(loadtype)]
# Create for Shop
parduotuve = st.sidebar.multiselect("Pick a Shop", options=df9.sort_values(by="Shop").Shop.unique())
if not parduotuve:
df10 = df9.copy()
else:
df10 = df9[df9["Shop"].isin(parduotuve)]
# Create for Family
seima = st.sidebar.multiselect("Sort by Family", options=df10.sort_values(by="Seima").Seima.unique())
if not seima:
df11 = df10.copy()
else:
df11 = df10[df10["Seima"].isin(seima)]
# Create for Type
tipas = st.sidebar.multiselect("Filter by Type", options=df11.sort_values(by="Type").Type.unique())
if not tipas:
df12 = df11.copy()
else:
df12 = df11[df11["Type"].isin(tipas)] What is df12, and how does it differ from df? Is df12 the same as df with just the 12 filter columns? If true, does that mean that selecting state and market results in your filtered_df having fewer columns when you filter Market and State than if you only filter Market?
Yes df12 is same as df with 12 filter columns
No, it filters my rows not columns.
Posts: 6,780
Threads: 20
Joined: Feb 2020
Sep-27-2023, 09:36 PM
(This post was last modified: Sep-27-2023, 09:36 PM by deanhystad.)
I like your new approach more than the old, but it could be improved. Keeping track of all those dataframes is going to get complicated. Unless you use df10 for something other than a way to get from df9 to df11, your approach is just a longer and more resource intensive version of this:
df2 = df.copy()
st.sidebar.header("Choose your filter")
# Create for Region
regionas = st.sidebar.multiselect("Pick your Region", options=df2.sort_values(by="Region").Region.unique())
if regionas:
df2 = df2[df2["Region"].isin(regionas)]
# Create for Market
marketas = st.sidebar.multiselect("Pick the Market Area", options=df2.sort_values(by="Market").Market.unique())
if marketas:
df2= df2[df2["Market"].isin(marketas)]
. . .
# Create for Type
tipas = st.sidebar.multiselect("Filter by Type", options=d2.sort_values(by="Type").Type.unique())
if tipas:
df2 = df2[df2["Type"].isin(tipas)] This is shorter, but it is still messy. The less you have to type, the better the code. If you see that you are going to do the same thing over and over, use program tools designed to repeat something over and over, like a loop. Once you get rid of all the extra dataframes, the only difference between filtering by Region and Market are the names of the columns and the prompt in the sidebar. The rest of the code is the same, and easily moved to a loop. In the code below, I put the column name:sidebar prompt information in a dictionary.
# These are the columns we filter by. column name: sidebar prompt.
# Could be list of column names if the column names always matched the prompt.
filter_columns = {
"Region": "Region",
"Market": "Market",
"State": "State",
"Zmogus": "Person",
"Daiktas": "Thing",
"Weekday": "Day",
"Month": "Month",
"Surprise": "Surprise",
"Shop": "Shop",
"Selma": "Family",
"Type": "Type",
}
st.sidebar.header("Choose your filter")
# Loop through all the columns
df2 = df.copy()
for column, prompt in filter_columns.items():
choices = st.sidebar.multiselect(f"Pick your {prompt}", options=sorted(set(df2[column].values)))
if choices:
df2 = df2[df2[column].isin(choices)] To add, remove, change a sorting column all you need to do is edit the dictionary.
Posts: 25
Threads: 7
Joined: Aug 2023
(Sep-27-2023, 09:36 PM)deanhystad Wrote: This is shorter, but it is still messy. The less you have to type, the better the code. If you see that you are going to do the same thing over and over, use program tools designed to repeat something over and over, like a loop. Once you get rid of all the extra dataframes, the only difference between filtering by Region and Market are the names of the columns and the prompt in the sidebar. The rest of the code is the same, and easily moved to a loop. In the code below, I put the column name:sidebar prompt information in a dictionary.
# These are the columns we filter by. column name: sidebar prompt.
# Could be list of column names if the column names always matched the prompt.
filter_columns = {
"Region": "Region",
"Market": "Market",
"State": "State",
"Zmogus": "Person",
"Daiktas": "Thing",
"Weekday": "Day",
"Month": "Month",
"Surprise": "Surprise",
"Shop": "Shop",
"Selma": "Family",
"Type": "Type",
}
st.sidebar.header("Choose your filter")
# Loop through all the columns
df2 = df.copy()
for column, prompt in filter_columns.items():
choices = st.sidebar.multiselect(f"Pick your {prompt}", options=sorted(set(df2[column].values)))
if choices:
df2 = df2[df2[column].isin(choices)] To add, remove, change a sorting column all you need to do is edit the dictionary.
Your code with dictionary is way more pleasing, i will play with it more and try to understand it because if i choose State as an example - it filters everything what is after the State but not whats above it ( market and region ) [ in sidebar selection mode ]
Also im trying to use less df[numbers] - i like naming it like filtered_df etc as its easier for me to track what i have done and what i have defined.
Thank You again for your time and patience
|