Python Forum
Cleaning my code to make it more efficient
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Cleaning my code to make it more efficient
#9
I'm not asking for a description of the code. I want to know what the program does. What is it going to be used for. I am more interested in why you need to filter the dataframe than how you filter the dataframe.

This part of your description is useful:
Quote:I have csv file containing various data and it contains many columns, but i choose 12 columns to filter what i need.
So i have some pie charts, line bars, histograms, bar charts and thats how i analyze my data also.
What is this filtering meant to accomplish? How do the filters relate to the charts? Is that related to this?
Quote:Why im saying like this: i have Market column - market column contains 152 unique areas ( Atlanta, Houston, Los Angeles etc .. some major us cities ) so to write for each - i dont think its worth it as it would be 152 extra lines of code .... and to use my logic would be way much less but longer lines as i choose only Column MARKET instead of each from column MARKET or im completely wrong
I am still stymied by this:
Quote:and to use my logic would be way much less but longer lines as i choose only Column MARKET instead of each from column MARKET or im completely wrong ?
Read that last quoted line. Does it make sense to you? What am I missing that prevents it from making sense to me? Before pressing the "post" button, read the post as if you had no other knowledge of the problem except the contents of the post. Your audience is vast in Python knowledge and completely ignorant about your program.

These are equivalent. Your approach:
if not any ([regionas, valstija, marketas]):
    filtered_df = df
elif not any ([valstija, regionas]):
    filtered_df = df[df["Market"].isin(marketas)]
elif not any ([valstija, marketas]):
    filtered_df = df[df["Region"].isin(regionas)]
elif not any ([regionas, marketas]):
    filtered_df = df[df["State"].isin(valstija)]
elif valstija and marketas:
    filtered_df = df12[df["State"].isin(valstija) & df12["Marketas"].isin(marketas)]
elif valstija and marketas:
    filtered_df = df12[df["State"].isin(valstija) & df12["Market"].isin(marketas)]
elif regionas and marketas:
    filtered_df = df12[df["Region"].isin(regionas) & df12["Marketas"].isin(marketas)]
else:
    filtered_df = df12[df12["Market"].isin(marketas) & df12["Region"].isin(regionas) & df12["State"].isin(valstija) 
Using if statements:
filtered_df = df.copy()
if marketas:
    filtered_df = df[df["Market"].isin(marketas)]
if regionas:
    filtered_df = df[df["Region"].isin(regionas)]
if valstija:
    filtered_df = df[df["State"].isin(valstija)]
Using a dictionary and a for loop.
filters = {"Market": marketas, "Region": regionas, "State": valstija}
filtered_df = df.copy()
for column, values in filters.items():
    if values:
        filtered_df = filtered_df[filtered_df.isin(values)]
Just using a for loop.
for column, values in zip(("Market", "Region", "State"), (marketas, regionas, valstija)):
    if values:
        filtered_df = filtered_df[filtered_df.isin(values)]
I like the dictionary approach myself as it makes it really easy to add new filters to your code. Referring back to your initial post:
filters = {}

def add_filter(df, column):
    """Add filter to filters dictionary.  Return dataframe with all filters applied."""
    options = sorted(set(df[column].values))
    filters[column] = st.sidebar.multiselect(f"Pick you {column}", options)

    filtered_df = df.copy()
    for column, values in filters:
        filtered_df = filtered_df[filtered_df[column].isin(values)]
    return filtered_df

df2 = add_filter(df, "Regionas")
That may look like a lot of extra code, but to add Marketas all you add is:
d2 = add_filter(df, "Marketas")
So having 12 filters all you need is one function and 12 lines of code that call that function

If you are worried that updating df_filtered over and over is inefficient, you can do this.
def add_filter(df, column):
    """Add filter to filters dictionary.  Return dataframe with all filters applied."""

    options = sorted(set(df[column].values))
    filters[column] = st.sidebar.multiselect(f"Pick you {column}", options)

    selected = pd.Series([True] * len(df)):
    for column, values in filters:
        selected &= df[column].isin(values)
    return df.copy()[selected]
BSDevo likes this post
Reply


Messages In This Thread
RE: Cleaning my code to make it more efficient - by deanhystad - Sep-27-2023, 06:06 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  hi need help to make this code work correctly atulkul1985 5 1,025 Nov-20-2023, 04:38 PM
Last Post: deanhystad
  newbie question - can't make code work tronic72 2 845 Oct-22-2023, 09:08 PM
Last Post: tronic72
  A more efficient code titanif 2 600 Oct-17-2023, 02:07 PM
Last Post: deanhystad
  how to make bot that sends instagram auto password reset code kraixx 2 1,566 Mar-04-2023, 09:59 PM
Last Post: jefsummers
  Make code non-blocking? Extra 0 1,260 Dec-03-2022, 10:07 PM
Last Post: Extra
  Making a function more efficient CatorCanulis 9 2,146 Oct-06-2022, 07:47 AM
Last Post: DPaul
  Apply textual data cleaning to several CSV files ErcoleL99 0 925 Jul-09-2022, 03:01 PM
Last Post: ErcoleL99
  Make the code shorter quest 2 1,621 Mar-14-2022, 04:28 PM
Last Post: deanhystad
  How would you (as an python expert) make this code more efficient/simple coder_sw99 3 1,924 Feb-21-2022, 10:52 AM
Last Post: Gribouillis
  Pyspark - my code works but I want to make it better Kevin 1 1,913 Dec-01-2021, 05:04 AM
Last Post: Kevin

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020