Posts: 25
Threads: 7
Joined: Aug 2023
Hi, im new to Pandas, Python and all this awesomnes !
I learn hard way ...
I have few things im confused about.
Im trying to sort and visualize with graphs but separate by years.
df_mc = pd.DataFrame(df.groupby(['PMarket', 'PuYear']).size().reset_index())
df_mc.columns = ['PMarket', 'PuYear', 'Count']
fig1 = px.bar(df_mc, x="PMarket", y="Count", color='PMarket', range_y=[0,100]) But it shows me total of all years combined and selector wont work due to my "groupby function ?
Count column created by grouping. My range would be from 2000 but no end date and i want to do it for each month but cant find how to implement exact month as well.
So by the end of the day it should be - January - all data from January just need to be filtered by year with selector, next tab is February so same as January but February and so on. ( i have tabs already )
Also i would like to rename fields inside graph and be able to choose a year. I have a selector but it does nothing, my graph does not changes but if you look at THIS IMAGE it separates my bar with dotted line for each year.
Also - would it be safe to combine week 52 with week 53 if yes - how ? as far as i understand - week 52 is part of the week 53 or week 53 is part of week 52 and week 1 ?
Im also looking for info how to rename things inside charts as px.bar will show column name.
And also select Market and show on graph with all years . Lets say it shows all available years from earliest date available in my dat frame and i select area i want to look at to see difference between each year.
Add percentage difference from earliest year posible. I got game cd in 2000-1-1 next franchise came out 20010 and i got cd again but different price. i want to use my first CD as 100% and second cd price would show me a difference in percent - was it cheaper or more expensive .
I do all of this inside streamlit.
Thank You.
Posts: 7,326
Threads: 123
Joined: Sep 2016
Sep-07-2023, 09:55 AM
(This post was last modified: Sep-07-2023, 09:55 AM by snippsat.)
You should post a sample of working DataFrame,
you do lot explaining for the 4 line of code,but it hard to make any sense of without a example.
Same problem your other Thread .
Posts: 25
Threads: 7
Joined: Aug 2023
(Sep-07-2023, 09:55 AM)snippsat Wrote: You should post a sample of working DataFrame,
you do lot explaining for the 4 line of code,but it hard to make any sense of without a example.
Same problem your other Thread .
Understood, as im beginner and barelly use forums i thought this should be enough.
My code:
Reading data from uploaded file
@st.cache_data
def load_data(path: str):
df = pd.read_csv(path, converters={'PZip': str, 'PWeek': str}) # IMPORTANT !!! - to read DATA !!!!
df = df.drop_duplicates()
return df
df = load_data(uploaded_file) Date column separation to Day, Week, Month, Year
df['PDate'] = pd.to_datetime(df['PDate']) # skip if your Date column already in datetime format
df.insert(7, "PDay", "PDate")
df['PDay'] = df['PDate'].dt.day_name()
df.insert(7, "PMonth", "PDate")
df['PMonth'] = df['PDate'].dt.month_name()
df.insert(7, "PYear", "PDate")
df['PYear'] = df['PDate'].dt.year
df['PWeek'] = df['PDate'].dt.strftime('%V') # Important to get day of the week but atm its 53 weeks instead of 52 Section with chart with Year selection
with features:
df.column = ['Shop', 'PMarket', 'Receiver', 'Buyer', 'PDay', 'PYear', 'PMonth', 'PState', 'PWeek', 'Status'] # Buyer - Me, mom, sister... etc , Receiver - who received the aquired purchase
st.write(df)
January, February, March, April, May, June, July, August, September, October, November, December = st.tabs(["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"])
with January:
market_options = df['PMarket'].unique().tolist()
market_date_options = df['PDate'].unique().tolist()
market_date = st.selectbox('Choose Year', market_date_options, 3) # only 1,2,3 works - dont khow what it is , left for later inspection
market_list = st.multiselect('Choose market area', market_options, ['Atlanta'])
df = df[df['PMarket'].isin(market_list)]
df_mc = pd.DataFrame(df.groupby(['PMarket', 'PDate']).size().reset_index())
df_mc.columns = ['PMarket', 'PDate', 'Count']
fig1 = px.bar(df_mc, x="PMarket", y="Count", color='PMarket', range_y=[0,30], text_auto=True)
fig1.update_layout(width=1000)
st.write(fig1) I know im missing Month tabs, i just dont know where to put them so it would count by month count . My main goal is to Choose a year and see results From selected year by months in each tab and at the moment it count all years and year selector does not do anything.
I posted only January tab as other tabs are empty no code as i think rest of the month should be same code just change code for specific months. ?
I hope this should help more to understand my code and help me to solve it.
Thank You.
Posts: 25
Threads: 7
Joined: Aug 2023
Sep-08-2023, 09:18 AM
(This post was last modified: Sep-08-2023, 09:18 AM by BSDevo.)
I sorted out how to choose by year as i missed one option.
df = df[df['PMarket'].isin(market_list)]
df = df[df['PYear']==market_date] df = df[df['PYear']==market_date] <--- this one allows me to choose year from select list.
df_mc.columns = ['PMarket', 'PDate', 'Count'] df_mc.columns = ['PMarket', 'Count'] <--- to show on my chart.
But now i realised all of this could be done using date picker in streamlit as im having a hard time to converting month, day, week tu datetime.
with features:
market_options = df['PMarket'].unique().tolist()
min_date = pd.to_datetime(df['PuDate'], errors='coerce') # PDate renamed to PuDate [ purchase date ]
max_date = pd.to_datetime(df['PuDate'], errors='coerce')
value=(min(df['PuDate']), max(df['PuDate'])),
market_date = st.date_input(
"Date picker",
min_value=min(df['PuDate']),
max_value=max(df['PuDate']),
value=(min(df['PuDate']), max(df['PuDate'])),
format="YYYY/MM/DD"
)
market_list = st.multiselect('Choose market area', market_options, ['Atlanta'])
df = df[df['PMarket'].isin(market_list)]
df = df[df['PuDate']==market_date]
df_mc = df.groupby(df['PMarket'])['PuDate'].count().reset_index()
df_mc.columns = ['PMarket', 'Count']
fig1 = px.bar(df_mc, x="PMarket", y="Count", color='PMarket', range_y=[0,30], text_auto=True)
fig1.update_layout(width=1000)
st.write(fig1)
But im getting an error.
Error: ValueError: Lengths must match
Traceback:
File "/home/evo/koala/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script
exec(code, module.__dict__)
File "/home/evo/koala/koala.py", line 183, in <module>
df = df[df['PuDate']==market_date]
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/evo/koala/lib/python3.11/site-packages/pandas/core/ops/common.py", line 81, in new_method
return method(self, other)
^^^^^^^^^^^^^^^^^^^
File "/home/evo/koala/lib/python3.11/site-packages/pandas/core/arraylike.py", line 40, in __eq__
return self._cmp_method(other, operator.eq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/evo/koala/lib/python3.11/site-packages/pandas/core/series.py", line 6096, in _cmp_method
res_values = ops.comparison_op(lvalues, rvalues, op)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/evo/koala/lib/python3.11/site-packages/pandas/core/ops/array_ops.py", line 279, in comparison_op
res_values = op(lvalues, rvalues)
^^^^^^^^^^^^^^^^^^^^
File "/home/evo/koala/lib/python3.11/site-packages/pandas/core/ops/common.py", line 81, in new_method
return method(self, other)
^^^^^^^^^^^^^^^^^^^
File "/home/evo/koala/lib/python3.11/site-packages/pandas/core/arraylike.py", line 40, in __eq__
return self._cmp_method(other, operator.eq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/evo/koala/lib/python3.11/site-packages/pandas/core/arrays/datetimelike.py", line 935, in _cmp_method
other = self._validate_comparison_value(other)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/evo/koala/lib/python3.11/site-packages/pandas/core/arrays/datetimelike.py", line 574, in _validate_comparison_value
raise ValueError("Lengths must match")
Im trying to imply my previous code.
Posts: 6,827
Threads: 20
Joined: Feb 2020
Sep-08-2023, 12:55 PM
(This post was last modified: Sep-08-2023, 12:55 PM by deanhystad.)
From the documentation: https://docs.streamlit.io/library/api-re...date_input
value (datetime.date or datetime.datetime or list/tuple of datetime.date or datetime.datetime or None)
The value of this widget when it first renders. If a list/tuple with 0 to 2 date/datetime values is provided, the datepicker will allow users to provide a range. Defaults to today as a single-date picker.
st.date_input returns (datetime.date or a tuple with 0-2 dates)
I think the format of the return value is set by the format of the value argument. You provided a tuple as the value argument, you should expect a tuple as the return type. I think the return type might also be an empty tuple, so you need to account for that possibility.
|