Python Forum

Full Version: Should you include uneven data
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi guys,

I am trying to make this post very short, because its a very general and simple question. I have a dataset which includes date as a feature. However, approx. 80% of the date is in the 4th quarter of the year, or in the colder months of the year.
Since the dependent variable is connected to social media, I think you will see a difference in warmer and colder months due to increased activity on social media when the weather is bad. However, since the data is so one side towards the last quarter of the year, I don't know if I should include it anyway? Inlcuding it makes me think that the regression will be biased, that is just a feeling though and I am really not sure. I appreaciate any help!