May-24-2024, 12:06 AM
Hey everyone.
I am working on personal project that can change the face of the restaurant industry.
Let’s make it simple. Dataset of 63k rows, 7 columns. 6 significant characteristics to me target value. 2 two target values ( Show or no show). For instance, I want to build a model that is predicting if a person will show or not show at a restaurant knowing some characteristics ( Type of guest, party size, visits completed, day, hours, month). However, I have 53k rows for reservations that are qualified “Done” against 6k rows for my no show. I built random forest and regression, giving me shit results. Why? How should I deal with that? I have something big, but my model… Any help would be appreciated!
I can forward you beginning of my data set which are encoded such as Day 1 = Lundi Hours 2= between 6 to 7 Month = 3 March Type of Client 3= Member Visits completed 4 Size = 5 meaning 5 people at the table
I am working on personal project that can change the face of the restaurant industry.
Let’s make it simple. Dataset of 63k rows, 7 columns. 6 significant characteristics to me target value. 2 two target values ( Show or no show). For instance, I want to build a model that is predicting if a person will show or not show at a restaurant knowing some characteristics ( Type of guest, party size, visits completed, day, hours, month). However, I have 53k rows for reservations that are qualified “Done” against 6k rows for my no show. I built random forest and regression, giving me shit results. Why? How should I deal with that? I have something big, but my model… Any help would be appreciated!
I can forward you beginning of my data set which are encoded such as Day 1 = Lundi Hours 2= between 6 to 7 Month = 3 March Type of Client 3= Member Visits completed 4 Size = 5 meaning 5 people at the table