I have a table in a database with about 1600 rows containing an ingredient name column with another column with all the recipe ids that ingredient is in. This second column also includes data about the number of ingredients in that recipe and a number representing the recipe popularity. So for example ;
butter <r>3897,10,200</r><r>15600,12,90</r><r>29354,7,10</r>
chopped tomatoes <r>14130,10,200</r><r>387,10,150</r><r>20888,20,50</r>
vegetable stock <r>2455,17,120</r><r>3500,12,45</r><r>3588,5,5</r>
basil <r>14130,10,200</r><r>3777,14,70</r><r>377,19,0</r>
olive oil <r>14130,10,200</r><r>387,10,150</r><r>497,7,150</r>
The r tag indicates each recipe, within those tags is recipe id, number of ingredients in recipe, popularity value. There are more than 30,000 recipes in the database, so the example above is quite a simplified one.
Given an input of a list of ingredient names & a maximum ingredients per recipe integer value in this format ;
Input ingredients: chopped tomatoes,basil,olive oil
Maximum ingredients/recipe : 15
I would like some python code that would quickly return:
* A list of unique recipe id's from the database those ingredients are in
* The list should exclude recipe id's with more than the input maximum ingredients value
* The list should be sorted by the the recipe popularity value, then by number of ingredients in each recipe.
* The list should be multidimensional, so for each recipe id, these values should also be included ; number of ingredient matches to input string, recipe popularity value, total number of ingredients in each recipe.
So for the example input values above, I would get a list like this:
[[14130, 3, 10, 200], [387, 2, 10, 150], [497, 1, 7,150]]
Any suggestions most welcome, thank you in advance.
butter <r>3897,10,200</r><r>15600,12,90</r><r>29354,7,10</r>
chopped tomatoes <r>14130,10,200</r><r>387,10,150</r><r>20888,20,50</r>
vegetable stock <r>2455,17,120</r><r>3500,12,45</r><r>3588,5,5</r>
basil <r>14130,10,200</r><r>3777,14,70</r><r>377,19,0</r>
olive oil <r>14130,10,200</r><r>387,10,150</r><r>497,7,150</r>
The r tag indicates each recipe, within those tags is recipe id, number of ingredients in recipe, popularity value. There are more than 30,000 recipes in the database, so the example above is quite a simplified one.
Given an input of a list of ingredient names & a maximum ingredients per recipe integer value in this format ;
Input ingredients: chopped tomatoes,basil,olive oil
Maximum ingredients/recipe : 15
I would like some python code that would quickly return:
* A list of unique recipe id's from the database those ingredients are in
* The list should exclude recipe id's with more than the input maximum ingredients value
* The list should be sorted by the the recipe popularity value, then by number of ingredients in each recipe.
* The list should be multidimensional, so for each recipe id, these values should also be included ; number of ingredient matches to input string, recipe popularity value, total number of ingredients in each recipe.
So for the example input values above, I would get a list like this:
[[14130, 3, 10, 200], [387, 2, 10, 150], [497, 1, 7,150]]
Any suggestions most welcome, thank you in advance.