Trying to sort a MYSQL Database - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Trying to sort a MYSQL Database (/thread-11339.html) Pages:
1
2
|
Trying to sort a MYSQL Database - PierreSoulier - Jul-04-2018 Hello, (first of all sorry for my english, I'm french **biggrin** ) I have a database in MySQL with this table: CREATE TABLE Tri( id VARCHAR(200) PRIMARY KEY, nom TEXT, adresse TEXT, CP TEXT, ville TEXT, details TEXT, organisme_gest TEXT, organisme_gest_associe TEXT, tel TEXT, web TEXT, mail TEXT, date_d TEXT, horaire TEXT, photo TEXT, date_f TEXT, handicap TEXT ); This basicly is storing data for event in France. In details there is all the information about the event. I actually use only the id and details in my program. I want to sort the database using keywords. To do so i have an idea: Creating the list of the keyword (already done): #MC = ['route','autoroute','éclairage','public','parking','couvert','non','stationnement','rue','habitat', # 'logement','transport','bus','metro','tramway','train','avion','piste','cyclable','aire','jeu','enfant', # 'crêche','parentale','haltes','garderies','jardin','d\éveil','assistante maternelle', # 'école maternelle','école élémentaire','collège','lycée professionel','lycée général','lycée technologi', # 'lycée polyvalent','université','fac','campus','étudiant','centre','formation','sport','scolaire', # 'loisir','école','centre','aéré','rythme','péri','extra','MDPH','CDAPH','IME','IEM','ITEP','SESSAD', # 'FAM','SAVS','SAMSAH','ESAT','maladie','mentale','soin','infirmier','CLIC','aidant','EHPAD','alzheimer', # 'pôle','accueil','familial','domicile','agé','CDD','senior','loisir','CHRS','hebergement','urgence','asile', # 'femme','batue','addictologie','desintoxication','insertion','remobilisation','aide','alimentaire','colis', # 'urgence','RSA','ASE','aide','socia','enfan','protection','enfance','CAF','allocation','famill','aide', # 'devoir','soutien','MJC','maison','jeune','majeur','accès','soin','APL','résidence','universitaire', # 'CLE','hebergement','scolaire','mission','locale','Crous','financ','bourse','pharmacie','médicament', # 'santé','hôpita','hospital','clinique','ambulance','formation','sanitaire','toilette','douche', # 'patrimoine','château','exposition','collection','musée','art','contemporain','cinéma','projection','film', # 'télévision','reportage','salsa','tango','danse','hip-hop','bourrée','farandole','rock-and-roll','twist', # 'valse','bal','musette','country','flamenco','concert','musique','hestiv\'oc','concert','musique','hestiv\'oc', # 'opéra','théatre','cirque','artifice','illumination','pyrotechnie','pyromusical','carnaval','urbain','défilé', # 'médiathèque','bibliothèque','livre','lecture','ludothèque','baseball','basket-ball','football', # 'football américain','handball','hockey','pétanque','boulodrome','rugby','volley-ball','athlétisme', # 'ball-trap','poing','billard','snooker','bowling','quille','echec','equitation','cheval','equestre', # 'ecrime','fleuret','epée','gymnastique','musculation','haltérophilie','culturisme','javelot','pentathlon', # 'roller','skate','roulette','pistolet','carabine','arc','triathlo','twirling','majorette','aikido','aikibudo', # 'budo','boxe','ring','savate','full contact','judo','kendo','jujitsu','karaté','martiaux','kick','boxing', # 'lutte','Muaythai','teakwondo','tai','chuan','gong','badminton','tambourin','paume','padel','pelote', # 'pala','squash','tennis','pong','ping','aviron','canöe','kayak','char','voile','plongée','apnée','sous', # 'marin','joute','sauvetage','nautique','motonautique','jetski','natation','nage','piscine','pêche','ski', # 'nautique','surf','voile','aéromodélisme','aéronautique','aérostation','camarguaise','landaise','toromachie', # 'cyclisme','cyclotourisme','giraviation','golf','escalade','motocyclisme','parachutisme','ULM','pédestre', # 'sauvetage','secourisme','ski','spéléologie','automobile','glace','patinage','traineau','pulka','parapente', # 'planeur','état','civil','fiscal','impôt','taxe','administation','numérique','démocratie','participative', # 'participatif','assainissement','ATMO','déchêt','collecte','traitement','déchetterie','recyclage', # 'gendarmerie','commisariat','police','municipale','sécurité','ASVP','SDIS','pompiers','parc','potager', # 'lac','étang','rivière','marais','site','naturel','biodiversité','ENS','forêt','bois','point de vue', # 'table d\'orientation','chemin','pédestre','propre','netto','énergie','chauffage','électricité', # 'écologie','renouvable','durable','environnement','économie','innovation','commerc','artisan', # 'fabication','tourisme','vin','gastronomie','emploi','chômage','job','recrute','entreprise','industrie', # 'entreupreun','création','agricult','agricole','coopération','international']And i want to count the occurence of every word in every details. Depending on the word that appears the most i will put the data in a new database(I can manage this part). The only problem is how to do it efficiently? I've tried this program here: QUERY="""SELECT details from Tri""" cur.execute(QUERY) #result est une liste contenant les détails result_details=cur.fetchall() QUERY="""SELECT id from Tri""" cur.execute(QUERY) #result est une liste contenant les id result_id=cur.fetchall() #On teste les mots clés un par un MC = ['route',...,'international'] #as writen above for i in result_details : #création d'un tableau: 1ère colonne: mot-clé 2ème colonne: nb d'occurence OCCUR=[0]*len(MC) for j in range(len(MC)-1): OCCUR[j] = i.count(MC[j]) #Distinction de tous les mots qui sont apparus un maximum de fois #Liste qui contient tous les mots apparus un maximum de fois MA=[] #Occurence du mot étant apparu un nombre maximal de fois occ=0 for k in range(len(MC)-1): if OCCUR[k] > occ: occ = OCCUR[k] MA = [MC[k]] elif OCCUR[k] == occ: MA.append(MC[k])But i tried to run the program for an entire night and it keep running.... In the end i don't know if the program is running properly and/or if I do it right! Do you have any suggestions? Thank you so much in advance :) RE: Trying to sort a MYSQL Database - Larz60+ - Jul-04-2018 You don't sort a database, that's one of the reasons you use it. Instead, you add a ORDER BY clause to your Query which sorts the output of the query: example: for oby in MC: cur.execute("SELECT FROM Tri VALUES (%s, %s, %s, ... ORDER BY %)", (var1, var2, var3. ..., oby)) ...Something like above. RE: Trying to sort a MYSQL Database - PierreSoulier - Jul-04-2018 I think you don't get the point, maybe because i'm having a hard time speaking english. Details is a description, a list of word if you want. For each data i'm counting how many time each word appear in the text. In the end i want to know which word has appeared most of the time. RE: Trying to sort a MYSQL Database - Larz60+ - Jul-04-2018 It would be easier to get a count of all words in result_details, using: from collections import Counter ... # If not already a list: delimiter = '|' # Or whatever it is data_list = result_details.strip().split(delimiter) most_common_list = Counter(data_list).most_common()And then find your list of words in most_common_list If it's already a list, use directly. You will have to match ID to your list, but that's simple enough using index RE: Trying to sort a MYSQL Database - PierreSoulier - Jul-04-2018 I want to count words that has appeared most of the time and that are in my MC list. If I understand it well, your program only search for the word that has appeared most of the time, either or not it is in MC, right? RE: Trying to sort a MYSQL Database - Larz60+ - Jul-04-2018 No, it will provide a list of all words in most used order. You can match this to your list simply: untested and unfinished code: from collections import Counter ... # If not already a list: delimiter = '|' # Or whatever it is data_list = result_details.strip().split(delimiter) most_common_list = Counter(data_list).most_common() mc_dict = dict((x, y) for x, y in most_common) print(mc_dict) matches = [] for idx, word in enumerate(MC): try: count = mc_dict[word] matches.append(result_id[idx], word, count) except: continue print(matches)matches will contain a list of all your MC entries that were found with corresponding id and a count of the number of times occurred. RE: Trying to sort a MYSQL Database - PierreSoulier - Jul-05-2018 Thank you for the answer :) I think i get your idea. However result_details is a tuple composed with string of word, each word separated with ' ' (space). So i don't really understand what you are trying to do on line 5? Thank you for taking your time to answer :)! I've tried something like that based on your program: for text in result_details: # If not already a list: delimiter = ' ' text=''.join(text) #from tuple to string data_list = text.strip().split(delimiter) most_common_list = Counter(data_list).most_common(5) mc_dict = dict((x, y) for x, y in most_common_list) print(mc_dict) matches = [] for idx, word in enumerate(MC): try: count = mc_dict[word] matches.append(result_id[idx], word, count) except: continue print(matches)But this is what i get in return: {'le': 3, '(jazz': 1, 'de': 6, '\xc3\xa0': 2, 'samedi': 2} [] {'vent': 1, '8h': 1, 'de': 4, 'Le': 1, 'plein': 1} [] {':': 19, 'Prix': 20, 'du': 12, 'de': 33, '-': 13} [] {'ville': 1, 'de': 2, ':': 5, 'pour': 2, 'la': 2} [] {'12h': 1, 'et': 3, 'de': 3, 'sur': 3, '\xc3\xa0': 2} [] {'\xc3\xa0': 3, 'et': 3, 'de': 6, 'pour': 3, 'les': 5} [] {'et': 4, 'de': 6, '\xc3\xa0': 7, 'du': 4, 'la': 7} [] {'tir\xc3\xa9': 1, 'port.': 1, 'Feu': 1, 'du': 1, "d'artifice": 1} [] {'': 1, 'et': 5, 'au': 2, '18': 2, 'la': 3} [] {'et': 2, 'des': 2, 'de': 7, 'le': 2, 'la': 2} [] {'60': 1, 'Loto': 1, 'des': 1, '56': 2, '06': 2} []So it seems like the first part is working great but it can't stock data in matches I've updated the MC list with very common words as 'le','la','et' etc... but i still get matches=[] RE: Trying to sort a MYSQL Database - Larz60+ - Jul-05-2018 you can reverse the order of the dictionary example: data = [('Genres','Action'), ('Adventure','Action'), ('Adventure', 'Animation'), ('Adventure', 'Animation'), ('Comedy', 'Drama'), ('Action','Adventure'), ('Fantasy', 'Science Fiction')] mc_dict = dict((y, x) for x, y in data) print(mc_dict)gives you:
RE: Trying to sort a MYSQL Database - PierreSoulier - Jul-05-2018 Ok thank you for your help, i've with this working program: db=MySQLdb.connect(host='localhost',user='root',passwd='eisti0001',db='Citela') cur=db.cursor() QUERY="""SELECT details from Tri""" cur.execute(QUERY) #result est une liste contenant les détails result_details=cur.fetchall() QUERY="""SELECT id from Tri""" cur.execute(QUERY) #result est une liste contenant les id result_id=cur.fetchall() QUERY="""SELECT nom from Tri""" cur.execute(QUERY) #result est une liste contenant les titre result_nom=cur.fetchall() #On teste les mots clés un par un MC = ['route','autoroute','éclairage','public','parking','couvert','stationnement','rue','habitat', 'logement','transport','bus','metro','tramway','train','avion','piste','cyclable','aire','jeu','enfant','enfants' 'crêche','parentale','haltes','garderies','jardin','d\éveil','assistante maternelle', 'école maternelle','école élémentaire','collège','lycée professionel','lycée général','lycée technologi', 'lycée polyvalent','université','fac','campus','étudiant','centre','formation','sport','scolaire', 'loisir','école','centre','aéré','rythme','péri','extra','MDPH','CDAPH','IME','IEM','ITEP','SESSAD', 'FAM','SAVS','SAMSAH','ESAT','maladie','mentale','soin','infirmier','CLIC','aidant','EHPAD','alzheimer', 'pôle','accueil','familial','domicile','agé','CDD','senior','loisir','CHRS','hebergement','urgence','asile', 'femme','batue','addictologie','desintoxication','insertion','remobilisation','aide','alimentaire','colis', 'urgence','RSA','ASE','aide','socia','enfan','protection','enfance','CAF','allocation','famill','aide', 'devoir','soutien','MJC','maison','jeune','majeur','accès','soin','APL','résidence','universitaire', 'CLE','hebergement','scolaire','mission','locale','Crous','financ','bourse','pharmacie','médicament', 'santé','hôpita','hospital','clinique','ambulance','formation','sanitaire','toilette','douche', 'patrimoine','château','exposition','collection','musée','art','contemporain','cinéma','projection','film', 'télévision','reportage','salsa','tango','danse','hip-hop','bourrée','farandole','rock-and-roll','twist', 'valse','bal','musette','country','flamenco','concert','hestiv\'oc','musique','hestiv\'oc', 'opéra','théatre','cirque','artifice','illumination','pyrotechnie','pyromusical','carnaval','urbain','défilé', 'médiathèque','bibliothèque','livre','lecture','ludothèque','baseball','basket-ball','football', 'football américain','handball','hockey','pétanque','boulodrome','rugby','volley-ball','athlétisme', 'ball-trap','poing','billard','snooker','bowling','quille','echec','equitation','cheval','equestre', 'ecrime','fleuret','epée','gymnastique','musculation','haltérophilie','culturisme','javelot','pentathlon', 'roller','skate','roulette','pistolet','carabine','arc','triathlo','twirling','majorette','aikido','aikibudo', 'budo','boxe','ring','savate','full contact','judo','kendo','jujitsu','karaté','martiaux','kick','boxing', 'lutte','Muaythai','teakwondo','tai','chuan','gong','badminton','tambourin','paume','padel','pelote', 'pala','squash','tennis','pong','ping','aviron','canöe','kayak','char','voile','plongée','apnée','sous', 'marin','joute','sauvetage','nautique','motonautique','jetski','natation','nage','piscine','pêche','ski', 'nautique','surf','voile','aéromodélisme','aéronautique','aérostation','camarguaise','landaise','toromachie', 'cyclisme','cyclotourisme','giraviation','golf','escalade','motocyclisme','parachutisme','ULM','pédestre', 'sauvetage','secourisme','ski','spéléologie','automobile','glace','patinage','traineau','pulka','parapente', 'planeur','état','civil','fiscal','impôt','taxe','administation','numérique','démocratie','participative', 'participatif','assainissement','ATMO','déchêt','collecte','traitement','déchetterie','recyclage', 'gendarmerie','commisariat','police','municipale','sécurité','ASVP','SDIS','pompiers','parc','potager', 'lac','étang','rivière','marais','site','naturel','biodiversité','ENS','forêt','bois','point de vue', 'table d\'orientation','chemin','pédestre','propre','netto','énergie','chauffage','électricité', 'écologie','renouvable','durable','environnement','économie','innovation','commerc','artisan', 'fabication','tourisme','vin','gastronomie','emploi','chômage','job','recrute','entreprise','industrie', 'entreupreun','création','agricult','agricole','coopération','international'] for i in range(len(result_details)): # If not already a list: text = (''.join(result_details[i])) + (''.join(result_nom[i])) delimiter = ' ' #text=''.join(text) #from tuple to string text=text.lower() #avoid case data_list = text.strip().split() most_common_list = Counter(data_list).most_common() #print(most_common_list) mc_dict = dict((x, y) for x, y in most_common_list) #print('\n\n\n~~~~~~~~Mots et occurence des mots:') #print(mc_dict) matches = [] for idx, word in enumerate(MC): try: count = mc_dict[word] matches.append((word, count)) except: continue print('\n\n\nIdentifiant et mots clés trouvés') if not matches: print(result_id[i],':Aucun mot trouvé') else: print(result_id[i],matches)And here is what the result looks like: Identifiant et mots clés trouvés (('https://data.datatourisme.gouv.fr/20/d7b292c0-931d-301c-8c7b-4eae0b1bc8e1',), [('rue', 1), ('jardin', 3)]) Identifiant et mots clés trouvés (('https://data.datatourisme.gouv.fr/20/d7c8ed18-4746-3a01-be47-17ce9534eb86',), [('musique', 2), ('sous', 1)]) Identifiant et mots clés trouvés (('https://data.datatourisme.gouv.fr/20/d7e7bc31-1fce-3e7c-a7db-82bb3681b315',), ':Aucun mot trouv\xc3\xa9') Identifiant et mots clés trouvés (('https://data.datatourisme.gouv.fr/20/d8157157-94e4-3766-8ec0-caac8a0e4623',), [('sport', 1)])With first the id then the word and its occurence One last problem i have is that it display non encoded utf-8 character. I've tried many things but nothing seemed to work RE: Trying to sort a MYSQL Database - Larz60+ - Jul-05-2018 which version of python are you using? Can you supply an example? |