Posts: 6,778
Threads: 20
Joined: Feb 2020
Quote:but for now i have to insert data from txt to the database so something like:
INSERT INTO sbostats (partita,p1,pX,p2,p1x,px2,p12) VALUES ( "Verona - Napoli", "10.42%", "27.08%","62.5%", "37.5%", "89.58%", "72.92%")
val =
Are you sure? Do you want to put a string, "10.42%", in a database or do you want to put a number like 0.1042 or 10.42? Are you going to do with the percentage?
Do you want to use "Verona - Napoli", or do you want to create a table for Verona and add percentages for when they play Napoli? Or do you want partita to be two columns: HOME and VISITOR?
In this code I parse the file and load the values into a dataframe. Pandas supports writing dataframes to sql.
https://pandas.pydata.org/docs/reference...o_sql.html
from itertools import islice
import pandas as pd
names = {
"Verona": "Hellas Verona",
"Sassuolu": "Sassuolu Calcio",
"Monza": "AC Monza",
}
def batched(iterable, n):
"Batch data into tuples of length n. The last batch may be shorter."
# batched('ABCDEFG', 3) --> ABC DEF G
if n < 1:
raise ValueError("n must be at least one")
it = iter(iterable)
while batch := tuple(islice(it, n)):
yield batch
def percentage(text):
return float(text.strip().split()[0].replace("%", ""))/100
def load_file(filename):
with open(filename, "r") as file:
for lines in batched(file, 7):
home, visitor = lines[0].strip().split(" - ")
yield {
"HOME": names.get(home, home),
"VISITOR": names.get(visitor, visitor),
"P1": percentage(lines[1]),
"PX": percentage(lines[2]),
"P2": percentage(lines[3]),
"P1X": percentage(lines[4]),
"PX2": percentage(lines[5]),
"P12": percentage(lines[6]),
}
df = pd.DataFrame(load_file("test.txt"))
print(df) Output: HOME VISITOR P1 PX P2 P1X PX2 P12
0 Hellas Verona Napoli 0.1042 0.2708 0.6250 0.3750 0.8958 0.7292
1 Torino Inter 0.1538 0.2308 0.6154 0.3846 0.8462 0.7692
2 Sassuolo Lazio 0.2407 0.2963 0.4630 0.5370 0.7593 0.7037
3 AS Roma AC Monza 0.5758 0.3333 0.0909 0.9091 0.4242 0.6667
4 Bologna Frosinone 0.5238 0.2937 0.1825 0.8175 0.4762 0.7063
5 Salernitana Cagliari 0.3778 0.2889 0.3333 0.6667 0.6222 0.7111
6 Atalanta Genoa 0.7230 0.1757 0.1014 0.8986 0.2770 0.8243
7 AC Milan Juventus 0.4335 0.3064 0.2601 0.7399 0.5665 0.6936
8 Udinese Lecce 0.4410 0.3275 0.2314 0.7686 0.5590 0.6725
9 Fiorentina Empoli 0.6859 0.1987 0.1154 0.8846 0.3141 0.8013
10 Genoa Salernitana 0.5455 0.2597 0.1948 0.8052 0.4545 0.7403
11 Sassuolo Bologna 0.4539 0.2695 0.2766 0.7234 0.5461 0.7305
12 Lecce Torino 0.3396 0.2642 0.3962 0.6038 0.6604 0.7358
13 Juventus Hellas Verona 0.6794 0.2214 0.0992 0.9008 0.3206 0.7786
14 Cagliari Frosinone 0.2857 0.2381 0.4762 0.5238 0.7143 0.7619
15 AC Monza Udinese 0.3704 0.3580 0.2716 0.7284 0.6296 0.6420
16 Inter AS Roma 0.5510 0.2653 0.1837 0.8163 0.4490 0.7347
17 Napoli AC Milan 0.3857 0.4000 0.2143 0.7857 0.6143 0.6000
18 Empoli Atalanta 0.2500 0.3750 0.3750 0.6250 0.7500 0.6250
19 Lazio Fiorentina 0.4100 0.2650 0.3250 0.6750 0.5900 0.7350
20 Hellas Verona Napoli 0.1042 0.2708 0.6250 0.3750 0.8958 0.7292
21 Torino Inter 0.1538 0.2308 0.6154 0.3846 0.8462 0.7692
22 Sassuolo Lazio 0.2407 0.2963 0.4630 0.5370 0.7593 0.7037
23 AS Roma AC Monza 0.5758 0.3333 0.0909 0.9091 0.4242 0.6667
24 Bologna Frosinone 0.5238 0.2937 0.1825 0.8175 0.4762 0.7063
25 Salernitana Cagliari 0.3778 0.2889 0.3333 0.6667 0.6222 0.7111
26 Atalanta Genoa 0.7230 0.1757 0.1014 0.8986 0.2770 0.8243
27 AC Milan Juventus 0.4335 0.3064 0.2601 0.7399 0.5665 0.6936
28 Udinese Lecce 0.4410 0.3275 0.2314 0.7686 0.5590 0.6725
29 Fiorentina Empoli 0.6859 0.1987 0.1154 0.8846 0.3141 0.8013
30 Genoa Salernitana 0.5455 0.2597 0.1948 0.8052 0.4545 0.7403
31 Sassuolo Bologna 0.4539 0.2695 0.2766 0.7234 0.5461 0.7305
32 Lecce Torino 0.3396 0.2642 0.3962 0.6038 0.6604 0.7358
33 Juventus Hellas Verona 0.6794 0.2214 0.0992 0.9008 0.3206 0.7786
34 Cagliari Frosinone 0.2857 0.2381 0.4762 0.5238 0.7143 0.7619
35 AC Monza Udinese 0.3704 0.3580 0.2716 0.7284 0.6296 0.6420
36 Inter AS Roma 0.5510 0.2653 0.1837 0.8163 0.4490 0.7347
37 Napoli AC Milan 0.3857 0.4000 0.2143 0.7857 0.6143 0.6000
38 Empoli Atalanta 0.2500 0.3750 0.3750 0.6250 0.7500 0.6250
39 Lazio Fiorentina 0.4100 0.2650 0.3250 0.6750 0.5900 0.7350
If you don't want to use pandas, you can easily write a loop to put entries in a database.
sql = "INSERT INTO sbostats (home, visitor, p1, pX, p2, p1x, px2, p12) VALUES (%s, %s, %s, %s, %s, %s, %s, %s)"
for game in load_file("test.txt"):
mycursor.execute(sql, tuple(game.values()))
cartonics likes this post
Posts: 27
Threads: 4
Joined: Oct 2023
Oct-13-2023, 02:28 PM
(This post was last modified: Oct-13-2023, 02:29 PM by cartonics.)
i need the value it can be 10.42 but also 0.1042!
i need Verona - Napoli (with - also) so i can do an update query and put them directly in calcio table
but its good to have also home and away team in other to columns to do other joins on other tables...
so with pandas i can also build the query and send to my db?
something like:
UPDATE calcio (home,visitors,p1,pX,p2,p1x,px2,p12) VALUES (" Verona","Napoli" ,"10.42 ","27.08","62.50 " ,"37.50","89.58","72.92" ) where partita ="Verona - Napoli"
and so on...
sql = "INSERT INTO sbostats (home, visitor, p1, pX, p2, p1x, px2, p12) VALUES (%s, %s, %s, %s, %s, %s, %s, %s)"
for game in load_file("test.txt"):
mycursor.execute(sql, tuple(game.values())) Error: for game in load_file("test.txt"):
NameError: name 'load_file' is not defined
Posts: 6,778
Threads: 20
Joined: Feb 2020
Posts: 27
Threads: 4
Joined: Oct 2023
Oct-13-2023, 08:01 PM
(This post was last modified: Oct-13-2023, 08:01 PM by cartonics.)
(Oct-13-2023, 06:35 PM)deanhystad Wrote: See previous post.
from itertools import islice
import pandas as pd
import mysql.connector
from mysql.connector import errorcode
mydb = mysql.connector.connect(
host="localhost",
user="root",
database="my_cataviz"
)
mycursor = mydb.cursor()
names = {
"Verona": "Hellas Verona",
"Sassuolu": "Sassuolu Calcio",
"Monza": "AC Monza",
}
def batched(iterable, n):
"Batch data into tuples of length n. The last batch may be shorter."
# batched('ABCDEFG', 3) --> ABC DEF G
if n < 1:
raise ValueError("n must be at least one")
it = iter(iterable)
while batch := tuple(islice(it, n)):
yield batch
def percentage(text):
return float(text.strip().split()[0].replace("%", ""))
def load_file(filename):
with open(filename, "r") as file:
for lines in batched(file, 7):
home, visitor = lines[0].strip().split(" - ")
yield {
"PARTITA": names.get(lines[0]),
"HOME": names.get(home, home),
"VISITOR": names.get(visitor, visitor),
"P1": percentage(lines[1]),
"PX": percentage(lines[2]),
"P2": percentage(lines[3]),
"P1X": percentage(lines[4]),
"PX2": percentage(lines[5]),
"P12": percentage(lines[6]),
}
df = pd.DataFrame(load_file("statistics.txt"))
#print(df)
mycursor = mydb.cursor()
sql = "INSERT INTO sbostats (partita,casa, fcasa, p1, px, p2, p1x, px2, p12) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)"
for game in load_file("statistics.txt"):
mycursor.execute(sql, tuple(game.values()))
#mycursor.execute(sql, val)
mydb.commit()
print(mycursor.rowcount, "records inserted.") THANKS I tryed to add also the "partita" ... non errors but the column of database is empty. While other columns are fine!
Posts: 6,778
Threads: 20
Joined: Feb 2020
lines[0] will be something like "Verona - Napoli". There is no "Verona - Napoli" entry in the names dictionary, and there should never be.
There is code to get the home and visitor teams, and how to update the team names using the names dictionary. Maybe you can use that to make a partitia column.
cartonics likes this post
Posts: 27
Threads: 4
Joined: Oct 2023
Oct-14-2023, 06:33 AM
(This post was last modified: Oct-14-2023, 06:33 AM by cartonics.)
(Oct-13-2023, 08:42 PM)deanhystad Wrote: lines[0] will be something like "Verona - Napoli". There is no "Verona - Napoli" entry in the names dictionary, and there should never be.
There is code to get the home and visitor teams, and how to update the team names using the names dictionary. Maybe you can use that to make a partitia column.
i solved via mysql doing a concat of 2 columns
Posts: 14
Threads: 1
Joined: Dec 2023
Dec-15-2023, 06:28 PM
(This post was last modified: Dec-15-2023, 06:28 PM by nicoali.)
Quote:I don't see any value in putting this information in a database. A database needs to be carefully designed to be useful. If you could put this information into a database, what kind of queries would you want to perform? Use the queries to define what tables and columns you'll need in your database, then write the code to load the file into database tables. Don't create the tables first then try to figure out how to write a query.
is it possible to do searchs on the txt that has this structure?
Genoa - Juventus
26% (3.85) 1.11% 1
18% (5.56) -1.11% X
56% (1.79) -1.11% 2
44% (2.27) 1.11% 1X
74% (1.35) -1.11% X2
82% (1.22) 1.11% 12
60% (1.67) -11.11% GOL
40% (2.50) 11.11% NO GOL
96% (1.04) -1% OVER 0.5
82% (1.22) -1.11% OVER 1.5
58% (1.72) 1.11% OVER 2.5
28% (3.57) -1.11% OVER 3.5
18% (5.56) -1.11% OVER 4.5
4% (25.00) 1% UNDER 0.5
18% (5.56) 1.11% UNDER 1.5
42% (2.38) -1.11% UNDER 2.5
72% (1.39) 1.11% UNDER 3.5
82% (1.22) 1.11% UNDER 4.5
66% (1.52) -1.11% SEGNA CASA
34% (2.94) 1.11% SEGNA CASA OVER 1.5
12% (8.33) -1.11% SEGNA CASA OVER 2.5
66% (1.52) -1.11% SEGNA CASA UNDER 1.5
88% (1.14) 1.11% SEGNA CASA UNDER 2.5
90% (1.11) -11% SEGNA OSPITE
50% (2.00) -1.11% SEGNA OSPITE OVER 1.5
24% (4.17) 1.11% SEGNA OSPITE OVER 2.5
50% (2.00) 1.11% SEGNA OSPITE UNDER 1.5
76% (1.32) -1.11% SEGNA OSPITE UNDER 2.5
24% (4.17) -11.11% 1 PRIMO TEMPO
32% (3.13) 11.11% X PRIMO TEMPO
44% (2.27) -1.11% 2 PRIMO TEMPO
56% (1.79) 1.11% 1X PRIMO TEMPO
76% (1.32) 11.11% X2 PRIMO TEMPO
68% (1.47) -11.11% 12 PRIMO TEMPO
78% (1.28) -11.11% OVER 0.5 PRIMO TEMPO
36% (2.78) -1.11% OVER 1.5 PRIMO TEMPO
10% (10.00) 1.11% OVER 2.5 PRIMO TEMPO
22% (4.55) 11.11% UNDER 0.5 PRIMO TEMPO
64% (1.56) 1.11% UNDER 1.5 PRIMO TEMPO
90% (1.11) -1.11% UNDER 2.5 PRIMO TEMPO
16% (6.25) -1.11% GOL PRIMO TEMPO
84% (1.19) 1.11% NO GOL PRIMO TEMPO
18% (5.56) -1.11% 1 SECONDO TEMPO
40% (2.50) 11% X SECONDO TEMPO
42% (2.38) -11.11% 2 SECONDO TEMPO
58% (1.72) 11.11% 1X SECONDO TEMPO
82% (1.22) 1.11% X2 SECOND HALF
60% (1.67) -11% 12 SECONDO TEMPO
78% (1.28) -11% OVER 0.5 SECONDO TEMPO
52% (1.92) 1.11% OVER 1.5 SECONDO TEMPO
30% (3.33) -1.11% OVER 2.5 SECONDO TEMPO
22% (4.55) 11% UNDER 0.5 SECONDO TEMPO
48% (2.08) -1.11% UNDER 1.5 SECONDO TEMPO
70% (1.43) 1.11% UNDER 2.5 SECONDO TEMPO
40% (2.50) 1.11% GOL SECONDO TEMPO
60% (1.67) -1.11% NO GOL SECONDO TEMPO
56% (1.79) 1.11% DISPARI
44% (2.27) -1.11% PARI
54% (1.85) -1.11% MULTIGOL CASA 1-2
62% (1.61) -1.11% MULTIGOL CASA 1-3
30% (3.33) 11.11% MULTIGOL CASA 2-3
66% (1.52) -11.11% MULTIGOL OSPITE 1-2
80% (1.25) -1.11% MULTIGOL OSPITE 1-3
40% (2.50) 1.11% MULTIGOL OSPITE 2-3
38% (2.63) -1.11% MULTIGOL 1-2
68% (1.47) 1.11% MULTIGOL 1-3
78% (1.28) 1.11% MULTIGOL 1-4
86% (1.16) -1.11% MULTIGOL 1-5
92% (1.09) 1.11% MULTIGOL 1-6
54% (1.85) -1.11% MULTIGOL 2-3
64% (1.56) 1.11% MULTIGOL 2-4
72% (1.39) -1.11% MULTIGOL 2-5
78% (1.28) -1.11% MULTIGOL 2-6
40% (2.50) 11.11% MULTIGOL 3-4
48% (2.08) 1.11% MULTIGOL 3-5
54% (1.85) 1.11% MULTIGOL 3-6
18% (5.56) -1.11% MULTIGOL 4-5
24% (4.17) -1.11% MULTIGOL 4-6
14% (7.14) -1.11% MULTIGOL 5-6
4% (25.00) -1.11% MULTIGOL 7-+
26% (3.85) 1.11% 1 + OVER 1.5
14% (7.14) -1.11% X + OVER 1.5
42% (2.38) -11.11% 2 + OVER 1.5
20% (5.00) 1.11% 1 + OVER 2.5
4% (25.00) 1% X + OVER 2.5
34% (2.94) -1.11% 2 + OVER 2.5
10% (10.00) -1.11% 1 + OVER 3.5
4% (25.00) 1% X + OVER 3.5
14% (7.14) -1.11% 2 + OVER 3.5
6% (16.67) -1.11% 1 + OVER 4.5
2% (50.00) 1% X + OVER 4.5
10% (10.00) -1.11% 2 + OVER 4.5
40% (2.50) 1.11% 1X + OVER 1.5
56% (1.79) -11.11% X2 + OVER 1.5
68% (1.47) -1.11% 12 + OVER 1.5
24% (4.17) 1.11% 1X + OVER 2.5
38% (2.63) 1.11% X2 + OVER 2.5
54% (1.85) -1.11% 12 + OVER 2.5
14% (7.14) -1.11% 1X + OVER 3.5
18% (5.56) -1.11% X2 + OVER 3.5
24% (4.17) -11.11% 12 + OVER 3.5
8% (12.50) -1.11% 1X + OVER 4.5
12% (8.33) -1.11% X2 + OVER 4.5
16% (6.25) -11.11% 12 + OVER 4.5
0% (NA) 1% 1 + UNDER 1.5
4% (25.00) 1% X + UNDER 1.5
14% (7.14) 1.11% 2 + UNDER 1.5
6% (16.67) 1% 1 + UNDER 2.5
14% (7.14) -1.11% X + UNDER 2.5
22% (4.55) -1.11% 2 + UNDER 2.5
16% (6.25) 11% 1 + UNDER 3.5
14% (7.14) -1.11% X + UNDER 3.5
42% (2.38) -1.11% 2 + UNDER 3.5
20% (5.00) 11.11% 1 + UNDER 4.5
16% (6.25) -1.11% X + UNDER 4.5
46% (2.17) 1.11% 2 + UNDER 4.5
4% (25.00) 1% 1X + UNDER 1.5
18% (5.56) 1.11% X2 + UNDER 1.5
14% (7.14) 1.11% 12 + UNDER 1.5
20% (5.00) 1.11% 1X + UNDER 2.5
36% (2.78) -1.11% X2 + UNDER 2.5
28% (3.57) 1.11% 12 + UNDER 2.5
30% (3.33) 11.11% 1X + UNDER 3.5
56% (1.79) -1.11% X2 + UNDER 3.5
58% (1.72) 11.11% 12 + UNDER 3.5
36% (2.78) 1.11% 1X + UNDER 4.5
62% (1.61) -1.11% X2 + UNDER 4.5
66% (1.52) 11.11% 12 + UNDER 4.5
20% (5.00) 1.11% 1 + GOL
14% (7.14) -1.11% X + GOL
26% (3.85) -11.11% 2 + GOL
34% (2.94) -1.11% 1X + GOL
40% (2.50) -11.11% X2 + GOL
46% (2.17) -1.11% 12 + GOL
6% (16.67) 1% 1 + NO GOL
4% (25.00) 1% X + NO GOL
30% (3.33) 1.11% 2 + NO GOL
10% (10.00) 11% 1X + NO GOL
34% (2.94) 1.11% X2 + NO GOL
36% (2.78) 1.11% 12 + NO GOL for example find all the % > 65 and odd the value in () > of 1,2 and put in output match and type ?
Posts: 14
Threads: 1
Joined: Dec 2023
import re
def is_condition_met(line):
# Utilizza un'espressione regolare per estrarre la percentuale e il valore tra parentesi
match = re.search(r'(\d+\.\d+%)\s+\((\d+\.\d+)\)', line)
if match:
percentuale = float(match.group(1)[:-1]) # Rimuove il simbolo % e converte in float
valore = float(match.group(2))
# Verifica le condizioni richieste
if percentuale > 75 and valore > 1.2:
return True
return False
with open('statistics.txt', 'r') as file:
lines = file.readlines()
for line in lines:
if is_condition_met(line):
print(line.strip()) # Stampa la linea senza spazi iniziali/finali can be done something in python to have a "window" to choice which field to choice then do the search?
Posts: 1
Threads: 0
Joined: Jan 2024
Posts: 5
Threads: 0
Joined: Jan 2024
Great discussion!
I've gained a lot of insights from this. Thank you everyone.
|