Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
replace text in a txt
#11
Quote:but for now i have to insert data from txt to the database so something like:
INSERT INTO sbostats (partita,p1,pX,p2,p1x,px2,p12) VALUES ( "Verona - Napoli", "10.42%", "27.08%","62.5%", "37.5%", "89.58%", "72.92%")
val =
Are you sure? Do you want to put a string, "10.42%", in a database or do you want to put a number like 0.1042 or 10.42? Are you going to do with the percentage?

Do you want to use "Verona - Napoli", or do you want to create a table for Verona and add percentages for when they play Napoli? Or do you want partita to be two columns: HOME and VISITOR?

In this code I parse the file and load the values into a dataframe. Pandas supports writing dataframes to sql.

https://pandas.pydata.org/docs/reference...o_sql.html
from itertools import islice
import pandas as pd


names = {
    "Verona": "Hellas Verona",
    "Sassuolu": "Sassuolu Calcio",
    "Monza": "AC Monza",
}


def batched(iterable, n):
    "Batch data into tuples of length n. The last batch may be shorter."
    # batched('ABCDEFG', 3) --> ABC DEF G
    if n < 1:
        raise ValueError("n must be at least one")
    it = iter(iterable)
    while batch := tuple(islice(it, n)):
        yield batch


def percentage(text):
    return float(text.strip().split()[0].replace("%", ""))/100


def load_file(filename):
    with open(filename, "r") as file:
        for lines in batched(file, 7):
            home, visitor = lines[0].strip().split(" - ")
            yield {
                "HOME": names.get(home, home),
                "VISITOR": names.get(visitor, visitor),
                "P1": percentage(lines[1]),
                "PX": percentage(lines[2]),
                "P2": percentage(lines[3]),
                "P1X": percentage(lines[4]),
                "PX2": percentage(lines[5]),
                "P12": percentage(lines[6]),
            }


df = pd.DataFrame(load_file("test.txt"))
print(df)
Output:
HOME VISITOR P1 PX P2 P1X PX2 P12 0 Hellas Verona Napoli 0.1042 0.2708 0.6250 0.3750 0.8958 0.7292 1 Torino Inter 0.1538 0.2308 0.6154 0.3846 0.8462 0.7692 2 Sassuolo Lazio 0.2407 0.2963 0.4630 0.5370 0.7593 0.7037 3 AS Roma AC Monza 0.5758 0.3333 0.0909 0.9091 0.4242 0.6667 4 Bologna Frosinone 0.5238 0.2937 0.1825 0.8175 0.4762 0.7063 5 Salernitana Cagliari 0.3778 0.2889 0.3333 0.6667 0.6222 0.7111 6 Atalanta Genoa 0.7230 0.1757 0.1014 0.8986 0.2770 0.8243 7 AC Milan Juventus 0.4335 0.3064 0.2601 0.7399 0.5665 0.6936 8 Udinese Lecce 0.4410 0.3275 0.2314 0.7686 0.5590 0.6725 9 Fiorentina Empoli 0.6859 0.1987 0.1154 0.8846 0.3141 0.8013 10 Genoa Salernitana 0.5455 0.2597 0.1948 0.8052 0.4545 0.7403 11 Sassuolo Bologna 0.4539 0.2695 0.2766 0.7234 0.5461 0.7305 12 Lecce Torino 0.3396 0.2642 0.3962 0.6038 0.6604 0.7358 13 Juventus Hellas Verona 0.6794 0.2214 0.0992 0.9008 0.3206 0.7786 14 Cagliari Frosinone 0.2857 0.2381 0.4762 0.5238 0.7143 0.7619 15 AC Monza Udinese 0.3704 0.3580 0.2716 0.7284 0.6296 0.6420 16 Inter AS Roma 0.5510 0.2653 0.1837 0.8163 0.4490 0.7347 17 Napoli AC Milan 0.3857 0.4000 0.2143 0.7857 0.6143 0.6000 18 Empoli Atalanta 0.2500 0.3750 0.3750 0.6250 0.7500 0.6250 19 Lazio Fiorentina 0.4100 0.2650 0.3250 0.6750 0.5900 0.7350 20 Hellas Verona Napoli 0.1042 0.2708 0.6250 0.3750 0.8958 0.7292 21 Torino Inter 0.1538 0.2308 0.6154 0.3846 0.8462 0.7692 22 Sassuolo Lazio 0.2407 0.2963 0.4630 0.5370 0.7593 0.7037 23 AS Roma AC Monza 0.5758 0.3333 0.0909 0.9091 0.4242 0.6667 24 Bologna Frosinone 0.5238 0.2937 0.1825 0.8175 0.4762 0.7063 25 Salernitana Cagliari 0.3778 0.2889 0.3333 0.6667 0.6222 0.7111 26 Atalanta Genoa 0.7230 0.1757 0.1014 0.8986 0.2770 0.8243 27 AC Milan Juventus 0.4335 0.3064 0.2601 0.7399 0.5665 0.6936 28 Udinese Lecce 0.4410 0.3275 0.2314 0.7686 0.5590 0.6725 29 Fiorentina Empoli 0.6859 0.1987 0.1154 0.8846 0.3141 0.8013 30 Genoa Salernitana 0.5455 0.2597 0.1948 0.8052 0.4545 0.7403 31 Sassuolo Bologna 0.4539 0.2695 0.2766 0.7234 0.5461 0.7305 32 Lecce Torino 0.3396 0.2642 0.3962 0.6038 0.6604 0.7358 33 Juventus Hellas Verona 0.6794 0.2214 0.0992 0.9008 0.3206 0.7786 34 Cagliari Frosinone 0.2857 0.2381 0.4762 0.5238 0.7143 0.7619 35 AC Monza Udinese 0.3704 0.3580 0.2716 0.7284 0.6296 0.6420 36 Inter AS Roma 0.5510 0.2653 0.1837 0.8163 0.4490 0.7347 37 Napoli AC Milan 0.3857 0.4000 0.2143 0.7857 0.6143 0.6000 38 Empoli Atalanta 0.2500 0.3750 0.3750 0.6250 0.7500 0.6250 39 Lazio Fiorentina 0.4100 0.2650 0.3250 0.6750 0.5900 0.7350
If you don't want to use pandas, you can easily write a loop to put entries in a database.
sql = "INSERT INTO sbostats (home, visitor, p1, pX, p2, p1x, px2, p12) VALUES (%s, %s, %s, %s, %s, %s, %s, %s)"
for game in load_file("test.txt"):
    mycursor.execute(sql, tuple(game.values()))
cartonics likes this post
Reply
#12
i need the value it can be 10.42 but also 0.1042!

i need Verona - Napoli (with - also) so i can do an update query and put them directly in calcio table
but its good to have also home and away team in other to columns to do other joins on other tables...

so with pandas i can also build the query and send to my db?

something like:
UPDATE calcio (home,visitors,p1,pX,p2,p1x,px2,p12) VALUES (" Verona","Napoli" ,"10.42 ","27.08","62.50 " ,"37.50","89.58","72.92" ) where partita ="Verona - Napoli"

and so on...

sql = "INSERT INTO sbostats (home, visitor, p1, pX, p2, p1x, px2, p12) VALUES (%s, %s, %s, %s, %s, %s, %s, %s)"
for game in load_file("test.txt"):
    mycursor.execute(sql, tuple(game.values()))
Error:
for game in load_file("test.txt"): NameError: name 'load_file' is not defined
Reply
#13
See previous post.
Reply
#14
(Oct-13-2023, 06:35 PM)deanhystad Wrote: See previous post.

from itertools import islice
import pandas as pd
import mysql.connector
from mysql.connector import errorcode

mydb = mysql.connector.connect(
  host="localhost",
  user="root",
  database="my_cataviz"
)

mycursor = mydb.cursor()
 
names = {
    "Verona": "Hellas Verona",
    "Sassuolu": "Sassuolu Calcio",
    "Monza": "AC Monza",
}
 
 
def batched(iterable, n):
    "Batch data into tuples of length n. The last batch may be shorter."
    # batched('ABCDEFG', 3) --> ABC DEF G
    if n < 1:
        raise ValueError("n must be at least one")
    it = iter(iterable)
    while batch := tuple(islice(it, n)):
        yield batch
 
 
def percentage(text):
    return float(text.strip().split()[0].replace("%", ""))
 
 
def load_file(filename):
    with open(filename, "r") as file:
        for lines in batched(file, 7):
            home, visitor = lines[0].strip().split(" - ")
            yield {
                "PARTITA": names.get(lines[0]),
                "HOME": names.get(home, home),
                "VISITOR": names.get(visitor, visitor),
                "P1": percentage(lines[1]),
                "PX": percentage(lines[2]),
                "P2": percentage(lines[3]),
                "P1X": percentage(lines[4]),
                "PX2": percentage(lines[5]),
                "P12": percentage(lines[6]),
            }
 
 
df = pd.DataFrame(load_file("statistics.txt"))
#print(df)


mycursor = mydb.cursor()

sql = "INSERT INTO sbostats (partita,casa, fcasa, p1, px, p2, p1x, px2, p12) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)"
for game in load_file("statistics.txt"):
    
    mycursor.execute(sql, tuple(game.values()))
    
#mycursor.execute(sql, val)

mydb.commit()

print(mycursor.rowcount, "records inserted.")
THANKS I tryed to add also the "partita" ... non errors but the column of database is empty. While other columns are fine!
Reply
#15
lines[0] will be something like "Verona - Napoli". There is no "Verona - Napoli" entry in the names dictionary, and there should never be.

There is code to get the home and visitor teams, and how to update the team names using the names dictionary. Maybe you can use that to make a partitia column.
cartonics likes this post
Reply
#16
(Oct-13-2023, 08:42 PM)deanhystad Wrote: lines[0] will be something like "Verona - Napoli". There is no "Verona - Napoli" entry in the names dictionary, and there should never be.

There is code to get the home and visitor teams, and how to update the team names using the names dictionary. Maybe you can use that to make a partitia column.

i solved via mysql doing a concat of 2 columns
Reply
#17
Quote:I don't see any value in putting this information in a database. A database needs to be carefully designed to be useful. If you could put this information into a database, what kind of queries would you want to perform? Use the queries to define what tables and columns you'll need in your database, then write the code to load the file into database tables. Don't create the tables first then try to figure out how to write a query.

is it possible to do searchs on the txt that has this structure?

Genoa - Juventus
26% (3.85)  1.11%   1
18% (5.56)  -1.11%   X
56% (1.79)  -1.11%   2
44% (2.27)  1.11%   1X
74% (1.35)  -1.11%   X2
82% (1.22)  1.11%   12
60% (1.67)  -11.11%   GOL
40% (2.50)  11.11%   NO GOL
96% (1.04)  -1%   OVER 0.5
82% (1.22)  -1.11%   OVER 1.5
58% (1.72)  1.11%   OVER 2.5
28% (3.57)  -1.11%   OVER 3.5
18% (5.56)  -1.11%   OVER 4.5
4% (25.00)  1%   UNDER 0.5
18% (5.56)  1.11%   UNDER 1.5
42% (2.38)  -1.11%   UNDER 2.5
72% (1.39)  1.11%   UNDER 3.5
82% (1.22)  1.11%   UNDER 4.5
66% (1.52)  -1.11%   SEGNA CASA
34% (2.94)  1.11%   SEGNA CASA OVER 1.5
12% (8.33)  -1.11%   SEGNA CASA OVER 2.5
66% (1.52)  -1.11%   SEGNA CASA UNDER 1.5
88% (1.14)  1.11%   SEGNA CASA UNDER 2.5
90% (1.11)  -11%   SEGNA OSPITE
50% (2.00)  -1.11%   SEGNA OSPITE OVER 1.5
24% (4.17)  1.11%   SEGNA OSPITE OVER 2.5
50% (2.00)  1.11%   SEGNA OSPITE UNDER 1.5
76% (1.32)  -1.11%   SEGNA OSPITE UNDER 2.5
24% (4.17)  -11.11%   1 PRIMO TEMPO
32% (3.13)  11.11%   X PRIMO TEMPO
44% (2.27)  -1.11%   2 PRIMO TEMPO
56% (1.79)  1.11%   1X PRIMO TEMPO
76% (1.32)  11.11%   X2 PRIMO TEMPO
68% (1.47)  -11.11%   12 PRIMO TEMPO
78% (1.28)  -11.11%   OVER 0.5 PRIMO TEMPO
36% (2.78)  -1.11%   OVER 1.5 PRIMO TEMPO
10% (10.00)  1.11%   OVER 2.5 PRIMO TEMPO
22% (4.55)  11.11%   UNDER 0.5 PRIMO TEMPO
64% (1.56)  1.11%   UNDER 1.5 PRIMO TEMPO
90% (1.11)  -1.11%   UNDER 2.5 PRIMO TEMPO
16% (6.25)  -1.11%   GOL PRIMO TEMPO
84% (1.19)  1.11%   NO GOL PRIMO TEMPO
18% (5.56)  -1.11%   1 SECONDO TEMPO
40% (2.50)  11%   X SECONDO TEMPO
42% (2.38)  -11.11%   2 SECONDO TEMPO
58% (1.72)  11.11%   1X SECONDO TEMPO
82% (1.22)  1.11%   X2 SECOND HALF
60% (1.67)  -11%   12 SECONDO TEMPO
78% (1.28)  -11%   OVER 0.5 SECONDO TEMPO
52% (1.92)  1.11%   OVER 1.5 SECONDO TEMPO
30% (3.33)  -1.11%   OVER 2.5 SECONDO TEMPO
22% (4.55)  11%   UNDER 0.5 SECONDO TEMPO
48% (2.08)  -1.11%   UNDER 1.5 SECONDO TEMPO
70% (1.43)  1.11%   UNDER 2.5 SECONDO TEMPO
40% (2.50)  1.11%   GOL SECONDO TEMPO
60% (1.67)  -1.11%   NO GOL SECONDO TEMPO
56% (1.79)  1.11%   DISPARI
44% (2.27)  -1.11%   PARI
54% (1.85)  -1.11%   MULTIGOL CASA 1-2
62% (1.61)  -1.11%   MULTIGOL CASA 1-3
30% (3.33)  11.11%   MULTIGOL CASA 2-3
66% (1.52)  -11.11%   MULTIGOL OSPITE 1-2
80% (1.25)  -1.11%   MULTIGOL OSPITE 1-3
40% (2.50)  1.11%   MULTIGOL OSPITE 2-3
38% (2.63)  -1.11%   MULTIGOL 1-2
68% (1.47)  1.11%   MULTIGOL 1-3
78% (1.28)  1.11%   MULTIGOL 1-4
86% (1.16)  -1.11%   MULTIGOL 1-5
92% (1.09)  1.11%   MULTIGOL 1-6
54% (1.85)  -1.11%   MULTIGOL 2-3
64% (1.56)  1.11%   MULTIGOL 2-4
72% (1.39)  -1.11%   MULTIGOL 2-5
78% (1.28)  -1.11%   MULTIGOL 2-6
40% (2.50)  11.11%   MULTIGOL 3-4
48% (2.08)  1.11%   MULTIGOL 3-5
54% (1.85)  1.11%   MULTIGOL 3-6
18% (5.56)  -1.11%   MULTIGOL 4-5
24% (4.17)  -1.11%   MULTIGOL 4-6
14% (7.14)  -1.11%   MULTIGOL 5-6
4% (25.00)  -1.11%   MULTIGOL 7-+
26% (3.85)  1.11%   1 + OVER 1.5
14% (7.14)  -1.11%   X + OVER 1.5
42% (2.38)  -11.11%   2 + OVER 1.5
20% (5.00)  1.11%   1 + OVER 2.5
4% (25.00)  1%   X + OVER 2.5
34% (2.94)  -1.11%   2 + OVER 2.5
10% (10.00)  -1.11%   1 + OVER 3.5
4% (25.00)  1%   X + OVER 3.5
14% (7.14)  -1.11%   2 + OVER 3.5
6% (16.67)  -1.11%   1 + OVER 4.5
2% (50.00)  1%   X + OVER 4.5
10% (10.00)  -1.11%   2 + OVER 4.5
40% (2.50)  1.11%   1X + OVER 1.5
56% (1.79)  -11.11%   X2 + OVER 1.5
68% (1.47)  -1.11%   12 + OVER 1.5
24% (4.17)  1.11%   1X + OVER 2.5
38% (2.63)  1.11%   X2 + OVER 2.5
54% (1.85)  -1.11%   12 + OVER 2.5
14% (7.14)  -1.11%   1X + OVER 3.5
18% (5.56)  -1.11%   X2 + OVER 3.5
24% (4.17)  -11.11%   12 + OVER 3.5
8% (12.50)  -1.11%   1X + OVER 4.5
12% (8.33)  -1.11%   X2 + OVER 4.5
16% (6.25)  -11.11%   12 + OVER 4.5
0% (NA)  1%   1 + UNDER 1.5
4% (25.00)  1%   X + UNDER 1.5
14% (7.14)  1.11%   2 + UNDER 1.5
6% (16.67)  1%   1 + UNDER 2.5
14% (7.14)  -1.11%   X + UNDER 2.5
22% (4.55)  -1.11%   2 + UNDER 2.5
16% (6.25)  11%   1 + UNDER 3.5
14% (7.14)  -1.11%   X + UNDER 3.5
42% (2.38)  -1.11%   2 + UNDER 3.5
20% (5.00)  11.11%   1 + UNDER 4.5
16% (6.25)  -1.11%   X + UNDER 4.5
46% (2.17)  1.11%   2 + UNDER 4.5
4% (25.00)  1%   1X + UNDER 1.5
18% (5.56)  1.11%   X2 + UNDER 1.5
14% (7.14)  1.11%   12 + UNDER 1.5
20% (5.00)  1.11%   1X + UNDER 2.5
36% (2.78)  -1.11%   X2 + UNDER 2.5
28% (3.57)  1.11%   12 + UNDER 2.5
30% (3.33)  11.11%   1X + UNDER 3.5
56% (1.79)  -1.11%   X2 + UNDER 3.5
58% (1.72)  11.11%   12 + UNDER 3.5
36% (2.78)  1.11%   1X + UNDER 4.5
62% (1.61)  -1.11%   X2 + UNDER 4.5
66% (1.52)  11.11%   12 + UNDER 4.5
20% (5.00)  1.11%   1 + GOL
14% (7.14)  -1.11%   X + GOL
26% (3.85)  -11.11%   2 + GOL
34% (2.94)  -1.11%   1X + GOL
40% (2.50)  -11.11%   X2 + GOL
46% (2.17)  -1.11%   12 + GOL
6% (16.67)  1%   1 + NO GOL
4% (25.00)  1%   X + NO GOL
30% (3.33)  1.11%   2 + NO GOL
10% (10.00)  11%   1X + NO GOL
34% (2.94)  1.11%   X2 + NO GOL
36% (2.78)  1.11%   12 + NO GOL
for example find all the % > 65 and odd the value in () > of 1,2 and put in output match and type ?
Reply
#18
import re

def is_condition_met(line):
    # Utilizza un'espressione regolare per estrarre la percentuale e il valore tra parentesi
    match = re.search(r'(\d+\.\d+%)\s+\((\d+\.\d+)\)', line)
    
    if match:
        percentuale = float(match.group(1)[:-1])  # Rimuove il simbolo % e converte in float
        valore = float(match.group(2))
        
        # Verifica le condizioni richieste
        if percentuale > 75 and valore > 1.2:
            return True

    return False

with open('statistics.txt', 'r') as file:
    lines = file.readlines()

    for line in lines:
        if is_condition_met(line):
            print(line.strip())  # Stampa la linea senza spazi iniziali/finali
can be done something in python to have a "window" to choice which field to choice then do the search?
Reply
#19
thanks for all reply
Reply
#20
Great discussion!
I've gained a lot of insights from this. Thank you everyone.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Replace a text/word in docx file using Python Devan 4 3,470 Oct-17-2023, 06:03 PM
Last Post: Devan
  python-docx regex: replace any word in docx text Tmagpy 4 2,252 Jun-18-2022, 09:12 AM
Last Post: Tmagpy
  Replace String in multiple text-files [SOLVED] AlphaInc 5 8,168 Aug-08-2021, 04:59 PM
Last Post: Axel_Erfurt
  Iterate 2 large text files across lines and replace lines in second file medatib531 13 5,879 Aug-10-2020, 11:01 PM
Last Post: medatib531
  Search & Replace - Newlines Added After Replace dj99 3 3,413 Jul-22-2018, 01:42 PM
Last Post: buran
  Need to replace (remove) Unicode characters in text ineuw 1 8,606 Jan-02-2018, 08:01 PM
Last Post: micseydel

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020