Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Order an array
#1
Hi there,

At the university, I need to create a project in Python language and I'm stuck with a function that is needed to finish my work.
The task is:

There is a file called data, which contains the following lines (help: (id. itemname; description; price) ):
1. grillsütő; jó állapotú; 5000
4. gyerek bicikli; 14"-os; 10000
6. roller; piros; 2000
7. kenyérpirító; fekete; 2000
11. szék; barna; 2000

There is another file called update which contains these lines:
7. 1500
4. 9000
5. túracipő; új; 20000
6. visszavon
11. 1800
12. láncfűrész; Stihl; 27000

My task is to create a file called newdata, using these two. The newdata file should look like this:
I need to update the prices of items where it has a price after the id in update file,
add a new item where it has a whole new item line,
delete the given item where it says "visszavon"

so the final newdata should look like this:
1. grillsütő; jó állapotú; 5000
4. gyerek bicikli; 14"-os; 9000
5. túracipő; új; 20000
7. kenyérpirító; fekete; 1500
11. szék; barna; 1800
12. láncfűrész; Stihl; 27000

The problem is that I cannot order this list by the items id number. Actually I can, but it looks like it is ordered alphabetically, because 11 is less than 4 on my list.

My newdata looks like this:
1. grillsütő; jó állapotú; 5000
11. szék; barna; 1800
12. láncfűrész; Stihl; 27000
4. gyerek bicikli; 14"-os; 9000
5. túracipő; új; 20000
7. kenyérpirító; fekete; 1500

This is my current code:
import re

#1. grillsütő; jó állapotú; 5000
#4. gyerek bicikli; 14"-os; 10000
#6. roller; piros; 2000
#7. kenyérpirító; fekete; 2000
#11. szék; barna; 2000

#Read Input
dataFile = open("data.txt","r");
updateFile = open("update.txt","r");
outputFile = open("newdata.txt","w");

data = []

for line in dataFile:
	txt=line
	re1='(\\d+)'			# Integer Number 1
	re2='(\\.)'				# Any Single Character 1
	re3='(\\s+)'			# White Space 1
	re4='(\\w+[^;]*)'	    # Word 1
	re5='(;)'				# Any Single Character 2
	re6='(\\s+)'			# White Space 2
	re7='(\\w+[^;]*)'	    # Word 2
	re8='(;)'				# Any Single Character 3
	re9='(\\s+)'			# White Space 3
	re10='(\\d+)'			# Integer Number 2

	rg = re.compile(re1+re2+re3+re4+re5+re6+re7+re8+re9+re10,re.IGNORECASE|re.DOTALL)
	m = rg.search(txt)
	if m:
		targy_azonositoja = m.group(1)
		pont = m.group(2)
		elso_szunet = m.group(3)
		targy_neve = m.group(4)
		elso_pontosvesszo = m.group(5)
		masodik_szunet = m.group(6)
		targy_leirasa = m.group(7)
		masodik_pontosvesszo = m.group(8)
		harmadik_szunet = m.group(9)
		targy_ara = m.group(10)
		data.append(targy_azonositoja+pont+elso_szunet+targy_neve+elso_pontosvesszo+masodik_szunet+targy_leirasa+masodik_pontosvesszo+harmadik_szunet+targy_ara)

print(data)
print("\n\n")
#visszavonás		
for line in updateFile:
	txt = line
	regex_visszavon_1 = '(\\d+)'
	regex_visszavon_2 = '(\\.)'
	regex_visszavon_3 = '( )'
	regex_visszavon_4 = '(visszavon)'
	
	rg = re.compile(regex_visszavon_1+regex_visszavon_2+regex_visszavon_3+regex_visszavon_4, re.IGNORECASE|re.DOTALL)
	m = rg.search(txt)
	if m:
		targyat_visszavon_id = m.group(1)
		for adat in data:
			if (adat.startswith(targyat_visszavon_id+". ")):
				data.remove(adat)
print(data)
print("\n")

#ár módosítása
updateFile.seek(0)

for line in updateFile:
	txt = line
	regex_armodositas_1 = '(\\d+)'
	regex_armodositas_2 = '(\\.)'
	regex_armodositas_3 = '( )'
	regex_armodositas_4 = '(\\d+)'
	
	rg = re.compile(regex_armodositas_1+regex_armodositas_2+regex_armodositas_3+regex_armodositas_4, re.IGNORECASE|re.DOTALL)
	m = rg.search(txt)
	if m:
		armodositas_id = m.group(1)
		#print("Modositas: "+str(armodositas_id)+" -> "+str(m.group(4)))
		index = -1
		for adat in data:
			#print(adat)
			index+=1
			#print("Index: "+str(index))
			if (adat.startswith(armodositas_id+". ")):
				result = re.sub(';\\s+\\d{3,}', '; '+m.group(4), adat)
				#print(str(result))
				data[index] = result
	
	
print("\n")
print(data)


#uj tárgy felvétele
updateFile.seek(0)

for line in updateFile:
	txt=line
	re1='(\\d+)'			# Integer Number 1
	re2='(\\.)'				# Any Single Character 1
	re3='(\\s+)'			# White Space 1
	re4='(\\w+[^;]*)'	# Word 1
	re5='(;)'				# Any Single Character 2
	re6='(\\s+)'			# White Space 2
	re7='(\\w+[^;]*)'	# Word 2
	re8='(;)'				# Any Single Character 3
	re9='(\\s+)'			# White Space 3
	re10='(\\d+)'			# Integer Number 2

	rg = re.compile(re1+re2+re3+re4+re5+re6+re7+re8+re9+re10,re.IGNORECASE|re.DOTALL)
	m = rg.search(txt)
	if m:
		targy_azonositoja = m.group(1)
		pont = m.group(2)
		elso_szunet = m.group(3)
		targy_neve = m.group(4)
		elso_pontosvesszo = m.group(5)
		masodik_szunet = m.group(6)
		targy_leirasa = m.group(7)
		masodik_pontosvesszo = m.group(8)
		harmadik_szunet = m.group(9)
		targy_ara = m.group(10)
		data.append(targy_azonositoja+pont+elso_szunet+targy_neve+elso_pontosvesszo+masodik_szunet+targy_leirasa+masodik_pontosvesszo+harmadik_szunet+targy_ara)

print("\n\nEREDMÉNY\n\n")	
print(data)
print("\n\nVÉGE")


data.sort()

for line in data:
	print(line)
	outputFile.write(line+"\n")

dataFile.close();
updateFile.close();
outputFile.close();
Reply
#2
Can you reproduce your question with simpler code (fewer lines)? You should be able to do so with 10 or fewer lines, and it will be much easier to help with that way.
Reply
#3
Your problem is that you are sorting based on 'string values' and you need to sort based on 'numbers'.

Try the following approach which changes the data structure slightly, and replaces the first element 'string' with an 'integer'. When you are done it is relatively easy to put in the correct punctuation. It is a shame to waste your regular expression pattern, but parsing on semi-colons ';' seems to work.
my_simulated_file =     ['1. one; one one; 5000',
                         '4. four; four four; 9999',
                         '111. oneoneone; oneoneone oneoneone; 111111',
                         '7. seven; seven seven; 777',
                         '6. six; six six; 6666',
                         '11. eleven, eleven eleven, 1111']
           
data = []

for s in my_simulated_file:
    new_s = s                               #make a new copy of the string (bad practice to modify the index)
    new_s=new_s.replace(".", ";", 1)        #replace the first '.' with ';'
    small_list = new_s.split(";")           #parse based on semicolon ';'
    small_list[0] = int(small_list[0])      #replace the first element string with an integer
    data.append(small_list)
    
#Now when sorting, sort will be based on 'integer' first element    
data.sort()    

print("Original List:")
for s in my_simulated_file:
    print(s)

print()
print("Sorted List:")
for s in data:
    print(s)
Output:
Original List: 1. one; one one; 5000 4. four; four four; 9999 111. oneoneone; oneoneone oneoneone; 111111 7. seven; seven seven; 777 6. six; six six; 6666 11. eleven, eleven eleven, 1111 Sorted List: [1, ' one', ' one one', ' 5000'] [4, ' four', ' four four', ' 9999'] [6, ' six', ' six six', ' 6666'] [7, ' seven', ' seven seven', ' 777'] [11, ' eleven, eleven eleven, 1111'] [111, ' oneoneone', ' oneoneone oneoneone', ' 111111']
Lewis
To paraphrase: 'Throw out your dead' code. https://www.youtube.com/watch?v=grbSQ6O6kbs Forward to 1:00
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020