Posts: 453
Threads: 16
Joined: Jun 2022
Aug-30-2022, 08:55 AM
(This post was last modified: Aug-30-2022, 11:08 PM by rob101.
Edit Reason: Incremental code update
)
I've been studying the dictionary data structure as I wanted to discover a way of searching for nested items, so this code is a demonstration of that goal; I understand that it's possibly over engineered for the task that I have chosen.
The user input is not fully sanitized, but (as you will see from the code comments) I have that covered by a custom function that I've already written (for the sake of brevity, that function is not included here).
Also, (for the sake of brevity) I've only included a few books, but you can add as many as you like, for testing.
I know not of any bugs, so if you find any or if you have any general comments about my coding style, I'm open to constructive criticism.
Thank you reading and testing; I'll reply to any comments you may have, as and when.
Enjoy and who knows, you may even find this to be the bases of a useful app.
#!/usr/bin/python3
from sys import stdout
library = { # list indexed as 0 for the book title and 1 for the book author
'computer science & programming':{
'0-13-110163-3':[
'THE C PROGRAMMING LANGUAGE',
'BRIAN W.KERNIGHAN & DENNIS M.RITCHIE'
],
'0-85934-229-8':[
'PROGRAMMING IN QuickBASIC',
'N.KANTARIS'
],
'0-948517-48-4':[
'HiSoft BASIC VERSION 2: USER MANUAL',
'DAVID NUTKINS, ALEX KIERNAN and TONY KENDLE'
]
},
'reference':{
'0-333-34806-0':[
'DICTIONARY OF INFORMATION TECHNOLOGY',
'DENNIS LONGLEY and MICHAEL SHAIN'
]
},
'novels':{
'0-681-40322-5':[
'THE MORE THAN COMPLETE HITCHHIKER\'S GUIDE',
'DOUGLAS ADAMS'
]
}
}
#===========<End of dictionary>===========#
def search(publication, term):
result = []
results = []
found = 0
maximum = 6
title = 0
author = 1
categories = library.keys()
for category in library:
for isbn in library[category]:
book = library.get(category).get(isbn)
if term[:4] == 'ISBN' and term[4:] == isbn:
term = book[title]
if term in book[publication]:
found += 1
result.append(book[title])
result.append(book[author])
result.append(category)
result.append(isbn)
results.append(result)
result = []
if found > maximum:
break
if found > maximum:
break
if found:
if found > maximum:
return maximum
else:
return results
else:
return
#=========<End of search function>=========#
def output(results, file=stdout):
print("-"*50)
if isinstance(results, list):
for books in results:
for book in books:
print(book)
print("-"*50)
else:
print("Search results exceeds the maximum of {}".format(results))
print("-"*50)
#=========<End of output function>=========#
find, found = None, None
# attempt = the index reference passed to <if term in book[publication]>
attempt = 0 # 0 = book title 1 = book author
quit = False
while not find and not quit:
print('''
Search term must be alphanumeric characters only
and greater than three characters in length.
For a ISBN search, enter ISBN and press return.
''')
find = input("Search term: ").strip().upper() # to-do: check the input with the user input checker function
if find == 'QUIT':
quit = True
elif find =='ISBN':
print("ISBN search")
isbn = input("ISBN: ").strip()
find = find+isbn
if len(find) > 3:
found = search(attempt, find)
else:
find, found = None, None
if find and not quit:
while not found and attempt < 1: # change this if more fields are added to the publication
attempt +=1
found = search(attempt, find)
if found:
output(found)
find, found = None, None
attempt = 0
elif not quit:
print("Nothing found")
find = None
attempt = 0
print("Search exit.")
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Posts: 4,790
Threads: 76
Joined: Jan 2018
(Aug-30-2022, 08:55 AM)rob101 Wrote: Thank you reading and testing; I'll reply to any comments you may have, as and when. I'll try to look into this. A few remarks while reading: - Write unit tests to automate testing while developing the code (the most useful thing).
- Add a "file" parameter to
output() , defaulting to sys.stdout (better for functions that do input/output)
- Use triple quotes to define multiline strings.
Posts: 453
Threads: 16
Joined: Jun 2022
Aug-30-2022, 11:12 AM
(This post was last modified: Aug-30-2022, 11:12 AM by rob101.)
(Aug-30-2022, 09:12 AM)Gribouillis Wrote: I'll try to look into this. A few remarks while reading:- Write unit tests to automate testing while developing the code (the most useful thing).
- Add a "file" parameter to output(), defaulting to sys.stdout (better for functions that do input/output)
- Use triple quotes to define multiline strings.
Thank you. I will update the code (above) in one hit, as and when any feedback that requires a code update, seems to be in.
I have to admit (and as you've likely guessed) I'm not up to speed with your point 1 and point 2. As for point 3, yes; that's something that I should have taken care of and it will be.
Point 1: By this, do you mean that I should have a 'driver' to simulate user input or am I barking up the wrong tree?
Point 2: Could you (if you've time) give me a quick explainer as to why this is a good option to have and how that could be used?
Function will be amended to: def output(results, file=sys.stdout) which is (as I understand it to be) the default.
Point 3: Done. The code here will be updated as and when.
With thanks and regards.
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Posts: 4,790
Threads: 76
Joined: Jan 2018
Aug-30-2022, 12:54 PM
(This post was last modified: Aug-30-2022, 12:54 PM by Gribouillis.)
(Aug-30-2022, 11:12 AM)rob101 Wrote: Point 1: By this, do you mean that I should have a 'driver' to simulate user input or am I barking up the wrong tree? In the end, it could be an option, but unit tests are made to test small «units» in a program, not the program as a whole. For example they test a function's behavior. Here is how you could start unit testing the output() function for example. I inserted the following code just before the find, found = None, None in your code
import io
import unittest
class TestOutput(unittest.TestCase):
def test_print_error_message_if_results_is_integer(self):
results = 25
ofh = io.StringIO()
output(results, file=ofh)
s = ofh.getvalue()
self.assertIn(f'exceeds the maximum of {results}', s)
def test_output_contains_titles(self):
results = [['ti0, ''au0', 'ca0', 'is0'], ['ti1', 'au1', 'ca1', 'is1']]
ofh = io.StringIO()
output(results, file=ofh)
s = ofh.getvalue()
self.assertIn('ti0', s)
self.assertIn('ti1', s)
if sys.argv[-1] == 'test':
unittest.main(argv=sys.argv[:-1])
sys.exit(0) Now if instead of python program.py , you call python program.py test , it will run the tests instead of an interactive session.
To make the output() function testable, I had to inject the file in its parameters, and this answer your second question: to make output function testable, you need to be able to inject the file object. I did it in a simple way here
import functools
import sys
def output(results, file=sys.stdout):
print = functools.partial(__builtins__.print, file=file)
print("-"*50)
if type(results) is not int:
for books in results:
for book in books:
print(book)
print("-"*50)
else:
print("Search results exceeds the maximum of {}".format(results))
print("-"*50)
#=========<End of output function>=========# Output: λ python paillasse/pf/rob101.py test
..
----------------------------------------------------------------------
Ran 2 tests in 0.000s
OK
rob101 Wrote:Point 3: Done. The code here will be updated as and when. You could perhaps upload the code to a site such as github gist which allows you to push updates of the code throw git like I did for this module for example, and leave a link in this thread so we could have the latest version at any time.
Posts: 453
Threads: 16
Joined: Jun 2022
Aug-30-2022, 02:13 PM
(This post was last modified: Aug-30-2022, 02:13 PM by rob101.
Edit Reason: to add
)
(Aug-30-2022, 12:54 PM)Gribouillis Wrote: In the end, it could be an option, but unit tests are made to test small «units» in a program, not the program as a whole.
This is all very helpful and I need to take a little time so that I can get my head around these new (to me) concepts and evaluate the code that you have posted, so that I fully understand what you've done, as well as why.
(Aug-30-2022, 12:54 PM)Gribouillis Wrote: You could perhaps upload the code to a site such as github gist...
This is an option that I will look into. In the mean time, I will update the code that's in my first post: I feel that it's maybe better to do that, than to have multiple versions sprinkled around this thread.
Given that I have the output() function and that it can be 'unit tested' in the way that you demonstrate, I feel it could be better to have all the print() functions moved to the output() function, right? That is to say, the ones that are concerned with the search results, such as Nothing found
A thought that's come to mind, as I type this: once testing has been done, is it 'best practice' to remove the code that facilitates said testing, or does one leave it as is? I feel it should be removed, as it plays no part in the functionality of the app, right? It's details such as this, that are of as much interest to me, as is writing the code.
With that last thought in mind, I will refrain from including any of the code that is purely for testing, until I'm clear about what should and should not be included in the, shall we call it, release candidate.
Thank you very much for your time, as well as the information, and I look forward to your next reply, as and when you have more time to do so.
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Posts: 1,950
Threads: 8
Joined: Jun 2018
This is too verbose:
categories = library.keys()
for category in categories:
books = library.get(category)
for isbn in books:
book = books.get(isbn)
# do something with book If you iterate over dictionary then you iterate over keys. So you can reduce this to:
for category in library:
for record in library[category]:
# do something with library[category][record] Which raises the question about the way the data is structured. If I get some data from upstream my first action is to check whether I should convert it to make it simpler (and faster) to work with. In this particular case list of dictionaries could be one possibility - very simple and generic filtering function could deliver all required functionality. Current code iterates over all the data, so there should not be any performance penalty as well. Another possibility is to use dataframe and take advantage of vectorization.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 453
Threads: 16
Joined: Jun 2022
Aug-30-2022, 02:35 PM
(This post was last modified: Aug-30-2022, 04:04 PM by rob101.)
(Aug-30-2022, 02:17 PM)perfringo Wrote: This is too verbose:...
... Which raises the question about the way the data is structured.
Yes, it does. I'm not one for any nested data structured, if it can be avoided, but keep in mind that this is an academic exercise for me, just because I wanted to learn how one would go about searching such a data structure, if one needed to. If I was implementing a way to store and search a book collection, I would not use this code, as there are much simpler ways in which that can be done.
I will have a look at the improvement that you've posted, for which I am grateful, as I'm sure it will be better and I will be able to apply what you've shown me.
With thanks and regards.
To add...
(Aug-30-2022, 02:17 PM)perfringo Wrote: If you iterate over dictionary then you iterate over keys. So you can reduce this to:
for category in library:
for record in library[category]: # do something with library[category][record]
I've run a test and from what I can see, your improvement will work for me:
for category in library:
for isbn in library[category]:
book = library.get(category).get(isbn)
if term[:4] == 'ISBN' and term[4:] == isbn:
# do the rest of the search from here ... which I will implement and update (unforeseen issues aside).
Thank you.
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
Posts: 12,031
Threads: 485
Joined: Sep 2016
For what it's worth, Here's how I generically display nested dictionaries.
def display_dict(dictname, indentwidth=0):
indent = " " * (4 * indentwidth)
for key, value in dictname.items():
if isinstance(value, dict):
print(f'\n{indent}{key}')
indentwidth += 1
display_dict(value, indentwidth)
else:
print(f'{indent}{key}: {value}')
if indentwidth > 0:
indentwidth -= 1
def testit():
urllist = {
"LocalGovernment": {
"Argentina": {
"MisionesOpenData_AR": "http://www.datos.misiones.gov.ar/"
},
"Austria": {
"ViennaOpenData_AT": "https://www.data.gv.at/"
},
"UnitedStates": {
"alabama": {
"Alabaster": {
"Rank": 16,
"URL": "https://www.cityofalabaster.com/",
"Population": "33,373"
},
"Albertville": {
"Rank": 27,
"URL": "https://www.cityofalbertville.com/",
"Population": "21,620"
}
}
}
}
}
display_dict(urllist)
if __name__ == '__main__':
testit() Output: LocalGovernment
Argentina
MisionesOpenData_AR: http://www.datos.misiones.gov.ar/
Austria
ViennaOpenData_AT: https://www.data.gv.at/
UnitedStates
alabama
Alabaster
Rank: 16
URL: https://www.cityofalabaster.com/
Population: 33,373
Albertville
Rank: 27
URL: https://www.cityofalbertville.com/
Population: 21,620
Posts: 4,790
Threads: 76
Joined: Jan 2018
(Aug-30-2022, 08:13 PM)Larz60+ Wrote: Here's how I generically display nested dictionaries. Or using module asciitree from Pypi
import asciitree
class OurTraversal(asciitree.traversal.Traversal):
def get_children(self, node):
k, v = node
return list(v.items()) if isinstance(v, dict) else []
def get_root(self, tree):
return tree
def get_text(self, node):
k, v = node
return k if isinstance(v, dict) else f'{k}: {v}'
def testit():
urllist = {
"LocalGovernment": {
"Argentina": {
"MisionesOpenData_AR": "http://www.datos.misiones.gov.ar/"
},
"Austria": {
"ViennaOpenData_AT": "https://www.data.gv.at/"
},
"UnitedStates": {
"alabama": {
"Alabaster": {
"Rank": 16,
"URL": "https://www.cityofalabaster.com/",
"Population": "33,373"
},
"Albertville": {
"Rank": 27,
"URL": "https://www.cityofalbertville.com/",
"Population": "21,620"
}
}
}
}
}
s = str(asciitree.LeftAligned(traverse=OurTraversal())(('', urllist)))
print(s)
if __name__ == '__main__':
testit() Output: +-- LocalGovernment
+-- Argentina
| +-- MisionesOpenData_AR: http://www.datos.misiones.gov.ar/
+-- Austria
| +-- ViennaOpenData_AT: https://www.data.gv.at/
+-- UnitedStates
+-- alabama
+-- Alabaster
| +-- Rank: 16
| +-- URL: https://www.cityofalabaster.com/
| +-- Population: 33,373
+-- Albertville
+-- Rank: 27
+-- URL: https://www.cityofalbertville.com/
+-- Population: 21,620
Posts: 453
Threads: 16
Joined: Jun 2022
(Aug-30-2022, 08:13 PM)Larz60+ Wrote: For what it's worth, Here's how I generically display nested dictionaries.
Thank you for that.
This...
for key, value in dictname.items():
if isinstance(value, dict): ... looks very interesting. I'd not considered accessing the keys directly, in a for loop, with the .items() method, together with the isinstance() function. I'll certainly look at that usage, for my own understanding, least ways.
Sig:
>>> import this
The UNIX philosophy: "Do one thing, and do it well."
"The danger of computers becoming like humans is not as great as the danger of humans becoming like computers." :~ Konrad Zuse
"Everything should be made as simple as possible, but not simpler." :~ Albert Einstein
|