Python Forum
What is distinction between 'sent3' and 'set(sent3)'? - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Homework (https://python-forum.io/forum-9.html)
+--- Thread: What is distinction between 'sent3' and 'set(sent3)'? (/thread-28188.html)



What is distinction between 'sent3' and 'set(sent3)'? - AOCL1234 - Jul-08-2020

What is the distinction between 'sent3' and 'set(sent3)', 'sent2' and 'set(sent2)', etc.? 'sent3' generates the tokens entailed in sentence 3, which makes sense, but 'set(sent3)' is unusually ordered. Here is the code (caveat, I am a novice at Python tags):

>>> sent3
[output]['In', 'the', 'beginning', 'God', 'created', 'the', 'heaven', 'and', 'the', 'earth', '.'][/output]
>>> set(sent3)
[output]{'created', '.', 'and', 'the', 'heaven', 'earth', 'God', 'In', 'beginning'}[/output]
>>> sent2
[output]['The', 'family', 'of', 'Dashwood', 'had', 'long', 'been', 'settled', 'in', 'Sussex', '.'][/output]
>>> set(sent2)
[output]{'settled', 'long', 'been', 'of', 'The', 'Dashwood', 'had', 'in', 'Sussex', '.', 'family'}[/output]



RE: What is distinction between 'sent3' and 'set(sent3)'? - deanhystad - Jul-09-2020

sets are unordered. That is why they are not subsrcriptable. Try this:
a_set = {1, 2, 3, 4, 5}
a_set[0]



RE: What is distinction between 'sent3' and 'set(sent3)'? - AOCL1234 - Jul-09-2020

OK. Understood. The question extends to the following operations. If 'set(sent3)' generates the distinct token total for sentence 3, and 'set(text1)' generates the distinct token total for text 1, clearly the former operation is less than the latter. However, when I substitute 'sent3' for 'sent2' [i.e. set(sent2) < set(text1)] why is the output 'false', when the former operation [i.e. set(sent3) < set(text1)] is 'true'? Irrespective of which sentence, the vocabulary total for these 2 sentences (or any singular sentence as a general rule) will always be less than the vocabulary total of an entire text [i.e. set(text1)]. The code is below:

>>> set(sent2) < set(text1)
[output]False[/output]
>>> set(sent3) < set(text1)
[output]True[/output]
>>> set(sent4) < set(text1)
[output]False[/output]
>>> set(sent5) < set(text1)
[output]False[/output]
>>> set(sent6) < set(text1)
[output]False[/output]
>>> set(sent7) < set(text1)
[output]False[/output]
>>> set(sent8) < set(text1)
[output]False[/output]
>>> set(sent9) < set(text1)
[output]False [/output]  



RE: What is distinction between 'sent3' and 'set(sent3)'? - deanhystad - Jul-09-2020

What does set_a > set_b even mean? Lists and tuples compare element by element, but since a tuple has no order, it makes no sense to compare elements. You say "distinct token total", but what does that mean? setA > setB if setA has more things? That is not the basis of comparison for any of the other collection types.

To be honest I am surprised that > and < don't throw an error when used with sets. The result is meaningless. Try this:
x = {'a', 'b', 'c', 'd', 'e'}
y = {'f', 'd', 'c', 'b', 'a'}
print(x > y)
print(y > x)
print(x == y)
Equal works if both sets have the same elements, but the code above returns:
Output:
False False False



RE: What is distinction between 'sent3' and 'set(sent3)'? - Yoriz - Jul-09-2020

Maybe you mean to check the amount of items
list1 = [1, 2, 1, 3, 4]
set1 = set(list1)
print(len(list1), len(set1))
print(len(list1) > len(set1))
Output:
5 4 True



RE: What is distinction between 'sent3' and 'set(sent3)'? - bowlofred - Jul-11-2020

(Jul-09-2020, 03:51 AM)deanhystad Wrote: To be honest I am surprised that > and < don't throw an error when used with sets. The result is meaningless.

The operators when used with sets test for subset and superset.


>>> set([3]) < set([4, 5])  # first is not a subset of the second
False
>>> set([5]) < set([4, 5])  # first is a subset of the second
True