Python Forum
[split] Time Complexity of Counting
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[split] Time Complexity of Counting
#1
micseydel speaking here: this is a split from https://python-forum.io/Thread-List-show...-instances

(Jan-18-2017, 07:22 PM)wavic Wrote: I was thinking that this is homework   

Very generously  
Yes, except based on his previous post,
(Jan-18-2017, 11:00 AM)Pie_Fu n Wrote: ok i've got it sorted
if its of benefit to anybody else
he thinks he has already solved the problem.

Therefore he needs to be shown how to do it correctly so he doesn't think the previous attempt was the best way.
I'm not going to obfuscate a couple lines of code.

Also your suggestion,
wavic Wrote:You need to count an element in the incrementing slices of the first list and append the result for an element to the new list.
would be O(n^2) at best.

Edit: Adding my code to this post so people don't need to link back.
My suggested implementation from the previous thread:
sequence = 48,52,55,48,52,55,60,62,48
 
result = []
cum_count = {}
for num in sequence:
    cum_count[num] = cum_count.get(num, 0) + 1
    result.append(cum_count[num])
 
print(result)
Output:
[1, 1, 1, 2, 2, 2, 1, 1, 3]
Reply
#2
I came with
for s, i in enumerate(l,1):               
    c = Counter(l[:s])                                         
    res.append(c[i])   
However, how do you get this O(n^2)? How can I do the same? 
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#3
What you wrote is O(n^2).  The version I wrote is O(n).
Counting all elements (even of a sublist) every time through the loop is inefficient and pushes the algorithm to quadratic time.

Also note you could just use list.count (or tuple.count depending) for your method:
sequence = 48,52,55,48,52,55,60,62,48
result = [sequence[:i].count(val) for i,val in enumerate(sequence, 1)] 
but this is still O(n^2) which should be avoided.
Reply
#4
Is there a guidance how to to do it my self? How to determine the complication of the code
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#5
Micseydel wrote a guide for determining time complexity.  You should start there and then ask if you are confused on anything:
https://python-forum.io/Thread-Efficiency-Crash-Course
Reply
#6
Thanks for the plug, Mek.

@wavic:
To dig into it a bit more, the code
for s, i in enumerate(l, 1):
    l[:s]
is quadratic (O(n**2)) because slicing is a linear operation. During the first iteration, n work is done, during the second is (n-1) etc. down to 1. On average, n/2 work is done within the loop, but the halving doesn't matter.

Given, f(x)=x*x/2, doubling x results in a more than doubling of f(x), as shown below
f(2) = 2
f(4) = 8
f(8) = 32

The reason Mek's solution is O(n) is that within the loop (which runs n times), only constant-time operations are taken. Operating on a dictionary is efficient (O(1)) so long as you're not doing other linear operations, like collection.count().
Reply
#7
Thank you! I'll see it tomorrow. It's midnight here Coffee
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#8
Quote:My suggested implementation from the previous thread:
sequence = 48,52,55,48,52,55,60,62,48
 
result = []
cum_count = {}
for num in sequence:
    cum_count[num] = cum_count.get(num, 0) + 1
    result.append(cum_count[num])
 
print(result)
Output:
[1, 1, 1, 2, 2, 2, 1, 1, 3]
I think this program is not of complexity O(n); You have assumed that
dict.get(num,0)
is of constant complexity. However this is not the case; The get method has to go through the dictionary keys every time it is called. Your code has a worst case complexity of O(n^2). Correct me if I'm wrong
Reply
#9
According to https://wiki.python.org/moin/TimeComplexity time complexity for get item from a dict is:
  • O(1) in Average Case
  • O(n) in Amortized Worst Case

look at the very bottom of the page
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#10
The following code has complexity O(n log(n)) like sorting
import itertools as itt
import operator as op

def solve(problem):
    es = sorted((x, i) for i, x in enumerate(problem))
    t = sorted((x[1], i)
         for k, g in itt.groupby(es, key = op.itemgetter(0))
         for i, x in enumerate(g, 1))
    return [x[1] for x in t]

if __name__ == '__main__':
    problem = [48,52,55,48,52,55,60,62,48]
    expected = [1,1,1,2,2,2,1,1,3]
    assert solve(problem) == expected
Of course, you can reach the same asymptotic complexity by using a sorted dictionary based on binary search instead of the dict class. This module probably provides the needed SortedDict class.

But here, we do it with the standard library alone! For most practical purposes, the dict based solution is probably the best one although it's not O(n log(n))
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  counting lines in split data Skaperen 6 1,348 Oct-07-2022, 07:09 PM
Last Post: Skaperen
  Split gps files based on time (text splitting) dervast 0 1,850 Nov-09-2020, 09:19 AM
Last Post: dervast
  What is the run time complexity of this code and please explain? samlee916 2 2,257 Nov-06-2020, 02:37 PM
Last Post: deanhystad
  Basic Time Complexity Help Adakip 1 1,805 Aug-25-2019, 09:12 AM
Last Post: ThomasL
  Time complexity of dict vs attributes heras 6 3,704 Aug-24-2018, 10:18 PM
Last Post: Gribouillis
  How to calculate the mccabe complexity in module falke8? fuqianz 1 3,101 Oct-16-2017, 02:08 AM
Last Post: sparkz_alot

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020