Python Forum
How can I make this as computationally efficient as possible?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How can I make this as computationally efficient as possible?
#4
I think you can speed up things by generating the individual matching lines first. The following code prints a record for each word appearing in the five columns. Each record contains five lists of all the indexes where the word appears in the columns of sa_data

from collections import defaultdict, namedtuple

def word_idx_dict(seq):
    di = defaultdict(list)
    for i, k in enumerate(seq):
        di[k].append(i)
    return di

def line_matches(sa_data):
    dics  = [word_idx_dict(seq) for seq in sa_data]
    # get words appearing in all the columns
    s = set(dics[0])
    for d in dics[1:]:
        s &= set(d)
    words = sorted(s)
    Record = namedtuple('Record', 'word rownums')
    for w in words:
        yield Record(word=w, rownums=[d[w] for d in dics])

for rec in line_matches(sa_data):
    print(rec)
I think this sequence of records is fast to generate and it should be a better starting point than the raw sa_data array.
Reply


Messages In This Thread
RE: How can I make this as computationally efficient as possible? - by Gribouillis - Apr-15-2018, 10:05 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  BS4 - Is There A More Efficient Way Of Doing This? digitalmatic7 4 4,988 Nov-28-2017, 11:33 AM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020