Shouldn't merchant names be like: UA UNION SQUARE 14, 7-ELEVEN 18594, JCPENNEY 1330, Netflix.com, GEICO *AUTO, FARMERS INS BILLIN?
Or should merchants be UA, 7-ELEVEN, JCPENNEY, Netflix.com, GEICO, FARMERS (i.e first word after date)?
One way to achieve these results:
Or should merchants be UA, 7-ELEVEN, JCPENNEY, Netflix.com, GEICO, FARMERS (i.e first word after date)?
One way to achieve these results:
>>> lst = [ 'PURCHASE AUTHORIZED ON 09/28 UA UNION SQUARE 14 NEW YORK NY S388272071655085 CARD 0057', 'PURCHASE AUTHORIZED ON 09/28 7-ELEVEN 18594 ARVADA CO S588272206422481 CARD 7621', 'PURCHASE AUTHORIZED ON 09/30 JCPENNEY 1330 CORPUS CHRIST TX S468273721740671 CARD 8143' ] >>> splitted = [[''.join(word).strip() for word in row.split(' ') if word] for row in lst] >>> splitted ['PURCHASE', 'AUTHORIZED ON', '09/28 UA UNION SQUARE 14', 'NEW YORK', 'NY', 'S388272071655085', 'CARD 0057'], ['PURCHASE', 'AUTHORIZED ON', '09/28 7-ELEVEN 18594', 'ARVADA', 'CO', 'S588272206422481', 'CARD 7621'], ['PURCHASE', 'AUTHORIZED ON', '09/30 JCPENNEY 1330', 'CORPUS CHRIST TX', 'S468273721740671', 'CARD 8143']] >>> merchants = [' '.join(row[2].split()[1:]) for row in splitted] >>> merchants ['UA UNION SQUARE 14', '7-ELEVEN 18594', 'JCPENNEY 1330'] >>> first_word = [row.split()[0] for row in merchants] >>> first_word ['UA', '7-ELEVEN', 'JCPENNEY']
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.