Python Forum
Print element of list if string included = element of another list
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Print element of list if string included = element of another list
#1
Hi Smile

Python 3.6
Windows 7 64 bits
Pycharm Community 2018.1

In the following script, I want each element of list 'g_liste' being searched in list 'g_era_complet_liste' ; if found, then print corresponding element of 'g_era_complet_liste' :

#-*- coding: utf-8 -*-

g_liste = ['G-12','G-422','G-562','G-1803','G-2534']
print(g_liste)

g_era_complet_liste = ['gep_g;G-1803;828333;http://dx.doi.org/10.3931/era-3900;http://www.era.ch/titlepage/doi/10.3931/era-3900/304', 'gep_g;G-2534;R005180150;http://dx.doi.org/10.3931/era-3902;http://www.era.ch/titlepage/doi/10.3931/era-3902/304', 'gep_g;G-6529;R003341713;http://dx.doi.org/10.3931/era-33195;http://www.era.ch/titlepage/doi/10.3931/era-33195/304', 'gep_g;G-6530;2035543;http://dx.doi.org/10.3931/era-33196;http://www.e-rara.ch/titlepage/doi/10.3931/era-33196/304', 'gep_g;G-6536;R005406255;http://dx.doi.org/10.3931/era-33198;http://www.e-rara.ch/titlepage/doi/10.3931/era-33198/304']
for g_nb in g_liste:
    for g_era in g_era_complet_liste:
        if str(g_nb) in g_era:
            print(g_era)
The result, here, is correct :
Output:
['G-12', 'G-422', 'G-562', 'G-1803', 'G-2534'] gep_g;G-1803;828333;http://dx.doi.org/10.3931/ea-3900;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-3900/304 gep_g;G-2534;R005180150;http://dx.doi.org/10.3931/era-3902;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-3902/304
But when list 'g_era_complet_liste' includes many elements, the result is wrong. Example in the following result : 'G-123' is not an element of list 'g_list' but the print includes element of list 'g_era_complet_liste' containing 'G-123' (first line) :
Output:
['G-12', 'G-422', 'G-562', 'G-1803', 'G-2534'] gep_g;G-123;239631;http://dx.doi.org/10.3931/e-rara-5708;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-5708/304 gep_g;G-124;R213744060;http://dx.doi.org/10.3931/e-rara-5709;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-5709/304 gep_g;G-1277;R005004932;http://dx.doi.org/10.3931/e-rara-5886;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-5886/304 gep_g;G-120;239526;http://dx.doi.org/10.3931/e-rara-1043;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-1043/304 gep_g;G-121;239629;http://dx.doi.org/10.3931/e-rara-1044;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-1044/304 gep_g;G-126;R213771160;http://dx.doi.org/10.3931/e-rara-1045;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-1045/304 gep_g;G-122;R213783960;http://dx.doi.org/10.3931/e-rara-2459;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-2459/304 gep_g;G-1287;R005004948;http://dx.doi.org/10.3931/e-rara-2698;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-2698/304 lac1_g;G-1216;;http://dx.doi.org/10.3931/e-rara-9617;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-9617/304 lac1_g;G-1240;;http://dx.doi.org/10.3931/e-rara-9618;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-9618/304 nev_g;G-1202;R005374316;http://dx.doi.org/10.3931/e-rara-33155;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-33155/304 gep_g;G-422;239553;http://dx.doi.org/10.3931/e-rara-5763;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-5763/304 gep_g;G-5620;R005389342;http://dx.doi.org/10.3931/e-rara-6944;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-6944/304 lac1_g;G-5627;;http://dx.doi.org/10.3931/e-rara-9698;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-9698/304 mhr_g;G-562;R005356211;http://dx.doi.org/10.3931/e-rara-12669;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-12669/304 gep_g;G-1803;828333;http://dx.doi.org/10.3931/e-rara-3900;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-3900/304 gep_g;G-2534;R005180150;http://dx.doi.org/10.3931/e-rara-3902;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-3902/304
What is wrong in my script? Any suggestion of correction?

Thanks Smile
Reply
#2
You are checking a string in a string - G-12 is substring of G-123, so your code behaves correctly - your logic is wrong.

You also check every record in the first list against every record of the second - which is inefficient. On the whole, considering the structure of your strings, I would suggest this (side comment - English variable names would be a good step towards helping your potential helpers). I use set for the checklist - faster access time

check_list = {'G-12','G-422','G-562','G-1803','G-2534'}
filtered_data = [data for data in data_list if data.split(';', 2)[1] in check_list]
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#3
Dear Minister of Silly Walks,

Your logic is great, thank you for your help.

If you had some more time to explain in a few words what following code is actually doing, I would be even more grateful :

data.split(';', 2)[1]
Reply
#4
(Jul-27-2018, 12:27 PM)silfer Wrote:
data.split(';', 2)[1]

ELementary, Watson! Ot should I say - Unladden Swallow?!

str.split splits a string object by the string in the first argument, second argument limits number of splits, i.e. its stops after the second instance of the split string - ; - is encountered.

RTM

Since you need the second element in the string - 2 splits will do. Of course, you don't have to limit number of splits, but my previous incarnation as an embedded engineer hates inefficient code.
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#5
Thank you again, Dear Sherlock.

I am neither previous nor future engineer, so I needed to try following kind of code to understand:

check_list = {'G-12', 'G-422', 'G-562', 'G-1803', 'G-2534'}
data_list = ['gep_g;G-1803;828333;http://dx.doi.org/10.3931/era-3900;http://www.era.ch/titlepage/doi/10.3931/era-3900/304', 'gep_g;G-2534;R005180150;http://dx.doi.org/10.3931/era-3902;http://www.era.ch/titlepage/doi/10.3931/era-3902/304', etc.] for data in data_list:
    data_split = data.split(';',2)
    print(data_split)
Output:
['gep_g', 'G-1803', '828333;http://dx.doi.org/10.3931/era-3900;http://www.era.ch/titlepage/doi/10.3931/era-3900/304'] ['gep_g', 'G-2534', 'R005180150;http://dx.doi.org/10.3931/era-3902;http://www.era.ch/titlepage/doi/10.3931/era-3902/304'] etc.
check_list = {'G-12', 'G-422', 'G-562', 'G-1803', 'G-2534'}
data_list = ['gep_g;G-1803;828333;http://dx.doi.org/10.3931/era-3900;http://www.era.ch/titlepage/doi/10.3931/era-3900/304', 'gep_g;G-2534;R005180150;http://dx.doi.org/10.3931/era-3902;http://www.era.ch/titlepage/doi/10.3931/era-3902/304', etc.] for data in data_list:
    data_split = data.split(';',2)[1]
    print(data_split)
Output:
G-1803 G-2534 etc.
By the way, at the time being, my reading of Python documentation (reference in your previous post) remains very uncomfortable. Some hard (and, at my age, not reasonable?) work to come.

A la prochaine fois (French variable name) Smile
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Fixture not returning webdriver element Nik1811 1 211 Apr-15-2024, 04:39 PM
Last Post: Nik1811
  element in list detection problem jacksfrustration 5 390 Apr-11-2024, 05:44 PM
Last Post: deanhystad
  Elegant way to apply each element of an array to a dataframe? sawtooth500 7 427 Mar-29-2024, 05:51 PM
Last Post: deanhystad
  How to parse and group hierarchical list items from an unindented string in Python? ann23fr 0 197 Mar-27-2024, 01:16 PM
Last Post: ann23fr
  If a set element has digits in the element tester_V 3 325 Mar-25-2024, 04:43 PM
Last Post: deanhystad
  Variable for the value element in the index function?? Learner1 8 666 Jan-20-2024, 09:20 PM
Last Post: Learner1
  Searche each element of each tuple based 3 numbes zinho 8 888 Dec-11-2023, 05:14 PM
Last Post: zinho
  Sample random, unique string pairs from a list without repetitions walterwhite 1 461 Nov-19-2023, 10:07 PM
Last Post: deanhystad
  list in dicitonary element problem jacksfrustration 3 713 Oct-14-2023, 03:37 PM
Last Post: deanhystad
  trouble reading string/module from excel as a list popular_dog 0 430 Oct-04-2023, 01:07 PM
Last Post: popular_dog

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020