Python Forum
Print element of list if string included = element of another list - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Print element of list if string included = element of another list (/thread-11803.html)



Print element of list if string included = element of another list - silfer - Jul-26-2018

Hi Smile

Python 3.6
Windows 7 64 bits
Pycharm Community 2018.1

In the following script, I want each element of list 'g_liste' being searched in list 'g_era_complet_liste' ; if found, then print corresponding element of 'g_era_complet_liste' :

#-*- coding: utf-8 -*-

g_liste = ['G-12','G-422','G-562','G-1803','G-2534']
print(g_liste)

g_era_complet_liste = ['gep_g;G-1803;828333;http://dx.doi.org/10.3931/era-3900;http://www.era.ch/titlepage/doi/10.3931/era-3900/304', 'gep_g;G-2534;R005180150;http://dx.doi.org/10.3931/era-3902;http://www.era.ch/titlepage/doi/10.3931/era-3902/304', 'gep_g;G-6529;R003341713;http://dx.doi.org/10.3931/era-33195;http://www.era.ch/titlepage/doi/10.3931/era-33195/304', 'gep_g;G-6530;2035543;http://dx.doi.org/10.3931/era-33196;http://www.e-rara.ch/titlepage/doi/10.3931/era-33196/304', 'gep_g;G-6536;R005406255;http://dx.doi.org/10.3931/era-33198;http://www.e-rara.ch/titlepage/doi/10.3931/era-33198/304']
for g_nb in g_liste:
    for g_era in g_era_complet_liste:
        if str(g_nb) in g_era:
            print(g_era)
The result, here, is correct :
Output:
['G-12', 'G-422', 'G-562', 'G-1803', 'G-2534'] gep_g;G-1803;828333;http://dx.doi.org/10.3931/ea-3900;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-3900/304 gep_g;G-2534;R005180150;http://dx.doi.org/10.3931/era-3902;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-3902/304
But when list 'g_era_complet_liste' includes many elements, the result is wrong. Example in the following result : 'G-123' is not an element of list 'g_list' but the print includes element of list 'g_era_complet_liste' containing 'G-123' (first line) :
Output:
['G-12', 'G-422', 'G-562', 'G-1803', 'G-2534'] gep_g;G-123;239631;http://dx.doi.org/10.3931/e-rara-5708;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-5708/304 gep_g;G-124;R213744060;http://dx.doi.org/10.3931/e-rara-5709;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-5709/304 gep_g;G-1277;R005004932;http://dx.doi.org/10.3931/e-rara-5886;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-5886/304 gep_g;G-120;239526;http://dx.doi.org/10.3931/e-rara-1043;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-1043/304 gep_g;G-121;239629;http://dx.doi.org/10.3931/e-rara-1044;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-1044/304 gep_g;G-126;R213771160;http://dx.doi.org/10.3931/e-rara-1045;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-1045/304 gep_g;G-122;R213783960;http://dx.doi.org/10.3931/e-rara-2459;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-2459/304 gep_g;G-1287;R005004948;http://dx.doi.org/10.3931/e-rara-2698;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-2698/304 lac1_g;G-1216;;http://dx.doi.org/10.3931/e-rara-9617;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-9617/304 lac1_g;G-1240;;http://dx.doi.org/10.3931/e-rara-9618;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-9618/304 nev_g;G-1202;R005374316;http://dx.doi.org/10.3931/e-rara-33155;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-33155/304 gep_g;G-422;239553;http://dx.doi.org/10.3931/e-rara-5763;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-5763/304 gep_g;G-5620;R005389342;http://dx.doi.org/10.3931/e-rara-6944;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-6944/304 lac1_g;G-5627;;http://dx.doi.org/10.3931/e-rara-9698;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-9698/304 mhr_g;G-562;R005356211;http://dx.doi.org/10.3931/e-rara-12669;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-12669/304 gep_g;G-1803;828333;http://dx.doi.org/10.3931/e-rara-3900;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-3900/304 gep_g;G-2534;R005180150;http://dx.doi.org/10.3931/e-rara-3902;http://www.e-rara.ch/titlepage/doi/10.3931/e-rara-3902/304
What is wrong in my script? Any suggestion of correction?

Thanks Smile


RE: Print element of list if string included = element of another list - volcano63 - Jul-26-2018

You are checking a string in a string - G-12 is substring of G-123, so your code behaves correctly - your logic is wrong.

You also check every record in the first list against every record of the second - which is inefficient. On the whole, considering the structure of your strings, I would suggest this (side comment - English variable names would be a good step towards helping your potential helpers). I use set for the checklist - faster access time

check_list = {'G-12','G-422','G-562','G-1803','G-2534'}
filtered_data = [data for data in data_list if data.split(';', 2)[1] in check_list]



RE: Print element of list if string included = element of another list - silfer - Jul-27-2018

Dear Minister of Silly Walks,

Your logic is great, thank you for your help.

If you had some more time to explain in a few words what following code is actually doing, I would be even more grateful :

data.split(';', 2)[1]



RE: Print element of list if string included = element of another list - volcano63 - Jul-27-2018

(Jul-27-2018, 12:27 PM)silfer Wrote:
data.split(';', 2)[1]

ELementary, Watson! Ot should I say - Unladden Swallow?!

str.split splits a string object by the string in the first argument, second argument limits number of splits, i.e. its stops after the second instance of the split string - ; - is encountered.

RTM

Since you need the second element in the string - 2 splits will do. Of course, you don't have to limit number of splits, but my previous incarnation as an embedded engineer hates inefficient code.


RE: Print element of list if string included = element of another list - silfer - Jul-30-2018

Thank you again, Dear Sherlock.

I am neither previous nor future engineer, so I needed to try following kind of code to understand:

check_list = {'G-12', 'G-422', 'G-562', 'G-1803', 'G-2534'}
data_list = ['gep_g;G-1803;828333;http://dx.doi.org/10.3931/era-3900;http://www.era.ch/titlepage/doi/10.3931/era-3900/304', 'gep_g;G-2534;R005180150;http://dx.doi.org/10.3931/era-3902;http://www.era.ch/titlepage/doi/10.3931/era-3902/304', etc.] for data in data_list:
    data_split = data.split(';',2)
    print(data_split)
Output:
['gep_g', 'G-1803', '828333;http://dx.doi.org/10.3931/era-3900;http://www.era.ch/titlepage/doi/10.3931/era-3900/304'] ['gep_g', 'G-2534', 'R005180150;http://dx.doi.org/10.3931/era-3902;http://www.era.ch/titlepage/doi/10.3931/era-3902/304'] etc.
check_list = {'G-12', 'G-422', 'G-562', 'G-1803', 'G-2534'}
data_list = ['gep_g;G-1803;828333;http://dx.doi.org/10.3931/era-3900;http://www.era.ch/titlepage/doi/10.3931/era-3900/304', 'gep_g;G-2534;R005180150;http://dx.doi.org/10.3931/era-3902;http://www.era.ch/titlepage/doi/10.3931/era-3902/304', etc.] for data in data_list:
    data_split = data.split(';',2)[1]
    print(data_split)
Output:
G-1803 G-2534 etc.
By the way, at the time being, my reading of Python documentation (reference in your previous post) remains very uncomfortable. Some hard (and, at my age, not reasonable?) work to come.

A la prochaine fois (French variable name) Smile