Python Forum
Match and extract if found - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Match and extract if found (/thread-38137.html)

Pages: 1 2 3


Match and extract if found - Calli - Sep-07-2022

For instance I have a list of IP address in one file IPfile1.txt

192.168.0.1
192.168.0.2
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.4
192.168.0.5
192.168.0.6

And I have another file IP2.txt

192.168.0.799
192.168.0.900
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.1000
192.168.0.5
192.168.0.83

What I want to achieve is SCAN the IPfile1.txt file using IP2.txt and if the ip address from IP2.txt found in IPfile1.txt than remove that IP and output the ip which is not in IPfile1.txt which in this case is
192.168.0.799
192.168.0.900
2001:4800:7819:104:be76:4eff:fe04:5809
192.168.0.1000
192.168.0.83

And skip anything which isn't an ip address which in this case is 2001:4800:7819:104:be76:4eff:fe04:5809

Can anyone help me with this program


RE: Match and extract if found - Gribouillis - Sep-07-2022

Start by creating a 'set' instance containing all the IP addresses found in file 1.


RE: Match and extract if found - menator01 - Sep-07-2022

Here is a long way of doing it. Much room for optimizing.

def compare(arg1, arg2):
    list1 = []
    list2 = []
    lastlist = []
    with open(arg1, 'r') as file:
        lines = file.readlines()
        for line in lines:
            if ':' not in line:
                list1.append(line.strip())

    with open(arg2, 'r') as file2:
        lines = file2.readlines()
        for line in lines:
            if ':' not in line:
                list2.append(line.strip())

    for line in list2:
        if line not in list1:
            lastlist.append(line)
    return lastlist

print(compare('IP1.txt', 'IP2.txt'))
Output:
['192.168.0.799', '192.168.0.900', '192.168.0.1000', '192.168.0.83']



RE: Match and extract if found - Calli - Sep-08-2022

(Sep-07-2022, 06:33 PM)menator01 Wrote: Here is a long way of doing it. Much room for optimizing.

def compare(arg1, arg2):
    list1 = []
    list2 = []
    lastlist = []
    with open(arg1, 'r') as file:
        lines = file.readlines()
        for line in lines:
            if ':' not in line:
                list1.append(line.strip())

    with open(arg2, 'r') as file2:
        lines = file2.readlines()
        for line in lines:
            if ':' not in line:
                list2.append(line.strip())

    for line in list2:
        if line not in list1:
            lastlist.append(line)
    return lastlist

print(compare('IP1.txt', 'IP2.txt'))
Output:
['192.168.0.799', '192.168.0.900', '192.168.0.1000', '192.168.0.83']
Thanks johnny but the output give me this error message what could have gone wrong?
python3 filter.py 
[]



RE: Match and extract if found - perfringo - Sep-08-2022

(Sep-07-2022, 05:00 PM)Calli Wrote: And skip anything which isn't an ip address which in this case is 2001:4800:7819:104:be76:4eff:fe04:5809

This is an ip address - IPv6 address.

For dealing with ip addresses Python has built-in ipaddress.

This is no-effort question so no code from me.


RE: Match and extract if found - Calli - Sep-08-2022

(Sep-08-2022, 04:08 AM)perfringo Wrote:
(Sep-07-2022, 05:00 PM)Calli Wrote: And skip anything which isn't an ip address which in this case is 2001:4800:7819:104:be76:4eff:fe04:5809

This is an ip address - IPv6 address.

For dealing with ip addresses Python has built-in ipaddress.

This is no-effort question so no code from me.

I am learning and thanks for nothing aivar.paalberg


RE: Match and extract if found - DeaD_EyE - Sep-08-2022

Use a set, if the order is not required.

from io import StringIO
from ipaddress import ip_address, IPv4Address


# fake file for test
ip1_content = StringIO(
    """192.168.0.1
192.168.0.2
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.4
192.168.0.5
192.168.0.6"""
)


ip2_content = StringIO(
    """
192.168.0.799
192.168.0.900
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.1000
192.168.0.5
192.168.0.83"""
)


def make_ipv4_set(file_like):
    results = set()

    for line in map(str.strip, file_like):
        try:
            addr = ip_address(line)
        except ValueError:
            continue

        if isinstance(addr, IPv4Address):
            results.add(addr)
            # or if you want str
            # results.add(line)

    return results


results = make_ipv4_set(ip1_content) - make_ipv4_set(ip2_content)
print(results)



RE: Match and extract if found - Calli - Sep-08-2022

(Sep-08-2022, 08:30 AM)DeaD_EyE Wrote: Use a set, if the order is not required.

from io import StringIO
from ipaddress import ip_address, IPv4Address


# fake file for test
ip1_content = StringIO(
    """192.168.0.1
192.168.0.2
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.4
192.168.0.5
192.168.0.6"""
)


ip2_content = StringIO(
    """
192.168.0.799
192.168.0.900
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.1000
192.168.0.5
192.168.0.83"""
)


def make_ipv4_set(file_like):
    results = set()

    for line in map(str.strip, file_like):
        try:
            addr = ip_address(line)
        except ValueError:
            continue

        if isinstance(addr, IPv4Address):
            results.add(addr)
            # or if you want str
            # results.add(line)

    return results


results = make_ipv4_set(ip1_content) - make_ipv4_set(ip2_content)
print(results)

Works like charm but is it possible to import the first and second file because the first file ip1_content = StringIO conatins 1.4 million ip and second file contains 700,253.

Output
{IPv4Address('192.168.0.799'), IPv4Address('192.168.0.800'), IPv4Address('192.168.0.900'), IPv4Address('1.117.250.1000')}
If we can output the file as newIPlist.txt instead of displaying it in terminal would be just great.
Only Ip address needed not this text IPv4Address possible?


RE: Match and extract if found - DeaD_EyE - Sep-08-2022

This makes the code shorter:

from ipaddress import IPv4Address, ip_address


def make_ipv4_set(file_like):
    results = set()

    for line in map(str.strip, file_like):
        try:
            addr = ip_address(line)
        except ValueError:
            continue

        if isinstance(addr, IPv4Address):
            results.add(addr)

    return results


with (
    open("ip1_file.txt") as ip1_file,
    open("ip2_file.txt") as ip2_file,
):
    ip1_set = make_ipv4_set(ip1_file)
    ip2_set = make_ipv4_set(ip2_file)


# sorting works because IPv4Address is sortable
results = sorted(ip1_set - ip2_set)
# results is now a list. sorted returns always a list

# iterate over the elements and print them
for ip in results:
    # converting IPv4Address to str
    print(str(ip))



RE: Match and extract if found - Calli - Sep-08-2022

(Sep-08-2022, 11:45 AM)DeaD_EyE Wrote: This makes the code shorter:

from ipaddress import IPv4Address, ip_address


def make_ipv4_set(file_like):
    results = set()

    for line in map(str.strip, file_like):
        try:
            addr = ip_address(line)
        except ValueError:
            continue

        if isinstance(addr, IPv4Address):
            results.add(addr)

    return results


with (
    open("ip1_file.txt") as ip1_file,
    open("ip2_file.txt") as ip2_file,
):
    ip1_set = make_ipv4_set(ip1_file)
    ip2_set = make_ipv4_set(ip2_file)


# sorting works because IPv4Address is sortable
results = sorted(ip1_set - ip2_set)
# results is now a list. sorted returns always a list

# iterate over the elements and print them
for ip in results:
    # converting IPv4Address to str
    print(str(ip))

I entered this IP44.44.44.44.11 on ip2_file.txt and when I ran the program it doesn't print out this IP address could it be because we are treading the IP as IPv4? can we just assign all the IP as text or number and than compare and extract which isn't in the first file ip1_file.txt