Python Forum

Full Version: Match and extract if found
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3
For instance I have a list of IP address in one file IPfile1.txt

192.168.0.1
192.168.0.2
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.4
192.168.0.5
192.168.0.6

And I have another file IP2.txt

192.168.0.799
192.168.0.900
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.1000
192.168.0.5
192.168.0.83

What I want to achieve is SCAN the IPfile1.txt file using IP2.txt and if the ip address from IP2.txt found in IPfile1.txt than remove that IP and output the ip which is not in IPfile1.txt which in this case is
192.168.0.799
192.168.0.900
2001:4800:7819:104:be76:4eff:fe04:5809
192.168.0.1000
192.168.0.83

And skip anything which isn't an ip address which in this case is 2001:4800:7819:104:be76:4eff:fe04:5809

Can anyone help me with this program
Start by creating a 'set' instance containing all the IP addresses found in file 1.
Here is a long way of doing it. Much room for optimizing.

def compare(arg1, arg2):
    list1 = []
    list2 = []
    lastlist = []
    with open(arg1, 'r') as file:
        lines = file.readlines()
        for line in lines:
            if ':' not in line:
                list1.append(line.strip())

    with open(arg2, 'r') as file2:
        lines = file2.readlines()
        for line in lines:
            if ':' not in line:
                list2.append(line.strip())

    for line in list2:
        if line not in list1:
            lastlist.append(line)
    return lastlist

print(compare('IP1.txt', 'IP2.txt'))
Output:
['192.168.0.799', '192.168.0.900', '192.168.0.1000', '192.168.0.83']
(Sep-07-2022, 06:33 PM)menator01 Wrote: [ -> ]Here is a long way of doing it. Much room for optimizing.

def compare(arg1, arg2):
    list1 = []
    list2 = []
    lastlist = []
    with open(arg1, 'r') as file:
        lines = file.readlines()
        for line in lines:
            if ':' not in line:
                list1.append(line.strip())

    with open(arg2, 'r') as file2:
        lines = file2.readlines()
        for line in lines:
            if ':' not in line:
                list2.append(line.strip())

    for line in list2:
        if line not in list1:
            lastlist.append(line)
    return lastlist

print(compare('IP1.txt', 'IP2.txt'))
Output:
['192.168.0.799', '192.168.0.900', '192.168.0.1000', '192.168.0.83']
Thanks johnny but the output give me this error message what could have gone wrong?
python3 filter.py 
[]
(Sep-07-2022, 05:00 PM)Calli Wrote: [ -> ]And skip anything which isn't an ip address which in this case is 2001:4800:7819:104:be76:4eff:fe04:5809

This is an ip address - IPv6 address.

For dealing with ip addresses Python has built-in ipaddress.

This is no-effort question so no code from me.
(Sep-08-2022, 04:08 AM)perfringo Wrote: [ -> ]
(Sep-07-2022, 05:00 PM)Calli Wrote: [ -> ]And skip anything which isn't an ip address which in this case is 2001:4800:7819:104:be76:4eff:fe04:5809

This is an ip address - IPv6 address.

For dealing with ip addresses Python has built-in ipaddress.

This is no-effort question so no code from me.

I am learning and thanks for nothing aivar.paalberg
Use a set, if the order is not required.

from io import StringIO
from ipaddress import ip_address, IPv4Address


# fake file for test
ip1_content = StringIO(
    """192.168.0.1
192.168.0.2
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.4
192.168.0.5
192.168.0.6"""
)


ip2_content = StringIO(
    """
192.168.0.799
192.168.0.900
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.1000
192.168.0.5
192.168.0.83"""
)


def make_ipv4_set(file_like):
    results = set()

    for line in map(str.strip, file_like):
        try:
            addr = ip_address(line)
        except ValueError:
            continue

        if isinstance(addr, IPv4Address):
            results.add(addr)
            # or if you want str
            # results.add(line)

    return results


results = make_ipv4_set(ip1_content) - make_ipv4_set(ip2_content)
print(results)
(Sep-08-2022, 08:30 AM)DeaD_EyE Wrote: [ -> ]Use a set, if the order is not required.

from io import StringIO
from ipaddress import ip_address, IPv4Address


# fake file for test
ip1_content = StringIO(
    """192.168.0.1
192.168.0.2
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.4
192.168.0.5
192.168.0.6"""
)


ip2_content = StringIO(
    """
192.168.0.799
192.168.0.900
192.168.0.3
2001:4800:7819:104:be76:4eff:fe04:5819
192.168.0.1000
192.168.0.5
192.168.0.83"""
)


def make_ipv4_set(file_like):
    results = set()

    for line in map(str.strip, file_like):
        try:
            addr = ip_address(line)
        except ValueError:
            continue

        if isinstance(addr, IPv4Address):
            results.add(addr)
            # or if you want str
            # results.add(line)

    return results


results = make_ipv4_set(ip1_content) - make_ipv4_set(ip2_content)
print(results)

Works like charm but is it possible to import the first and second file because the first file ip1_content = StringIO conatins 1.4 million ip and second file contains 700,253.

Output
{IPv4Address('192.168.0.799'), IPv4Address('192.168.0.800'), IPv4Address('192.168.0.900'), IPv4Address('1.117.250.1000')}
If we can output the file as newIPlist.txt instead of displaying it in terminal would be just great.
Only Ip address needed not this text IPv4Address possible?
This makes the code shorter:

from ipaddress import IPv4Address, ip_address


def make_ipv4_set(file_like):
    results = set()

    for line in map(str.strip, file_like):
        try:
            addr = ip_address(line)
        except ValueError:
            continue

        if isinstance(addr, IPv4Address):
            results.add(addr)

    return results


with (
    open("ip1_file.txt") as ip1_file,
    open("ip2_file.txt") as ip2_file,
):
    ip1_set = make_ipv4_set(ip1_file)
    ip2_set = make_ipv4_set(ip2_file)


# sorting works because IPv4Address is sortable
results = sorted(ip1_set - ip2_set)
# results is now a list. sorted returns always a list

# iterate over the elements and print them
for ip in results:
    # converting IPv4Address to str
    print(str(ip))
(Sep-08-2022, 11:45 AM)DeaD_EyE Wrote: [ -> ]This makes the code shorter:

from ipaddress import IPv4Address, ip_address


def make_ipv4_set(file_like):
    results = set()

    for line in map(str.strip, file_like):
        try:
            addr = ip_address(line)
        except ValueError:
            continue

        if isinstance(addr, IPv4Address):
            results.add(addr)

    return results


with (
    open("ip1_file.txt") as ip1_file,
    open("ip2_file.txt") as ip2_file,
):
    ip1_set = make_ipv4_set(ip1_file)
    ip2_set = make_ipv4_set(ip2_file)


# sorting works because IPv4Address is sortable
results = sorted(ip1_set - ip2_set)
# results is now a list. sorted returns always a list

# iterate over the elements and print them
for ip in results:
    # converting IPv4Address to str
    print(str(ip))

I entered this IP44.44.44.44.11 on ip2_file.txt and when I ran the program it doesn't print out this IP address could it be because we are treading the IP as IPv4? can we just assign all the IP as text or number and than compare and extract which isn't in the first file ip1_file.txt
Pages: 1 2 3