Python Forum

Full Version: merging sublist into single list in python
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi All,

I need help in solving the below problem -

my current list value are like this -
single_list1 = [ 'Group Name Details', 'Group Name', 'Person Phone Number', 'Father Name'] 
I want to get the below output in another list -
list_output = [ 'Group', 'Name', 'Details', 'Group', 'Name', 'Person', 'Phone','Number', 'Father', 'Name'] 
Please let me know if you can share the optimum code for this.

Thanks

#python

Adding some more details which I missed in original post-
single_list1 = [ 'New', ['Group Name Details'], ['Group Name'], ['Person Phone Number'], ['Father Name']]
so basically it is a combination of multiple sublist and items and I want to keep them all as one single list items.

I want to get the below output in another list -
list_output = [ 'New', 'Group', 'Name', 'Details', 'Group', 'Name', 'Person', 'Phone','Number', 'Father', 'Name']
sum(map(str.split, sum([[k] if isinstance(k, str) else k for k in single_list1], [])), []) 
Ohh not easy to read that line scidam Wink
It can be simplified down.
>>> lst = ['Group Name Details', 'Group Name', 'Person Phone Number', 'Father Name']

>>> [word for line in lst for word in line.split()]
['Group', 'Name', 'Details', 'Group', 'Name', 'Person', 'Phone', 'Number', 'Father', 'Name']

>>> sum(map(str.split, sum([[k] if isinstance(k, str) else k for k in lst], [])), [])
['Group', 'Name', 'Details', 'Group', 'Name', 'Person', 'Phone', 'Number', 'Father', 'Name']
@snippsat, unfortunately the OP has mixture of strings and nested lists of strings [ 'New', ['Group Name Details'], ['Group Name'], ...], so invoking line.split(), when line is a list, will lead to an error. I rewrite my solution to be more pythonic:

def extract_words(mixture):
    """Extract words from mixture of words and lists of words.
    """

    result = list()
    for item in mixture:
        if isinstance(item, list):
            for _astring in item:
                result += _astring.split()
        else:
            result += item.split()
    return result
Another solution which supports deep nested iterables.

from collections import deque


def extract_words(mixture):
    """
    Flat deep nested iterables and split strings if they occour.
    """
    stack = deque([mixture])
    # using a stack to allow iterating
    # deep nested list
    # it could be done easier with recursion
    # but all stack based languages have a recursion limit
    to_split = (str, bytes)
    # we want to split str and bytes
    while stack:
        # loop runs until stack is empty
        current = stack.popleft()
        # with the first iteration
        # stack is currently empty
        # and current has the first element from stack
        if isinstance(current, to_split):
            # split if the current object is a str or bytes
            yield from current.split()
        else:
            # this branch is executed, if the current object
            # is not a str or bytes
            try:
                current = iter(current)
                # iter of iter returns the same iterator
                subelement = next(current)
                # the next does what the for loop does
            except StopIteration:
                # but we have to check for errors manually
                pass
            except TypeError:
                # and if an element is not iterable, it raieses
                # TypeError. Intgers are for example are not
                # iterable
                yield subelement
            else:
                # if no error happens, put the current iterator back
                # to the left side of the stack
                stack.appendleft(current)
                # put the subelement of the beginning of the deque
                stack.appendleft(subelement)
This is a generator. If you call it, you have to iterate over it.

data = ['Andre Müller', ['hallo', '123'], [[[[['foo bar']]], 'bat']], 12]
extractor = extract_words(data)
# nothing happens, now you can decide if you want to have a list, set, tuple or dict or something different.
result = list(extractor)
# now the generator extractor is exhausted, you can't take it again
# but you can make a new generator and use it for example in a for loop
for element in extract_words(data):
    print(element)
The befit is, that the caller decides which object type is used to put the elements inside.
You can also pipe one generator into the next.

The code above allows also Integers, if you want to filter them, you can make a pipeline.

def filter_non_iterable(iterable):
    allowed = (tuple, list, set, str, bytes)
    for element in iterable:
        if isinstance(element, allowed):
            yield element

filtered = filter_non_iterable(data)
result = extract_words(filtered)
# now still nothing happens


print(list(result))
# now the first generator is consumed by the second generator. The first one filters, the second one does the flatten and split task.
Just to use yield from Shifty
def flatten(seq):
    for item in seq:
        if isinstance(item, list):
            yield from flatten(item)
        else:
            yield item
>>> list1 = [ 'New', ['Group Name Details'], ['Group Name'], ['Person Phone Number'], ['Father Name']]
>>> [word for line in flatten(list1) for word in line.split()]
['New', 'Group', 'Name', 'Details', 'Group', 'Name', 'Person', 'Phone', 'Number', 'Father', 'Name']

>>> data = ['Andre Müller', ['hallo', '123'], [[[[['foo bar']]], 'bat']], 12]
>>> list(flatten(data))
['Andre Müller', 'hallo', '123', 'foo bar', 'bat', 12]
Recursion limit :-)

But the code looks nice and Pythonic.
(Mar-22-2019, 07:18 PM)DeaD_EyE Wrote: [ -> ]Recursion limit :-)
sys.setrecursionlimit(9999) Naughty
(Mar-22-2019, 07:10 PM)snippsat Wrote: [ -> ]Just to use yield from
That isn't as efficient as it may look, when compared to the solution with an explicit stack. In this elegant-looking solution, for deep nesting, Python will keep a stack of generators and when you want to get the "next" value you'll have to start at the "bottom" of the stack, work your way to the top, and then return the value back through that stack of generators.

The explicit stack solution is linear time, but iterating over that generator would be more like quadratic (the true complexity is more involved, but I think "quadratic" describes the problem here well enough).

Also, the OP doesn't appear to need arbitrary nesting :)