Posts: 69
Threads: 34
Joined: Sep 2017
I made a basic script that passes a dataframe between 2 functions.
I understand how it works, except for the meaning of the last line.
test2(test1()) Does this mean that all the code from test1 is merged within test2?
I get that it's calling both functions, but I don't fully understand why test1 has to be inside test2, and what exactly is happening at run time.
import pandas as pd
'''
test1 function creates the dataframe
'''
def test1():
list1 = ["item1", "item2", "item3", "item4", "item5", "item6",
"item7", "item8", "item9", "item10", "item11", "item12"]
df = pd.DataFrame(list1, columns=['Col 0:']) # create new dataframe from list1
df.insert(1, 'Col 1:', "") # insert new column
df.insert(2, 'Col 2:', "") # insert new column
return df # pass the dataframe outside the function
'''
test2 function calls the dataframe from test1() makes a change to the cell data and prints df
'''
def test2(df): # call df which was passed from test1()
df.iloc[5, 2] = "NEW CELL DATA"
print(df)
test2(test1())
Posts: 12,022
Threads: 484
Joined: Sep 2016
test2(test1()) if you called test by itself, you would do something like:
df = test1()
test2(df)
# or
xx = test1()
test2(xx) so the last line is simply a shortcut, it states that test1 is to be executed and the results will be the argument to test2.
If test1 did not return a value, it would fail, try it.
Posts: 69
Threads: 34
Joined: Sep 2017
Quote:test1 is to be executed and the results will be the argument to test2
hmmm.. I read that part 10x and it finally started clicking a bit, but it still seems kinda weird
test2(df) is pulling the returned dataframe from test1() into test2(). it was seeming to me like that should be all the connection I need between the two..
Posts: 7,312
Threads: 123
Joined: Sep 2016
Apr-18-2018, 08:28 AM
(This post was last modified: Apr-18-2018, 08:28 AM by snippsat.)
(Apr-18-2018, 06:52 AM)digitalmatic7 Wrote: Does this mean that all the code from test1 is merged within test2? Not merged test1 do only one task that is to return df.
When you do test2(test1()) is test1 a argument that get passed to function test2.
Here another example how to send just function object,also showing a better way to document with docstrings .
Because functions are objects(like string,int,dict,list..ect) you can pass them as arguments to other functions.
import pandas as pd
def test1():
'''test1 function creates the dataframe'''
list1 = [
"item1", "item2", "item3", "item4", "item5", "item6",
"item7", "item8", "item9", "item10", "item11", "item12"
]
df = pd.DataFrame(list1, columns=['Col 0:'])
df.insert(1, 'Col 1:', "")
df.insert(2, 'Col 2:', "")
return df
def test2(df):
'''
test2 function calls the dataframe from test1()
makes a change to the cell data and prints df
'''
df = df() # now call test1 inside test2
df.iloc[5, 2] = "NEW CELL DATA"
print(df)
if __name__ == '__main__':
test2(test1) Output: Col 0: Col 1: Col 2:
0 item1
1 item2
2 item3
3 item4
4 item5
5 item6 NEW CELL DATA
6 item7
7 item8
8 item9
9 item10
10 item11
11 item12
Quote:Python’s functions are first-class objects.
You can assign them to variables, store them in data structures,
pass them as arguments to other functions,
and even return them as values from other functions.
Posts: 69
Threads: 34
Joined: Sep 2017
Apr-18-2018, 09:30 AM
(This post was last modified: Apr-18-2018, 09:30 AM by digitalmatic7.)
Thanks! I'm feeling more confident I understand passing arguments now, but I tried to apply it to a new script and I'm getting an error.
In pycharm it says:
Parameter 'fruit' unfilled When I run the script I get:
TypeError: test2() missing 1 required positional argument: 'fruit' Here's the full code:
from multiprocessing import Pool, Manager
def test1():
fruit = "banana"
return fruit
def test2(obj, fruit): # obj is the array iterable passed from map
counter, item = obj #
counter_val = counter.get()
counter.set(counter_val + 1) # increment counter
print(item, counter_val)
fruit()
fruit = "apple"
print(fruit)
test2(test1)
if __name__ == '__main__':
list2 = ["item1",
"item2",
"item3",
"item4",
"item5",
"item6",
"item7",
"item8",
"item9",
"item10",
"item11",
"item12"]
counter = Manager().Value(int, 0)
array = [(counter, item) for item in list2] # link together the counter and list in an array
p = Pool(4) # worker count
p.map(test2, array) # function, iterable
p.terminate() I don't think I can make what I need to do any more barebones and simplified, but it doesn't work.
Is it just breaking because of some kind of conflict with the multiprocessing? I tried passing it with Manager() as well but no luck with that either.
|