Python Forum
create new column based on condition
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
create new column based on condition
#8
I'm using here a"vectorized" form instead of using loops; vectorization is much faster as you'll notice.

Remarks:
  • The Matrix array is just an example, you can useyour own A,B,C instead
  • D is intialized to avoid (memory) dynamic allocation; it's a good practise
  • i is the "index" vector (from 1 to (r-1))
  • D is obvious here - based on your formula
  • finally the first row corresponds to index i=0
  • if your working only with integer, it might have been more relevant in my previous post to directly define the dtype for D/D2
  • since the first row uses a different formula, so it can be calculated before of after the main block

Test the following codes to compare the vectorization and (classical) loops => you'll figured out how it works.

Hope it helps

import numpy as np
import time

Nmax = 10_000
r, c = 1_000_000, 3

# Matrix = np.random.randint(1, Nmax, size=(r,c))
Matrix = np.arange(r*c).reshape(r, c)

# for all the matrix except the first row
t0=time.time()
D = np.zeros((r, 1), dtype=int)
i = np.arange(1, r)
D[i, 0] = Matrix[i-1, 2] - Matrix[i, 0] + Matrix[i, 1]

# specifically for the first row
i = 0
D[i, 0] = Matrix[i, 2] + Matrix[i, 0] - Matrix[i, 1]
t1 = time.time()
print(f"Duration1 = {t1-t0}")


# using loops
t2=time.time()
D2 = np.zeros((r, 1), dtype=int)
for i in range(1, r):
    D2[i, 0] = Matrix[i-1, 2] - Matrix[i, 0] + Matrix[i, 1]
D2[0, 0] = Matrix[0, 2] + Matrix[0, 0] - Matrix[0, 1]
t3=time.time()

print(f"D equals D2? => {np.array_equal(D, D2)}")
print(f"Duration2 = {t3-t2}")
print(f"Duration's ratio = {(t3-t2)/(t1-t0)}")
Reply


Messages In This Thread
create new column based on condition - by arvin - Dec-12-2022, 11:32 AM
RE: create new column based on condition - by arvin - Dec-12-2022, 02:22 PM
RE: create new column based on condition - by arvin - Dec-12-2022, 02:26 PM
RE: create new column based on condition - by arvin - Dec-13-2022, 06:24 AM
RE: create new column based on condition - by paul18fr - Dec-13-2022, 10:22 AM
RE: create new column based on condition - by arvin - Dec-13-2022, 10:36 AM
RE: create new column based on condition - by arvin - Dec-13-2022, 10:37 AM
RE: create new column based on condition - by arvin - Dec-13-2022, 11:36 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Get an average of the unique values of a column with group by condition and assign it klllmmm 0 491 Feb-17-2024, 05:53 PM
Last Post: klllmmm
  unable to remove all elements from list based on a condition sg_python 3 587 Jan-27-2024, 04:03 PM
Last Post: deanhystad
  Create dual folder on different path/drive based on the date agmoraojr 2 565 Jan-21-2024, 10:02 AM
Last Post: snippsat
  Python Alteryx QS-Passing pandas dataframe column inside SQL query where condition sanky1990 0 825 Dec-04-2023, 09:48 PM
Last Post: sanky1990
  Sent email based on if condition stewietopg 1 973 Mar-15-2023, 08:54 AM
Last Post: menator01
  How to assign a value to pandas dataframe column rows based on a condition klllmmm 0 914 Sep-08-2022, 06:32 AM
Last Post: klllmmm
  Python create a spreadsheet with column and row header ouruslife 4 1,815 Jul-09-2022, 11:01 AM
Last Post: Pedroski55
  select Eof extension files based on text list of filenames with if condition RolanRoll 1 1,607 Apr-04-2022, 09:29 PM
Last Post: Larz60+
  Openpyxl-change value of cells in column based on value that currently occupies cells phillipaj1391 5 10,208 Mar-30-2022, 11:05 PM
Last Post: Pedroski55
  Cannot convert the series to <class 'int'> when trying to create new dataframe column Mark17 3 8,771 Jan-20-2022, 05:15 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020