Nov-28-2023, 07:08 PM
Hi all,
This may be a data science question or just a Python newbie question, but I'm trying to fundamentally understand how pandas modifies all records in this assignment:
Thanks in advance,
Mark.
This may be a data science question or just a Python newbie question, but I'm trying to fundamentally understand how pandas modifies all records in this assignment:
import pandas as pd df = pd.read_csv('example.csv') print(df.head()) df['input'] = 'TEXT1: ' + df.context + ' TEXT2: ' + df.target + ' TEXT3: ' + df.anchor print(df.head())The csv has existing columns called 'context', 'target' and 'anchor' and I'm adding 'input' as a column in the above code. With what looks like a single concatenated string assignment, pandas has created a new column for all rows and modified the column according to the logic in the string concatenation. I'm coming to python from other languages, so is this a pythonic thing, or has pandas overridden object property assignment and they're using the expression as a shortcut to modify all rows? In another language you'd just end up with df['input'] as a property with a single string value - or it might throw an error because e.g. df.context isn't a variable that can be concatenated.
Thanks in advance,
Mark.