How is pandas modifying all rows in an assignment - python-newbie question

markm74 · Nov-28-2023, 07:08 PM

Hi all,

This may be a data science question or just a Python newbie question, but I'm trying to fundamentally understand how pandas modifies all records in this assignment:

import pandas as pd
df = pd.read_csv('example.csv')
print(df.head())
df['input'] = 'TEXT1: ' + df.context + ' TEXT2: ' + df.target + ' TEXT3: ' + df.anchor
print(df.head())

The csv has existing columns called 'context', 'target' and 'anchor' and I'm adding 'input' as a column in the above code. With what looks like a single concatenated string assignment, pandas has created a new column for all rows and modified the column according to the logic in the string concatenation. I'm coming to python from other languages, so is this a pythonic thing, or has pandas overridden object property assignment and they're using the expression as a shortcut to modify all rows? In another language you'd just end up with df['input'] as a property with a single string value - or it might throw an error because e.g. df.context isn't a variable that can be concatenated.

Thanks in advance,

Mark.

**deanhystad** · (This post was last modified: Nov-28-2023, 10:36 PM by deanhystad.)

It is doing exactly what I would expect. What were you expecting?

Maybe this will help explain.

import pandas as pd

df = pd.DataFrame(
    {"context": list("ABC"), "target": list("DEF"), "anchor": list("GHI")}
)
new_series = "TEXT1: " + df.context + " TEXT2: " + df.target + " TEXT3: " + df.anchor
print("target", df.target, "", sep="\n")
print("new_series", new_series, "", sep="\n")

Output:target
0    D
1    E
2    F
Name: target, dtype: object

new_series
0    TEXT1: A TEXT2: D TEXT3: G
1    TEXT1: B TEXT2: E TEXT3: H
2    TEXT1: C TEXT2: F TEXT3: I
dtype: object

df.target is a Series, an array like object that is the "target" row the df dataframe.

The result of this operation that uses multiple Series is a new Series.

new_series = "TEXT1: " + df.context + " TEXT2: " + df.target + " TEXT3: " + df.anchor

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Cython, Pandas, and Chained Assignment	sawtooth500	4	253	Apr-13-2024, 04:18 AM Last Post: sawtooth500
	newbie question - can't make code work	tronic72	2	695	Oct-22-2023, 09:08 PM Last Post: tronic72
	Newbie question about switching between files - Python/Pycharm	Busby222	3	622	Oct-15-2023, 03:16 PM Last Post: deanhystad
	Question on pandas.dataframe merging two colums	shomikc	4	842	Jun-29-2023, 11:30 AM Last Post: snippsat
	Pandas question	DPaul	3	2,862	Apr-22-2023, 05:51 AM Last Post: DPaul
	Newbie.... run for cover. OpenCV question	Stevolution2023	2	990	Apr-12-2023, 12:57 PM Last Post: Stevolution2023
	numpy newbie question	bcwilly_ca	4	1,193	Feb-10-2023, 05:55 PM Last Post: jefsummers
	How to assign a value to pandas dataframe column rows based on a condition	klllmmm	0	849	Sep-08-2022, 06:32 AM Last Post: klllmmm
	Python newbie	laleebee	2	1,330	May-24-2022, 01:39 PM Last Post: laleebee
	Modifying code	cheburashka	1	1,309	Dec-13-2021, 01:01 PM Last Post: Kebap

How is pandas modifying all rows in an assignment - python-newbie question

User Panel Messages

Announcements