Python Forum
How is pandas modifying all rows in an assignment - python-newbie question
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How is pandas modifying all rows in an assignment - python-newbie question
#1
Hi all,

This may be a data science question or just a Python newbie question, but I'm trying to fundamentally understand how pandas modifies all records in this assignment:

import pandas as pd
df = pd.read_csv('example.csv')
print(df.head())
df['input'] = 'TEXT1: ' + df.context + ' TEXT2: ' + df.target + ' TEXT3: ' + df.anchor
print(df.head())
The csv has existing columns called 'context', 'target' and 'anchor' and I'm adding 'input' as a column in the above code. With what looks like a single concatenated string assignment, pandas has created a new column for all rows and modified the column according to the logic in the string concatenation. I'm coming to python from other languages, so is this a pythonic thing, or has pandas overridden object property assignment and they're using the expression as a shortcut to modify all rows? In another language you'd just end up with df['input'] as a property with a single string value - or it might throw an error because e.g. df.context isn't a variable that can be concatenated.

Thanks in advance,

Mark.
Reply
#2
It is doing exactly what I would expect. What were you expecting?

Maybe this will help explain.
import pandas as pd

df = pd.DataFrame(
    {"context": list("ABC"), "target": list("DEF"), "anchor": list("GHI")}
)
new_series = "TEXT1: " + df.context + " TEXT2: " + df.target + " TEXT3: " + df.anchor
print("target", df.target, "", sep="\n")
print("new_series", new_series, "", sep="\n")
Output:
target 0 D 1 E 2 F Name: target, dtype: object new_series 0 TEXT1: A TEXT2: D TEXT3: G 1 TEXT1: B TEXT2: E TEXT3: H 2 TEXT1: C TEXT2: F TEXT3: I dtype: object
df.target is a Series, an array like object that is the "target" row the df dataframe.

The result of this operation that uses multiple Series is a new Series.
new_series = "TEXT1: " + df.context + " TEXT2: " + df.target + " TEXT3: " + df.anchor
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Cython, Pandas, and Chained Assignment sawtooth500 4 254 Apr-13-2024, 04:18 AM
Last Post: sawtooth500
  newbie question - can't make code work tronic72 2 696 Oct-22-2023, 09:08 PM
Last Post: tronic72
  Newbie question about switching between files - Python/Pycharm Busby222 3 623 Oct-15-2023, 03:16 PM
Last Post: deanhystad
  Question on pandas.dataframe merging two colums shomikc 4 843 Jun-29-2023, 11:30 AM
Last Post: snippsat
  Pandas question DPaul 3 2,863 Apr-22-2023, 05:51 AM
Last Post: DPaul
  Newbie.... run for cover. OpenCV question Stevolution2023 2 990 Apr-12-2023, 12:57 PM
Last Post: Stevolution2023
  numpy newbie question bcwilly_ca 4 1,195 Feb-10-2023, 05:55 PM
Last Post: jefsummers
  How to assign a value to pandas dataframe column rows based on a condition klllmmm 0 849 Sep-08-2022, 06:32 AM
Last Post: klllmmm
  Python newbie laleebee 2 1,333 May-24-2022, 01:39 PM
Last Post: laleebee
  Modifying code cheburashka 1 1,310 Dec-13-2021, 01:01 PM
Last Post: Kebap

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020