Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
pivot
#1
Information 
Hello,
could someone help me please with a special programming question:

I have a dataframe like this:

Datum Mitarbeiter Tageszeit
0 01.01.2023 M51 V
10 01.01.2023 M51 N
37 02.01.2023 M53 V
91 02.01.2023 M51 V
506 02.01.2023 M53 N
... ... ... ...
323288 04.12.2024 M53 N
323608 05.12.2024 M53 V
324199 05.12.2024 M34 N
324210 05.12.2024 M53 N
324419 06.12.2024 M53 V

I would need to calculate a table like this

Datum Tageszeit M51 M53 M34
04.12.2024 V 0 0 0
04.12.2024 N 0 X 0

Where X is set if there is an entry in the original table at e.g. Datum 04.12.2024 Tageszeit V for Mitarbeiter M51 or Mitarbeiter M34 and a 0 if there is not.
V means noon, N means afternoon.

How can I achieve this. I was thinking for a pivot table, but I have no idea how to do this.

Regards,
Gunthert
Reply
#2
Don't know how I clicked on this old post, but I did.

I tried pandas .pivot() and .pivot_table() without success, probably I did it wrong.

But you can get the data in a basic way by using:

list_df = df.values.tolist()
Now you have a list of lists with all the contents of your df. Lists are fairly easy to manipulate.

from io import StringIO
import pandas as pd

# small data taken from your post
data = StringIO("""Datum Mitarbeiter Tageszeit
01.01.2023 M51 V
01.01.2023 M51 N
02.01.2023 M53 V
02.01.2023 M51 V
02.01.2023 M53 N
04.12.2024 M53 N
05.12.2024 M53 V
05.12.2024 M34 N
05.12.2024 M53 N
06.12.2024 M53 V""")

# read the data as csv although we don't have a csv
df = pd.read_csv(data, sep=" ")

# get a list of unique Mitarbeiterzeichen
mitwirker = list(set(df['Mitarbeiter'].tolist()))
# sort the result, der Schönheit wegen
mitwirker.sort()
# die anderen gewünschten Zeilen im Resultat
extra_zeilen = ['Datum', 'Tageszeit']
df2_columns = extra_zeilen + mitwirker
# mach eine Liste voller Indexnummern
indexnums = [i for i in range(len(df))]
# create a df full of NaN, but with index
df2 = pd.DataFrame(columns=df2_columns, index=indexnums)

# get all values in df as a list of lists
# in this case, each sub-list has 3 elements: Datum, Mitarbeiter, Tageszeit
list_df = df.values.tolist()
# mach ne Kopie von df2 als Arbeitskopie
test = df2.copy()
# spiele die Werte von list_df in test hinein
for i in range(len(list_df)):
    test.loc[i,['Datum']] = list_df[i][0]
    test.loc[i,['Tageszeit']] = list_df[i][2]
    test.loc[i,[list_df[i][1]]] = 'X'

# überschreibe NaN mit Null
resultat = test.fillna(0)
Result:

Output:
Datum Tageszeit M34 M51 M53 0 01.01.2023 V 0 X 0 1 01.01.2023 N 0 X 0 2 02.01.2023 V 0 0 X 3 02.01.2023 V 0 X 0 4 02.01.2023 N 0 0 X 5 04.12.2024 N 0 0 X 6 05.12.2024 V 0 0 X 7 05.12.2024 N X 0 0 8 05.12.2024 N 0 0 X 9 06.12.2024 V 0 0 X
I managed to get pivot_table() to work, you need to get the aggfunc parameter correct. By default it is "mean", but there is no "mean" from V and N.

pivot_df = pd.pivot_table(dp, index=["Datum"], columns=["Mitarbeiter"], values="Tageszeit", aggfunc="sum") # yes
resultat = pivot_df.fillna(0) # ok
Gives:

Output:
Mitarbeiter M34 M51 M53 Datum 01.01.2023 0 VN 0 02.01.2023 0 V VN 04.12.2024 0 0 N 05.12.2024 N 0 VN 06.12.2024 0 0 V
From this we can see, M34 is a lazy ·[expletive deleted]!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  working with pivot table gunther 1 402 Jan-22-2025, 08:55 AM
Last Post: Keville_35
  Need Help! Pandas EXCEL PIVOT psb3958 1 1,604 Nov-13-2022, 10:37 PM
Last Post: deanhystad
  SQL Pivot EAV Quentin 2 3,573 Dec-03-2019, 11:52 PM
Last Post: Quentin
  How to pivot a dat UGuntupalli 0 2,558 Oct-17-2019, 11:13 PM
Last Post: UGuntupalli
  pivot error shyamdba 1 3,090 Feb-02-2018, 11:12 PM
Last Post: klllmmm
  Should it be pivot or unstack for this sample shyamdba 0 2,543 Feb-02-2018, 05:50 PM
Last Post: shyamdba

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020