Python Forum
How to transform from wide to long format in python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to transform from wide to long format in python
#1
Check the Problem-2.jpg file.

The files will (should) always be the same from columns A to H. The second last column will be S3Link. The columns between H and the S3Link column change between some files depending on the data is being reported.

For the final CSV, we need to take columns A to H, the S3Link column and the DataSource as below. Columns between column H and the S3Link column need to be normalised into the AttributeName and AttributeValue columns as shown below. It needs to be dynamic. In these files, the number of dynamic columns can vary from 1 up to 5 or 6, maybe.

The problem is that I am not able to dynamic loop and have difficulty normalising stuff into the Attribute Value and Attribute name. Till now, I am able to write python code:

1. The First Loop is Looping over CSV file
2. The Second Loop is for reaching the end of the Excel loop.
3. The third loop is for traversing columns - I think this is wrong.
for fn in csv_files:
    all_dfs = pd.read_csv(fn)
        for i in range(1,len(df)):  # Row Loop
            frameNo = all_dfs.iloc[i,1]
            gpsTimestamp = all_dfs.iloc[i,2]
            videoTimeInSec = all_dfs.iloc[i,3]
            latitude = all_dfs.iloc[i,4]
            longitude = all_dfs.iloc[i,5]
            metreID = all_dfs.iloc[i,6]
            Distance = all_dfs.iloc[i,7]
            vineNumber = all_dfs.iloc[i,8]            
            for j in range(1,len(df.loc[0,"vineNumber":"S3Link")):  #Column Loop For Normalization
                AttributeName = all_dfs.iloc[i,9]
                AttributeValue = all_dfs.iloc[i,10]
                S3Link = all_dfs.loc[i,S3Link]
                DataSource = all_dfs.loc[i,DataSource]
                rs = rs.append(pd.DataFrame({
                    "frameNo" = [frameNo]
                    "gpsTimestamp" = [gpsTimestamp]
                    "videoTimeInSec" = [videoTimeInSec]
                    "latitude" = [latitude]
                    "longitude" = [longitude]
                    "metreID" = [metreID]
                    "Distance" = [Distance]
                    "vineNumber" =  [vineNumber]
                    "AttributeName" = [AttributeName]
                    "AttributeValue" = [AttributeValue]
                    "S3Link" = [S3Link]
                    "DataSource" = [DataSource]
                }),ignore_index=True
)

Input File attached.

I don't know my logic for this task is correct or wrong. Any help????

Attached Files

Thumbnail(s)
   

.csv   VY6_Row1_report_url_detections_per_frame.csv (Size: 10.56 KB / Downloads: 199)
.csv   VY6_Row2_report_url_detections_per_frame.csv (Size: 8.5 KB / Downloads: 178)
Reply
#2
I think you are overthinking/overcomplicating this problem.

Is it save to state that
  • You want first 8 columns (i.e. A-H), then last 2 columns, then the 9th and 10th columns
  • The column names for first 10 columns are the same as in the original file, the last 2 columns (former 9th and 10th) will have new header - AttributeName and AttributeValue

fname = 'VY6_Row1_report_url_detections_per_frame.csv'
# fname = 'VY6_Row2_report_url_detections_per_frame.csv'

df =  pd.read_csv(fname)
col_drop = len(df.columns) - 10
df['AttributeName'] = df[df.columns[8]]
df['AttributeValue'] = df[df.columns[9]]
df.drop(labels=df.columns[8:8+col_drop], axis=1, inplace=True)
df.to_csv('VY6_out.csv', index=False)
Do you need to use pandas? Using csv module would also do.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  [SOLVED] Looking for documentation on Reportlab's canvas.transform() function NeilUK 1 621 Aug-23-2023, 01:21 PM
Last Post: NeilUK
  Transform Dic to dataframe d9d9d 4 1,396 Apr-14-2022, 09:35 AM
Last Post: perfringo
  Transform 3 Columns into Single Column DaveG 8 1,890 Apr-04-2022, 08:42 AM
Last Post: Pedroski55
  Long-term stable source to get news headlines with Python? sandufi 4 1,945 Dec-23-2021, 09:48 AM
Last Post: sandufi
  Importing Program Wide JarredAwesome 4 2,198 Sep-07-2020, 04:34 PM
Last Post: JarredAwesome
  Python long running script - causes RDP failure to server? william101 1 2,340 Jun-08-2020, 08:18 AM
Last Post: nuffink
  What is the best way to set application-wide config values? ajorona 1 1,920 May-07-2020, 05:03 PM
Last Post: buran
  Transform list or set regardless of nesting structure blubb 2 1,971 Mar-10-2020, 07:17 PM
Last Post: ibreeden
  output a list of random numbers 'x' columns wide adityavpratap 4 3,012 Jan-13-2020, 05:32 PM
Last Post: perfringo
  Long command with characters not working in Python on Oracle Linux 7 iaas_infra 10 6,267 Jul-19-2019, 04:53 PM
Last Post: ichabod801

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020