Python Forum
How to use the excel filename as a value to populate new column, using Pandas?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to use the excel filename as a value to populate new column, using Pandas?
#1
I'm new to Python. I have about 50 excel workbooks that all have different names e.g. 012018 CBA, 022018 CBA, etc. For each, I would like to insert a new column called "Date" and use the first 6 digits "012018" of the excel filename as the value to populate the entire Date column. I would like to accomplish this using Pandas. Thank you!
Reply
#2
If your data frame is defined and prefilled, you can create a new column and fill it as follows df['Date'] = 'desired value'. This automatically creates Date column in the data frame and fills it with desired value.

If you want to incorporate data from all excel files into one data frame, your algorithm might be the following:

  1. Define an empty list, e.g., named acc (it will be used later).
  2. Use os.path.walk (os.walk in Py3) or glob.glob to iterate over all files;
  3. Load data from each file using pandas, e.g. pandas.read_csv, pandas.read_excel;
  4. Once you iterating files, you know their names; Let filename is the current filename of a file being loaded into df; You can just do df['Dates'] = filename;
  5. Append each df to acc list;
  6. Use pd.concat to combine all dfs stored in acc into a new data frame.
Reply
#3
Thanks, but maybe I didn't explain it very well. I have 50 Excel files with a naming sequence of 012018 CBA, 022018 CBA, 032018 CBA, 042018 CBA, etc. For each file, I want to use the first 6 digits of the filename to populate the new column within that particular file ... Then do the same for the all the others.

For example, first file called "012018 CBA", I want to insert a new column called "Date" and use the "012018" to populate that column. Next file called "022018 CBA", I want to insert a new column called "Date" and use the "022018" to populate that column. Next file called "032018 CBA", I want to insert a new column called "Date" and use the "032018" to populate that column, etc for all 50 files.

Make sense?
Reply
#4
Your files have regular names, so you can extract desired dates using index slicing, e.g. date = filename[:6].

(Jun-26-2019, 10:32 AM)Dequanharrison Wrote: I want to insert a new column called "Date" and use the "032018" to populate that column, etc for all 50 files.

So, you will get a data frame with at least 50 columns that have the same name Date? In any case, you will likely use either os.walk or glob.glob to get all file names. You can read about these modules in official docs here and here.

This isn't to be a hard problem, so it would be better if you post some code to show your efforts on solving the problem.
Reply
#5
Not looking to create a datadrame with 50 columns, but rather end up with 50 dataframes/Excel files woth the additional new "date" column filled with the filename of the original file. Saved as a CSV.

Make sense?
Reply
#6
I'll try to get started with what I know and post some sample code. Thanks.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Pandas keep existing format of Excel AlphaInc 2 882 Jan-11-2024, 03:44 AM
Last Post: plonkarchivist
  HTML Decoder pandas dataframe column mbrown009 3 962 Sep-29-2023, 05:56 PM
Last Post: deanhystad
  pandas column percentile nuncio 7 2,383 Aug-10-2022, 04:41 AM
Last Post: nuncio
  pandas: Compute the % of the unique values in a column JaneTan 1 1,756 Oct-25-2021, 07:55 PM
Last Post: jefsummers
  Pandas Data frame column condition check based on length of the value aditi06 1 2,655 Jul-28-2021, 11:08 AM
Last Post: jefsummers
  How to move each team row to a new column. Pandas vladiwnl 0 1,694 Jun-13-2021, 08:10 AM
Last Post: vladiwnl
  iretate over columns in df and calculate euclidean distance with one column in pandas Pit292 0 3,268 May-09-2021, 06:46 PM
Last Post: Pit292
  [Pandas] Write data to Excel with dot decimals manonB 1 5,774 May-05-2021, 05:28 PM
Last Post: ibreeden
Question Pandas - Creating additional column in dataframe from another column Azureaus 2 2,915 Jan-11-2021, 09:53 PM
Last Post: Azureaus
  Pandas: summing columns conditional on the column labels ddd2332 0 2,076 Sep-10-2020, 05:58 PM
Last Post: ddd2332

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020