Python Forum
Grab columns from multiple files, combine into one
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Grab columns from multiple files, combine into one
#1
I have some files stored in a cloud storage bucket and each file contains different variables. What I would like to develop is a function whereby I simply enter in the variables I am interested in and run the function to create a master data set with only those columns/variables. The function iterates through the files and when it finds one of the variable/column names entered as input in the function in one of the files, it grabs that column(s) and joins it to a master dataframe. Below is what I have so far. Any help in developing this further would be very much appreciated.

---
from tensorflow.python.lib.io import file_io

files = [o.key for o in storage.Objects(bucket_name, '', '')]
def get_my_data(list1):
  df=pd.DataFrame()
  files = [o.key for o in storage.Objects(bucket_name, '', '')]
  for l in list1:
    for f in files:
      file1="gs://bucket_name/%s" % f
      with file_io.FileIO(file1, 'r') as f:
        columns = pd.read_csv(f, nrows=1)
        if l in columns:
          data=pd.read_csv(f)
          print file1, data[l]
          #append desired column to our new df
        else:
          pass
get_my_data(['var1', 'var2', 'var3'])
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Import multiple CSV files into pandas Krayna 0 1,693 May-20-2021, 04:56 PM
Last Post: Krayna
  pandas.to_datetime: Combine data from 2 columns ju21878436312 1 2,418 Feb-20-2021, 08:25 PM
Last Post: perfringo
  Loading multiple JSON files to create a csv 0LI5A3A 0 2,081 Jun-28-2020, 10:35 PM
Last Post: 0LI5A3A
Question Dividing a single column of dataframe into multiple columns based on char length darpInd 2 2,412 Mar-14-2020, 09:19 AM
Last Post: scidam
  Append Multiple CSV files Nidhesh 2 2,464 Jul-03-2019, 11:55 AM
Last Post: Nidhesh
  How to extract different data groups from multiple CSV files using python Rafiz 3 3,196 Jun-04-2019, 05:20 PM
Last Post: jefsummers
  Concatenate multiple csv files Oscarca 1 3,041 Nov-05-2018, 11:18 AM
Last Post: Larz60+
  comparing two columns two different files in pandas nuncio 0 2,370 Jun-06-2018, 01:04 PM
Last Post: nuncio
  Dropping all rows of multiple columns after the max of one cell Thunberd 2 2,918 Jun-01-2018, 10:18 PM
Last Post: Thunberd
  Question from beginners: how to combine 2 columns Jack_Sparrow 1 2,925 May-12-2018, 04:18 PM
Last Post: woooee

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020