Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Remove some columns
Hi there,

I want to remove some columns keep only Age Sex BMI Region Charges and then display the new table.
Please help me because I cannot display the new table as I expected.

Age Sex BMI Smoker Region Children Charges
21 male 25.75 no northeast 2 3279.87
37 female 25.74 yes southeast 3 21454.49
18 male 30.03 no southeast 1 1720.35
37 male 30.68 no northeast 3 6801.44
58 male 32.01 no southeast 1 11946.63
46 male 26.62 no southeast 1 7742.11
25 male 31.19 no northeast 4 21736.33
Do you have any code?
I welcome all feedback.
The only dumb question, is one that doesn't get asked.
My Github
How to post code using bbtags

(Dec-13-2023, 12:48 AM)menator01 Wrote: Do you have any code?


import pandas as pd

# Specified list of names
specified_names = ['Age','Sex','BMI','Region','Charges']

# Read the CSV file into a pandas DataFrame
df = pd.read_csv('medical_insurance.csv')

# Get the column names from the first row of the DataFrame
columns = df.columns.tolist()

# Create a list of columns to keep (columns whose first row entries match the specified names)
columns_to_keep = [col for col in columns if df[col][0] in specified_names]

# Create a new DataFrame with only the columns to keep
filtered_df = df[columns_to_keep]

# Save the filtered DataFrame to a new CSV file
filtered_df.to_csv('filtered_medical_insurance', index=False)

#FL: Why these 2 lines don't work
#new_df = pd.read_csv('filtered_medical_insurance.csv')

Can use the usecols

import pandas as pd

df = pd.read_csv('test.csv', usecols=[0,1,2,4,6])

Age Sex BMI Region Charges 0 21 male 25.75 northeast 3279.87 1 37 female 25.74 southeast 21454.49 2 18 male 30.03 southeast 1720.35 3 37 male 30.68 northeast 6801.44 4 58 male 32.01 southeast 11946.63 5 46 male 26.62 southeast 7742.11 6 25 male 31.19 northeast 21736.33

Using tabulate for display
# Do the imports
import pandas as pd
from tabulate import tabulate 

# Read the csv file with selected columns
df = pd.read_csv('test.csv', usecols=[0,1,2,4,6])

# Create new csv file
df.to_csv('newcsv.csv', index=False)

# Read new csv file
new_df = pd.read_csv('newcsv.csv')

# Using tabulate to format display output
new_df = tabulate(new_df, headers=list(new_df), showindex=False, tablefmt='pretty')

# Print new_df
+-----+--------+-------+-----------+----------+ | Age | Sex | BMI | Region | Charges | +-----+--------+-------+-----------+----------+ | 21 | male | 25.75 | northeast | 3279.87 | | 37 | female | 25.74 | southeast | 21454.49 | | 18 | male | 30.03 | southeast | 1720.35 | | 37 | male | 30.68 | northeast | 6801.44 | | 58 | male | 32.01 | southeast | 11946.63 | | 46 | male | 26.62 | southeast | 7742.11 | | 25 | male | 31.19 | northeast | 21736.33 | +-----+--------+-------+-----------+----------+
I welcome all feedback.
The only dumb question, is one that doesn't get asked.
My Github
How to post code using bbtags

Thank you Menator01

Possibly Related Threads…
Thread Author Replies Views Last Post
  Remove if similar values available based on two columns klllmmm 1 1,501 Feb-20-2022, 06:55 PM
Last Post: Larz60+
  How to remove a column or two columns in a correlation heatmap? lulu43366 3 5,592 Sep-30-2021, 03:47 PM
Last Post: lulu43366
  Remove Specific Columns when the number of columns is greater than a specific value CuriousOne 0 1,421 Sep-09-2021, 09:17 PM
Last Post: CuriousOne
  Remove \n from list of values within a pandas columns klllmmm 2 14,497 Jun-24-2019, 05:16 AM
Last Post: klllmmm

Forum Jump:

User Panel Messages

Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020