Python Forum
How to remove duplicates basis keys of a csv file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to remove duplicates basis keys of a csv file
#1
hi,

i have a csv while has data structure like this

key bhk Area Property_Type
310935 2 BHK 47.32 APARTMENT
310935 2 BHK 47.43 APARTMENT
310935 2 BHK 47.86 APARTMENT
310935 2 BHK 49.8 APARTMENT
310817 1BHK 28.56 APARTMENT
310817 1BHK 30.9 APARTMENT
310817 1BHK 30.9 APARTMENT
310817 1BHK 31.45 APARTMENT
310803 1BHK 25.92 APARTMENT
310803 1BHK 30.21 APARTMENT


Now i want to remove duplicates from area column but condition is that it should be key based. Meaning 1 key cannot have duplicates Area. Area can be duplicate in other keys but not in itself key.

I am trying to create it but not getting the logic behind:

These are my codes:

import csv
OUTPUT_FILE = 'Desired_format.csv'
filename = "optionsbook.csv"
sublist = []
with open("./"+ filename, "r") as file,open(OUTPUT_FILE, 'w') as f_out:
    reader = csv.DictReader(file)
    for line in reader:
        line["key"] = line["bhk"],line["Area"],line["Property_Type"]
        if line["Area"] in line:
            continue
        else:
            sublist.append(line["key"])
Reply
#2
Just memorize (key, area) in a set.

seen = set()
for row in data:
    memorize = (row['key'], row['Area'])
    if memorize in seen:
        continue
    else:
        seen.add(memorize)
        print(row)
del seen
By the way, the provided example data is wrong.
It's not comma separated and if you use whitespace as delimiter,
you'll get 5 columns for room 310935 and 4 columns for the rest.
Between 2 and BHK is a whitespace.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#3
So sorry that's a typo Error. But nice example. I am trying to fit in your example into my code. Thank you so much
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  remove partial duplicates from csv ledgreve 0 784 Dec-12-2022, 04:21 PM
Last Post: ledgreve
  Remove empty keys in a python list python_student 7 3,012 Jan-12-2022, 10:23 PM
Last Post: python_student
  Problem : Count the number of Duplicates NeedHelpPython 3 4,362 Dec-16-2021, 06:53 AM
Last Post: Gribouillis
  Remove single and double quotes from a csv file in 3 to 4 column shantanu97 0 6,971 Mar-31-2021, 10:52 AM
Last Post: shantanu97
  how to remove \n from file? shams 7 3,253 Feb-04-2021, 07:56 AM
Last Post: shams
  Removal of duplicates teebee891 1 1,790 Feb-01-2021, 12:06 PM
Last Post: jefsummers
  Displaying duplicates in dictionary lokesh 2 1,977 Oct-15-2020, 08:07 AM
Last Post: DeaD_EyE
  how do i pass duplicates in my range iterator? pseudo 3 2,352 Dec-18-2019, 03:01 PM
Last Post: ichabod801
  Deleting duplicates in tuples Den 2 2,752 Dec-14-2019, 10:32 PM
Last Post: ichabod801
  How to remove empty struct from matlab file in python? python_newbie09 0 2,374 Jun-25-2019, 12:13 PM
Last Post: python_newbie09

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020