Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Basic Beginner question
#1
Hi all,

I'm trying to work with a large CSV file to effectively return what a filter would in excel....
So when the value in column 4 = a defined number then return that entire row.
Seems pointless, but some of the csv files won't open with excel as they have too many rows, so this would be quite useful for just looking at 1 customer.
Also, I plan on expanding this a lot further, this is just the 1st step!

What I have so far is:

import csv
import sys


number = '1'

csv_file = csv.reader(open('File.csv', "r", encoding="Latin1"))
filename  = open("Result.csv",'w')
sys.stdout =filename
print(next(csv_file))


for row in csv_file:

    if number == row[4]:
       print(','.join(row))
This is getting me some of the way there, however there are a few cases where cells have ',' within them, this results in the results from here going into 2 different columns.

I was trying to use quotechar="'" but I am using it wrong as my file returns nothing with this.

Also, using the
print(','.join(row))
it changed 'text' to text (which is what I want) however the header row still has 'text'.

Anyone know a good way of fixing these two issues?
Guessing it's quite straight forward, but anything I find online I think I end up using wrong as it returns errors, or it runs but the result file only has the header row, or nothing at all!

Thanks in advance,

N
Reply
#2
I would look closely at your csv file. Confirm that you have the right quotechar. See how the file handles quotechars within quoted text (the escapechar setting in csv.reader).

I would also use filename.write rather than redirecting sys.stdout, in the interests of keeping it simple.
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply
#3
Why not simply use pandas dataframe to read the csv and filter out the row(s)? Something like this:


import pandas as pd
import csv

df = pd.read_csv('C:\Test\Test.csv')
df_filtered = df[df['Column'] == number]
df_filtered
Reply
#4
from a long term perspective (given you want to expand, etc.) the best approach would be to load the csv fie into database.
for start you can look into sqlite3 - builtin support comes with python. It will be scalable solution and you can upgrade to something like MySQL, etc. in the future
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#5
Thanks for all the help.
I tried a few of the solutions, and at the moment for the simple part of the overall task the Pandas solution worked a treat!
I will no doubt have many more questions while i'm getting up to speed in python, so thanks again for the quick and helpful replies!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Basic Coding Question: Exit Program Command? RockBlok 3 504 Nov-19-2023, 06:31 PM
Last Post: deanhystad
  A simple "If...Else" question from a beginner Serena2022 6 1,638 Jul-11-2022, 05:59 AM
Last Post: Serena2022
  [solved] Basic question on list matchiing paul18fr 7 1,808 May-02-2022, 01:03 PM
Last Post: DeaD_EyE
Question Beginner Boolean question [Guessing game] TKB 4 2,226 Mar-22-2022, 05:34 PM
Last Post: deanhystad
  Very basic calculator question BoudewijnFunke 4 1,883 Dec-10-2021, 10:39 AM
Last Post: BoudewijnFunke
  Beginner question NameError amazing_python 6 2,379 Aug-13-2021, 07:28 AM
Last Post: amazing_python
  Beginner question - storing values cybertron2 4 3,137 Mar-09-2021, 04:21 AM
Last Post: deanhystad
  basic question isinstance tames 5 2,772 Nov-23-2020, 07:20 AM
Last Post: tames
  basic question about tuples and immutability sudonym3 6 2,828 Oct-18-2020, 05:11 PM
Last Post: sudonym3
  beginner question about lists and functions sudonym3 5 2,666 Oct-17-2020, 12:31 AM
Last Post: perfringo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020