Parsing CSVs...

ChLenx79 · Dec-12-2022, 02:39 PM

Hi All,

hope you are all well!

Need a little nugde...

I have a CSV, within which a couple of lines look like this...

CompanyName LTD, "[email protected],[email protected]", "12234,56678"

How would I select, the second email, from the second column? Or print each of the numbers in the 3rd column individually?

import csv


dataSetFile = "DemoDataSetForReports.csv"

with open(dataSetFile, 'r') as dataSet:
    csvReader = csv.reader(dataSet)

    for row in csvReader:

        print(row[2])

The above code prints : 12234,56678

I assume CSV is the correct module for doing this?

Any help would be appreciated!

**perfringo** · (This post was last modified: Dec-12-2022, 04:00 PM by perfringo.)

Yes, csv module is right for task at hand.

Some suggestions:

read file with newline='' (from documentation: "If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.")
if file has space after comma use skipinitialspace=True
remember that Python uses 0-based indexing

You need to have second item on a row (at index 1), split it at comma and get second item (at index 1).

So you can write:

import csv

with open("my_magic.csv", "r", newline="") as f:
    data = csv.reader(f, skipinitialspace=True)
    for row in data:
        print(row[1].split(',')[1])

# prints [email protected]

ChLenx79 · Dec-13-2022, 03:29 PM

You sir, are a genius!!

Thanks so much!

**deanhystad** · (This post was last modified: Dec-13-2022, 07:39 PM by deanhystad.)

skipinitialspace=True is required if you have spaces after your separator character

Output:           separator
               v
CompanyName LTD, "[email protected],[email protected]"
                ^
            whitespace

If you don't skip the initial space(s) the csv reader misses the starting quote which messes up parsing the rest of the line. Instead of three columns like this.

Output:CompanyName LTD
[email protected], [email protected]"
12234,56678

you end up with 5

Output:CompanyName LTD
 "[email protected]
[email protected]"
 "12234
56678"

Such a subtle and not obvious difference that has a big effect.
The newline='' in:

with open("my_magic.csv", "r", newline="") as f:

removes newline characters (\n) from the end of lines read from the file. I don't think this is really needed here. The csv reader ignores the newline characters. Setting the newline to '' when writing a csv file prevents the csv writer writing a blank line after each row(two newline characters at the end of each row). I think that only happens on Windows

I would read the file like this:

import csv

with open("test.csv", "r") as file:
    data = csv.reader(file, skipinitialspace=True)
    for company, emails, numbers, in data:
        emails = emails.split(",")
        numbers = numbers.split(",")
        print(company, emails, numbers)

Parsing CSVs...

User Panel Messages

Announcements