Python Forum

We need to transfer some names and email addresses from Claws mail address book to a Samsung Galaxy Tab A. To understand what the Galaxy wanted , did an export of what is in the contacts app. It is in VCF version 2.1 an example from https://docs.fileformat.com/email/vcf/ as follows:

Output:BEGIN:VCARD
VERSION:2.1
N:Gump;Forrest;;Mr.
FN:Forrest Gump
ORG:Bubba Gump Shrimp Co.
TITLE:Shrimp Man
PHOTO;GIF:http://www.example.com/dir_photos/my_photo.gif
TEL;WORK;VOICE:(111) 555-1212
TEL;HOME;VOICE:(404) 555-1212
ADR;WORK;PREF:;;100 Waters Edge;Baytown;LA;30314;United States of America
LABEL;WORK;PREF;ENCODING#QUOTED-PRINTABLE;CHARSET#UTF-8:100 Waters Edge#0D#
 #0ABaytown\, LA 30314#0D#0AUnited States of America
ADR;HOME:;;42 Plantation St.;Baytown;LA;30314;United States of America
LABEL;HOME;ENCODING#QUOTED-PRINTABLE;CHARSET#UTF-8:42 Plantation St.#0D#0A#
 Baytown, LA 30314#0D#0AUnited States of America
EMAIL:[email protected]
REV:20080424T195243Z
END:VCARD

Claws can export the address book contents to either HTML or LDIF format. Rather than get bogged down in converting from one of those formats to VCF, I have used the script at https://r3mlab.github.io/python/2018/07/...riter.html . After correcting a few errors, the current code is

#!/usr/bin/env python

import csv

def vcfWriter(name, email, phone, category):
    vcfLines = []
    vcfLines.append('BEGIN:VCARD')
    vcfLines.append('VERSION:4.0')
    vcfLines.append('FN:%s' % name)
    vcfLines.append('EMAIL:%s' % email)
    vcfLines.append('TEL:%s' % phone)
    vcfLines.append('CATEGORIES:%s' % category)
    vcfLines.append('END:VCARD')
    vcfString = '\n'.join(vcfLines) + '\n'
    return vcfString

# Get data from the CSV file
csvFile = open('contacts.csv')
csvReader = csv.reader(csvFile)
csvData = list(csvReader)

# Create the ouput file
outputFile = open('contacts.vcf', 'w')

# Iterate over the lines of the CSV table
for row in range(len(csvData)):
    if row == 0:
        continue # Skip the first row (headers)
    else:
        # Get contact data from current row
        name = csvData[row][0]
        email = csvData[row][1]
        phone = csvData[row][2]
        category = csvData[row][3]

        # Write the corresponding vCard string to the output file:
        outputFile.write(vcfWriter(name, email,phone, category))

# Don't forget to close both files
outputFile.close()      
csvFile.close()

The input data did have TAB as a delimeter, but that needed to be change to a delimeter of a COMMA to get it to work. Here is the input data, file file contacts.csv

Output:
Alice ,         [email protected], 0123456789, Friends
Bob, [email protected], 0987654321, Work

and the output data in file contacts.vcf is

Output:BEGIN:VCARD
VERSION:4.0
FN:Bob
EMAIL: [email protected]
TEL: 0987654321
CATEGORIES: Work
END:VCARD

Note it is only reading the second line in the file, and not both. The version is easy to modify. As the exports from Claws mail are only HTML or LDIF format, they are quite different to a CSV file where it is only one row per person and all the fields are delimitered by a specific character.

So, I definitely need to move from using CSV to some sort of flat file format. The HTML looks quite messy and possibly hard to work with as there are cells within a table, lots of HTML code,etc. The LDIF on the other hand is like this

Output:dn: uid=538705298
objectClass: person
objectClass: inetOrgPerson
cn: Forrest Gump
sn: Gump
givenName: Forrest
displayName: Forrest Gump
mail: [email protected]

The "dn: uid=" indicates a new dataset, the unique number there is irrelevant for this purpose. So in summary, how do I ensure all records are read in the above script, and how can the CSV be replaced with some sort of flat file processing please ?

Is the second part of the modifications suitable addressed by https://pypi.org/project/ldif/ ?

It was easier to look at doing this with the LDIF format. Here is the code:

#!/usr/bin/env python

from ldif3 import LDIFParser
from pprint import pprint

parser = LDIFParser(open("claws_export.ldif", "rb"))

for dn, record in parser.parse():
    
    name = ""
    if 'cn' in record:
        name = record['cn'][0]
    
    surname = ""
    if 'sn' in record:
        surname = record['sn'][0]
    
    given_name = ""
    if 'givenName' in record:
        given_name = record['givenName'][0]
    
    display_name = ""
    if 'displayName' in record:
        display_name = record['displayName'][0]
    
    email = ""
    if 'mail' in record:
        email = record['mail'][0]

    print ('BEGIN:VCARD')
    print ('VERSION:2.1')
    print ("N:" + surname + ";" + given_name + ";;;")
    print ("FN:" + name)
    print ("EMAIL;HOME:" + email)
    print ("END:VCARD")

The o/p data looks like it is nearly matching what is required for import into the Galaxy contacts. If I did a

print(record)

this was the output

Quote:OrderedDict([('objectClass', ['inetOrgPerson']), ('cn', ['Forrest Gump']), ('sn', ['Gump']), ('displayName', ['Forrest Gump']), ('mail', ['[email protected]'])])

Can any improvements be done to the code ? For example, it seems a waste having to do all those "if" statements; possibly python has some sort of lookup function, to lookup within the class ?

The solutions to my last question were answered at https://python-forum.io/Thread-Can-I-rep...#pid138636

jehoshua

jehoshua

jehoshua