Exporting contacts as vCard file
I use Thunderbird as mail client, and my contact are stored on my server using the Caldav protocol. Mozilla Thunderbird offers to export contacts in the main address book interface ( « Tools -> Export » and choose vCard export type)
If you use an other way of storing your contacts, like Google with Gmail, you can also export your contact with vCard export format.
I have created two contacts to show the incompatibility with the phone.
- test 1
- test éàéç 2
You will end up with a file like this one « vcard_test_export.vcf » :
begin:vcard
fn:test 1
n:1;test
tel;cell:+33 6 00 00 00 00
version:2.1
end:vcard
begin:vcard
fn;quoted-printable:test =C3=A9=C3=A0=C3=A9=C3=A7 2
n;quoted-printable:2;test =C3=A9=C3=A0=C3=A9=C3=A7
tel;cell:+33 7 00 00 00 00
version:2.1
end:vcard
Sending vCard to Nokia phone
If we try to send this file to the Nokia phone, via Bluetooth, it detects it correctly as a vCard file, and ask you to add the contact to the phone. But, it will do it only for the first contact in the file ("test 1" in our example).
So the first issue is that the Nokia phone does not support multiple contact in one vCard file. We will have to split each contact in its own vCard file.
Also, the second contact has specific characters that are not part of the ASCII table. You can see that exporting a vCard will output the specific character as "quoted-printable" format with some specific code like : "=C3=A9".
Quoted-printable strings are a manner of encoding character which prevent to be incorrectly parsed when exporting (or sending over a wire) data. All escape code are prefixed by a "=" sign followed by the code.
So here : =C3=A9 is : \xc3\xa9. It is UTF-8 encoding for representing the character "é".
But, what happens if we sent the second contact via Bluetooth ? The Nokia phone do not understand the UTF-8 encoding, and save the contact name as is like:
"test =C3=A9=C3=A0=C3=A9=C3=A7 2"
If we look at the standard page for Charset specification (section #3.1) it says:
« The charset (see [RFC3536] for internationalization terminology) for vCard is UTF-8 as defined in [RFC3629]. »
And it is quite clear with this sentence:
« There is no way to override this. »
So the phone tricks the standard. But how to find the expecting encoding charset for vCard on this Nokia Phone ? This is simple, we will create a contact on the Nokia phone, "test éçà" and send it via Bluetooth to the computer. (Menu contact, "Send bus. card"-> "via Bluetooth")
Open the VCF file on your computer, and what do you see ? The Nokia phone overrides the encoding information by adding a Charset specification ("N;CHARSET=ISO-8859-1;ENCODING=QUOTED-PRINTABLE:").
ISO-8859-1 is another Charset specification, widely used before UTF-8 encoding for encoding latin specific characters.
So we have 2 issues here:
- Charset encoding
- vCard Split: One VCF file per contact
Issue 1 : Hacking the vCard encoding
We will need to hack the vCard output to transform quoted-printable in UTF-8 format to be quoted-printable in ISO-8859-1. Also, to keep easy the formating I decide to remove the field FN from vCard output. (Only N; field will be kept)
Issue 2 : Splitting the vCard file
For each vCard contact discovered in the vCard output, we will create a single VCF file with the vCard information.
Bonus : Filtering interesting contact
As a Bonus, I will only kept contact with a phone number.
Python Script
I made a simple Python script (2.7) to address theses two issues. It takes one argument, a vCard file and re-encode and split all contact in it (only those with a phone number) to the "out/" directory by creating one VCF file per contact.
When done, you will end up with all your vCard files in the "out/" directory. Each vCard file is named by the contact name. To send it to your phone, you must send each separate vCard file via Bluetooth, and for each contact, manually accept it on the phone. (It takes me about 10 minutes for ~ 200 contacts)
- For vCard parsing I use the module « vobject » python module.
- For quoted-printable decode and re-encode, I use quopri python module.
- To hack the encoding for N field and to remove the FN field, I use two simples regular expressions, with Python « re » module.
The python script is available on my personal git repository here.
You can clone it:
$ git clone https://dev.beneth.fr/beneth/vcard_legacy.git
The script name is split_vcard.py.
To run it, you have to give as first argument your vCard export file, like this:
$ ./split_vcard.py vcard_test_file/vcard_test_file_1.vcf
Do not hesitate to comment and/or improve the script if you use it.