mercredi 21 octobre 2015

How to pass Unicode correctly to Perl's DBI layer?

I have the following simple Perl wrapper for a Sqlite database:

#! /usr/bin/perl

use strict;
use warnings;
use DBI;
use Data::Dumper;

my $sql = shift;

my $dbh = DBI->connect(
    "dbi:SQLite:dbname=data.sqlite3",
    "", # no user
    "", # no pw
    {
        RaiseError => 1,
        sqlite_unicode => 1
    },
) || die $DBI::errstr;

my $sth = $dbh->prepare($sql);
$sth->execute();

print Dumper ($sth->fetchall_arrayref({}));

$sth->finish();
$dbh->disconnect();

Although I have set sqlite_unicode flag, as explained in the documentation, I can not execute queries containing Unicode characters:

$ ./sqlite.pl "select * from person where lastname = 'Schütte'"
$VAR1 = [];

When I mask the 'ü' it seems to work, although I am not sure, if the \x{fc} means Latin 1 FC or Unicode U+00FC.

$ ./sqlite.pl "select * from person where lastname like 'Sch%tte'"
$VAR1 = [
          {
            'id' => 8,
            'firstname' => undef,
            'lastname' => "Sch\x{fc}tte"
          }
        ];

When I do the same with the Sqlite command line tool it works fine:

$ sqlite3 data.sqlite3 "select * from person where lastname = 'Schütte'"
8||Schütte

Did I forget anything to tell the DBI layer to support Unicode characters?

My local encoding is UTF-8:

$ locale
LANG=de_DE.utf8
LANGUAGE=
LC_CTYPE="de_DE.utf8"
LC_NUMERIC="de_DE.utf8"
LC_TIME="de_DE.utf8"
LC_COLLATE="de_DE.utf8"
LC_MONETARY="de_DE.utf8"
LC_MESSAGES="de_DE.utf8"
LC_PAPER="de_DE.utf8"
LC_NAME="de_DE.utf8"
LC_ADDRESS="de_DE.utf8"
LC_TELEPHONE="de_DE.utf8"
LC_MEASUREMENT="de_DE.utf8"
LC_IDENTIFICATION="de_DE.utf8"
LC_ALL=

Aucun commentaire:

Enregistrer un commentaire