gleu's blog

Aller au contenu | Aller au menu | Aller à la recherche

Nouvelle série d'articles pour GLMF

Après l'article sur les nouveautés de la version 9.3, j'ai entamé une série d'articles sur le fonctionnement du planificateur (aussi appelé optimiseur).

Le premier est sorti ce mois-ci, dans le numéro 170 de GNU/Linux Magazine France. Il traite des différents types de parcours que l'optimiseur peut planifier.

J'ai loupé le coche pour le prochain numéro, donc celui sur les jointures devrait sortir dans le numéro 172 si tout va bien. En fait, je viens tout juste de finir son écriture.

Il y aura certainement deux autres articles, un sur les autres types de nœuds et un sur les outils pour la compréhension des plans d'exécution mais ils restent à écrire.

En attendant, je suis preneur de toute remarque/critique sur mes articles :)

Deux news... (GLMF + traduction)

Ça fait un bon moment que je n'ai pas publié un billet sur ce blog... ça fait peur :)

Bref, deux nouvelles intéressantes.

J'ai mis à jour les manuels français suite aux dernières versions mineures. Ça n'a pas été spécialement long de le faire. Par contre, il n'a pas été simple de trouver le temps pour le faire. Mais bon, c'est fait, les manuels sont à jour.

Il y avait aussi longtemps que je n'avais pas écrit un article pour le GNU/Linux Magazine France. J'ai enfin repris, avec un article sur les nouveautés de la version 9.3. Il est paru sur le GLMF 169. Je pense qu'il y aura d'autres articles, suivant le temps à ma disposition et la motivation que j'ai. Pour l'instant, j'essaie d'écrire sur le planificateur de requêtes, sujet que j'étudie depuis plus d'un an maintenant. Cela étant dit, si vous avez des idées de sujets, je suis preneur :)

Manuels de PostgreSQL en PDF

J'ai fini par m'y remettre. Depuis que je suis passé d'Ubuntu à Fedora, je ne pouvais plus générer le manuel de PostgreSQL au format PDF. Cet après-midi, n'arrivant pas à me mettre à autre chose, je me suis collé à ça.

Pour réussir à générer, j'ai récupéré la dernière version de docbook-xsl (1.78.1), puis j'ai modifié le fichier /opt/docbook-xsl/profiling/profile-mode.xsl pour mettre en commentaire la ligne 214. Ensuite, j'ai modifié le fichier stylesheets/pg-profile.xsl des sources de la documentation pour remplacer l'appel à /opt/docbook-xsl/profiling/profile-mode-pdf.xsl par un appel à /opt/docbook-xsl/profiling/profile-mode.xsl. Enfin, j'ai exporté deux variables :

export PATH=/home/guillaume/bin/fop-0.20.5:$PATH
export JAVA_HOME=/usr

Et, pas de soucis pour générer le PDF des versions 8.4 et 9.0. La 9.1 continue à me donner du fil à retordre mais rien de bien méchant. C'est simplement (très) long car il faut tester la génération, parfois plusieurs fois, pour trouver et comprendre chaque problème rencontré.

Mais bon, ça va venir. Ce n'est plus qu'une question de jours...

Just a few hours left to submit your talks to pgconf.eu 2013

Quick blog post to remind you that you still have a few hours to submit your talks to pgconf.eu if you haven't done so yet. Read our call for paper, and submit your proposals to be part of an awesome PostgreSQL event :)

Expose the estimation of number of changed tuples since last analyze

This is the commit message of commit c87ff71f374652936a6089215a30998492b14d52 in the PostgreSQL git repository. This new feature is really interesting. It was written by Mark Kirkwood, reviewed by Laurenz Albe, and commited by Magnus Hagander.

It allows one to see when the autovacuum will fire an ANALYZE on a table. This is extremely useful. At least, to me, when I do a training course. Before, you only had dead_tuples in pg_stat_all_tuples to guess when autovacuum will fire a VACUUM. You had nothing to guess when it will fire an ANALYZE. As the commit message says, this information is available in PostgreSQL but not publicly available. This new function makes it available for the next PostgreSQL release. The PostgreSQL project only adds new features in the development branch of PostgreSQL. So the older releases won't have it. Unless you have an extension which will do the same thing.

That's what I did today: add the function in an extension, and add the extension on pgxn.org so that anyone can install it.

It's actually quite simple to install. The easier way is to use pgxnclient:

[guillaume@laptop ~]$ pgxnclient install mods_since_analyze
INFO: best version: mods_since_analyze 1.0.0
INFO: saving /tmp/tmpTHjC3c/mods_since_analyze-1.0.0.zip
INFO: unpacking: /tmp/tmpTHjC3c/mods_since_analyze-1.0.0.zip
INFO: building extension
cp mods_since_analyze.sql mods_since_analyze--1.0.sql
sed 's/DROP /ALTER EXTENSION mods_since_analyze ADD /' uninstall_mods_since_analyze.sql > mods_since_analyze--unpackaged--1.0.sql
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fpic -I. -I. -I/opt/postgresql-9.2/include/server -I/opt/postgresql-9.2/include/internal -D_GNU_SOURCE -I/usr/include/libxml2   -c -o mods_since_analyze.o mods_since_analyze.c
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fpic -L/opt/postgresql-9.2/lib -Wl,--as-needed -Wl,-rpath,'/opt/postgresql-9.2/lib',--enable-new-dtags  -shared -o mods_since_analyze.so mods_since_analyze.o
sed 's/EXTVERSION/1.0/;s/EXTENSION/mods_since_analyze/;s/EXTCOMMENT/Expose the estimation of number of changed tuples since last analyze/' mods_since_analyze.control.in > mods_since_analyze.control
rm mods_since_analyze.o
INFO: installing extension
[sudo] password for guillaume: 
/usr/bin/mkdir -p '/usr/share/pgsql/extension'
/usr/bin/mkdir -p '/usr/share/pgsql/mods_since_analyze'
/usr/bin/mkdir -p '/usr/lib64/pgsql'
/bin/sh /usr/lib64/pgsql/pgxs/src/makefiles/../../config/install-sh -c -m 644 ./mods_since_analyze.control '/usr/share/pgsql/extension/'
/bin/sh /usr/lib64/pgsql/pgxs/src/makefiles/../../config/install-sh -c -m 644 ./mods_since_analyze--unpackaged--1.0.sql ./mods_since_analyze--1.0.sql  '/usr/share/pgsql/mods_since_analyze/'
/bin/sh /usr/lib64/pgsql/pgxs/src/makefiles/../../config/install-sh -c -m 755  mods_since_analyze.so '/usr/lib64/pgsql/'

Done. Easy enough, right? :)

If you don't have pgxnclient, this is still quite easy but you need to do it manually:

  1. Download mods_since_analyze (mods_since_analyze-1.0.0.zip)
  2. Unzip the downloaded file
  3. Compile it
  4. Install it

In other words:

[guillaume@laptop ~]$ wget http://api.pgxn.org/dist/mods_since_analyze/1.0.0/mods_since_analyze-1.0.0.zip
--2013-07-07 18:34:09--  http://api.pgxn.org/dist/mods_since_analyze/1.0.0/mods_since_analyze-1.0.0.zip
Resolving api.pgxn.org (api.pgxn.org)... 88.198.47.126
Connecting to api.pgxn.org (api.pgxn.org)|88.198.47.126|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5686 (5.6K) [application/zip]
Saving to: ‘mods_since_analyze-1.0.0.zip’

100%[========================================================================================>] 5,686       --.-K/s   in 0.02s   

2013-07-07 18:34:10 (271 KB/s) - ‘mods_since_analyze-1.0.0.zip’ saved [5686/5686]

[guillaume@laptop ~]$ unzip mods_since_analyze-1.0.0.zip 
Archive:  mods_since_analyze-1.0.0.zip
cfe2bbb6ea83d02d402024821696aef7de01d7be
   creating: mods_since_analyze-1.0.0/
  inflating: mods_since_analyze-1.0.0/.gitignore  
  inflating: mods_since_analyze-1.0.0/AUTHORS  
  inflating: mods_since_analyze-1.0.0/COPYING  
  inflating: mods_since_analyze-1.0.0/INSTALL  
  inflating: mods_since_analyze-1.0.0/META.json  
  inflating: mods_since_analyze-1.0.0/Makefile  
  inflating: mods_since_analyze-1.0.0/README.md  
 extracting: mods_since_analyze-1.0.0/TODO  
  inflating: mods_since_analyze-1.0.0/mods_since_analyze.c  
  inflating: mods_since_analyze-1.0.0/mods_since_analyze.control.in  
  inflating: mods_since_analyze-1.0.0/mods_since_analyze.sql  
  inflating: mods_since_analyze-1.0.0/uninstall_mods_since_analyze.sql  
[guillaume@laptop ~]$ cd mods_since_analyze-1.0.0/
[guillaume@laptop mods_since_analyze-1.0.0]$ make && make install
cp mods_since_analyze.sql mods_since_analyze--1.0.sql
sed 's/DROP /ALTER EXTENSION mods_since_analyze ADD /' uninstall_mods_since_analyze.sql > mods_since_analyze--unpackaged--1.0.sql
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fpic -I. -I. -I/opt/postgresql-9.2/include/server -I/opt/postgresql-9.2/include/internal -D_GNU_SOURCE -I/usr/include/libxml2   -c -o mods_since_analyze.o mods_since_analyze.c
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fpic -L/opt/postgresql-9.2/lib -Wl,--as-needed -Wl,-rpath,'/opt/postgresql-9.2/lib',--enable-new-dtags  -shared -o mods_since_analyze.so mods_since_analyze.o
sed 's/EXTVERSION/1.0/;s/EXTENSION/mods_since_analyze/;s/EXTCOMMENT/Expose the estimation of number of changed tuples since last analyze/' mods_since_analyze.control.in > mods_since_analyze.control
rm mods_since_analyze.o
/usr/bin/mkdir -p '/opt/postgresql-9.2/share/extension'
/usr/bin/mkdir -p '/opt/postgresql-9.2/share/mods_since_analyze'
/usr/bin/mkdir -p '/opt/postgresql-9.2/lib'
/bin/sh /opt/postgresql-9.2/lib/pgxs/src/makefiles/../../config/install-sh -c -m 644 ./mods_since_analyze.control '/opt/postgresql-9.2/share/extension/'
/bin/sh /opt/postgresql-9.2/lib/pgxs/src/makefiles/../../config/install-sh -c -m 644 ./mods_since_analyze--unpackaged--1.0.sql ./mods_since_analyze--1.0.sql  '/opt/postgresql-9.2/share/mods_since_analyze/'
/bin/sh /opt/postgresql-9.2/lib/pgxs/src/makefiles/../../config/install-sh -c -m 755  mods_since_analyze.so '/opt/postgresql-9.2/lib/'

Then you'll have to activate it on the database where you want to use it. You simply need to execute CREATE EXTENSION:

tests=# CREATE EXTENSION mods_since_analyze;
CREATE EXTENSION

It will add a single function:

tests=# \dx+ mods_since_analyze
  Objects in extension "mods_since_analyze"
             Object Description              
---------------------------------------------
 function pg_stat_get_mod_since_analyze(oid)
(1 row)

This function returns the number of changed tuples since the last ANALYZE (manual or automatic).

tests=# CREATE TABLE t1(id integer);
CREATE TABLE
tests=# SELECT now(), pg_stat_get_last_autoanalyze_time('t1'::regclass),
  pg_stat_get_last_analyze_time('t1'::regclass), pg_stat_get_mod_since_analyze('t1'::regclass);
-[ RECORD 1 ]---------------------+-----------------------------
now                               | 2013-07-07 22:32:41.25594+02
pg_stat_get_last_autoanalyze_time | 
pg_stat_get_last_analyze_time     | 
pg_stat_get_mod_since_analyze     | 0

tests=# INSERT INTO t1 SELECT generate_series(1, 1000000);
INSERT 0 1000000
tests=# SELECT now(), pg_stat_get_last_autoanalyze_time('t1'::regclass),
  pg_stat_get_last_analyze_time('t1'::regclass), pg_stat_get_mod_since_analyze('t1'::regclass);
-[ RECORD 1 ]---------------------+------------------------------
now                               | 2013-07-07 22:33:00.408815+02
pg_stat_get_last_autoanalyze_time | 
pg_stat_get_last_analyze_time     | 
pg_stat_get_mod_since_analyze     | 1000000

tests=# SELECT now(), pg_stat_get_last_autoanalyze_time('t1'::regclass),
  pg_stat_get_last_analyze_time('t1'::regclass), pg_stat_get_mod_since_analyze('t1'::regclass);
-[ RECORD 1 ]---------------------+------------------------------
now                               | 2013-07-07 22:33:04.569035+02
pg_stat_get_last_autoanalyze_time | 
pg_stat_get_last_analyze_time     | 
pg_stat_get_mod_since_analyze     | 1000000

tests=# ANALYZE;
ANALYZE
tests=# SELECT now(), pg_stat_get_last_autoanalyze_time('t1'::regclass),
  pg_stat_get_last_analyze_time('t1'::regclass), pg_stat_get_mod_since_analyze('t1'::regclass);
-[ RECORD 1 ]---------------------+------------------------------
now                               | 2013-07-07 22:33:09.00094+02
pg_stat_get_last_autoanalyze_time | 
pg_stat_get_last_analyze_time     | 2013-07-07 22:33:07.096124+02
pg_stat_get_mod_since_analyze     | 0

tests=# INSERT INTO t1 SELECT generate_series(1, 1000000);
INSERT 0 1000000
tests=# DELETE FROM t1 WHERE id<100000;
DELETE 199998
tests=# SELECT now(), pg_stat_get_last_autoanalyze_time('t1'::regclass),
  pg_stat_get_last_analyze_time('t1'::regclass), pg_stat_get_mod_since_analyze('t1'::regclass);
-[ RECORD 1 ]---------------------+------------------------------
now                               | 2013-07-07 22:33:32.42597+02
pg_stat_get_last_autoanalyze_time | 
pg_stat_get_last_analyze_time     | 2013-07-07 22:33:07.096124+02
pg_stat_get_mod_since_analyze     | 1199998

tests=# SELECT now(), pg_stat_get_last_autoanalyze_time('t1'::regclass),
  pg_stat_get_last_analyze_time('t1'::regclass), pg_stat_get_mod_since_analyze('t1'::regclass);
-[ RECORD 1 ]---------------------+------------------------------
now                               | 2013-07-07 22:34:52.344178+02
pg_stat_get_last_autoanalyze_time | 2013-07-07 22:34:15.050458+02
pg_stat_get_last_analyze_time     | 2013-07-07 22:33:07.096124+02
pg_stat_get_mod_since_analyze     | 0

It works from the 9.0 release to the 9.3 release.

Hope you'll enjoy it.

Plus que 20 jours pour proposer une conférence à pgconf.eu 2013

L'appel à conférencier est disponible depuis un moment. Du coup, il ne reste plus que 20 jours pour proposer une conférence à pgconf.eu 2013. Cela étant dit, ne désespérez pas, il vous reste un peu de temps :)

Ne pensez pas que vous n'avez rien d'intéressant à raconter. Votre expérience, vos compétences, vos outils sont autant de sujets passionnants de conférences.

Bref, n'hésitez pas à proposer une conférence. Et de toute façon, venez participer à cet événement. Les conférences, les tutoriels, les rencontres avec les contributeurs sont des raisons amplement suffisantes pour venir à Dublin.

Only 20 days left to send your talk proposal for PostgreSQL Conference Europe 2013

If you haven't already send a proposal, you should seriously consider it. PostgreSQL Conference Europe 2013 will take place in Dublin, Ireland, on October 29-November 1, and we accept talks on a wide range of topics related to PostgreSQL:

  • Developing applications for PostgreSQL
  • Administering large scale PostgreSQL installations
  • Case studies and/or success stories of PostgreSQL deployments
  • PostgreSQL tools and utilities
  • PostgreSQL hacking
  • Community & user groups
  • Tuning the server
  • Migrating from other systems
  • Scaling/replication
  • Benchmarking & hardware
  • PostgreSQL related products

To be honest, I would love to see more case studies, as we usually don't have a lot and that they are really interesting to many people.

Well, there are only 20 days left to send your proposal. Read our call for paper, think about something you want to talk about, and submit your proposals (yes, you can submit more than one :) ).

By the way, we are also still looking for sponsors - please see our website for details about the sponsor benefits and the costs.

Adding read capabilities to a Foreign Data Wrapper

In the first article, we talked about the minimum required functions to be able to add a foreign data wrapper to a PostgreSQL database. Alas, we couldn't do much with it, as it didn't have other capabilities. And one of the most interesting capabilities of a FDW is to access remote datas. In other words, read them.

So, in this blog post, we'll see how we can add reading capabilities to our simple FDW.

In the previous blog post, we said that there are four hook functions to implement to get this capability:

  • BeginForeignScan, to begin executing a foreign scan
  • IterateForeignScan, to fetch one row from the foreign source
  • ReScanForeignScan, to restart the scan from the beginning
  • EndForeignScan, to end the scan and release resources

Actually, we also need some functions used by the planner:

  • Before 9.2
    • PlanForeignScan (<9.2)
  • In 9.2
    • GetForeignRelSize (>=9.2), to obtain relation size estimates for a foreign table
    • GetForeignPaths (>=9.2), to create possible access paths for a scan on a foreign table
    • GetForeignPlan (>=9.2), to create a ForeignScan plan node from the selected foreign access path

We first need to declare the functions:

/*
 * Callback functions
 */

/* Planner functions */
#if (PG_VERSION_NUM >= 90200)
static void simpleGetForeignRelSize(PlannerInfo *root,
    RelOptInfo *baserel,
    Oid foreigntableid);
static void simpleGetForeignPaths(PlannerInfo *root,
    RelOptInfo *baserel,
    Oid foreigntableid);
static ForeignScan *simpleGetForeignPlan(PlannerInfo *root,
    RelOptInfo *baserel,
    Oid foreigntableid,
    ForeignPath *best_path,
    List *tlist,
    List *scan_clauses);
#else
static FdwPlan *simplePlanForeignScan(Oid foreigntableid, PlannerInfo *root, RelOptInfo *baserel);
#endif

/* Executor reading functions */
static void simpleBeginForeignScan(ForeignScanState *node, int eflags);
static TupleTableSlot *simpleIterateForeignScan(ForeignScanState *node);
static void simpleReScanForeignScan(ForeignScanState *node);
static void simpleEndForeignScan(ForeignScanState *node);

And we need to set the hooks in the handler function, which looks like this now:

Datum
simple_fdw_handler(PG_FUNCTION_ARGS)
{
	FdwRoutine *fdwroutine = makeNode(FdwRoutine);

	elog(DEBUG1,"entering function %s",__func__);

	/* assign the handlers for the FDW */
#if (PG_VERSION_NUM >= 90200)
	fdwroutine->GetForeignRelSize = simpleGetForeignRelSize;
	fdwroutine->GetForeignPaths = simpleGetForeignPaths;
	fdwroutine->GetForeignPlan = simpleGetForeignPlan;
#else
	fdwroutine->PlanForeignScan = simplePlanForeignScan;
#endif
	fdwroutine->BeginForeignScan = simpleBeginForeignScan;
	fdwroutine->IterateForeignScan = simpleIterateForeignScan;
	fdwroutine->ReScanForeignScan = simpleReScanForeignScan;
	fdwroutine->EndForeignScan = simpleEndForeignScan;

	PG_RETURN_POINTER(fdwroutine);
}

The new code is the eight lines starting with fdwroutine->, and the ifdef statement. They set each of the hook functions we will need to read our remote datas.

For now, let's do something really simple so that we can compile it, run it, and understand how it works. We'll start with this code:

static void
simpleGetForeignRelSize(PlannerInfo *root,
    RelOptInfo *baserel,
    Oid foreigntableid)
{
	SimpleFdwPlanState *fdw_private;

	elog(DEBUG1,"entering function %s",__func__);

	baserel->rows = 0;

	fdw_private = palloc0(sizeof(SimpleFdwPlanState));
	baserel->fdw_private = (void *) fdw_private;
}

static void
simpleGetForeignPaths(PlannerInfo *root,
    RelOptInfo *baserel,
    Oid foreigntableid)
{
	Cost		startup_cost,
				total_cost;

	elog(DEBUG1,"entering function %s",__func__);

	startup_cost = 0;
	total_cost = startup_cost + baserel->rows;

	add_path(baserel, (Path *)
		create_foreignscan_path(root, baserel,
			 baserel->rows,
			 startup_cost,
			 total_cost,
			 NIL,
			 NULL,
			 NIL));
}

static ForeignScan *
simpleGetForeignPlan(PlannerInfo *root,
    RelOptInfo *baserel,
    Oid foreigntableid,
    ForeignPath *best_path,
    List *tlist,
    List *scan_clauses)
{
	Index		scan_relid = baserel->relid;

	elog(DEBUG1,"entering function %s",__func__);

	scan_clauses = extract_actual_clauses(scan_clauses, false);

	return make_foreignscan(tlist,
		scan_clauses,
		scan_relid,
		NIL,
		NIL);
}

static void
simpleBeginForeignScan(ForeignScanState *node,
						  int eflags)
{
	elog(DEBUG1,"entering function %s",__func__);
}


static TupleTableSlot *
simpleIterateForeignScan(ForeignScanState *node)
{
	TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;

	elog(DEBUG1,"entering function %s",__func__);

	ExecClearTuple(slot);

	return slot;
}


static void
simpleReScanForeignScan(ForeignScanState *node)
{
	elog(DEBUG1,"entering function %s",__func__);
}


static void
simpleEndForeignScan(ForeignScanState *node)
{
	elog(DEBUG1,"entering function %s",__func__);
}

That's quite a lot of code. Simply put, all executor void functions just call elog so that we know when the function is called. The only executor returning function is the one returning a tuple. We simply return an empty tuple right now. The planner functions do as less as they can to be able to do a SELECT statement. We'll explain what they do later.

Let's compile that code, and try it.

$ make
cp sql/simple_fdw.sql sql/simple_fdw--0.0.1.sql
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fpic -I. -I. -I/opt/postgresql-9.2/include/server -I/opt/postgresql-9.2/include/internal -D_GNU_SOURCE -I/usr/include/libxml2   -c -o src/simple_fdw.o src/simple_fdw.c
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fpic -shared -o simple_fdw.so src/simple_fdw.o -L/opt/postgresql-9.2/lib -Wl,--as-needed -Wl,-rpath,'/opt/postgresql-9.2/lib',--enable-new-dtags  
$ make install
/usr/bin/mkdir -p '/opt/postgresql-9.2/lib'
/usr/bin/mkdir -p '/opt/postgresql-9.2/share/extension'
/usr/bin/mkdir -p '/opt/postgresql-9.2/share/extension'
/bin/sh /opt/postgresql-9.2/lib/pgxs/src/makefiles/../../config/install-sh -c -m 755  simple_fdw.so '/opt/postgresql-9.2/lib/simple_fdw.so'
/bin/sh /opt/postgresql-9.2/lib/pgxs/src/makefiles/../../config/install-sh -c -m 644 ./simple_fdw.control '/opt/postgresql-9.2/share/extension/'
/bin/sh /opt/postgresql-9.2/lib/pgxs/src/makefiles/../../config/install-sh -c -m 644  sql/simple_fdw--0.0.1.sql '/opt/postgresql-9.2/share/extension/'

OK, everything is in place. Let's create a database, and add our objects in it:

$ createdb my_simple_test
$ psql my_simple_test 
Expanded display is used automatically.
psql (9.2.4)
Type "help" for help.

my_simple_test=# CREATE EXTENSION simple_fdw;
CREATE EXTENSION
my_simple_test=# CREATE SERVER simple_server FOREIGN DATA WRAPPER simple_fdw;
CREATE SERVER
my_simple_test=# CREATE FOREIGN TABLE simple_table() SERVER simple_server;
CREATE FOREIGN TABLE

Now, let's read our new foreign table, but first we need to make sure we get DEBUG1 log messages:

my_simple_test=# SET client_min_messages TO debug1;
LOG:  duration: 0.188 ms  statement: set client_min_messages to debug1;
SET
my_simple_test=# SELECT * FROM simple_table ;
DEBUG:  entering function simple_fdw_handler
DEBUG:  entering function simpleGetForeignRelSize
DEBUG:  entering function simpleGetForeignPaths
DEBUG:  entering function simpleGetForeignPlan
DEBUG:  entering function simple_fdw_handler
DEBUG:  entering function simpleBeginForeignScan
DEBUG:  entering function simpleIterateForeignScan
DEBUG:  entering function simpleEndForeignScan
LOG:  duration: 0.438 ms  statement: select * from simple_table ;
(No rows)

So, what happened? the SELECT statement is executed in two parts: the planner part, and the executor part. The planner called our handler function to make sure that the hook functions were set. Then it called each of the functions it needed: simpleGetForeignRelSize, simpleGetForeignPaths, and simpleGetForeignPlan, as I'm on a PostgreSQL 9.2 release. Then the executor called once again the handler function. And then it started reading the table: it called simpleBeginForeignScan to prepare the reading, simpleIterateForeignScan as much as needed till it got an empty tuple (so only once in our case), and then simpleEndForeignScan to clean everything that needed to get cleaned. We got no rows because our code was that simple.

Now, we need to make it actually read something. We'll use an sqlite database as our example, but the idea is the same for every FDW you can find.

To read a table on an sqlite database, you need to open a database connection, prepare a query, and then loop on all the tuples returned by your query. At the end, you'll have to close the database connection. Here is a simple code that do it alone (I mean, outside of a FDW):

  sqlite3 *db;
  sqlite3_stmt * pStmt;
  const char * pzTail;

  rc = sqlite3_open("/path/to/database_file.db", &db);
  rc = sqlite3_prepare(db, "SELECT * FROM t1", -1, &pStmt, &pzTail);

  do {
        rc = sqlite3_step(pStmt);
        if (rc == SQLITE_ROW) {
            printf("value col 1 %d\n", sqlite3_column_int(pStmt, 0));
            printf("value col 2 %s\n", sqlite3_column_text(pStmt, 1));
        }
    } while (rc == SQLITE_ROW);

  sqlite3_close(db);

Of course, this is a really simple code, without error checking and stuff like that. Anyway, that's enough to work on our simple FDW. Remember that we have at least three functions to code:

  • simpleBeginForeignScan will open the database connection (sqlite3_open)
  • simpleIterateForeignScan will prepare the query (sqlite3_prepare) and fetch the first tuple (sqlite3_step) if it didn't do that already, otherwise it will only fetch the next tuple (sqlite3_step)
  • and simpleEndForeignScan will close the database connection (sqlite3_close)

The question is: how can we give the connection handle and the statement handle created in the first and second function to the second and third functions? and the answer is: with some private Foreign Data Wrapper structure that will be stored in the ForeignScanState structure.

So, onto the code. The first thing we need to do is add the sqlite header file:

#include <sqlite3.h>

We also need to declare our private data structure:

/*
 * FDW-specific information for ForeignScanState.fdw_state.
 */
typedef struct simpleFdwExecutionState
{
	sqlite3       *conn;
	sqlite3_stmt  *result;
	char          *query;
} SimpleFdwExecutionState;

This structure will contain the connection handle, the statement handle, and the query string.

Now, we will write the simpleBeginForeignScan. It will open the connection, and prepare the query. As our foreign table accepts one option, the table name, we'll take advantage of this to name the table we want to read. We don't yet have the code to get the options back. We'll write a function to get them. Here is its declaration:

static void simpleGetOptions(Oid foreigntableid, char **database, char **table);

And here is its code:

/*
 * Fetch the options for a simple_fdw foreign table.
 */
static void
simpleGetOptions(Oid foreigntableid, char **database, char **table)
{
	ForeignTable   *f_table;
	ForeignServer  *f_server;
	List           *options;
	ListCell       *lc;

	/*
	 * Extract options from FDW objects.
	 */
	f_table = GetForeignTable(foreigntableid);
	f_server = GetForeignServer(f_table->serverid);

	options = NIL;
	options = list_concat(options, f_table->options);
	options = list_concat(options, f_server->options);

	/* Loop through the options */
	foreach(lc, options)
	{
		DefElem *def = (DefElem *) lfirst(lc);

		if (strcmp(def->defname, "database") == 0)
			*database = defGetString(def);

		if (strcmp(def->defname, "table") == 0)
			*table = defGetString(def);
	}

	/* Check we have the options we need to proceed */
	if (!*database && !*table)
		ereport(ERROR,
			(errcode(ERRCODE_SYNTAX_ERROR),
			errmsg("a database and a table must be specified")
			));
}

The code is easy to read. We first get a handle on the foreign table and on the foreign server. We put all options into the options variable, and we loop through it. We check that we have the two options we need and we set the variables for them. It errors out if we don't have the two options we need (ie, the database name and the table name).

Now that we can get the two options through this function, we can write the code of the simpleBeginForeignScan hook function:

static void
simpleBeginForeignScan(ForeignScanState *node,
						  int eflags)
{
	sqlite3                  *db;
	SimpleFdwExecutionState  *festate;
	char                     *svr_database = NULL;
	char                     *svr_table = NULL;
	char                     *query;
    size_t                   len;

	elog(DEBUG1,"entering function %s",__func__);

	/* Fetch options  */
	simpleGetOptions(RelationGetRelid(node->ss.ss_currentRelation), &svr_database, &svr_table);

	/* Connect to the server */
	if (sqlite3_open(svr_database, &db)) {
		ereport(ERROR,
			(errcode(ERRCODE_FDW_OUT_OF_MEMORY),
			errmsg("Can't open sqlite database %s: %s", svr_database, sqlite3_errmsg(db))
			));
		sqlite3_close(db);
	}

	/* Build the query */
    len = strlen(svr_table) + 15;
    query = (char *)palloc(len);
    snprintf(query, len, "SELECT * FROM %s", svr_table);

	/* Stash away the state info we have already */
	festate = (SimpleFdwExecutionState *) palloc(sizeof(SimpleFdwExecutionState));
	node->fdw_state = (void *) festate;
	festate->conn = db;
	festate->result = NULL;
	festate->query = query;
}

Once again, it's quite easy to read. We first grab the database and table names. Then, we open the database with the sqlite3_open function. If it didn't work, we use ereport to report the error. If everything is fine, we build the query string, which will simply be a "SELECT * FROM" and our table name. Finally, we allocate some memory for our SimpleFdwExecutionState structure, and we push the connection handle and the query string in it.

Now that the FDW can open a connection, we need to be able to execute a query, and fetch each resulting row. This is the aim of the simpleIterateForeignScan function. It will prepare the query if it hasn't done so, and then it will fetch the next tuple. Here is its code:

static TupleTableSlot *
simpleIterateForeignScan(ForeignScanState *node)
{
	char        **values;
	HeapTuple   tuple;
	int         x;
	const char  *pzTail;
	int         rc;

	SimpleFdwExecutionState *festate = (SimpleFdwExecutionState *) node->fdw_state;
	TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;

	elog(DEBUG1,"entering function %s",__func__);

	/* Execute the query, if required */
	if (!festate->result)
	{
		rc = sqlite3_prepare(festate->conn, festate->query, -1, &festate->result, &pzTail);
		if (rc!=SQLITE_OK) {
			ereport(ERROR,
				(errcode(ERRCODE_FDW_UNABLE_TO_CREATE_EXECUTION),
				errmsg("SQL error during prepare: %s", sqlite3_errmsg(festate->conn))
				));
			sqlite3_close(festate->conn);
		}
	}

	ExecClearTuple(slot);

	/* get the next record, if any, and fill in the slot */
	if (sqlite3_step(festate->result) == SQLITE_ROW)
	{
		/* Build the tuple */
		values = (char **) palloc(sizeof(char *) * sqlite3_column_count(festate->result));

		for (x = 0; x < sqlite3_column_count(festate->result); x++)
			values[x] = sqlite3_column_text(festate->result, x);

		tuple = BuildTupleFromCStrings(
			TupleDescGetAttInMetadata(node->ss.ss_currentRelation->rd_att),
			values);
		ExecStoreTuple(tuple, slot, InvalidBuffer, false);
	}

	/* then return the slot */
	return slot;
}

sqlite3_column_count gives us the number of columns in the resulting tuple, and we use that in a for loop, to get each of the values. BuildTupleFromCStrings builds a HeapTuple from all gathered values, and ExecStoreTuple stores the HeapTuple in the slot.

Finally, we need the ending function. It will clean every part of memory we allocated to scan the remote table. So, it's pretty simple: we have to deallocate the query string, the statement handle, and the database handle:

static void
simpleEndForeignScan(ForeignScanState *node)
{
	SQLiteFdwExecutionState *festate = (SQLiteFdwExecutionState *) node->fdw_state;

	elog(DEBUG1,"entering function %s",__func__);

	if (festate->result)
	{
		sqlite3_finalize(festate->result);
		festate->result = NULL;
	}

	if (festate->conn)
	{
		sqlite3_close(festate->conn);
		festate->conn = NULL;
	}

	if (festate->query)
	{
		pfree(festate->query);
		festate->query = 0;
	}

}

And we're done. We should test it right away. I'll use the Chinook sample database (file ChinookDatabase1.4_CompleteVersion.zip) as an example database. So here is how we can use the simple FDW:

my_simple_test=# DROP EXTENSION simple_fdw cascade;
NOTICE:  drop cascades to 2 other objects
DETAIL:  drop cascades to server simple_server
drop cascades to foreign table simple_table
DROP EXTENSION
my_simple_test=# CREATE EXTENSION simple_fdw;
CREATE EXTENSION
my_simple_test=# CREATE SERVER simple_server
    FOREIGN DATA WRAPPER simple_fdw
    OPTIONS (database '/home/guillaume/FDW/example/chinook/Chinook_Sqlite.sqlite');
CREATE SERVER
my_simple_test=# CREATE FOREIGN TABLE genre (g_id integer, g_name text)
    SERVER simple_server
    OPTIONS (table 'Genre');
CREATE FOREIGN TABLE
my_simple_test=# SELECT * FROM genre;
 g_id |       g_name       
------+--------------------
    1 | Rock
    2 | Jazz
    3 | Metal
    4 | Alternative & Punk
    5 | Rock And Roll
    6 | Blues
    7 | Latin
    8 | Reggae
    9 | Pop
   10 | Soundtrack
   11 | Bossa Nova
   12 | Easy Listening
   13 | Heavy Metal
   14 | R&B/Soul
   15 | Electronica/Dance
   16 | World
   17 | Hip Hop/Rap
   18 | Science Fiction
   19 | TV Shows
   20 | Sci Fi & Fantasy
   21 | Drama
   22 | Comedy
   23 | Alternative
   24 | Classical
   25 | Opera
(25 rows)

That worked perfectly!

So, we had to create quite a lof of functions to be able to read some remote data: add some planner hook functions, add some executor reading hook functions, add some helper functions (actually one to get the list of options). That wasn't hard to do, but there are plenty of things to know.

To be honest, we didn't put many thoughts on the planner functions. Even if we get 25 rows, the planner still thinks we'll have 0 rows, and that it will cost nothing to read the remote data. And we didn't use the simpleReScanForeignScan function.

But this is enough for today. We'll go deeper in the planner hook functions in another blog post.

By the way, the simple FDW is available on this github repository: Writing A FDW. This repository will contain all the code written for this serie of blog posts.

The handler and the validator functions of a FDW

I'm still working on my foreign data wrapper. Actually, I haven't written a lot of code. Instead, I've been reading the code of the other foreign data wrappers and it helped me a lot to understand many things, and to see how far each went.

I'll try to start a few blog posts about coding a foreign data wrapper.

This first one (this one) will be about the two functions needed to declare a foreign data wrapper. Let's take a look at the CREATE FOREIGN DATA WRAPPER DDL statement:

CREATE FOREIGN DATA WRAPPER name
    [ HANDLER handler_function | NO HANDLER ]
    [ VALIDATOR validator_function | NO VALIDATOR ]
    [ OPTIONS ( option 'value' [, ... ] ) ]

The handler function is available since 9.1. The other options have always been available (at least from the beginning of the SQL/MED infrastructure in PostgreSQL, which is the 8.4 release).

The handler function is the function you use to declare all the foreign data wrappers hooks you'll need. Here are the complete list, with the first version they were available in :

  • plan functions (used by the planner)
    • GetForeignRelSize (>=9.2), obtain relation size estimates for a foreign table, called at the beginning of a query planning
    • GetForeignPaths (>=9.2), create possible access paths for a scan on a foreign table, called during query planning
    • GetForeignPlan (>=9.2), create a ForeignScan plan node from the selected foreign access path, called at the end of a query planning
    • PlanForeignScan (<9.2)
  • read functions (used by the executor)
    • BeginForeignScan, begin executing a foreign scan
    • IterateForeignScan, fetch one row from the foreign source
    • ReScanForeignScan, restart the scan from the beginning
    • EndForeignScan, end the scan and release resources
  • write functions (>=9.3, used by the executor)
    • AddForeignUpdateTargets, used to add extra hidden target columns to the list of columns that are to be retrieved from the foreign table during an UPDATE or DELETE
    • PlanForeignModify, perform any additional planning actions needed for an insert, update, or delete on a foreign table
    • BeginForeignModify, begin executing a foreign table modification operation
    • ExecForeignInsert, insert one tuple into the foreign table
    • ExecForeignUpdate, update one tuple into the foreign table
    • ExecForeignDelete, delete one tuple into the foreign table
    • EndForeignModify, end the table update and release resources
  • explain functions (used during the execution of an EXPLAIN)
    • ExplainForeignScan, print additional EXPLAIN output for a foreign table scan
    • ExplainForeignModify (>=9.3), print additional EXPLAIN output for a foreign table update
  • analyze function (>=9.2, used during the execution of an ANALYZE)
    • AnalyzeForeignTable, collect statistics for this foreign table when ANALYZE is executed on it

None are really required. Of course, if you implement none of them, the foreign data wrapper won't be really useful :)

All these hooks are to be set inside a structure of type FdwRoutine. It's best to use the makeNode() function to set all fields to NULL. Then, you'll be able to set each of the hooks you really need, and don't care about the rest.

At the end, this function hands back the FdwRoutine structure to the caller of the function.

So the code looks something like this:

Datum
my_fdw_handler(PG_FUNCTION_ARGS)
{
  FdwRoutine *fdwroutine = makeNode(FdwRoutine);

  // plan functions
#if (PG_VERSION_NUM >= 90200)
  fdwroutine->GetForeignRelSize = myGetForeignRelSize;
  fdwroutine->GetForeignPaths = myGetForeignPaths;
  fdwroutine->GetForeignPlan = myGetForeignPlan;
#else
  fdwroutine->PlanForeignScan = myPlanForeignScan;
#endif

  // read functions
  fdwroutine->BeginForeignScan = myBeginForeignScan;
  fdwroutine->IterateForeignScan = myIterateForeignScan;
  fdwroutine->ReScanForeignScan = myReScanForeignScan;
  fdwroutine->EndForeignScan = myEndForeignScan;

#if (PG_VERSION_NUM >= 90300)
  // write functions
  fdwroutine->AddForeignUpdateTargets = myAddForeignUpdateTargets;
  fdwroutine->PlanForeignModify = myPlanForeignModify;
  fdwroutine->BeginForeignModify = myBeginForeignModify;
  fdwroutine->ExecForeignInsert = myExecForeignInsert;
  fdwroutine->ExecForeignUpdate = myExecForeignUpdate;
  fdwroutine->ExecForeignDelete = myExecForeignDelete;
  fdwroutine->EndForeignModify = myEndForeignModify;
#endif

  // explain functions
  fdwroutine->ExplainForeignScan = myExplainForeignScan;
#if (PG_VERSION_NUM >= 90300)
  fdwroutine->ExplainForeignModify = myExplainForeignModify;
#endif

#if (PG_VERSION_NUM >= 90200)
  // analyze functions
  fdwroutine->AnalyzeForeignTable = myAnalyzeForeignTable;
#endif

  PG_RETURN_POINTER(fdwroutine);
}

Of course, all the my* functions must be previously declared. Rather than the actual code, the more interesting thing to know in a foreign data wrapper is which hooks it support. It really depends a lot on the FDW. For example, the Oracle FDW doesn't support yet the write functions. The MySQL FDW doesn't support the write, the explain, and the analyze functions. Etc.

The validator function checks the options given to a specific foreign object (foreign data wrapper, foreign server, user mapping, and foreign table). It gets a list of all the options, and the context of the options (aka the foreign object type). The function has to go through each of them, checks that they are valid, and error out if anything wrong happens (send back the ERRCODE_FDW_INVALID_OPTION_NAME error if it isn't a valid option, please).

A good way of doing it is to first add a FDW option structure, this way:

struct myFdwOption
{
  const char  *optname;
  Oid   optcontext;
};

The optname will obviously be the option name. The optcontext will contain the kind of object this option is for. There are four possibilities: ForeignDataWrapperRelationId, ForeignServerRelationId, UserMappingRelationId, and ForeignTableRelationId.

The options could be declared like this:

static struct myFdwOption valid_options[] =
{
  { "database",  ForeignServerRelationId },
  { "table",     ForeignTableRelationId },
  { NULL,     InvalidOid }
};

So, a simple code for the validator function might be:

Datum
my_fdw_validator(PG_FUNCTION_ARGS)
{
  List      *options_list = untransformRelOptions(PG_GETARG_DATUM(0));
  Oid       catalog = PG_GETARG_OID(1);
  ListCell  *cell;
  char      *my_database = NULL;
  char      *my_table = NULL;

  foreach(cell, options_list)
  {
    DefElem    *def = (DefElem *) lfirst(cell);

    if (!myIsValidOption(def->defname, catalog))
    {
      struct SQLiteFdwOption *opt;
      StringInfoData buf;

      /*
       * Unknown option specified, complain about it. Provide a hint
       * with list of valid options for the object.
       */
      initStringInfo(&buf);
      for (opt = valid_options; opt->optname; opt++)
      {
        if (catalog == opt->optcontext)
          appendStringInfo(&buf, "%s%s", (buf.len > 0) ? ", " : "",
               opt->optname);
      }

      ereport(ERROR, 
        (errcode(ERRCODE_FDW_INVALID_OPTION_NAME), 
        errmsg("invalid option \"%s\"", def->defname), 
        errhint("Valid options in this context are: %s", buf.len ? buf.data : "<none>")
        ));
    }

    /*
     * Here is the code for the valid options
     */

    if (strcmp(def->defname, "database") == 0)
    {
      if (my_database)
        ereport(ERROR,
          (errcode(ERRCODE_SYNTAX_ERROR),
          errmsg("redundant options: database (%s)", defGetString(def))
          ));

      my_database = defGetString(def);
    }
    else if (strcmp(def->defname, "table") == 0)
    {
      if (my_table)
        ereport(ERROR,
          (errcode(ERRCODE_SYNTAX_ERROR),
          errmsg("redundant options: table (%s)", defGetString(def))
          ));

      my_table = defGetString(def);
    }
  }

  PG_RETURN_VOID();
}

The myIsValidOption() function will go through the options and checks that it matches the right context. Just as is:

static bool
myIsValidOption(const char *option, Oid context)
{
  struct myFdwOption *opt;

  for (opt = valid_options; opt->optname; opt++)
  {
    if (context == opt->optcontext && strcmp(opt->optname, option) == 0)
      return true;
  }
  return false;
}

After compilation and installation of such a FDW, we can try creating foreign objects:

yeah=# CREATE EXTENSION my_fdw ;
CREATE EXTENSION
-- we can add a server without options
yeah=# CREATE SERVER myserver FOREIGN DATA WRAPPER my_fdw ;
CREATE SERVER
-- we cannot add a server with unknown options
yeah=# CREATE SERVER myserver3 FOREIGN DATA WRAPPER my_fdw OPTIONS (inconnu '10') ;
ERROR:  invalid option "inconnu"
HINT:  Valid options in this context are: database
-- but we can add one with a valid option
yeah=# CREATE SERVER myserver2 FOREIGN DATA WRAPPER my_fdw OPTIONS (database 'this one') ;
CREATE SERVER
-- we can also create foreign table without options
yeah=# CREATE FOREIGN TABLE ft(id integer) SERVER myserver2;
CREATE FOREIGN TABLE
-- we cannot add a foreign table with unknown options
yeah=# CREATE FOREIGN TABLE ft2(id integer) SERVER myserver2 OPTIONS (inconnu '10');
ERROR:  invalid option "inconnu"
HINT:  Valid options in this context are: table
-- we can add a foreign table with valid options
yeah=# CREATE FOREIGN TABLE ft2(id integer) SERVER myserver2 OPTIONS (table '10');
CREATE FOREIGN TABLE

So the validator function seems to work. The handler function doesn't do anything. If you try to read the foreign table, it will... err.. let's simply say that we still need something to make it work:

yeah=# select * from ft2;
The connection to the server was lost. Attempting reset: Failed.
!> 

We'll see how it can send back some lines next time.

Petit compte-rendu sur le pgday.fr 2013

L'association PostgreSQL.fr a organisé un pgday à Nantes cette année. Il a eu lieu le 13 juin et a réuni plus d'une centaine de participants. Il s'agit à ma connaissance du plus gros événement PostgreSQL sur une journée en France à ce jour. Félicitations aux organisateurs. Sans compter que l'organisation a été sans faille : enregistrement simple, bon traiteur (sans que ce soit aussi génial que Ivanne et Sidonie, les traiteurs des PostgreSQL Sessions et du pgday.fr 2009), salle assez grande mais pas trop, une soirée sympa... vraiment bien.

Au niveau des conférences, il y avait aussi du très bon. Cédric Villemain a parlé de la gestion des ressources mémoires, Damien Clochard des nouveautés de la version 9.3, Philippe Beaudoin de simulation de charge, Hugo Mercier de PostGIS 2.0, Grégoire Hubert de POMM, Gilles Darold d'ora2pg, Dimitri Fontaine de grosses volumétries avec PostgreSQL, moi-même des plans d'exécution et de la commande EXPLAIN, et Vik Fearing de la clause LATERAL. Une journée donc très chargée et bourrée d'informations. Sans parler des discussions entre les conférences, qui sont souvent un moment particulier pour apprendre, découvrir et s'enthousiasmer.

Pour moi, la meilleure conférence était celle de Philippe Beaudoin. Il a une manière très agréable et assez unique de présenter une étude de cas. Il a donc montré comment il a cherché (et réussi) à simuler une charge au niveau de la base de données pour s'assurer que son projet était viable. Sans aller dans les détails, il a quand même indiqué les outils qu'il a utilisé, quelques requêtes et ses résultats. Ce n'était pas vraiment technique, pas seulement une étude de cas... ce qui fait que tout le monde a pu trouver quelque chose à récupérer de sa conférence. Très impressionnant. En tout cas, ça m'a donné des idées pour de nouveaux rapports d'activité dans pgbadger.

Autre conférence intéressante, celle de Grégoire Hubert sur POMM. En fait, c'est plutôt le conférencier qui était intéressant. Il sait très bien faire passer son message. Ça fait très showman mais c'est très efficace et ça a permis de tenir éveiller les participants lors du créneau particulièrement difficile qui suit le repas :)

Quant à ma propre conférence, elle s'est plutôt bien passée. J'ai du aller vite car il y avait beaucoup de contenu pour seulement 50 minutes. J'espère n'avoir perdu personne. Les retours sont bons dans l'ensemble et ça fait plaisir. Les slides sont disponibles sur le site du pgday.fr ainsi que sur le site dalibo.org.

Pour terminer sur ce compte-rendu, je dois avouer que j'ai été agréablement surpris et que j'espère pouvoir être présent l'année prochaine pour l'édition 2014.

Et pour les intéressés, mes photos des conférenciers et des organisateurs.

sqlite foreign data wrapper

Last week, one of my customers asked me during a training why there is no foreign data wrapper for sqlite (actually, you can have one with multicorn, but there's no native one). I have no idea why but I was happy to learn that no sqlite FDW already existed. I wanted to write one for quite some time now, and it appeared to be the perfect idea.

So, in the evening, I started working on one. I took as foundation the blackhole foreign data wrapper, written by Andrew Dunstan. It helped a lot to start quickly. I found a bit surprising that it didn't include #ifdef for some functions and hooks, so that it could be compatible with 9.1 and 9.2. I added them in a patch that you can find here if you need them. Otherwise, you can simply delete some parts of the blackhole code.

After that, I tried to find how to open and close a sqlite database, and how to read a table in it. They have a really nice example in their "Quick start document" and "Introduction To The SQLite C/C++ Interface".

I wanted something really simple to start with. Almost two years ago, at pgconf.eu, I went to see Dave Page's talk on FDW (PostgreSQL at the center of your dataverse). So I already knew that you mainly need three functions to read a table: BeginForeignScan, IterateForeignScan, EndForeignScan. The first one has to open the connection (if it's not opened yet). The second one will be executed as many times as there are rows to grab. On its first execution, it must launch the query and get the first row. On every other iteration, it will grab a new row. And the third function helps cleaning memory. So I started to include the sqlite tutorial code in the blackhole FDW. And it worked great. I had to write some other functions, specifically the handler and the validator, but,in an hour, I had something working. Of course, it wasn't pretty. The database filename was written in the code, with no option to change it. The query executed remotely was also written in the code, which means you couldn't change the tablename without recompiling.

So I started to look at options for the foreign server and the foreign table. Most of the code was taken from Dave Page's mysql_fdw code. I quickly had a foreign data wrapper allowing two options: database (to change the database file), and table (to target a specific remote table).

The whole code is available on the sqllite_fdw github repository. I'll try to add the missing features as soon as possible. And when I'll have something working and tested, I'll post it on pgxn.

Traduction de la documentation de PostgreSQL 9.3

Contrairement à l'année dernière, le merge a été beaucoup plus rapide. En moins de trois jours, il était terminé. Il est donc possible de commencer la traduction.

Ceux qui veulent participer sont les bienvenus. La liste des fichiers et les instructions sont disponibles sur : https://github.com/gleu/pgdocs_fr/wiki/translation-9.3. Il est possible que j'ai besoin de donner le droit de modifier cette page, on verra bien. J'avoue que je ne connais pas bien le système wiki de github.

De mon côté, je vais commencer certainement dès demain.

Pour ceux qui ont besoin de cette documentation, elle est disponible sur http://docs.postgresql.fr/9.3/. Elle sera mise à jour au fur et à mesure de l'avancement de la traduction (donc, là, c'est juste la documentation de la 9.2 en français, dans laquelle se trouvent greffés les ajouts de la 9.3 en anglais).

Nouvelles versions mineures de la documentation

Désolé, encore une fois, j'ai mis du temps avant de me coller à la mise à jour de la traduction des manuels. C'est fait, c'est disponible, pour les version 8.4 à 9.2.

La version beta de la 9.3 devrait sortir demain. Je m'y colle dès demain soir. Ça risque d'être épique, comme à chaque mise à jour majeure :) Quoique, après avoir regardé le diff, ça ne semble pas si dur que ça. Sont principalement touchées la documentation sur les fonctions, la configuration, les catalogues (rien que de l'habituel jusqu'ici), ainsi que les triggers sur événement, les vues matérialisées, les Foreign Data Wrapper, pg_xlogdump, pg_isready et les Background Workers (les grosses nouveautés de cette version). N'empêche qu'il y a du boulot.

Use the index, luke! ... en français !

usetheindexluke.png

Il y a quelques mois de ça, j'ai découvert un site excellent grâce à un collègue (Marc Cousin pour ne pas le nommer). Ce site s'appelle Use the index, Luke! Ce site explique en des termes très simples ce que sont les index, comment une base de données les utilise, comment bien les concevoir pour avoir de meilleures performances, etc, etc. Le plus étonnant, c'est que les explications sont simples et limpides malgré des concepts assez complexes. Bref, j'ai dévoré le site complet en espérant en apprendre le plus possible. Son concepteur, Markus Winand, a fini par le publier sous la forme d'un livre. Je l'ai évidemment acheté pour pouvoir le relire entièrement tranquillement. J'ai fini par envoyer un mail à Markus pour lui demander s'il allait faire une version PDF. C'était bien le cas, j'ai même eu la possibilité de tester une version beta du PDF. Classe :)

En discutant avec Markus, on est tombé d'accord sur une traduction française. Puis pgconf.eu est arrivé (où j'ai d'ailleurs rencontré Markus), puis le boulot a repris... autant dire que le commencement du travail sur la traduction s'est trouvé fortement décalé. Mais j'ai enfin pu travailler dessus. Ça doit faire trois semaines que je travaille plus ou moins sérieusement dessus, et la moitié de la traduction est faite. Markus a décidé qu'il était possible de commencer à mettre ce début à disposition du monde entier (au moins :) ). C'est donc fait. C'est dispo sur http://use-the-index-luke.com/fr.

N'hésitez pas à m'indiquer tout problème de traduction ou de compréhension. Ce livre va évoluer, au moins le temps de sa traduction complète.

Conférence à l'AFUP Lyon

Hier soir, je suis allé donner une conférence à l'AFUP de Lyon avec Dimitri. Le but était de faire une introduction à PostgreSQL auprès de développeurs PHP.

J'ai commencé à 19h30 avec une conférence pour une heure d'introduction à PostgreSQL. J'ai divisé cette conférence en plusieurs parties :

  • quelques informations globales sur PostgreSQL : son historique, ses fonctionnalités, sa communauté, ses sponsors, etc. ;
  • l'installation : avec des informations sur le matériel, sur le système d'exploitation, et sur les paquets précompilés pour PostgreSQL ;
  • la configuration, du système d'exploitation et de PostgreSQL ;
  • la maintenance : VACUUM, ANALYZE, REINDEX et leur automatisation ;
  • la sauvegarde, logique comme physique ;
  • la supervision, avec les journaux applicatifs et les statistiques d'activités, et un bref rappel des outils utilisables.

Je pense que cela a donné un aperçu complet sur PostgreSQL. Forcément, en une heure, on survole beaucoup de concepts, mais on ne peut guère attendre mieux d'un aperçu. Pour les curieux, les slides sont sur dalibo.org.

Dimitri a pris la suite, avec une autre conférence d'une heure elle-aussi. Sa conférence portait sur un exemple de développement. Il y avait beaucoup d'informations sur l'écriture de requêtes (CTE, requête de fenêtrage, etc.). Les requêtes finales étaient vraiment impressionnantes.

Il y avait une centaine de personnes dans l'amphithéâtre. Je m'attendais plutôt à une vingtaine de personnes, la surprise a été très agréable.

La soirée s'est terminée au restaurant, dans un bouchon lyonnais. Bonne bouffe, excellent vin, et très bonne compagnie. Une bonne conclusion à ce voyage éclair à Lyon.

Merci aux organisateurs, notamment Gautier et Michael.

pgconf.eu 2012, a quick recap

One year of work, that's all it takes to build such an event. That doesn't seem a lot, but believe me, it is. And we are still working: invoices to pay, feedbacks to collect, etc. But it's worth it. It's more than simply "worth it". Seeing so many people come and have a good time talking about PostgreSQL, that's just awesome.

prague.png

I didn't see as many talks as I would have liked, but I did see some, mostly because I was a room host twice. In a weird way, the most interesting talks for me were talks from people I didn't know about before pgconf.eu and that I would have not seen if I weren't a room host. Roland Sonnenschein talked about PostgreSQL in Research & Development. He explained how he uses PostgreSQL in critical missions, and how he was able to sustain really big insert load with some tricks. The little tricks he gave were smart and inspiring. I hope I'll have the time to use that kind of tricks with some of my customers. PostBIS was another surprising talk with its explanation of the good and bad of the compression algorithm in PostgreSQL. Of the people I knew, the most interesting talks were from Alexey Klyukin ("Migrating Oracle queries to PostgreSQL"), Gianni Ciolli ("Debugging complex queries..."), and Hans-Jürgen Schönig ("Limiting PostgreSQL ressource consumption"). They were all inspiring, and I intend to use every bit of information when doing audits for my customers or when giving a training course on PostgreSQL.

My own talk ("Understanding EXPLAIN and its output") went pretty good, I think. The feedbacks I got were usually good. The slides and notes are available here if you're interested.

Of course, pgconf.eu isn't only about talks. The hallway track is also a place to be. I had a great time meeting old-timers, but also newcomers. Meeting Joe Celko and Markus Winand (the guy behind the use-the-index-luke website, and I recommend you to buy and read his book) was great.

I also went to the lightning talks. That's always a lot of fun. I can't think of anyone who could do a better job than Harald Armin Massa. He's really the right guy to introduce the speakers in a lightning talk. Can't wait to see him next year.

Well, that's it. For those who couldn't come (shame on you :) ), you can still read the slides on the PostgreSQL wiki (but sorry, no video), and mark on your calendar that another event is planned next year, in mid-october.

See you all next year!

Nouvelle version de pgSnap

Ça fait facilement 10 mois que je n'avais pas travaillé sur pgSnap. Là, c'était vraiment nécessaire, notamment suite à la sortie de la version 9.2 de PostgreSQL et de toutes les petites modifications du schéma système que cela implique. Donc pas de grosses nouveautés sur cette version : quelques corrections de bugs et surtout le support complet de la 9.2.

Pour le télécharger, c'est sur github maintenant.

Nouvelles versions mineures du manuel

De nouveau, la mise à jour de la documentation s'est fait un peu attendre. La faute à un week-end bien rempli et à une semaine en déplacement. Néanmoins, les mises à jour sont disponibles sur le site de la documentation française.

Documentation française

Il ne vous a pas échappé que des nouvelles versions mineures de PostgreSQL sont sorties récemment. Ça s'est fait pendant mes vacances, ce qui ne m'a pas permis d'être aussi réactif que d'habitude. Néanmoins, là, c'est fait. Chaque nouvelle version mineure dispose d'une mise à jour de la traduction française. Et même la RC1 est dispo.

La traduction des applications pour PostgreSQL 9.2 est terminée. Celle de pgAdmin 1.16 l'est aussi.

Je crois que, du côté traduction, c'est plutôt couvert.

Il ne me reste plus qu'à retrouver comment on produit les versions PDF et ce sera parfait :)

Traduction de la documentation de PostgreSQL 9.2

Je viens de terminer le merge des modifications de la version anglaise. Ça m'a pris plus de temps que d'habitude pour différentes raisons (une semaine au PGCon, suivie d'une semaine à Berkeley, et les nouvelles versions mineures). Bref, il est enfin possible de traduire. Pour infos, j'ai l'impression qu'il y aura moins de travail à fournir cette année (ouf...).

Ceux qui veulent participer sont les bienvenus. La liste des fichiers et les instructions sont disponibles sur : http://www.postgresql.fr/traduction_postgresql_9.2.

De mon côté, je vais commencer certainement dès lundi prochain.

Pour ceux qui ont besoin de cette documentation, elle est disponible sur http://docs.postgresql.fr/9.2/. Elle sera mise à jour au fur et à mesure de l'avancement de la traduction (donc, là, c'est juste la documentation de la 9.1 en français, dans laquelle se trouvent greffés les ajouts de la 9.2 en anglais).

- page 1 de 68