gleu's blog

Aller au contenu | Aller au menu | Aller à la recherche

The handler and the validator functions of a FDW

I'm still working on my foreign data wrapper. Actually, I haven't written a lot of code. Instead, I've been reading the code of the other foreign data wrappers and it helped me a lot to understand many things, and to see how far each went.

I'll try to start a few blog posts about coding a foreign data wrapper.

This first one (this one) will be about the two functions needed to declare a foreign data wrapper. Let's take a look at the CREATE FOREIGN DATA WRAPPER DDL statement:

    [ HANDLER handler_function | NO HANDLER ]
    [ VALIDATOR validator_function | NO VALIDATOR ]
    [ OPTIONS ( option 'value' [, ... ] ) ]

The handler function is available since 9.1. The other options have always been available (at least from the beginning of the SQL/MED infrastructure in PostgreSQL, which is the 8.4 release).

The handler function is the function you use to declare all the foreign data wrappers hooks you'll need. Here are the complete list, with the first version they were available in :

  • plan functions (used by the planner)
    • GetForeignRelSize (>=9.2), obtain relation size estimates for a foreign table, called at the beginning of a query planning
    • GetForeignPaths (>=9.2), create possible access paths for a scan on a foreign table, called during query planning
    • GetForeignPlan (>=9.2), create a ForeignScan plan node from the selected foreign access path, called at the end of a query planning
    • PlanForeignScan (<9.2)
  • read functions (used by the executor)
    • BeginForeignScan, begin executing a foreign scan
    • IterateForeignScan, fetch one row from the foreign source
    • ReScanForeignScan, restart the scan from the beginning
    • EndForeignScan, end the scan and release resources
  • write functions (>=9.3, used by the executor)
    • AddForeignUpdateTargets, used to add extra hidden target columns to the list of columns that are to be retrieved from the foreign table during an UPDATE or DELETE
    • PlanForeignModify, perform any additional planning actions needed for an insert, update, or delete on a foreign table
    • BeginForeignModify, begin executing a foreign table modification operation
    • ExecForeignInsert, insert one tuple into the foreign table
    • ExecForeignUpdate, update one tuple into the foreign table
    • ExecForeignDelete, delete one tuple into the foreign table
    • EndForeignModify, end the table update and release resources
  • explain functions (used during the execution of an EXPLAIN)
    • ExplainForeignScan, print additional EXPLAIN output for a foreign table scan
    • ExplainForeignModify (>=9.3), print additional EXPLAIN output for a foreign table update
  • analyze function (>=9.2, used during the execution of an ANALYZE)
    • AnalyzeForeignTable, collect statistics for this foreign table when ANALYZE is executed on it

None are really required. Of course, if you implement none of them, the foreign data wrapper won't be really useful :)

All these hooks are to be set inside a structure of type FdwRoutine. It's best to use the makeNode() function to set all fields to NULL. Then, you'll be able to set each of the hooks you really need, and don't care about the rest.

At the end, this function hands back the FdwRoutine structure to the caller of the function.

So the code looks something like this:

  FdwRoutine *fdwroutine = makeNode(FdwRoutine);

  // plan functions
#if (PG_VERSION_NUM >= 90200)
  fdwroutine->GetForeignRelSize = myGetForeignRelSize;
  fdwroutine->GetForeignPaths = myGetForeignPaths;
  fdwroutine->GetForeignPlan = myGetForeignPlan;
  fdwroutine->PlanForeignScan = myPlanForeignScan;

  // read functions
  fdwroutine->BeginForeignScan = myBeginForeignScan;
  fdwroutine->IterateForeignScan = myIterateForeignScan;
  fdwroutine->ReScanForeignScan = myReScanForeignScan;
  fdwroutine->EndForeignScan = myEndForeignScan;

#if (PG_VERSION_NUM >= 90300)
  // write functions
  fdwroutine->AddForeignUpdateTargets = myAddForeignUpdateTargets;
  fdwroutine->PlanForeignModify = myPlanForeignModify;
  fdwroutine->BeginForeignModify = myBeginForeignModify;
  fdwroutine->ExecForeignInsert = myExecForeignInsert;
  fdwroutine->ExecForeignUpdate = myExecForeignUpdate;
  fdwroutine->ExecForeignDelete = myExecForeignDelete;
  fdwroutine->EndForeignModify = myEndForeignModify;

  // explain functions
  fdwroutine->ExplainForeignScan = myExplainForeignScan;
#if (PG_VERSION_NUM >= 90300)
  fdwroutine->ExplainForeignModify = myExplainForeignModify;

#if (PG_VERSION_NUM >= 90200)
  // analyze functions
  fdwroutine->AnalyzeForeignTable = myAnalyzeForeignTable;


Of course, all the my* functions must be previously declared. Rather than the actual code, the more interesting thing to know in a foreign data wrapper is which hooks it support. It really depends a lot on the FDW. For example, the Oracle FDW doesn't support yet the write functions. The MySQL FDW doesn't support the write, the explain, and the analyze functions. Etc.

The validator function checks the options given to a specific foreign object (foreign data wrapper, foreign server, user mapping, and foreign table). It gets a list of all the options, and the context of the options (aka the foreign object type). The function has to go through each of them, checks that they are valid, and error out if anything wrong happens (send back the ERRCODE_FDW_INVALID_OPTION_NAME error if it isn't a valid option, please).

A good way of doing it is to first add a FDW option structure, this way:

struct myFdwOption
  const char  *optname;
  Oid   optcontext;

The optname will obviously be the option name. The optcontext will contain the kind of object this option is for. There are four possibilities: ForeignDataWrapperRelationId, ForeignServerRelationId, UserMappingRelationId, and ForeignTableRelationId.

The options could be declared like this:

static struct myFdwOption valid_options[] =
  { "database",  ForeignServerRelationId },
  { "table",     ForeignTableRelationId },
  { NULL,     InvalidOid }

So, a simple code for the validator function might be:

  List      *options_list = untransformRelOptions(PG_GETARG_DATUM(0));
  Oid       catalog = PG_GETARG_OID(1);
  ListCell  *cell;
  char      *my_database = NULL;
  char      *my_table = NULL;

  foreach(cell, options_list)
    DefElem    *def = (DefElem *) lfirst(cell);

    if (!myIsValidOption(def->defname, catalog))
      struct SQLiteFdwOption *opt;
      StringInfoData buf;

       * Unknown option specified, complain about it. Provide a hint
       * with list of valid options for the object.
      for (opt = valid_options; opt->optname; opt++)
        if (catalog == opt->optcontext)
          appendStringInfo(&buf, "%s%s", (buf.len > 0) ? ", " : "",

        errmsg("invalid option \"%s\"", def->defname), 
        errhint("Valid options in this context are: %s", buf.len ? : "<none>")

     * Here is the code for the valid options

    if (strcmp(def->defname, "database") == 0)
      if (my_database)
          errmsg("redundant options: database (%s)", defGetString(def))

      my_database = defGetString(def);
    else if (strcmp(def->defname, "table") == 0)
      if (my_table)
          errmsg("redundant options: table (%s)", defGetString(def))

      my_table = defGetString(def);


The myIsValidOption() function will go through the options and checks that it matches the right context. Just as is:

static bool
myIsValidOption(const char *option, Oid context)
  struct myFdwOption *opt;

  for (opt = valid_options; opt->optname; opt++)
    if (context == opt->optcontext && strcmp(opt->optname, option) == 0)
      return true;
  return false;

After compilation and installation of such a FDW, we can try creating foreign objects:

yeah=# CREATE EXTENSION my_fdw ;
-- we can add a server without options
-- we cannot add a server with unknown options
yeah=# CREATE SERVER myserver3 FOREIGN DATA WRAPPER my_fdw OPTIONS (inconnu '10') ;
ERROR:  invalid option "inconnu"
HINT:  Valid options in this context are: database
-- but we can add one with a valid option
yeah=# CREATE SERVER myserver2 FOREIGN DATA WRAPPER my_fdw OPTIONS (database 'this one') ;
-- we can also create foreign table without options
yeah=# CREATE FOREIGN TABLE ft(id integer) SERVER myserver2;
-- we cannot add a foreign table with unknown options
yeah=# CREATE FOREIGN TABLE ft2(id integer) SERVER myserver2 OPTIONS (inconnu '10');
ERROR:  invalid option "inconnu"
HINT:  Valid options in this context are: table
-- we can add a foreign table with valid options
yeah=# CREATE FOREIGN TABLE ft2(id integer) SERVER myserver2 OPTIONS (table '10');

So the validator function seems to work. The handler function doesn't do anything. If you try to read the foreign table, it will... err.. let's simply say that we still need something to make it work:

yeah=# select * from ft2;
The connection to the server was lost. Attempting reset: Failed.

We'll see how it can send back some lines next time.

Petit compte-rendu sur le 2013

L'association a organisé un pgday à Nantes cette année. Il a eu lieu le 13 juin et a réuni plus d'une centaine de participants. Il s'agit à ma connaissance du plus gros événement PostgreSQL sur une journée en France à ce jour. Félicitations aux organisateurs. Sans compter que l'organisation a été sans faille : enregistrement simple, bon traiteur (sans que ce soit aussi génial que Ivanne et Sidonie, les traiteurs des PostgreSQL Sessions et du 2009), salle assez grande mais pas trop, une soirée sympa... vraiment bien.

Au niveau des conférences, il y avait aussi du très bon. Cédric Villemain a parlé de la gestion des ressources mémoires, Damien Clochard des nouveautés de la version 9.3, Philippe Beaudoin de simulation de charge, Hugo Mercier de PostGIS 2.0, Grégoire Hubert de POMM, Gilles Darold d'ora2pg, Dimitri Fontaine de grosses volumétries avec PostgreSQL, moi-même des plans d'exécution et de la commande EXPLAIN, et Vik Fearing de la clause LATERAL. Une journée donc très chargée et bourrée d'informations. Sans parler des discussions entre les conférences, qui sont souvent un moment particulier pour apprendre, découvrir et s'enthousiasmer.

Pour moi, la meilleure conférence était celle de Philippe Beaudoin. Il a une manière très agréable et assez unique de présenter une étude de cas. Il a donc montré comment il a cherché (et réussi) à simuler une charge au niveau de la base de données pour s'assurer que son projet était viable. Sans aller dans les détails, il a quand même indiqué les outils qu'il a utilisé, quelques requêtes et ses résultats. Ce n'était pas vraiment technique, pas seulement une étude de cas... ce qui fait que tout le monde a pu trouver quelque chose à récupérer de sa conférence. Très impressionnant. En tout cas, ça m'a donné des idées pour de nouveaux rapports d'activité dans pgbadger.

Autre conférence intéressante, celle de Grégoire Hubert sur POMM. En fait, c'est plutôt le conférencier qui était intéressant. Il sait très bien faire passer son message. Ça fait très showman mais c'est très efficace et ça a permis de tenir éveiller les participants lors du créneau particulièrement difficile qui suit le repas :)

Quant à ma propre conférence, elle s'est plutôt bien passée. J'ai du aller vite car il y avait beaucoup de contenu pour seulement 50 minutes. J'espère n'avoir perdu personne. Les retours sont bons dans l'ensemble et ça fait plaisir. Les slides sont disponibles sur le site du ainsi que sur le site

Pour terminer sur ce compte-rendu, je dois avouer que j'ai été agréablement surpris et que j'espère pouvoir être présent l'année prochaine pour l'édition 2014.

Et pour les intéressés, mes photos des conférenciers et des organisateurs.

sqlite foreign data wrapper

Last week, one of my customers asked me during a training why there is no foreign data wrapper for sqlite (actually, you can have one with multicorn, but there's no native one). I have no idea why but I was happy to learn that no sqlite FDW already existed. I wanted to write one for quite some time now, and it appeared to be the perfect idea.

So, in the evening, I started working on one. I took as foundation the blackhole foreign data wrapper, written by Andrew Dunstan. It helped a lot to start quickly. I found a bit surprising that it didn't include #ifdef for some functions and hooks, so that it could be compatible with 9.1 and 9.2. I added them in a patch that you can find here if you need them. Otherwise, you can simply delete some parts of the blackhole code.

After that, I tried to find how to open and close a sqlite database, and how to read a table in it. They have a really nice example in their "Quick start document" and "Introduction To The SQLite C/C++ Interface".

I wanted something really simple to start with. Almost two years ago, at, I went to see Dave Page's talk on FDW (PostgreSQL at the center of your dataverse). So I already knew that you mainly need three functions to read a table: BeginForeignScan, IterateForeignScan, EndForeignScan. The first one has to open the connection (if it's not opened yet). The second one will be executed as many times as there are rows to grab. On its first execution, it must launch the query and get the first row. On every other iteration, it will grab a new row. And the third function helps cleaning memory. So I started to include the sqlite tutorial code in the blackhole FDW. And it worked great. I had to write some other functions, specifically the handler and the validator, but,in an hour, I had something working. Of course, it wasn't pretty. The database filename was written in the code, with no option to change it. The query executed remotely was also written in the code, which means you couldn't change the tablename without recompiling.

So I started to look at options for the foreign server and the foreign table. Most of the code was taken from Dave Page's mysql_fdw code. I quickly had a foreign data wrapper allowing two options: database (to change the database file), and table (to target a specific remote table).

The whole code is available on the sqllite_fdw github repository. I'll try to add the missing features as soon as possible. And when I'll have something working and tested, I'll post it on pgxn.

Traduction de la documentation de PostgreSQL 9.3

Contrairement à l'année dernière, le merge a été beaucoup plus rapide. En moins de trois jours, il était terminé. Il est donc possible de commencer la traduction.

Ceux qui veulent participer sont les bienvenus. La liste des fichiers et les instructions sont disponibles sur : Il est possible que j'ai besoin de donner le droit de modifier cette page, on verra bien. J'avoue que je ne connais pas bien le système wiki de github.

De mon côté, je vais commencer certainement dès demain.

Pour ceux qui ont besoin de cette documentation, elle est disponible sur Elle sera mise à jour au fur et à mesure de l'avancement de la traduction (donc, là, c'est juste la documentation de la 9.2 en français, dans laquelle se trouvent greffés les ajouts de la 9.3 en anglais).

Nouvelles versions mineures de la documentation

Désolé, encore une fois, j'ai mis du temps avant de me coller à la mise à jour de la traduction des manuels. C'est fait, c'est disponible, pour les version 8.4 à 9.2.

La version beta de la 9.3 devrait sortir demain. Je m'y colle dès demain soir. Ça risque d'être épique, comme à chaque mise à jour majeure :) Quoique, après avoir regardé le diff, ça ne semble pas si dur que ça. Sont principalement touchées la documentation sur les fonctions, la configuration, les catalogues (rien que de l'habituel jusqu'ici), ainsi que les triggers sur événement, les vues matérialisées, les Foreign Data Wrapper, pg_xlogdump, pg_isready et les Background Workers (les grosses nouveautés de cette version). N'empêche qu'il y a du boulot.

Use the index, luke! ... en français !


Il y a quelques mois de ça, j'ai découvert un site excellent grâce à un collègue (Marc Cousin pour ne pas le nommer). Ce site s'appelle Use the index, Luke! Ce site explique en des termes très simples ce que sont les index, comment une base de données les utilise, comment bien les concevoir pour avoir de meilleures performances, etc, etc. Le plus étonnant, c'est que les explications sont simples et limpides malgré des concepts assez complexes. Bref, j'ai dévoré le site complet en espérant en apprendre le plus possible. Son concepteur, Markus Winand, a fini par le publier sous la forme d'un livre. Je l'ai évidemment acheté pour pouvoir le relire entièrement tranquillement. J'ai fini par envoyer un mail à Markus pour lui demander s'il allait faire une version PDF. C'était bien le cas, j'ai même eu la possibilité de tester une version beta du PDF. Classe :)

En discutant avec Markus, on est tombé d'accord sur une traduction française. Puis est arrivé (où j'ai d'ailleurs rencontré Markus), puis le boulot a repris... autant dire que le commencement du travail sur la traduction s'est trouvé fortement décalé. Mais j'ai enfin pu travailler dessus. Ça doit faire trois semaines que je travaille plus ou moins sérieusement dessus, et la moitié de la traduction est faite. Markus a décidé qu'il était possible de commencer à mettre ce début à disposition du monde entier (au moins :) ). C'est donc fait. C'est dispo sur

N'hésitez pas à m'indiquer tout problème de traduction ou de compréhension. Ce livre va évoluer, au moins le temps de sa traduction complète.

Conférence à l'AFUP Lyon

Hier soir, je suis allé donner une conférence à l'AFUP de Lyon avec Dimitri. Le but était de faire une introduction à PostgreSQL auprès de développeurs PHP.

J'ai commencé à 19h30 avec une conférence pour une heure d'introduction à PostgreSQL. J'ai divisé cette conférence en plusieurs parties :

  • quelques informations globales sur PostgreSQL : son historique, ses fonctionnalités, sa communauté, ses sponsors, etc. ;
  • l'installation : avec des informations sur le matériel, sur le système d'exploitation, et sur les paquets précompilés pour PostgreSQL ;
  • la configuration, du système d'exploitation et de PostgreSQL ;
  • la maintenance : VACUUM, ANALYZE, REINDEX et leur automatisation ;
  • la sauvegarde, logique comme physique ;
  • la supervision, avec les journaux applicatifs et les statistiques d'activités, et un bref rappel des outils utilisables.

Je pense que cela a donné un aperçu complet sur PostgreSQL. Forcément, en une heure, on survole beaucoup de concepts, mais on ne peut guère attendre mieux d'un aperçu. Pour les curieux, les slides sont sur

Dimitri a pris la suite, avec une autre conférence d'une heure elle-aussi. Sa conférence portait sur un exemple de développement. Il y avait beaucoup d'informations sur l'écriture de requêtes (CTE, requête de fenêtrage, etc.). Les requêtes finales étaient vraiment impressionnantes.

Il y avait une centaine de personnes dans l'amphithéâtre. Je m'attendais plutôt à une vingtaine de personnes, la surprise a été très agréable.

La soirée s'est terminée au restaurant, dans un bouchon lyonnais. Bonne bouffe, excellent vin, et très bonne compagnie. Une bonne conclusion à ce voyage éclair à Lyon.

Merci aux organisateurs, notamment Gautier et Michael. 2012, a quick recap

One year of work, that's all it takes to build such an event. That doesn't seem a lot, but believe me, it is. And we are still working: invoices to pay, feedbacks to collect, etc. But it's worth it. It's more than simply "worth it". Seeing so many people come and have a good time talking about PostgreSQL, that's just awesome.


I didn't see as many talks as I would have liked, but I did see some, mostly because I was a room host twice. In a weird way, the most interesting talks for me were talks from people I didn't know about before and that I would have not seen if I weren't a room host. Roland Sonnenschein talked about PostgreSQL in Research & Development. He explained how he uses PostgreSQL in critical missions, and how he was able to sustain really big insert load with some tricks. The little tricks he gave were smart and inspiring. I hope I'll have the time to use that kind of tricks with some of my customers. PostBIS was another surprising talk with its explanation of the good and bad of the compression algorithm in PostgreSQL. Of the people I knew, the most interesting talks were from Alexey Klyukin ("Migrating Oracle queries to PostgreSQL"), Gianni Ciolli ("Debugging complex queries..."), and Hans-Jürgen Schönig ("Limiting PostgreSQL ressource consumption"). They were all inspiring, and I intend to use every bit of information when doing audits for my customers or when giving a training course on PostgreSQL.

My own talk ("Understanding EXPLAIN and its output") went pretty good, I think. The feedbacks I got were usually good. The slides and notes are available here if you're interested.

Of course, isn't only about talks. The hallway track is also a place to be. I had a great time meeting old-timers, but also newcomers. Meeting Joe Celko and Markus Winand (the guy behind the use-the-index-luke website, and I recommend you to buy and read his book) was great.

I also went to the lightning talks. That's always a lot of fun. I can't think of anyone who could do a better job than Harald Armin Massa. He's really the right guy to introduce the speakers in a lightning talk. Can't wait to see him next year.

Well, that's it. For those who couldn't come (shame on you :) ), you can still read the slides on the PostgreSQL wiki (but sorry, no video), and mark on your calendar that another event is planned next year, in mid-october.

See you all next year!

Nouvelle version de pgSnap

Ça fait facilement 10 mois que je n'avais pas travaillé sur pgSnap. Là, c'était vraiment nécessaire, notamment suite à la sortie de la version 9.2 de PostgreSQL et de toutes les petites modifications du schéma système que cela implique. Donc pas de grosses nouveautés sur cette version : quelques corrections de bugs et surtout le support complet de la 9.2.

Pour le télécharger, c'est sur github maintenant.

Nouvelles versions mineures du manuel

De nouveau, la mise à jour de la documentation s'est fait un peu attendre. La faute à un week-end bien rempli et à une semaine en déplacement. Néanmoins, les mises à jour sont disponibles sur le site de la documentation française.

Documentation française

Il ne vous a pas échappé que des nouvelles versions mineures de PostgreSQL sont sorties récemment. Ça s'est fait pendant mes vacances, ce qui ne m'a pas permis d'être aussi réactif que d'habitude. Néanmoins, là, c'est fait. Chaque nouvelle version mineure dispose d'une mise à jour de la traduction française. Et même la RC1 est dispo.

La traduction des applications pour PostgreSQL 9.2 est terminée. Celle de pgAdmin 1.16 l'est aussi.

Je crois que, du côté traduction, c'est plutôt couvert.

Il ne me reste plus qu'à retrouver comment on produit les versions PDF et ce sera parfait :)

Traduction de la documentation de PostgreSQL 9.2

Je viens de terminer le merge des modifications de la version anglaise. Ça m'a pris plus de temps que d'habitude pour différentes raisons (une semaine au PGCon, suivie d'une semaine à Berkeley, et les nouvelles versions mineures). Bref, il est enfin possible de traduire. Pour infos, j'ai l'impression qu'il y aura moins de travail à fournir cette année (ouf...).

Ceux qui veulent participer sont les bienvenus. La liste des fichiers et les instructions sont disponibles sur :

De mon côté, je vais commencer certainement dès lundi prochain.

Pour ceux qui ont besoin de cette documentation, elle est disponible sur Elle sera mise à jour au fur et à mesure de l'avancement de la traduction (donc, là, c'est juste la documentation de la 9.1 en français, dans laquelle se trouvent greffés les ajouts de la 9.2 en anglais).

pgAdmin 1.16 beta 1 is out

At the same time PostgreSQL 9.2 beta 1 was released, so did pgAdmin 1.16 beta 1 (with an unfortunate copy/paste error on the URL to download beta 1).

We don't have yet installers for Windows, and packages for Mac OS X. They will come at a later time.

What you can expect from pgAdmin 1.16 is complete compatibility with PostgreSQL 9.2. All the new features of PostgreSQL 9.2 are handled by pgAdmin 1.16. You can use pgAdmin 1.14 with PostgreSQL 9.2, but you won't get all the new parameters, and objects. Moreover, you'll have issues with the tablespaces node (one column disappears in the pg_tablespace system catalog).

One thing you shouldn't expect from pgAdmin 1.16, and I'm really sorry about that, is the database designer. This really cool tool isn't ready for prime time yet. We still need more time to fix the bugs, to make sure it's usable on every plateform, and to do all the little things that make a product great. You can still try it if you want, but you'll need to compile pgAdmin (look for the --enable-databasedesigner ./configure option).

I'm working right now on a visual tour for 1.16, but it may take some time before it gets out.

During the beta period, I'll also fix the bugs testers will encounter, and work on the documentation.

So, as you see, still a lot of work. But don't hesitate to install pgAdmin 1.16 beta 1, search for bugs, and report them on our mailing lists, so that we can have a really great 1.16.0 release.

Appel à conférenciers pour 2012 2012 aura lieu à Prague, en octobre 2012. L'appel à conférenciers vient d'être lancé par Magnus Hagander. N'hésitez pas à envoyer vos propositions sur tout sujet qui vous semble intéressant (évidemment en relation avec PostgreSQL). Les propositions sont acceptées jusqu'au 1er août. Un comité se réunira après pour faire le tri dans les propositions et sélectionner les plus intéressantes.

Si vous n'êtes pas intéressé par donner vous-même une conférence, faites passer néanmoins le message à vos collègues qui pourraient l'être :)

Et notez bien que l'événement aura lieu du 23 au 26 octobre. Nous vous attendons tous à l'hôtel Corynthia pour participer à ce nouvel événement majeur pour les contributeurs et utilisateurs de PostgreSQL.

Profiling PL/pgsql functions

I spent my whole week trying to fix PL/pgsql functions. More to the point, I tried to make them perform better. My customer complains that they aren't fast enough, and I agree with him: some of his PL/pgsql functions are really slow. Unfortunately, he uses a lot of PL/pgsql functions. One function may call many other functions, and it's hard to tell which one is the culprit (if there is only one).

My first step was to use pgFouine on a three-days log to know which queries are really slow on his system. Three queries stood out as being the real offenders. I tried to EXPLAIN ANALYZE them, but found that these queries involved lots of views and functions. Not really easy queries to fix. So I set track_functions to pl, so that I can see which functions were called by one query, and which functions are the really offending ones. Here is the result I got:

b1=# SELECT funcname, calls, total_time, self_time
b1-# FROM pg_stat_user_functions;
 funcname  | calls  | total_time | self_time 
 f0        |      1 |     328486 |      1375
 f1        |      1 |        210 |       210
 f2        |      1 |        123 |       123
 f3        |      1 |      20310 |     20310
 f4        |      1 |        763 |       763
 f51       |      1 |      33486 |     33486
 f52       |      1 |     250363 |    239703
 f53       |      1 |      19139 |     19139
 f6        |      1 |        399 |       399
 f7        |      1 |       1770 |      1770
 f8        |      1 |        543 |       543
 f_compute | 110588 |      10660 |     10660
(12 rows)

Some context first, I only executed f0(), which itself called all the other functions. So what do I get from this? f_compute() is executed 110588 times. That's a lot, but it doesn't consume much of f0() execution time (3%). f52() is only executed once, but took 76% of the total execution time of f0(). If I can make it run faster, I would probably gain a lot.

But there's more. f52() executes other functions, but its own execution took most of the time (95%). So what I really need to know is how long each of its statements last. There's not many way to know that actually. You can modify the function source code to add RAISE LOG statements in this function (you can find many examples on this depesz's post). It takes a lot of time to do that, you may do it wrong, you cannot always do it. You can also use the plprofiler from EnterpriseDB. That's the first thing I did. Due to the number of statements in each of the functions the query called, my query went from 5.4 minutes to more than two hours of execution time. I canceled it because I couldn't wait any longer. The only thing left was to modify the function source code, but I really hate this idea.

So I read the plprofiler code, and found a better way to do it. The plprofiler is a shared library that uses hooks inside the PL/pgsql interpreter to grab lots of information for each statement of each executed function. I only had to change the code to make it less a burden on my system. All I wanted was to get a line in the PostgreSQL log files when a function starts, when it ends, and when a function's statement is executed. It turns out that it's actually quite easy to do. You can find the code on the log_functions github page. After compilation, installation, and configuration of this tool, and then the execution of the query, I got this log (somewhat edited to make it fit better on this blog post):

10:56:24 LOG:  log_functions, DECLARE, f52
10:56:24 LOG:  log_functions, BEGIN, f52
10:56:24 LOG:  log_functions, STMT, line 7, type 14
10:56:24 LOG:  log_functions, STMT, line 14, type 14
10:56:24 LOG:  log_functions, STMT, line 23, type 14
10:56:24 LOG:  log_functions, STMT, line 32, type 14
10:56:24 LOG:  log_functions, STMT, line 41, type 14
10:56:24 LOG:  log_functions, STMT, line 52, type 14
10:56:24 LOG:  log_functions, STMT, line 75, type 14
11:00:14 LOG:  log_functions, STMT, line 99, type 14
11:00:14 LOG:  log_functions, STMT, line 110, type 14
11:00:14 LOG:  log_functions, STMT, line 122, type 14
11:00:14 LOG:  log_functions, STMT, line 133, type 14
11:00:14 LOG:  log_functions, STMT, line 145, type 14
11:00:14 LOG:  log_functions, STMT, line 156, type 14
11:00:14 LOG:  log_functions, STMT, line 168, type 14
11:00:14 LOG:  log_functions, STMT, line 179, type 14
11:00:14 LOG:  log_functions, STMT, line 191, type 14
11:00:14 LOG:  log_functions, STMT, line 202, type 14
11:00:14 LOG:  log_functions, STMT, line 214, type 14
11:00:14 LOG:  log_functions, STMT, line 225, type 14
11:00:14 LOG:  log_functions, STMT, line 237, type 14
11:00:14 LOG:  log_functions, STMT, line 248, type 14
11:00:14 LOG:  log_functions, STMT, line 260, type 14
11:00:14 LOG:  log_functions, STMT, line 271, type 14
11:00:14 LOG:  log_functions, STMT, line 283, type 14
11:00:14 LOG:  log_functions, STMT, line 294, type 14
11:00:14 LOG:  log_functions, STMT, line 304, type 14
11:00:14 LOG:  log_functions, STMT, line 306, type 13
11:00:14 LOG:  log_functions, STMT, line 308, type 10
11:00:14 LOG:  log_functions, END, f52

I also got lucky here. All the time spent on this function is spent on the statement in line 75. So what I really need to know is why it takes so long to do this query. The search of performance for this query is not really relevant for this blog post. I only wanted to show how you can profile a PL/pgsql function easily without changing its source code.

As I think it may be a useful tool for many people, it's available on my github page, and on pgxn. Hope it can help people profiling their PL/pgsql functions.

To end this blog post, know that there are many things one can do with hooks in PostgreSQL. This is just one of them. I already explained how you can use another one to log only superuser queries. This new one will totally fit in with my talk on hooks at PGCon 2012 :)

pgAdmin and wxWidgets 2.9

pgAdmin is a multiplatform tool. This means that you can use it on various operating systems, such as Windows, Mac OS X, Linux, and even Solaris. Not sure if that last one is still supported. To be honest, I write bugfixes and new features on my Linux laptop (Fedora 16), and checks them, whenever I can, on Mac OS X (Snow Leopard, on my old Mac Mini) and Windows (XP, still on the old Mac Mini). We use wxWidgets to make it easier to code user interfaces for all platforms. pgAdmin supports wxWidgets 2.8. The developers of wxWidgets are working hard on the next big release, 3.0. To ease the jump, they have a 2.9 release since quite some time. This 2.9 version will become 3.0 "when it's ready".

I'm really happy to see that we are now compatible with 2.9. There are probably still some itches to fix, but it's already quite usable.

Peter Geoghegan was the first one to work on this, and wrote a big patch, that Dave Page applied on 2011-02-01. There were a few fixes here, and there. The second phase, done by Peter, was the merge of the wxOGL contrib into our source tree (we had the author's permission for that). And then another bunch of fixes, mostly from reports sent by Colin Beckingham, and fixed by me.

Right now, I have something that works great on Linux.

Why is it important? because it offers a better support of new features of the operating systems, and it fixes a lot of issues we had previously with wxWidgets 2.8 that the developers of wxWidgets couldn't fix (either because it would require too much changes, or because they don't have the manpower to do it). 2.9 also adds some widgets that we would definitely love to use :

  • a property grid ;
  • a list control with a tree in the first column (to display tree where each item has informations to display in columns) ;
  • a native calendar control for Linux (the previous one is a wxWidgets custom control, and is not really usable) ;
  • a tree control with checkbox (we have that via a custom widget, one that we could get rid of... remember, less code is good) ;
  • and probably many other interesting enhancements (for example, support for GTK3 would make my day).

If you can test pgAdmin 1.15 with wxWidgets 2.9, please send any issues to pgadmin-hackers.

An import tool for pgAdmin

With pgAdmin 1.14 and prior releases, there were only two ways to insert data in a table:

  • use the restore tool (which uses pg_restore) ;
  • use the query tool to execute INSERT queries (COPY queries are allowed if they use a file, but not stdin/stdout).

There was no way to ask it to import a CSV file for example. And a thread began on this matter a week ago. As working on the COPY API of libpq was quite motivating, I worked on this a few hours, and finished yesterday with this patch.

This patch adds a new window that allows a user to choose a file, and to select all options available in the COPY command (but only options available for import, not export).

This window is displayed when someone selects the "Import" item in the table's contextual menu, as shown in the following screenshot:


The import window has 5 tabs:

  • "File options" tab makes possible to choose a file, sets its format (text, CSV, or binary), and optionaly its encoding.


  • "Columns" tab allows the user to specify which columns are contained in the file. Right now, they must be in the same order than the table (because of the UI), but they don't have to be all present in the file.


  • "Misc Options", "Quote Options", and "NULL options" tabs are not used usually, but it could be handy to set another delimiter than the usual comma, amongs other possibilities.

import_win_3.png import_win_4.png import_win_5.png

When the user clicks the "Import" button, pgAdmin launches a COPY query that matches the selected options, and sends the file by chunks of 8KB. The scrollbar shows the progress of the import. If the import ends with success, the "Import" button is renamed "Done", and a click on it closes the dialog.

That was a really interesting patch to write, for the UI, and the COPY API. And it is a quite nice feature for pgAdmin.

Nouvelles versions mineures, mise à jour de la traduction française

Contrairement à mon habitude, les manuels français de PostgreSQL n'ont été mis à jour que maintenant, soit cinq jours après la sortie des versions. Pour me faire pardonner, j'ai enfin corrigé le problème de la recherche dans la documentation de la version 9.1 (merci à Thomas pour l'info).

J'allais oublier... la documentation de la 8.2 n'a pas disparu. Elle est juste partie rejoindre les documentations des versions obsolètes.

Sortie de pgsnap 0.7.0

Comme promis, voici la nouvelle version de pgsnap. Dans les nouveautés, la compatibilité avec PostgreSQL 9.1, un rapport sur les journaux de transactions et un autre sur les tailles d'index, la possibilité de trier tous les tableaux (merci jquery), sans parler de la possibilité d'ajouter automatiquement un répertoire contenant le rapport de toutes les bases de données.

Bref, du bon, mais rien de majeur non plus. La 0.8 devrait proposer plus de nouveaux rapports, et il est tout à fait possible qu'elle arrive rapidement (ie, avant la fin de l'année).

J'en ai profité pour faire une page wiki sur github qui récapitule quelques informations sur cet outil.

How to log superuser queries

A few weeks ago, one of my customers asked me if they could log all queries coming from superuser users. They have to make sure that none of their DBA could fire a query without that query being logged.

You can already log queries with the log_statement, or log_min_duration_statement GUCs. You can set these GUCs cluster-wide or for specific users. But there's a big issue here: the user can change the value of these GUCs, making it impossible to be sure everything they do is logged.

I first thought of writing a patch to handle this in PostgreSQL, but it also has a major issue: it won't be available for 9.0, the release they use. All new features are added to the development release, and I don't want to maintain a custom PostgreSQL release for anyone (that's too much burden). So, I came up with a nicer way of doing it.

PostgreSQL allows you to add hooks to the executor since 8.4. And you can use these hooks to know which queries are executed and by who. Actually, that's how pg_stat_statements works. So, all I needed to do was to create a small shared library that will add hooks to the executor. The hooks will check that the user executing a query is a superuser, and if that's true, log the query. Quite simple, here is the code of the hook:

static void
pgluq_log(const char *query)
    Assert(query != NULL);

    if (superuser())
        elog(log_level, "%s %s: %s", log_label, GetUserNameFromId(GetUserId()), query);

Of course, the library has more code than just that: to add/remove the hooks, to take care of two module GUCs (log level, and log prefix), but the most important stuff is on the pglug_log() function.

The module is available right now on github, and I intend to publish it via the website once my account will be approved. Docs are available here. It's compatible with PostgreSQL 8.4, 9.0, and 9.1.

Next thing I will do for this module is to add support for syslog. Right now, it only logs to the same location than PostgreSQL.

- page 2 de 69 -