gleu's blog

Aller au contenu | Aller au menu | Aller à la recherche

pgconf.eu 2012, a quick recap

One year of work, that's all it takes to build such an event. That doesn't seem a lot, but believe me, it is. And we are still working: invoices to pay, feedbacks to collect, etc. But it's worth it. It's more than simply "worth it". Seeing so many people come and have a good time talking about PostgreSQL, that's just awesome.

prague.png

I didn't see as many talks as I would have liked, but I did see some, mostly because I was a room host twice. In a weird way, the most interesting talks for me were talks from people I didn't know about before pgconf.eu and that I would have not seen if I weren't a room host. Roland Sonnenschein talked about PostgreSQL in Research & Development. He explained how he uses PostgreSQL in critical missions, and how he was able to sustain really big insert load with some tricks. The little tricks he gave were smart and inspiring. I hope I'll have the time to use that kind of tricks with some of my customers. PostBIS was another surprising talk with its explanation of the good and bad of the compression algorithm in PostgreSQL. Of the people I knew, the most interesting talks were from Alexey Klyukin ("Migrating Oracle queries to PostgreSQL"), Gianni Ciolli ("Debugging complex queries..."), and Hans-Jürgen Schönig ("Limiting PostgreSQL ressource consumption"). They were all inspiring, and I intend to use every bit of information when doing audits for my customers or when giving a training course on PostgreSQL.

My own talk ("Understanding EXPLAIN and its output") went pretty good, I think. The feedbacks I got were usually good. The slides and notes are available here if you're interested.

Of course, pgconf.eu isn't only about talks. The hallway track is also a place to be. I had a great time meeting old-timers, but also newcomers. Meeting Joe Celko and Markus Winand (the guy behind the use-the-index-luke website, and I recommend you to buy and read his book) was great.

I also went to the lightning talks. That's always a lot of fun. I can't think of anyone who could do a better job than Harald Armin Massa. He's really the right guy to introduce the speakers in a lightning talk. Can't wait to see him next year.

Well, that's it. For those who couldn't come (shame on you :) ), you can still read the slides on the PostgreSQL wiki (but sorry, no video), and mark on your calendar that another event is planned next year, in mid-october.

See you all next year!

pgAdmin 1.16 beta 1 is out

At the same time PostgreSQL 9.2 beta 1 was released, so did pgAdmin 1.16 beta 1 (with an unfortunate copy/paste error on the URL to download beta 1).

We don't have yet installers for Windows, and packages for Mac OS X. They will come at a later time.

What you can expect from pgAdmin 1.16 is complete compatibility with PostgreSQL 9.2. All the new features of PostgreSQL 9.2 are handled by pgAdmin 1.16. You can use pgAdmin 1.14 with PostgreSQL 9.2, but you won't get all the new parameters, and objects. Moreover, you'll have issues with the tablespaces node (one column disappears in the pg_tablespace system catalog).

One thing you shouldn't expect from pgAdmin 1.16, and I'm really sorry about that, is the database designer. This really cool tool isn't ready for prime time yet. We still need more time to fix the bugs, to make sure it's usable on every plateform, and to do all the little things that make a product great. You can still try it if you want, but you'll need to compile pgAdmin (look for the --enable-databasedesigner ./configure option).

I'm working right now on a visual tour for 1.16, but it may take some time before it gets out.

During the beta period, I'll also fix the bugs testers will encounter, and work on the documentation.

So, as you see, still a lot of work. But don't hesitate to install pgAdmin 1.16 beta 1, search for bugs, and report them on our mailing lists, so that we can have a really great 1.16.0 release.

Profiling PL/pgsql functions

I spent my whole week trying to fix PL/pgsql functions. More to the point, I tried to make them perform better. My customer complains that they aren't fast enough, and I agree with him: some of his PL/pgsql functions are really slow. Unfortunately, he uses a lot of PL/pgsql functions. One function may call many other functions, and it's hard to tell which one is the culprit (if there is only one).

My first step was to use pgFouine on a three-days log to know which queries are really slow on his system. Three queries stood out as being the real offenders. I tried to EXPLAIN ANALYZE them, but found that these queries involved lots of views and functions. Not really easy queries to fix. So I set track_functions to pl, so that I can see which functions were called by one query, and which functions are the really offending ones. Here is the result I got:

b1=# SELECT funcname, calls, total_time, self_time
b1-# FROM pg_stat_user_functions;
 funcname  | calls  | total_time | self_time 
-----------+--------+------------+-----------
 f0        |      1 |     328486 |      1375
 f1        |      1 |        210 |       210
 f2        |      1 |        123 |       123
 f3        |      1 |      20310 |     20310
 f4        |      1 |        763 |       763
 f51       |      1 |      33486 |     33486
 f52       |      1 |     250363 |    239703
 f53       |      1 |      19139 |     19139
 f6        |      1 |        399 |       399
 f7        |      1 |       1770 |      1770
 f8        |      1 |        543 |       543
 f_compute | 110588 |      10660 |     10660
(12 rows)

Some context first, I only executed f0(), which itself called all the other functions. So what do I get from this? f_compute() is executed 110588 times. That's a lot, but it doesn't consume much of f0() execution time (3%). f52() is only executed once, but took 76% of the total execution time of f0(). If I can make it run faster, I would probably gain a lot.

But there's more. f52() executes other functions, but its own execution took most of the time (95%). So what I really need to know is how long each of its statements last. There's not many way to know that actually. You can modify the function source code to add RAISE LOG statements in this function (you can find many examples on this depesz's post). It takes a lot of time to do that, you may do it wrong, you cannot always do it. You can also use the plprofiler from EnterpriseDB. That's the first thing I did. Due to the number of statements in each of the functions the query called, my query went from 5.4 minutes to more than two hours of execution time. I canceled it because I couldn't wait any longer. The only thing left was to modify the function source code, but I really hate this idea.

So I read the plprofiler code, and found a better way to do it. The plprofiler is a shared library that uses hooks inside the PL/pgsql interpreter to grab lots of information for each statement of each executed function. I only had to change the code to make it less a burden on my system. All I wanted was to get a line in the PostgreSQL log files when a function starts, when it ends, and when a function's statement is executed. It turns out that it's actually quite easy to do. You can find the code on the log_functions github page. After compilation, installation, and configuration of this tool, and then the execution of the query, I got this log (somewhat edited to make it fit better on this blog post):

10:56:24 LOG:  log_functions, DECLARE, f52
10:56:24 LOG:  log_functions, BEGIN, f52
10:56:24 LOG:  log_functions, STMT, line 7, type 14
10:56:24 LOG:  log_functions, STMT, line 14, type 14
10:56:24 LOG:  log_functions, STMT, line 23, type 14
10:56:24 LOG:  log_functions, STMT, line 32, type 14
10:56:24 LOG:  log_functions, STMT, line 41, type 14
10:56:24 LOG:  log_functions, STMT, line 52, type 14
10:56:24 LOG:  log_functions, STMT, line 75, type 14
11:00:14 LOG:  log_functions, STMT, line 99, type 14
11:00:14 LOG:  log_functions, STMT, line 110, type 14
11:00:14 LOG:  log_functions, STMT, line 122, type 14
11:00:14 LOG:  log_functions, STMT, line 133, type 14
11:00:14 LOG:  log_functions, STMT, line 145, type 14
11:00:14 LOG:  log_functions, STMT, line 156, type 14
11:00:14 LOG:  log_functions, STMT, line 168, type 14
11:00:14 LOG:  log_functions, STMT, line 179, type 14
11:00:14 LOG:  log_functions, STMT, line 191, type 14
11:00:14 LOG:  log_functions, STMT, line 202, type 14
11:00:14 LOG:  log_functions, STMT, line 214, type 14
11:00:14 LOG:  log_functions, STMT, line 225, type 14
11:00:14 LOG:  log_functions, STMT, line 237, type 14
11:00:14 LOG:  log_functions, STMT, line 248, type 14
11:00:14 LOG:  log_functions, STMT, line 260, type 14
11:00:14 LOG:  log_functions, STMT, line 271, type 14
11:00:14 LOG:  log_functions, STMT, line 283, type 14
11:00:14 LOG:  log_functions, STMT, line 294, type 14
11:00:14 LOG:  log_functions, STMT, line 304, type 14
11:00:14 LOG:  log_functions, STMT, line 306, type 13
11:00:14 LOG:  log_functions, STMT, line 308, type 10
11:00:14 LOG:  log_functions, END, f52

I also got lucky here. All the time spent on this function is spent on the statement in line 75. So what I really need to know is why it takes so long to do this query. The search of performance for this query is not really relevant for this blog post. I only wanted to show how you can profile a PL/pgsql function easily without changing its source code.

As I think it may be a useful tool for many people, it's available on my github page, and on pgxn. Hope it can help people profiling their PL/pgsql functions.

To end this blog post, know that there are many things one can do with hooks in PostgreSQL. This is just one of them. I already explained how you can use another one to log only superuser queries. This new one will totally fit in with my talk on hooks at PGCon 2012 :)

pgAdmin and wxWidgets 2.9

pgAdmin is a multiplatform tool. This means that you can use it on various operating systems, such as Windows, Mac OS X, Linux, and even Solaris. Not sure if that last one is still supported. To be honest, I write bugfixes and new features on my Linux laptop (Fedora 16), and checks them, whenever I can, on Mac OS X (Snow Leopard, on my old Mac Mini) and Windows (XP, still on the old Mac Mini). We use wxWidgets to make it easier to code user interfaces for all platforms. pgAdmin supports wxWidgets 2.8. The developers of wxWidgets are working hard on the next big release, 3.0. To ease the jump, they have a 2.9 release since quite some time. This 2.9 version will become 3.0 "when it's ready".

I'm really happy to see that we are now compatible with 2.9. There are probably still some itches to fix, but it's already quite usable.

Peter Geoghegan was the first one to work on this, and wrote a big patch, that Dave Page applied on 2011-02-01. There were a few fixes here, and there. The second phase, done by Peter, was the merge of the wxOGL contrib into our source tree (we had the author's permission for that). And then another bunch of fixes, mostly from reports sent by Colin Beckingham, and fixed by me.

Right now, I have something that works great on Linux.

Why is it important? because it offers a better support of new features of the operating systems, and it fixes a lot of issues we had previously with wxWidgets 2.8 that the developers of wxWidgets couldn't fix (either because it would require too much changes, or because they don't have the manpower to do it). 2.9 also adds some widgets that we would definitely love to use :

  • a property grid ;
  • a list control with a tree in the first column (to display tree where each item has informations to display in columns) ;
  • a native calendar control for Linux (the previous one is a wxWidgets custom control, and is not really usable) ;
  • a tree control with checkbox (we have that via a custom widget, one that we could get rid of... remember, less code is good) ;
  • and probably many other interesting enhancements (for example, support for GTK3 would make my day).

If you can test pgAdmin 1.15 with wxWidgets 2.9, please send any issues to pgadmin-hackers.

An import tool for pgAdmin

With pgAdmin 1.14 and prior releases, there were only two ways to insert data in a table:

  • use the restore tool (which uses pg_restore) ;
  • use the query tool to execute INSERT queries (COPY queries are allowed if they use a file, but not stdin/stdout).

There was no way to ask it to import a CSV file for example. And a thread began on this matter a week ago. As working on the COPY API of libpq was quite motivating, I worked on this a few hours, and finished yesterday with this patch.

This patch adds a new window that allows a user to choose a file, and to select all options available in the COPY command (but only options available for import, not export).

This window is displayed when someone selects the "Import" item in the table's contextual menu, as shown in the following screenshot:

importmenu.png

The import window has 5 tabs:

  • "File options" tab makes possible to choose a file, sets its format (text, CSV, or binary), and optionaly its encoding.

import_win_1.png

  • "Columns" tab allows the user to specify which columns are contained in the file. Right now, they must be in the same order than the table (because of the UI), but they don't have to be all present in the file.

import_win_2.png

  • "Misc Options", "Quote Options", and "NULL options" tabs are not used usually, but it could be handy to set another delimiter than the usual comma, amongs other possibilities.

import_win_3.png import_win_4.png import_win_5.png

When the user clicks the "Import" button, pgAdmin launches a COPY query that matches the selected options, and sends the file by chunks of 8KB. The scrollbar shows the progress of the import. If the import ends with success, the "Import" button is renamed "Done", and a click on it closes the dialog.

That was a really interesting patch to write, for the UI, and the COPY API. And it is a quite nice feature for pgAdmin.

How to log superuser queries

A few weeks ago, one of my customers asked me if they could log all queries coming from superuser users. They have to make sure that none of their DBA could fire a query without that query being logged.

You can already log queries with the log_statement, or log_min_duration_statement GUCs. You can set these GUCs cluster-wide or for specific users. But there's a big issue here: the user can change the value of these GUCs, making it impossible to be sure everything they do is logged.

I first thought of writing a patch to handle this in PostgreSQL, but it also has a major issue: it won't be available for 9.0, the release they use. All new features are added to the development release, and I don't want to maintain a custom PostgreSQL release for anyone (that's too much burden). So, I came up with a nicer way of doing it.

PostgreSQL allows you to add hooks to the executor since 8.4. And you can use these hooks to know which queries are executed and by who. Actually, that's how pg_stat_statements works. So, all I needed to do was to create a small shared library that will add hooks to the executor. The hooks will check that the user executing a query is a superuser, and if that's true, log the query. Quite simple, here is the code of the hook:

static void
pgluq_log(const char *query)
{
    Assert(query != NULL);

    if (superuser())
        elog(log_level, "%s %s: %s", log_label, GetUserNameFromId(GetUserId()), query);
}

Of course, the library has more code than just that: to add/remove the hooks, to take care of two module GUCs (log level, and log prefix), but the most important stuff is on the pglug_log() function.

The module is available right now on github, and I intend to publish it via the pgxn.org website once my account will be approved. Docs are available here. It's compatible with PostgreSQL 8.4, 9.0, and 9.1.

Next thing I will do for this module is to add support for syslog. Right now, it only logs to the same location than PostgreSQL.

Follow-up on pgconf.eu 2011

Just got an email asking me where we did put the slides of the talks. As they are a bit difficult to find, here is the URL: http://wiki.postgresql.org/wiki/PostgreSQL_Conference_Europe_Talks_2011. Unfortunately, you won't find every speaker's slides yet. And for the record, I fixed my slides (two SQL statements were wrong... kinda lame of me).

I guess we should have the room host collects them at the end of each talk. That's something I expect us to do next year.

BTW, attendees, I don't know if you've all already done that, but remember that you can add feedbacks for the event itself and for each of the talks you've attended. Here is the webpage to do that: http://2011.pgconf.eu/feedback/.

End of pgAdmin's GSoC 2011

Luis Ochoa sent me his latest patch some time ago. It took me a while to find some time to review his patch, to fix a few bugs, but I finally did it. And I applied the resulting patch this afternoon.

What we now have is really awesome: reverse engineering of an existing schema. Even more, we can build an SQL script to create the reversed model or only apply the changes between the model and a selected database. It could be another database than the one used for the reverse engineering. One simple example to show its power. You can reverse engineer your schema on your development database, connect to your production database and ask the database designer to build the script that will contain only the SQL to push the definition changes.

Of course, right now, the code is probably quite buggy. And it lacks a lot of features. First one would be a better datatype support. Only a few types are accepted right now. User defined types aren't obviously but also some uncommon types (like tsvector), and common ones (timestamp with timezone for example). Also lacking is a support of all the table properties (fillfactor, autovacuum settings are part of the missing ones).

If you want to take a look, you should probably see the video attached to this blogpost. You'll see how the database designer works with the pagila database (I fixed a few types in the schema to show a better demo but that's the only changes I made).

If you want to see it live, maybe you should come to pgconf.eu in Amsterdam. Luis Ochoa and I will be presenting this tool, do some demos, and wait for suggestions, and feature requests.

Wanna help us adding new features to pgAdmin?

And you don't know how to start? I've been there when I first worked on pgAdmin's source code. You need to understand pgAdmin's code, and you need to know wxWidgets. And PostgreSQL of course. Trust me, it takes quite some time to feel confident.

But there are a few things you can do to help us in our coding department. There are some little features, eagerly waited by some of our users, that are quite easy to code and that could help you start coding on pgAdmin. We have a list of feature requests on http://code.pgadmin.org and I spent some time this evening to add a juniorjob tag to each easy-to-code feature. If you're interested, you should get a look at this junior job list. It would really help us to have some people working on these tickets.

And if you don't want to work on C++ code, you can still help us with the documentation. We'll try to have a complete new manual for 1.16, but we need help to write it. We also badly need translators.

Last month work on pgAdmin

Yeah, my last blog post on pgAdmin is a month old. Time to give you some news.

Let's begin with PostgreSQL compatibility. 1.14 won't handle security labels. We haven't find time to add support for that. Good news is that 1.16 will. We already have that in git. Here is how it looks on a table:

X

See the new "Security labels" tab? that's it. A better news is that we already support security labels on shared objects (I mean databases, tablespaces, and roles). And that's a 9.2 new feature.

X

Speaking of 9.2, we also added support for invalid domains and invalid check constraints.

invaliddomain.png

The user can create these objects without the validate step; They can validate them later.

I think that's about it on PostgreSQL compatibility.

I know pgAdmin for quite a long time, and I've often been bitten by this old bug: open a property dialog, refresh the object on the browser, change the object in the property dialog... and crash. This shouldn't happen again. Once a property dialog is open for an object, the user can't refresh the object or any of its parents. He cannot either drop the object or any of its parents. And he cannot close the connection. Till he closes the property dialog. I'm really glad we finally put an end to this bug. Unfortunately, it's rather a big change in behaviour, so it won't be backpatched.

Luis Ochoa continued his work on his 2011 GSoC project (the database designer). He added load/save support. Files are in XML format, and have a .pgd extension. He also added multimodel diagram, so that the user can view his schema in different ways. Here are two views of the same design:

X

X

And last, we redesigned the options window, so that we could put more options without making the dialog bigger. It follows what LibreOffice and other programs are already doing:

X

So, despite the summer vacations, we're still going forward and at a good pace.

Allow change of a schema using pgAdmin's GUI

That's something I wanted to do for quite some time, and Thom Brown beats me to it. He added a "Schema" combobox in each object that's part of a schema. For example, on a sequence, you'll get this:

sequence.png

Before this patch, the user needed to execute the "ALTER SEQUENCE... SET SCHEMA" in the query tool, switch back to the browser, and refresh it. Now, it only has to get into the properties dialog of the sequence, select the right schema in the combobox, and click on the "OK" button. Much simpler.

Of course, it works with all objects you can find in a schema.

Thanks, Thom.

Nice talks at char(11)

As Dimitri already said, char(11) is over. This was my first time at this event, and I don't regret it. The talks were mostly great. My favorites were Jan's talk on the evolution of replication, Magnus's talk on the replication protocol, and Simon's talk on the replication features in core. The GreenPlum talks were also really interesting to me because there's not much information out there on the internet about that fork and finally getting some informations feels good.

Oh, and Harald has been a huge room host, really dedicated to his work. So, congrats to him too.

Well done, 2ndQuadrant guys. Hope there will be a char(12) next year. (And don't forget to add the talks' slides on a website so that we could have another look at them. Thanks :) )

Search Object tool in pgAdmin

Something was missing in pgAdmin, something I didn't believe at first. Jasmin Dizdarevic came up some (long) time ago with a patch implementing a Search Object tool. After lots of discussion, I commited the final patch.

The contextual menu of each database contains a new menu item: "Search objects...".

menu.png

When you select this item, it'll open a new window asking you to enter a name pattern, and press the "Find" button. Once you've done that, pgAdmin will search each object whose name contains the pattern. It'll will display each object found in the results list.

searchobject_tool.png

When you click on one of the results, it will select the object in the browser.

Quite impressive feature if you ask me, and, sure, one that was missing. There are probably a few more tweaks that could be interesting to enhance this feature (like search in the comments, and objects's icons in the results list). What's great is that we still have plenty of time to do that :)

Anyway, thanks a lot to Jasmin Dizdarevic!

Database Designer is finally in

I commited it this morning. First part of Luis's GSoC project this year is now commited. pgAdmin has a graphical database designer. It's not pretty yet, it doesn't have all features I would love to see implemented, so it's rather raw. But it's there. And we can build something great with this.

Here is what I just did with it.

dd.png

I'll try to find a way to record a session of work with the database designer.

Anyway, it's available in the master branch of pgAdmin's GIT repository (http://git.postgresql.org/gitweb?p=pgadmin3.git;a=summary). Go grab it, and tell us what you think of it (but keep in mind it's its early stage :) ).

pgAdmin 1.14 beta 1 available

pgAdmin v1.14.0 Beta 1 is now available for testing purpose. You can grab a copy (source or binary) on http://www.postgresql.org/ftp/pgadmin3/release/v1.14.0-beta1/ .

There are many new features in this release. Not as many as I would have liked, but still. There are plenty new things I'm sure you'll like.

Of course, PostgreSQL 9.1 support was our first goal. And it seems we did pretty well on this one. The only thing left out, IIRC, is the security labels. All the other new objects are handled: collation, extension, foreign table. Some old objects are finally available: the SQL/Med ones of course, but also the exclusion constraint and the constraint trigger.

We try to make the UI better too. All windows should work on a 800x600 screen. Even if I'm not sure people still use such small screens.

I'm sorry we still don't have the now usual visual tour. I wrote it last week-end, but it still needs a review of an english native speaker.

Of course, we still find bugs. We already fixed a few. For example, Thom Brown found a lot of issues in the foreign table support. They are fixed. There was a segfault on the property dialog for the index. Fixed too. I also fixed some issues on the DDL for permissions, a bug found by Erwin Brandstetter.

All this means we really need you to test this beta release, and send us (on pgadmin-support mailing list for example) the bugs you found, so that we could get rid of them for the final release.

GSoC project "Development of improvements in pgAdmin physical database designer module" accepted

I'm a bit late to announce it, but Luis's project has been accepted.

Last year, he worked on a database designer. We had something nearly working. Still a few bugs here and there, and quite a lot of cleaning to do. I did some, but didn't had the time to do much of it. So we won't have that for the next release of pgAdmin.

But the good news is that Luis is back. The first part of this project is to clean and debug his code. That should be done in early june. We'll have a commitable patch then. Something I'll commit with pleasure :)

Then, his main work will be to add more, and more features to the database designer.

This will be a really interesting summer :-D

pgAdmin 1.12.3 available...

... since 2011-04-18. I guess we're not good at announcing things :-/

This bugfix release is a long awaited one. We missed to get one out when PostgreSQL 9.0.3 went out. So, this release contains a lot of bugfixes (26). The complete list is available in the changelog. The most interesting patch fixes the issue when a user tried to dump a database whose name is in mixed case. That one was asked so many time, it is just great the release having the fix is finally out.

And while I'm blogging, remember that pgAdmin 1.14 beta 1 will be out next week.

Quick reminder: only three days to apply as a student to GSoC 2011

If you're a student and interested in working on a PostgreSQL project such as PostgreSQL itself, pgAdmin, phpPgAdmin, pgPool-II, you should really hurry to apply as a student and submit a proposal.

There are still three days left to do so.

Attendees of PGDay.eu 2010, send your feedback please

Now that PGDay.eu 2010 is over, this is a good time for us to think about what we did right and, perhaps more important, what we did wrong. You can help us on that work by telling us what you think about the venue, the caterer, and the talks (and I mean, each of them, including the keynote).

Thanks.

Bugfixes for pgAdmin

Now that most of the already commited 9.1 features are supported by pgAdmin, I realized that we had quite a lot of bugs to fix. 14 if trac is up to date. I'll work on these starting tonight and during next week. Obviously, I won't be able to fix those on Solaris, but I would really like us to have a much smaller number of bugs.

If you know of a bug on pgAdmin, we would like to know about it too. You've got two mailing lists to let us know: pgadmin-hackers and pgadmin-support.

- page 1 de 2