A simple custom aggregate

Custom units (custom aggregates) are one of the unique features of PostgreSQL that seems to be beyond the power of anyone. However, as soon as to create at least one really working example, the clouds open up, and you'll wonder how you ever lived without such hardened features. So let's create a simple unit. It will return the state, i.e. the most frequently occurring value, logical (Boolean) fields.

Why would someone want strange? Well, imagine you day and night to monitor a few of your web servers and you want to have an hourly pattern of uptime (uptime). Let's say every 30 seconds, the server status is entered in the table: TRUE — the server is running, FALSE — the server is. Then if most of the time the server was working, we want to return TRUE, and if for the most part, the server stood, respectively, then FALSE. And if the system monitor is and, therefore, there is no data will return NULL.

Of course, all this can crank and using a variety of other mechanisms, for example using the mechanism of WINDOW. However, imagine that in one query you will need to work with other gathered statistics, for example, downtime or server. In this case, PostgreSQL offers a graceful mechanism.

First, we need an aggregate function, which will accumulate the data about the field is Boolean. Usually, this function has two input parameters:
the
    the
  • option to store the calculated values (because each row will call this function);
  • the
  • the type parameter of the column in which the value for the current row.


Let's say we want to store the number of readings UP and DOWN for the server. You can use an integer array. With the same success it can be done with composite type, approx. interpreter. Such a function is easy to write even in pure SQL:

the
CREATE OR REPLACE function mode_bool_state(int[], boolean) RETURNS int[]
LANGUAGE sql
as $body$
SELECT CASE $2
WHEN TRUE THEN
array[ $1[1] + 1, $1[2] ]
WHEN FALSE THEN
array[ $1[1], $1[2] + 1 ]
ELSE
$1
END;
$body$;


Note that the result of the function int[] will be filed as the first parameter on entry to this function, when you call for the next line, approx. interpreter.

For decision-making and output the final result will create one more function:
the
CREATE OR REPLACE FUNCTION mode_bool_final(INT[]) RETURNS boolean
LANGUAGE sql
as $body$
SELECT CASE WHEN ( $1[1] = 0 AND $1[2] = 0 )
THEN NULL
ELSE $1[1] >= $1[2]
END;
$body$;


The case for small — to declare the Assembly:
the
CREATE AGGREGATE mode(boolean) (
SFUNC = mode_bool_state,
STYPE = INT[],
FINALFUNC = mode_bool_final,
INITCOND = '{0,0}'
);


Here SFUNC and FINALFUNC — the names of our functions, STYPE data type for statistics collection, INITCOND initial conditions.

Let's see how it works!

the
SELECT server_name,
sum(CASE WHEN server_up THEN 0.5 ELSE 0 END) as minutes_up,
mode(server_up) as mode
FROM servers 
WHERE montime BETWEEN '2013-04-01' and '2013-04-01 01:00:00';


the
server_name minutes_up mode
web1 56.5 TRUE
web2 0.0 FALSE
web3 48.0 TRUE
web4 11.5 FALSE


PS Article in English Tom brown also tells the story of how to create a custom unit. The author does not use the final optional FINALFUNC, because the type of data collection STYPE in his example, the same as the base type of the aggregate.
Article based on information from habrahabr.ru

Comments

Popular posts from this blog

Powershell and Cyrillic in the console (updated)

Active/Passive PostgreSQL Cluster, using Pacemaker, Corosync

Automatic deployment ElasticBeanstalk using Bitbucket Pipelines