Course of the young fighter PostgreSQL

I want to share with you some useful techniques for working with PostgreSQL (and other DBMS have similar functionality, but can have different syntax).

I will try to cover a variety of topics and techniques that will help when working with data, trying not to go into a detailed description of a functionality. I loved these articles, when studied independently. It's time ~~to pay tribute to free Internet self-education and~~ to write your own article.

This material will be useful to those who fully mastered the basic skills of SQL and want to learn more. I advise you to run and experiment with the examples in the pgAdmin'e, I did all the SQL queries feasible without expanding any of the dumps.

Go!

the

1. The use of temporary tables

When solving complex problems hard to put the decision in one request (although many try to do so). In such cases, it is convenient to put any intermediate data.econtinuing the table to use them in the future.

These tables are created as usual, but with the keyword TEMP, and are automatically deleted after the session.

The key ON COMMIT DROP automatically deletes a table (and all related objects) at the completion of the transaction.

Example:
the

ROLLBACK;
BEGIN;
CREATE TEMP TABLE my_fist_temp_table -- should use a more unique name
ON COMMIT DROP -- drop the table at the end of the transaction
AS 
SELECT 1 AS id, CAST ('some value' AS TEXT) AS val;

------------ Additional manipulation of the table: ------------------

-- alter the table by adding a column. I will often touch on related topics
ALTER TABLE my_fist_temp_table 
ADD COLUMN is_deleted BOOLEAN NOT NULL DEFAULT FALSE;
- for those who do not know, most often the  data  in the tables are not deleted, but are marked as deleted with a similar flag

CREATE UNIQUE INDEX ON my_fist_temp_table (lower(val))
WHERE is_deleted = FALSE; -- you could even create an index/constraint, if necessary
-- this index will not allow to insert a duplicate(not case sensitive) for column VAL for non-deleted rows

-- manipulate table data
UPDATE my_fist_temp_table 
SET id=id+3; 

- to check/use the contents of a table
SELECT * FROM my_fist_temp_table;
--COMMIT;

the

2. Commonly used shorthand syntax Postgres

the

Convert data types.

Expression:
the

SELECT CAST ('365' AS INT);

you can write less cumbersome:
the

SELECT '365'::INT;

the

Shorthand (I)LIKE '%text%'

LIKE perceives template expression. Details the manual
the operator LIKE can be replaced by ~~ (two tildes)
the operator ILIKE can be replaced by ~~* (two tilde with an asterisk)

Search regular expressions (has non-LIKE syntax)
the ~ (a single tilde) takes a regular expression
the ~* (one tilde and an asterisk) are case-insensitive version of ~

Give an example of search in different ways the rows that contain the word text
the the the the the

Abbreviated syntax	Description	Analog (I)LIKE
~ ‘text’ or ~~ ‘%text%’	Verifies that the expression is case-sensitive	LIKE '%text%'
~* ‘text’ ~~* ‘%text%’	Validates the expression case insensitive	ILIKE '%text%'
!~ ‘text’ !~~ ‘%text%’	Checks the mismatch expression, case sensitive	NOT LIKE '%text%'
!~* ‘text’ !~~* ‘%text%’	Checks the discrepancy between the expression case insensitive	NOT ILIKE '%text%'

the

3. Common table expressions (CTE). Design WITH

Very comfortable design, allows you to put the query result into a temporary table and then use it.

Examples are primitive to capture the essence.

a) a Simple SELECT

the

WITH cte_table_name AS ( -- set we were comfortable with the table name
SELECT schemaname, tablename, - our any request

ORDER BY 1,2
)
SELECT * FROM cte_table_name; -- specify our table
--in fact receive the result of the query in parentheses

This way you can 'wrap' any requests (even UPDATE, DELETE and INSERT, this will be below) and use their results in the future.

b) you Can create multiple tables, listing them in the following way

the

WITH
table_1 (col,b) AS (SELECT 1,1), -- first table
table_2 (col,c) AS (SELECT 2,2) -- second table
--,table_3 (cool,yah) AS (SELECT 2,2 from table_2) -- recently learned that you can refer to the table above
SELECT * FROM table_1 FULL JOIN table_2 USING (col);

c) you Can even put the above design in one (or more) WITH

the

WITH super_with (col,b,c) AS ( /* can set the column names in parentheses after the table name */
WITH
table_1 (col,b) AS (SELECT 1,1),
table_2 (col,c) AS (SELECT 2,2) 
SELECT * FROM table_1 FULL JOIN table_2 USING (col)-- specify our table
)
SELECT col, b*20, c*30 FROM super_with;

Performance should say that it is not necessary to place in the section WITH data that will largely be filtered subsequent to the external environment (outside the parentheses of the query), because the optimizer will not be able to build an effective query. Convenient to put in CTE the results that you want some time to apply.

the

4. Function array_agg(MyColumn).

Values in a relational database is stored separately (the attributes of one object can be represented in multiple rows). To transfer data to any application, you often need to collect data in one row (cell) or array.
In PostgreSQL there is a function array_agg(), she allows to collect in an array the data in the entire column (if the sample from one column).
When you use the GROUP BY in the array gets the data for a column for each group.

Immediately describe another feature and move on to the example.
array_to_string(array[], ';') allows you to convert the array into a string: the first parameter specifies an array, the second we were comfortable with the delimiter in single quotes (apostrophes). As delimiter you can use

special characters

Tab \t — for example, will allow you to insert cells in EXCEL without effort to split the values into columns (you can use it like this: array_to_string(array[], E'\t') )
Line \n — decomposed array values across rows in a single cell (use array_to_string(array[], E'\n') — 'll explain why below)

Example:
the

-- create and fill data table with above described method
WITH my_table (ID, year, any_val) AS ( 
VALUES (1, 2017,56)
,(2, 2017,67)
,(3, 2017,12)
,(4, 2017,30)
,(5, 2020,8)
,(6, 2030,17)
,(7, 2030,50)
) 
SELECT year
,array_agg(any_val) -- collect data (for each year) into an array
,array_agg(any_val ORDER BY any_val) AS sort_array_agg -- the order of the items can be sorted (with 9+ versions of Postgres) 
,array_to_string(array_agg(any_val),';') -- convert an array into a string
,ARRAY['This', 'is', 'my' , 'array'] AS my_simple_array is a way of creating an array
FROM my_table 
GROUP BY year; -- group data by year

Will give the result:

Perform the back action. Decompose the array into a string using the function UNNEST, at the same time demonstrate the design SELECT columns INTO table_name. To put this in a spoiler so that the article is not much swollen.

UNNEST a query

-- 1 Preparatory phase
- in the process request table will be created tst_unnest_for_del, with the SELECT INTO 
-- below query is not generated an error, if you will several times to run this script, will the script delete the table.
-- I also hope that you start this is not your production server any project where there is a table

DROP TABLE IF EXISTS tst_unnest_for_del; /* IF EXISTS will not cause errors if the tables to be deleted does not exist */

WITH my_table (ID, year, any_val) AS ( 
VALUES (1, 2017,56)
,(2, 2017,67)
,(3, 2017,12)
,(4, 2017,30)
,(5, 2020,8)
,(6, 2030,17)
,(7, 2030,50)
) 
SELECT year
,array_agg(id) AS -- collect data arr_id(id) for each year in the mountains

INTO tst_unnest_for_del -- !! way to create and populate the tables from the result
FROM my_table 
GROUP BY year;

--2 Demonstration functions Unnest
SELECT unnest(arr_id) unnest_id -- parse the id column
that year
,unnest(arr_any_val) unnest_any_val -- parse the column any_val
FROM tst_unnest_for_del 
ORDER BY 1 -- restore the sorting by id, no forced sorting data can be arranged randomly

The result:

the

5. Keyword RETURNIG *

is specified after the query INSERT, UPDATE or DELETE allows you to see the rows affected by the modification (usually the server reports only the number of modified rows).
Convenient in conjunction with BEGIN to see what impact the request, in case of uncertainty in the result or to transmit any id to the next step.

Example:
the

--1
DROP TABLE IF EXISTS for_del_tmp; /* IF EXISTS will not cause errors if the tables to be deleted does not exist */
For_del_tmp -- CREATE TABLE Create table
AS --Fill data generated from the query below
SELECT generate_series(1.1000 level) AS id, -- Generated 1000 numbered rows 
random() AS values; -- Fill with random numbers

--2
DELETE FROM for_del_tmp 
WHERE id > 500
RETURNING *; 
/*Will show all the deleted rows this command 
RETURNING * will return all columns of the table test 
you can also list the columns in the SELECT (approx. RETURNING id,name)*/

Can be used in conjunction with CTE, organize ~~insane~~ an example.

I am very confuse, fear that came hard, but I tried to comment on.

the

--1
DROP TABLE IF EXISTS for_del_tmp; /* IF EXISTS will not cause errors if the tables to be deleted does not exist */
For_del_tmp -- CREATE TABLE Create table
AS --Fill data generated from the query below
SELECT generate_series(1.1000 level) AS id, -- Generated 1000 numbered rows 
((random()*1000)::INTEGER)::text as values; /* Fill with random numbers. PS I Have Postgre 9.2 Random() returns a fractional number less than one, multiply by 1000 to obtain the integer part, then convert to INTEGER to get rid of the fractional part and convert to text, because I want the data type of created column TEXT*/

--2
DELETE FROM for_del_tmp 
WHERE id > 500
RETURNING *; -- This query simply deletes the record, returning the deleted rows on the screen

--3
WITH deleted_id (id) AS
(
DELETE FROM for_del_tmp
WHERE id > 25
RETURNING id -- removed another piece of data, recording an id in our CTE "deleted_id"
)
INSERT INTO for_del_tmp -- initiate INSERT
SELECT id, 'Deleted row' || now()::TIME || 'and if to be exact,' | | timeofday()::TIMESTAMP /* here you can see how the time difference returned by functions (depends on the description of the functions will not go deep, and so far)*/
FROM deleted_id -- insert the deleted data from the "for_del_tmp" in it
RETURNING *; -- see what poinsettias
- the whole unit can be run indefinitely, we insert the deleted data in the same table.

--4
SELECT * FROM for_del_tmp; -- check what happened in the end

Thus, to perform the data deletion, and deleted values will be passed to the next stage. It all depends on your imagination and goals. Before applying intricate designs be sure to check the documentation for your version of DBMS! (with a parallel combination of INSERT, UPDATE or DELETE are the subtleties).

the

6. Saving query result to a file

The team COPY many different settings and assignments, will describe the simplest application for review.
the

COPY (
SELECT * FROM pg_stat_activity /* Our request. For example: system table of running processes of the database */
--) TO 'C:/TEMP/my_proc_tst.csv' -- Write the query result to a file. Example for Windows
) TO '/tmp/my_proc_tst.csv' -- the Record of a query result to a file. Example for LINUX
--) TO STDOUT -- displays the data to the console or log pgAdmin
WITH CSV HEADER -- Optional string. Reports the name of the table columns in file

the

7. Execution of a query on another database

Not so long ago found out that you can address a request to the other base, this is the function dblink (all the details in the manual)

Example:
the

SELECT * FROM dblink(

'SELECT "Remote database:" || current_database()' /* has its own nuances and limitations. As an example, the request is passed in single quotes, so the quotes inside the query should be escaped (in this example, for shielding use two single quotes in a row). */
) 
RETURNS (col_name TEXT)
UNION ALL
SELECT 'Current database:' || current_database();

If an error occurs:

"ERROR: function dblink(unknown, unknown) does not exist"

you must install the extension with the following command:
the

CREATE EXTENSION dblink;

the

8. The similarity function

A function for determining the similarity of one value to another.

Used to map text data that were similar, but not equal to each other (there were typos). Saved a lot of time and nerves, minimizing manual binding.
similarity(a, b) produces a fractional number from 0 to 1, the closer to 1, the more accurate the match.
Let us turn to the example. Using WITH set up a temp table with fictional data (and specifically warped for demonstration functions), and will compare each line of our text. In the example below we look for something more like , OOO "ROMASHKA" (substitute the second parameter of the function).
the

WITH company (id,c_name) AS (
VALUES (1, 'OOO Daisy')
UNION ALL
/* PS UNION ALL is faster than UNION, because there is no forced sorting to eliminate duplicates, which we do not need in this case */
VALUES (2, 'OOO "ROMASHKA"')
UNION ALL
VALUES (3, 'OOO Ramashka')
UNION ALL
VALUES (4, 'JSC "ROMAXA"')
UNION ALL
VALUES (5, 'JSC DAISY')
UNION ALL
VALUES (6, 'OOO RO MASHKA')
UNION ALL
VALUES (7, 'OOO BARNYARD')
UNION ALL
VALUES (8, 'ZAO DAISY')
UNION ALL
VALUES (9, 'How did this get here?')
UNION ALL
VALUES (10, 'Daisy 33')
UNION ALL
VALUES (11, 'SP "Romashkevich"')
UNION ALL
VALUES (12, 'OOO "Roma Skovic"')
UNION ALL
VALUES (13, 'FE "Roma Skovic"')
)
SELECT *, similarity(c_name,', OOO "ROMASHKA"')
,dense_rank() OVER (ORDER BY similarity(c_name,', OOO "ROMASHKA"') DESC) 
AS "result Ranking" -- window functions, it is described below
FROM company
WHERE similarity(c_name,', OOO "ROMASHKA"') >0.25 -- values from 0 to 1, the closer to 1, the more accurate the match
ORDER BY similarity DESC;

We get the following result:

If an error occurs

"ERROR: function similarity(unknown, unknown) does not exist"

you must install the extension with the following command:
the

CREATE EXTENSION pg_trgm;

a more complex Example

WITH company (id,c_name) AS ( -- input data table
VALUES (1, 'OOO Daisy')
,(2, 'OOO "ROMASHKA"')
,(3, 'OOO Ramashka')
,(4, 'JSC "ROMAXA"')
,(5, 'JSC DAISY')
,(6, 'OOO RO MASHKA')
,(7, 'OOO BARNYARD')
,(8, 'ZAO DAISY')
,(9, 'How did this get here?')
,(10, 'Daisy 33')
,(11, 'SP "Romashkevich"')
,(12, 'OOO "Roma Skovic"')
,(14, 'FE "Roma Skovic"')
,(13, 'OOO RAGA AND CAPITA')
),
compare (id, need) -- AS our base for comparison
(VALUES (100500, 'OOO "ROMASHKA"')
,(9999, 'OPEN COMPANY "HORNS AND HOOFS"')
)

SELECT c1.id, c1.c_name, 'compare with' || c2.need, similarity(c1.c_name, c2.need) 
,dense_rank() OVER (PARTITION BY c2.need ORDER BY similarity(c1.c_name, c2.need) DESC) 
AS "result Ranking" -- window functions, it is described below
FROM company c1 CROSS JOIN c2 compare
WHERE similarity(c_name, c2.need) >0.25 -- values from 0 to 1, the closer to 1, the more accurate the match
ORDER BY similarity DESC;

Get this result:

Sortable similarity DESC. The first results we see the most similar row (1 is perfect similarity).

Optionally print the value of similarity SELECT, you can simply use it in the WHERE clause similarity(c_name,', OOO "ROMASHKA"') >0.7
and ask yourself suits us the option.

PS I would be grateful if you tell me what are some other ways of mapping the text data. Tried to remove regular expressions everything except the letters/numbers, and compare for equality, but this option does not work if there are typos.

the

9. The window function OVER() (PARTITION BY ORDER BY )

Almost described in your draft this is a very powerful tool, found ~~(sadness and joy)~~ that such a quality article on this subject already exists. I see no reason to duplicate information, so we recommend to read this article (link — habrahabr.ru/post/268983/, the author kowtow ) to those who do not know how to use window functions in SQL.
the

10. Multiple template LIKE

Task. You want to filter the list of users whose names must conform to certain templates.

As always, will present the simplest example:
the

-- Create table with data
CREATE TEMP TABLE users_tst (id, u_name) 
AS (VALUES (1::INT, NULL::VARCHAR(50))
,(2, 'Juliana H.')
,(3, 'Simon I.')
,(4, 'Victoria T')
,(5, 'Olga S.')
,(6, 'Elizabeth I.')
,(7, 'Nicholas H.')
,(8, 'Isaac R.')
,(9, 'Elisha A.')
);

Have a query that performs its function, but becomes cumbersome when a large number of filters.
the

SELECT * FROM users_tst
WHERE u_name LIKE'%' 
OR u_name LIKE '%AA%' 
OR u_name LIKE 'Juliana H.'
OR u_name LIKE 'Elisha%'
-- etc

Will show you how to make it more compact:
the

SELECT * FROM users_tst
WHERE u_name LIKE ANY (ARRAY ['%', '%AA%', 'Juliana H.', 'Elisha%'])

You can do interesting tricks, using a similar approach.
Write in the comments if you have any thoughts on how to rewrite the original query.

the

11. Some useful functions

NULLIF(a,b)
There are situations when a certain value should be interpreted as NULL.
For example, a zero-length string ( " — empty string) or zero(0).
You can write CASE, but more concise to use the NULLIF function which has 2 parameters, with the equality which returns NULL, otherwise prints out the original value.
the

SELECT id
,param
,CASE WHEN param = 0 THEN NULL ELSE -- param END the decision through a CASE
,NULLIF(param,0) -- solution using NULLIF
val 
FROM(
VALUES( 1, 0, 'IN the column on the left was 0' )
) AS tst (id,param,val);

COALESCE selects the first not NULL value
the

SELECT COALESCE(NULL,NULL,-20,1,NULL,-7); --choose -20

GREATEST selects the maximum value of the following
the

SELECT GREATEST(2,1,NULL,5,7,4,-9); --choose 7

LEAST selects the minimum value of the following
the

SELECT LEAST(2,1,NULL,5,7,4,-9); -- choose -9

PG_TYPEOF shows the data type of the column
the

SELECT pg_typeof(id), pg_typeof(arr), pg_typeof(NULL) 
FROM (VALUES ('1'::SMALLINT, array[1,2,'3',3.5])) AS x(id,arr);
- will show smallint, numeric [], and unknown, respectively

IT stopping unwanted processes in a database
the

SELECT pid, query, * FROM pg_stat_activity -- table process database. In older versions of postgres column PID was called PROCPID
WHERE state <> 'idle' and pid < > pg_backend_pid(); -- exclude connection and its just a process called

SELECT pg_terminate_backend(PID); /* substitute in here the PID of the process which we want to stop, unlike commands, send a gentle signal to end, which may not always kill the process*/
SELECT it(PID); /* substitute in here the PID of the process which we want to stop. Almost guaranteed kills the query, something like KILL -9 in LINUX */

12. Escaping characters

Start with the basics.
In SQL, string values are enclosed in ' an apostrophe (single quote).
Numeric values are not framing apostrophes, and to separate the fractional part you need to use a period, since the comma is interpreted as separator
the

SELECT 'My text', 365, 567.6, 567,6

the result:

All is well, until not want to output the apostrophe character '
For this there are two methods of screening (known to me)
the

SELECT 1, 'Apostrophe," and two apostrophes in a row "" ' -- escaping double writing "
UNION ALL
SELECT 2, E Apostrophe \' and two apostrophes in a row, \ ' \ '' -- escaping with a backslash, , English letter E before the first quotation mark is necessary so that the \ character was perceived as an escape character

the result is the same:

In PostgreSQL there is a more convenient way to use the data without escape characters. In framed by two dollar signs $$ line, you can use almost any character.

Example:
the

select $$is optional to write " just to pull out an apostrophe ', or to bother with E'\' $$

receive the data in its original form:

If this is not enough, and inside you want to use two dollar symbol in a row $$, Postgres allows you to specify a "delimiter". It is only between two dollars to write your text, for example:
the

select $uniq_tAg$ is not necessary to write " just to pull out an apostrophe ', or to bother with E'\', bramlette $$ or $any_text$ $uniq_tAg$

See our text:

This method has opened not so long ago, when I began to study writing functions.

the

Conclusion

We hope this information will help you to learn a lot of new beginners and "srednicki". I myself am not a developer, but I can only call myself a fan of SQL, so how to use the techniques is up to You.

I wish you success in learning SQL. Waiting for comments and thanks for reading!

Article based on information from habrahabr.ru

Search This Blog

computer express