Databene Benerator — binaryres it

"Get the dictionary and look what "catharsis". If that's what he wants us to soak, I want to know what it is." (C) Analyze this!

the

Introduction


Late in the evening, when design the database 64 of the table was almost complete, and the interface for their completion has not even started, the question arose about how they still to populate with data.
Fill in your hand — the idea was thrown to the side immediately.
"Something nakodit!" — screaming soul.
"We need to download something!" insisted mind!
The result proskynesis on the Internet and find a dozen different kinds of solutions as installable and SaaS, both paid and free — I found it — databene-benerator - generator linked data (fixtures) for the databases. And an article in Russian, with description and syntax (1), as well as her, but in English (2). I understand it is what you need! But where it to take? How to use it under Windows? Comfortable? Support Russian characters?

And so "catharsis" (3) is a concept in ancient philosophy; the term used to denote the process and result of facilitating the cleansing and ennobling of human exposure to various factors.

How is it connected with the theme of the publication? You'll understand if you read it. Invited under the cut!

the

creating a project in Eclipse


What has been described in the aforementioned articles about the "benerator" I do not quite fit.
    the
  1. use for Windows;
  2. the
  3. like the GUI (such a weakness as a cat... Well, you understand).
  4. the
  5. able to work with MySQL but not with PostgreSQL.
  6. the
  7. I need the data also in Russian.

If any of the above are suitable for you, so you have another way, or rather another entrance this way!

First we need to obtain benerator, it is necessary to fill out the form at
bergmann-it.de/download/download_benerator?lang=en
and click "Download".

At the time of publication available in 0.9.8, I used 0.9.7, in fact, the difference you most likely will not notice, since the most recent manual that I could find it — this (4) for version 0.8.1.

I Stumbled on it by accident, by comparing the version in the manual (http://databene.org/download/databene-benerator-manual-0.7.6.pdf) on the website and the version of benerator. I began to pick up the version in the address of the manual, and what was the surprise of finding 0.8.1!!! Further searches have not crowned success...

And so you did! In our hands, i.e. the fingertips archive "databene-benerator-0.9.7" (you fresh). Now, what to do with it.

Extract in "D:\databene-benerator-0.9.7".

And then it went pure shamanism: on the forums mentioned the maven — who is this beast I don't know, but I will say that working without it!
By not tricky operation to see what is in the archive. There are batnitzky (or sh-scripts to the word too) that run... benerator.bat launches benerator_common.bat he starts java.exe. The parameters of the first benerator.xml. In the second path to the lib folder and there *.jar....
At that time I tried to work only in two IDE for Java development is Netbeans and Eclipse. Asked Google the question "databene Benerator with eclipse" in the results got the answer "databene.org/databene-benerator/115-my-first-ide-based-benerator-project.html" — but the links from the pages of the official website on this page — no!


Now all we need is Eclipse, download and extract, if it does not. Any version of. I'm a little familiar with PHP, so my choice is you guess. By the way, and the location of the work Eclipse Windows (called Vista) to work with veneration — the PHP (you can select in the upper right corner).
So launch Eclipse, create a project:
Select "File->New->Project..."

Then select "Java Project", click "Next->".
In the window that appears, enter the name of the project "generatedb", and select the Project layout as "Use project folder as root for sources and class files", click "Next->".
Switch to the Libraries tab, click "Add External JARs...". In the opened window go to "D:\databene-benerator-0.9.7\lib" and choose all the files that are there.
However, to run benerator need to set "zapuskaem"!
Select "Run- > Run Configurations...".

In the window that appears:
1. "Java Aplication" do a RMB and select "New".
2. Then "Name" specify the name of our configuration to run.
3. Next "Project" remain unchanged.
4. And in the "Main class" enter "org.databene.benerator.main.Benerator".
5. Click "Apply".
If you clicked "Run", then the tab "Console" will be issued a large number of different strings of expletives, not all in Russian. This is because we have not done the most important thing. So what are we waiting for?

the

project Structure


In our project it is time to add files "benerator.xml" and "log4j.xml", the lack of which swore benerator.
Right-clicking on the project in project Explorer and select "New->XML File", enter a file name, and then "Finish".
benerator.xml – the main project file, it describes everything you will do with your tables.
log4j.xml – the configuration file for logs, its configuration depends on how and to what extent benerator spit out to the console (service information).

The content "log4j.xml" bring to mind:
log4j.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">

<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/" debug="false">

<!-- Append messages to the console -->
<appender name="CONSOLE" class="org.apache.log4j.ConsoleAppender">
<param name="Target" value="System.out"/>
<param name="Threshold" value="debug"/>
<layout class="org.apache.log4j.PatternLayout">
<param name="ConversionPattern" value="%d{ABSOLUTE} %-5p (%t) [%c{1}] %m%n"/>
</layout>
</appender>

<!-- Limit categories -->

<category name="org.apache">
<priority value="warn"/>
</category>

<category name="org.databene">
<priority value="info"/>
</category>

<!-- <category name="org.databene.commons">
<priority value="debug"/>
</category> -->

<category name="org.databene.COMMENT">
<priority value="debug"/>
</category>

<category name="org.databene.benerator.STATE">
<priority value="info"/>
</category>

<category name="org.databene.domain">
<priority value="info"/>
</category>

<category name="org.databene.SQL">
<priority value="debug"/>
</category>

<!-- ======================= -->
<!-- Setup the Root category -->
<!-- ======================= -->

<root>
<priority value="info"/>
<appender-ref ref="CONSOLE"/>
</root>

</log4j:configuration>



The content "benerator.xml" bring to mind:
benerator.xml
<?xml version="1.0" encoding="UTF-8"?>
<set xmlns="http://databene.org/benerator/0.9.7"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://databene.org/benerator/0.9.7 benerator is 0.9.7.xsd"
defaultEncoding="UTF-8"
defaultDataset="EN"
defaultLocale="EN"
defaultLineSeparator="\r\n"
defaultSeparator=";">

<import platforms="db csv" />
< import defaults="true" domains="organization,address,person net" />
<import class="org.databene.benerator.distribution.function.*,
org.databene.benerator.primitive.*,org.databene.platform.db.*"/>
<import class="org.databene.commons.TimeUtil"/>

<database id="db"
url="jdbc:mysql://localhost:3306/qs?characterEncoding=UTF-8"
driver="com.mysql.jdbc.Driver"
user="root"
password=""
catalog="qs"/>

<memstore id="memstore"/>

</setup>



the

Main script and some of the working solutions


Stop for some moments "benerator.xml" in more detail it is considered in the first Russian-language article in the guide to version 0.8.1.
Here I'll note that magic line of "characterEncoding=UTF-8" in the url parameter solves the problem with the transfer Russian characters in the database (and not only Russian).

to Mention that the developers also forgot. Well, it's not their concern. The configuration string of the jdbc universal driver for Java applications, I found this somewhere not related to the search area of the resource.

Clear database tables before re-generating
For a start prepare file "truncate_tables.mysql.sql" (text)
truncate_tables.mysql.sql
SET foreign_key_checks = 0;
--truncate table s_person;
--truncate table s_job_title;
--truncate table s_organization;
--truncate table s_department;
--truncate table t_orgstructure;
--truncate table s_type_project;
--truncate table s_direction_project;
--truncate table s_norm_labor;
--truncate table s_timetable;

SET foreign_key_checks = 1;



The first and last lines – disable and enable check consistency of the table. Otherwise a situation may arise when one table is blocking the deletion of records other (referential integrity).
1. A separate group reference (not connected with any other)
2. Next, the group of the cascade can start at the already filled tables.
Cleaning – either in whole group or one at a time, but watch out for dependencies.
Commenting on a table convenient for the fact that I had the flexibility to re-generate data.

In benerator.xml after you define memstore add row:
the
 <comment>Preparing the database</comment>
<execute uri="truncate_tables.mysql.sql" target="db" />

Beware! After starting benerator is not commented out table cleared!

Drop tables and create a database – to make such manipulation here, in my opinion, not worth it. This is a handy tools to synchronize models and databases. I use MySQL Workbench (5).

Integration into benerator scripts to... JavaScript!? Yes, Yes, it is!
In benerator.xml after determining the execute add row:
the
<comment>Prepare external scripts</comment>
<execute uri="script.js" type="js"/>

Create a file "script.js" (text)
script.js
function toLink (str) {
var space = ";
str = str.toLowerCase();
var transl = {
'a': 'a', 'b': 'b', 'in': 'v', 'g': 'g', 'd': 'd', 'e': 'e', 'e': 'e', 'W': 'zh', 
'z': 'z', 'and': 'i', 'th': 'j', 'K': 'k', 'l': 'l', 'm': 'm', 'n': 'n',
'o': 'o', 'p': 'p', 'p': 'r','s': 's', 't': 't', 'y': 'u', 'f': 'f', 'x': 'h',
'C': 'c', 'CH': 'ch', 'sh': 'sh', 'Sch': 'sh','b': space, 's': 'y', 's': space, 'e': 'e', 's': 'yu', 'I': 'ya'
}
var link = ";
for (var i = 0; i < str.length; i++) {
if(/[a-ll]/.test(str.charAt(i))) { //if the current character is a Latin letter, then change it
link += transl[str.charAt(i)];
}   else   if (/[a-z0-9]/.test(str.charAt(i))) {
link += str.charAt(i); //if the current character is an English letter or digit, then leave it like that
} else {
if (link.slice(-1) !== space) link += space; // if neither the inserted space
}
}
return link;
}

function cut(str, cutStart, cutEnd){
return str.substr(cutStart,cutEnd);
}



The first function performs the transliteration of Russian characters in English (taken from (5), with slight modifications).
The second cuts the piece of string.

An example of using JavaScript in your code:
the
<generate type="s_organization" count="20" consumer="db,ConsoleExporter">
...
<variable name="sgn" script="{js: (p.gender.name()=='MALE') ? sgnMALE : sgnFEMALE}"/>
<attribute name="email" type='string' script="{js:toLink(p.givenName+p.familyName)+'@'+d}" converter="ToLowerCaseConverter, UniqueStringConverter"/>
<variable name="theme_tmp" type='string', generator="new SeedSentenceGenerator('csv/notes.txt',3)" />
<attribute name="theme" maxLength="45" script="{js:cut(theme_tmp,0,44)+'.'}"/>
...
</date>

Main idea: everything in the script, inside {js: } – the essence of JavaScript. Variables are passed transparently, in other it can be seen from the examples.
Already drew attention to the short form if statement?

Distribution of the database tables in separate files for easy generation
It was convenient to allocate each table, or group of interrelated tables 2-3, the generation of which cannot be executed independently – in a separate file "*.ben.xml". Each file is commented separately for the convenience of its individual generation.
Please note: these files must have the extension is "*.ben.xml".
In the main file it looks like this:
the
<!-- <include uri="table.s_organization.ben.xml" /> -->
<!-- <include uri="table.s_job_title.ben.xml" /> -->
<!-- <include uri="table.s_type_doc.ben.xml" />-->

An example of a file "table.s_organization.ben.xml" (XML)
table.s_organization.ben.xml
<?xml version="1.0" encoding="UTF-8"?>
<set xmlns="http://databene.org/benerator/0.9.7"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://databene.org/benerator/0.9.7 benerator is 0.9.7.xsd"
defaultEncoding="UTF-8"
defaultDataset="EN"
defaultLocale="EN"
defaultLineSeparator="\r\n"
defaultSeparator=";"
>

<comment>[[POPULATE TABLE s_organization]] Processed...</comment>

< generate type="s_organization" count="20" consumer="db,ConsoleExporter">
<attribute name="bik" type='string' pattern='[0-9]{9}'/>

<variable name="c" generator="CompanyNameGenerator" dataset="US" locale="us"/>
<attribute name="caption" type='string' script="c.fullName" />
<attribute name="short_caption" type='string' script="c.shortName" />
<attribute name="form_sobs" type='string' script="c.legalForm" />

<variable name="a" generator="AddressGenerator" dataset="US" locale="us"/>
<attribute name="ur_strana" type='string' script="a.country" />
<attribute name="ur_index" type='string' pattern="[0-9]{6}"/>
<attribute name="ur_nas_punkt" type='string' script="a.city" />
<attribute name="ur_ulica" type='string' script="a.street" />

<attribute name='ur_dom' type='int' min='1' max='150' />
<attribute name='ur_office' type='int' min='1' max='100' />

<attribute name="telefon" type="string" script="a.officePhone" unique="true" />
<attribute name="faks" type="string" script="a.fax" unique="true" />

<variable name="d" generator="DomainGenerator" dataset="US" locale="us"/>
<variable name="p" generator="PersonGenerator" dataset="EN" locale="EN"/>

<variable name="tag1" source="memstore" type="sgnMALE" distribution="random"/>
<variable name="tag2" source="memstore" type="sgnFEMALE" distribution="random"/>
<variable name="sgn" script="{js: (p.gender.name()=='MALE') ? sgnMALE : sgnFEMALE}"/>

<attribute name="email" type='string' script="{js:toLink(p.givenName+p.familyName)+'@'+d}" converter="ToLowerCaseConverter, UniqueStringConverter"/>
<attribute name="webpage" type='string' script="d" converter="ToLowerCaseConverter, UniqueStringConverter"/>
<attribute name="fio_ruk" type='string' script="p.familyName +' '+ p.givenName +' '+ sgn.secondgiven"/>
<attribute name="rschet" type='string' pattern="[0-9]{20}"/>
<attribute name="kschet" type='string' pattern="[0-9]{20}"/>
<attribute name="INN" type='string' pattern="[0-9]{10}"/>
<attribute name="KPP" type='string' pattern="[0-9]{9}"/>

<attribute name="date_update" type="datetime" generator="dtGen"/>
<attribute name="note" type='string', generator="new SeedSentenceGenerator('csv/notes.txt',3)" maxLength="255"/>
</date>
<comment>[[POPULATE TABLE s_organization]] End. OK!</comment>

</setup>



Please note – the structure is similar to "benerator.xml" but you do not need to describe the database connection and the connection of the common modules, because all this has already been done in the main configuration file.

the

Conclusion


Now, that is why I experienced a "catharsis" – after so much pain it worked:
1. databene-benerator run and filled the plate data, only 2-3 nights and voila a convenient tool for solving urgent problems!
2. It turns out that Russian characters, he understands, and that's my fault, that I am not familiar with the syntax jdbc driver in a Java projects (universal syntax) – 3 nights and also exactly!
3. Went algorithms populate the tables one by one, they surrendered under my head every night. All 64 tables managed to fill in 6 nights.
Yes, there are still a lot of questions, but the main ones are disclosed, the task is completed, knowledge acquired, experience gained. To change the quantity and quality of records in tables I don't need them to "shovel" hands. Benerator for a few minutes will do the trick.

article:
1. the generation of interrelated tables
2. working with date and time
3. the generation of real numbers.
However, this information is in the posts referenced, as well as in the documentation. So after this acceleration, the reader not be a lot of work to master these issues.


For the article I registered on github and posted source that can help to understand the examples. To use them, simply download the *.zip file, unzip it. To create a new project and import it in "File->Import->General->FileSystem". To note the whole project and press "Finish". Don't forget to add "zapuskaem" and the library of benerator.

Thank you for your attention!

the

materials Used


1. habrahabr.ru/post/169713. [Online]
2. sysmagazine.com/posts/169713. [Online]
3. ru.wikipedia.org/wiki/Катарсис. [Online]
4. databene.org/download/databene-benerator-manual-0.8.1.pdf. [Online]
5. dev.mysql.com/downloads/workbench. [Online]
6. ajaxs.ru/lesson/javascript/137-transliteracija_stroki_na_javascript.html. [Internet]
Article based on information from habrahabr.ru

Comments

Popular posts from this blog

Powershell and Cyrillic in the console (updated)

Active/Passive PostgreSQL Cluster, using Pacemaker, Corosync

Automatic deployment ElasticBeanstalk using Bitbucket Pipelines