8
0
mirror of https://github.com/FirebirdSQL/firebird.git synced 2025-01-22 20:43:02 +01:00

Documentation.

This commit is contained in:
Vlad Khorsun 2022-06-16 18:48:10 +03:00
parent 889ff29299
commit 7d271e9e24
3 changed files with 186 additions and 3 deletions

View File

@ -1062,16 +1062,24 @@
# ============================
#
# Limit number of parallel workers for the single task. Per-process.
# Limits the total number of parallel workers that could be created within a
# single Firebird process for each attached database.
# Note, workers are accounted for each attached database independently.
# Valid values are from 1 (no parallelism) to 64. All other values
# silently ignored and default value of 1 is used.
# Per-process.
#
# Type: integer
#
#MaxParallelWorkers = 1
#
# Default number of parallel workers for the single task. Per-process.
# Default number of parallel workers for the single task.
# Valid values are from 1 (no parallelism) to MaxParallelWorkers (above).
# Values less than 1 is silently ignored and default value of 1 is used.
# Per-process.
#
# Type: integer
#
#ParallelWorkers = 1

View File

@ -1,4 +1,9 @@
In Firebird 4.0 a new switch was added to gbak: -INCLUDE(_DATA).
gbak enhancements in Firebird v4.
---------------------------------
A new switch was added to gbak: -INCLUDE(_DATA).
Author: Dimitry Sibiryakov <sd at ibphoenix com>
It takes one parameter which is "similar like" pattern matching
table names in a case-insensitive way.
@ -17,3 +22,93 @@ a table is following:
| MATCH | excluded | excluded | excluded |
| NOT MATCH | included | included | excluded |
+-----------+------------+------------+------------+
gbak enhancements in Firebird v5.
---------------------------------
1. Parallel execution.
Author: Vladyslav Khorsun <hvlad at users sourceforge net>
a) gbak backup
Backup could read source database tables using multiple threads in parallel.
New switch
-PAR(ALLEL) parallel workers
set number of workers that should be used for backup process. Default is 1.
Every additional worker creates own thread and own new connection used to read
data in parallel with other workers. All worker connections shares same database
snapshot to ensure consistent data view across all of its. Workers are created
and managed by gbak itself. Note, metadata still reads by single thread.
b) gbak restore
Restore could put data into user tables using multiple threads in parallel.
New switch
-PAR(ALLEL) parallel workers
set number of workers that should be used for restore process. Default is 1.
Every additional worker creates own thread and own new connection used to load
data in parallel with other workers. Metadata is still created using single
thread. Also, "main" connection uses DPB tag isc_dpb_parallel_workers to pass
the value of switch -PARALLEL to the engine - it allows to use engine ability
to build indices in parallel. If -PARALLEL switch is not used gbak will load
data using single thread and will not use DPB tag isc_dpb_parallel_workers. In
this case engine will use value of ParallelWorkers setting when building
indices, i.e. this phase could be run in parallel by the engine itself. To
fully avoid parallel operations when restoring database, use -PARALLEL 1.
Note, gbak not uses firebird.conf by itself and ParallelWorkers setting does
not affect its operations.
Examples.
Set in firebird.conf ParallelWorkers = 4, MaxParallelWorkers = 8 and restart
Firebird server.
a) backup using 2 parallel workers
gbak -b <database> <backup> -parallel 2
Here gbak will read user data using 2 connections and 2 threads.
b) restore using 2 parallel workers
gbak -r <backup> <database> -parallel 2
Here gbak will put user data using 2 connections and 2 threads. Also,
engine will build indices using 2 connections and 2 threads.
c) restore using no parallel workers but let engine to decide how many worker
shoudl be used to build indices
gbak -r <backup> <database>
Here gbak will put user data using single connection. Eengine will build
indices using 4 connections and 4 threads as set by ParallelWorkers.
d) restore using no parallel workers and not allow engine build indices in
parallel
gbak -r <backup> <database> -par 1
2. Direct IO for backup files.
New switch
-D(IRECT_IO) direct IO for backup file(s)
instruct gbak to open\create backup file(s) in direct IO (or unbuferred) mode.
It allows to not consume file system cache memory for backup files. Usually
backup is read (by restore) or write (by backup) just once and there is no big
use from caching it contents. Performance should not suffer as gbak uses
sequential IO with relatively big chunks.
Direct IO mode is silently ignored if backup file is redirected into standard
input\output, i.e. if "stdin"\"stdout" is used as backup file name.

View File

@ -0,0 +1,80 @@
Firebird engine parallel features in v5.
----------------------------------------
Author: Vladyslav Khorsun <hvlad at users sourceforge net>
The Firebird engine can now execute some tasks using multiple threads in
parallel. Currently parallel execution is implemented for the sweep and the
index creation tasks. Parallel execution is supported for both auto- and manual
sweep.
To handle same task by multiple threads engine runs additional worker threads
and creates internal worker attachments. By default, parallel execution is not
enabled. There are two ways to enable parallelism in user attachment:
- set number of parallel workers in DPB using new tag isc_dpb_parallel_workers,
- set default number of parallel workers using new setting ParallelWorkers in
firebird.conf.
For gfix utility there is new command-line switch -parallel that allows to
set number of parallel workers for the sweep task. For example:
gfix -sweep -parallel 4 <database>
will run sweep on given database and ask engine to use 4 workers. gfix uses DPB
tag isc_dpb_parallel_workers when attaches to <database>, if switch -parallel
is present.
New firebird.conf setting ParallelWorkers set default number of parallel
workers that can be used by any user attachment running parallelizable task.
Default value is 1 and means no use of additional parallel workers. Value in
DPB have higher priority than setting in firebird.conf.
To control number of additional workers that can be created by the engine
there are two new settings in firebird.conf:
- ParallelWorkers - set default number of parallel workers that used by user
attachments.
Could be overriden by attachment using tag isc_dpb_parallel_workers in DPB.
- MaxParallelWorkers - limit number of simultaneously used workers for the
given database and Firebird process.
Internal worker attachments are created and managed by the engine itself.
Engine maintains per-database pools of worker attachments. Number of items in
each of such pool is limited by value of MaxParallelWorkers setting. The pools
are created by each Firebird process independently.
In Super Server architecture worker attachments are implemented as light-
weight system attachments, while in Classic and Super Classic its looks like
usual user attachments. All worker attachments are embedded into creating
server process. Thus in Classic architectures there is no additional server
processes. Worker attachments are present in monitoring tables. Idle worker
attachment is destroyed after 60 seconds of inactivity. Also, in Classic
architectures worker attachments are destroyed immediately after last user
connection detached from database.
Examples:
Set in firebird.conf ParallelWorkers = 4, MaxParallelWorkers = 8 and restart
Firebird server.
a) Connect to test database not using isc_dpb_parallel_workers in DPB and
execute "CREATE INDEX ..." SQL statement. On commit the index will be actually
created and engine will use 3 additional worker attachments. In total, 4
attachments in 4 threads will work on index creation.
b) Ensure auto-sweep is enabled for test database. When auto-sweep will run on
that database, it also will use 3 additional workers (and run within 4 threads).
c) more than one single task at time could be parallelized: make 2 attachments
and execute "CREATE INDEX ..." in each of them (of course indices to be built
should be different). Each index will be created using 4 attachments (1 user
and 3 worker) and 4 threads.
d) run gfix -sweep <database> - not specifying switch -parallel: sweep will run
using 4 attachments in 4 threads.
d) run gfix -sweep -parallel 2 <database>: sweep will run using 2 attachments in
2 threads. This shows that value in DPB tag isc_dpb_parallel_workers overrides
value of setting ParallelWorkers.