Documentation.

2025-01-22 17:23:03 +01:00 · 2022-06-16 18:48:10 +03:00 · 2022-06-16 18:48:10 +03:00 · 7d271e9e24
commit 7d271e9e24
parent 889ff29299
3 changed files with 186 additions and 3 deletions
--- a/builds/install/misc/firebird.conf
+++ b/builds/install/misc/firebird.conf
@ -1062,16 +1062,24 @@
 # ============================

 #
-#  Limit number of parallel workers for the single task. Per-process.
+#  Limits the total number of parallel workers that could be created within a
+#  single Firebird process for each attached database.
+#  Note, workers are accounted for each attached database independently.
 #  Valid values are from 1 (no parallelism) to 64. All other values
 #  silently ignored and default value of 1 is used.
+#  Per-process.
+#
+# Type: integer
 #
 #MaxParallelWorkers = 1

 #
-#  Default number of parallel workers for the single task. Per-process.
+#  Default number of parallel workers for the single task. 
 #  Valid values are from 1 (no parallelism) to MaxParallelWorkers (above). 
 #  Values less than 1 is silently ignored and default value of 1 is used.
+#  Per-process.
+#
+# Type: integer
 #
 #ParallelWorkers = 1

--- a/doc/README.gbak
+++ b/doc/README.gbak
@ -1,4 +1,9 @@
-In Firebird 4.0 a new switch was added to gbak: -INCLUDE(_DATA).
+gbak enhancements in Firebird v4.
+---------------------------------
+
+A new switch was added to gbak: -INCLUDE(_DATA).
+
+Author: Dimitry Sibiryakov <sd at ibphoenix com>

 It takes one parameter which is "similar like" pattern matching
 table names in a case-insensitive way.
@ -17,3 +22,93 @@ a table is following:
 |   MATCH   |  excluded  |  excluded  |  excluded  |
 | NOT MATCH |  included  |  included  |  excluded  |
 +-----------+------------+------------+------------+
+
+
+
+gbak enhancements in Firebird v5.
+---------------------------------
+
+1. Parallel execution.
+
+Author: Vladyslav Khorsun <hvlad at users sourceforge net>
+
+a) gbak backup 
+
+Backup could read source database tables using multiple threads in parallel.
+
+New switch 
+-PAR(ALLEL)           parallel workers
+
+set number of workers that should be used for backup process. Default is 1.
+Every additional worker creates own thread and own new connection used to read
+data in parallel with other workers. All worker connections shares same database
+snapshot to ensure consistent data view across all of its. Workers are created
+and managed by gbak itself. Note, metadata still reads by single thread.
+
+b) gbak restore
+
+Restore could put data into user tables using multiple threads in parallel.
+
+New switch 
+-PAR(ALLEL)           parallel workers
+
+set number of workers that should be used for restore process. Default is 1.
+Every additional worker creates own thread and own new connection used to load
+data in parallel with other workers. Metadata is still created using single
+thread. Also, "main" connection uses DPB tag isc_dpb_parallel_workers to pass
+the value of switch -PARALLEL to the engine - it allows to use engine ability
+to build indices in parallel. If -PARALLEL switch is not used gbak will load
+data using single thread and will not use DPB tag isc_dpb_parallel_workers. In
+this case engine will use value of ParallelWorkers setting when building
+indices, i.e. this phase could be run in parallel by the engine itself. To 
+fully avoid parallel operations when restoring database, use -PARALLEL 1.
+
+  Note, gbak not uses firebird.conf by itself and ParallelWorkers setting does
+not affect its operations.
+
+
+Examples.
+
+  Set in firebird.conf ParallelWorkers = 4, MaxParallelWorkers = 8 and restart
+Firebird server.
+
+a) backup using 2 parallel workers
+
+	gbak -b <database> <backup> -parallel 2
+
+  Here gbak will read user data using 2 connections and 2 threads.
+
+
+b) restore using 2 parallel workers
+
+	gbak -r <backup> <database> -parallel 2
+
+  Here gbak will put user data using 2 connections and 2 threads. Also, 
+engine will build indices using 2 connections and 2 threads.
+
+c) restore using no parallel workers but let engine to decide how many worker
+shoudl be used to build indices
+
+	gbak -r <backup> <database>
+
+  Here gbak will put user data using single connection. Eengine will build 
+indices using 4 connections and 4 threads as set by ParallelWorkers.
+
+d) restore using no parallel workers and not allow engine build indices in
+parallel
+
+	gbak -r <backup> <database> -par 1
+
+
+2. Direct IO for backup files.
+
+New switch
+-D(IRECT_IO)          direct IO for backup file(s)
+
+instruct gbak to open\create backup file(s) in direct IO (or unbuferred) mode.
+It allows to not consume file system cache memory for backup files. Usually
+backup is read (by restore) or write (by backup) just once and there is no big
+use from caching it contents. Performance should not suffer as gbak uses 
+sequential IO with relatively big chunks.
+  Direct IO mode is silently ignored if backup file is redirected into standard
+input\output, i.e. if "stdin"\"stdout" is used as backup file name.
--- a/doc/README.parallel_features
+++ b/doc/README.parallel_features
@ -0,0 +1,80 @@
+Firebird engine parallel features in v5.
+----------------------------------------
+
+Author: Vladyslav Khorsun <hvlad at users sourceforge net>
+
+
+  The Firebird engine can now execute some tasks using multiple threads in
+parallel. Currently parallel execution is implemented for the sweep and the
+index creation tasks. Parallel execution is supported for both auto- and manual
+sweep.
+
+  To handle same task by multiple threads engine runs additional worker threads
+and creates internal worker attachments. By default, parallel execution is not
+enabled. There are two ways to enable parallelism in user attachment:
+- set number of parallel workers in DPB using new tag isc_dpb_parallel_workers,
+- set default number of parallel workers using new setting ParallelWorkers in
+  firebird.conf.
+
+  For gfix utility there is new command-line switch -parallel that allows to
+set number of parallel workers for the sweep task. For example:
+
+  gfix -sweep -parallel 4 <database>
+
+will run sweep on given database and ask engine to use 4 workers. gfix uses DPB
+tag isc_dpb_parallel_workers when attaches to <database>, if switch -parallel
+is present.
+
+  New firebird.conf setting ParallelWorkers set default number of parallel
+workers that can be used by any user attachment running parallelizable task.
+Default value is 1 and means no use of additional parallel workers. Value in
+DPB have higher priority than setting in firebird.conf.
+
+  To control number of additional workers that can be created by the engine 
+there are two new settings in firebird.conf: 
+- ParallelWorkers - set default number of parallel workers that used by user 
+  attachments. 
+  Could be overriden by attachment using tag isc_dpb_parallel_workers in DPB.
+- MaxParallelWorkers - limit number of simultaneously used workers for the
+  given database and Firebird process.
+
+  Internal worker attachments are created and managed by the engine itself.
+Engine maintains per-database pools of worker attachments. Number of items in
+each of such pool is limited by value of MaxParallelWorkers setting. The pools
+are created by each Firebird process independently.
+
+  In Super Server architecture worker attachments are implemented as light-
+weight system attachments, while in Classic and Super Classic its looks like
+usual user attachments. All worker attachments are embedded into creating
+server process. Thus in Classic architectures there is no additional server
+processes. Worker attachments are present in monitoring tables. Idle worker
+attachment is destroyed after 60 seconds of inactivity. Also, in Classic
+architectures worker attachments are destroyed immediately after last user
+connection detached from database.
+
+
+Examples:
+
+  Set in firebird.conf ParallelWorkers = 4, MaxParallelWorkers = 8 and restart
+Firebird server.
+
+a) Connect to test database not using isc_dpb_parallel_workers in DPB and
+execute "CREATE INDEX ..." SQL statement. On commit the index will be actually
+created and engine will use 3 additional worker attachments. In total, 4 
+attachments in 4 threads will work on index creation.
+
+b) Ensure auto-sweep is enabled for test database. When auto-sweep will run on
+that database, it also will use 3 additional workers (and run within 4 threads).
+
+c) more than one single task at time could be parallelized: make 2 attachments
+and execute "CREATE INDEX ..." in each of them (of course indices to be built
+should be different). Each index will be created using 4 attachments (1 user
+and 3 worker) and 4 threads.
+
+d) run gfix -sweep <database> - not specifying switch -parallel: sweep will run
+using 4 attachments in 4 threads.
+
+d) run gfix -sweep -parallel 2 <database>: sweep will run using 2 attachments in
+2 threads. This shows that value in DPB tag isc_dpb_parallel_workers overrides
+value of setting ParallelWorkers.
+