UpdateTasks: Difference between revisions
| Line 40: | Line 40: | ||
| For large environments, we propose to prepare a distributed parallel execution of update tasks based on the assumption that each DB instance should be able to handle "some" (e.g. four) update tasks in parallel. The exact number depends on the DB hardware and needs to be determined by experience. But since the update tasks on different schemas operate independently we don't experience bottlenecks like lock contention from parallel execution of update tasks; it is only about available (CPU, IOPS, etc) resources. | For large environments, we propose to prepare a distributed parallel execution of update tasks based on the assumption that each DB instance should be able to handle "some" (e.g. four) update tasks in parallel. The exact number depends on the DB hardware and needs to be determined by experience. But since the update tasks on different schemas operate independently we don't experience bottlenecks like lock contention from parallel execution of update tasks; it is only about available (CPU, IOPS, etc) resources. | ||
| Thus, the idea is to create text files, one per DB instance, with one schema per line. Use the SQL query given below in the "How to see all schemas" section for guidance how to obtain / create such a list. (Use the <code>-B -N</code> options and split manually the file per <code>db_pool_id</code> into separate files per DB instance. Delete all columns but the <code>db_schema</code> column.) | Thus, the idea is to create text files, one per DB instance, with one schema per line. Use the SQL query given below in the "How to see all schemas" section for guidance how to obtain / create such a list. (Use the <code>-B -N</code> options and split manually the file per <code>db_pool_id</code> into separate files per DB instance. Delete all columns but the <code>db_schema</code> column. Name the files like <code>schema-n.txt</code> where <code>n</code> is the DB pool id.) | ||
| Use these separate, per-DB-instance files to run update tasks per DB instance in parallel using something like | Use these separate, per-DB-instance files to run update tasks per DB instance in parallel using something like | ||
Revision as of 14:38, 12 December 2017
Update Task management in Open-Xchange
Overview
OX App Suite occasionally requires updates to the DB schemas via so-called update tasks. Usually an update task is something like adding tables, adding columns to tables, adding incides to tables, dropping columns, dropping tables, and so on.
Update tasks take care to apply changes to database tables which are required for new features or bugfixes.
Usually update tasks are only included in feature updates to OX App Suite (and not bugfix releases), but if a bugfix requires an update task, we will ship an update task also with a bugfix release.
Running update tasks
Automatically
Update tasks are applied when either the first user logs in to the OX UI or when triggered manually. Update tasks will be applied schema based. Per default, Open-Xchange stores 1000 contexts within a single schema.
We highly recommend to avoid this automatic update task execution on any real world production site in order to not overwhelm the databases with massive amounts of parallel running update tasks. See the Running_a_cluster#The_Big_Picture page (and the following sections of this page) for more background information.
Run update tasks on all schemas serially
Since 7.8.3 we offer a tool to execute all update tasks serially, one by one.
$ /opt/open-xchange/sbin/runallupdate
In large environments with multiple database instances this probably wastes a lot of speedup potential by parallelizing since update tasks can at least be parallelized by one per DB instance, probably also to (some small integer number) per DB instance. We'll cover how to do so in the following sections.
Run update tasks on one schema
Before 7.8.3 the only tool we offered to trigger update tasks was runupdate which can run update tasks for a given schema.
$ /opt/open-xchange/sbin/runupdate -n schema
This tool can also serve as component for some advanced combination of serial and parallel execution of update tasks to achive something like "execute N update tasks per DB instance in parallel".
Distributed execution of update tasks with limited parallelity
For large environments, we propose to prepare a distributed parallel execution of update tasks based on the assumption that each DB instance should be able to handle "some" (e.g. four) update tasks in parallel. The exact number depends on the DB hardware and needs to be determined by experience. But since the update tasks on different schemas operate independently we don't experience bottlenecks like lock contention from parallel execution of update tasks; it is only about available (CPU, IOPS, etc) resources.
Thus, the idea is to create text files, one per DB instance, with one schema per line. Use the SQL query given below in the "How to see all schemas" section for guidance how to obtain / create such a list. (Use the -B -N options and split manually the file per db_pool_id into separate files per DB instance. Delete all columns but the db_schema column. Name the files like schema-n.txt where n is the DB pool id.)
Use these separate, per-DB-instance files to run update tasks per DB instance in parallel using something like
cat schemas-3.txt | xargs -n1 -P4 /opt/open-xchange/sbin/runupdate -n
The -P switch to xargs defines parallel execution with the given number of parallel processes.
Spawn a command like this for every DB instance in parallel.
You end up with (4xN) parallel runupdate processes, where N is the number of DB instances.
Diagnosis / Monitoring
List executed update tasks for a given schema
$ /opt/open-xchange/sbin/listExecutedUpdateTasks -n schema
Sample output:
$ /opt/open-xchange/sbin/listExecutedUpdateTasks -n oxdatabase_5 taskName successful lastModified [...] LOCKED true 2014-02-02 11:35:49 MEZ com.openexchange.jslob.storage.db.groupware.DBJSlobCreateTableTask true 2014-02-02 11:35:52 MEZ com.openexchange.groupware.update.tasks.RemoveUnnecessaryIndexes true 2014-02-02 11:35:54 MEZ com.openexchange.groupware.update.tasks.CreateIcalPrincipalPrimaryKeyTask true 2014-02-02 11:35:54 MEZ com.openexchange.groupware.update.tasks.MailAccountAddArchiveTask true 2014-02-02 11:35:54 MEZ com.openexchange.groupware.update.tasks.GenconfAttributesBoolsAddUuidUpdateTask true 2014-02-02 11:35:54 MEZ com.openexchange.groupware.update.tasks.HeaderCacheDropFKTask true 2014-02-02 11:35:55 MEZ com.openexchange.groupware.update.tasks.ResourceClearDelTablesTask true 2014-02-02 11:35:55 MEZ com.openexchange.groupware.update.tasks.AddUUIDForUpdateTaskTable true 2014-02-02 11:35:55 MEZ com.openexchange.groupware.update.tasks.MailAccountAddReplyToTask true 2014-02-02 11:35:55 MEZ com.openexchange.groupware.tasks.database.TasksModifyCostColumnTask true 2014-02-02 11:35:59 MEZ com.openexchange.groupware.update.tasks.PrgContactsLinkageAddUuidUpdateTask true 2014-02-02 11:36:00 MEZ com.openexchange.ajax.requesthandler.converters.preview.cache.groupware.PreviewCacheCreateTableTask true 2014-02-02 11:36:00 MEZ com.openexchange.groupware.update.tasks.InfostoreExtendReservedPathsNameTask true 2014-02-02 11:36:00 MEZ com.openexchange.contact.storage.rdb.sql.AddFilenameColumnTask true 2014-02-02 11:37:47 MEZ com.openexchange.groupware.update.tasks.GenconfAttributesStringsAddUuidUpdateTask true 2014-02-02 11:37:48 MEZ [...]
How to see all schemas?
The database schemas are not exposed via OX tooling. You need to read them from the configdb. In principle you're looking for db_schema entries from the context_server2db_pool table and join that with corresponding lines from the db_pool table to get the database instance the schema is living on.
Sample query (works in general as of writing this documentation, may break on schema updates in future):
SELECT d.db_pool_id, SUBSTRING(hp.host_port, 1, LOCATE(':', hp.host_port)-1) AS host, SUBSTRING(hp.host_port, LOCATE(':', hp.host_port)+1) AS port, a.db_schema FROM db_pool d INNER JOIN (SELECT write_db_pool_id, db_schema FROM context_server2db_pool GROUP BY db_schema) AS a ON d.db_pool_id=a.write_db_pool_id INNER JOIN (SELECT d.db_pool_id as id, REPLACE(SUBSTRING(d.url, 1, LOCATE('/?', d.url)-1), 'jdbc:mysql://', '') AS host_port FROM db_pool d) AS hp ON d.db_pool_id=hp.id ORDER BY d.db_pool_id, CAST(SUBSTRING(db_schema, LOCATE('_', db_schema)+1) AS UNSIGNED);
If you also want to see login name and password for the given db hosts, add d.login, d.password to the fields returned by the query.
On a lab machine with 10 schemas on one DB instance the output looks like this:
+------------+---------+------+-----------+ | db_pool_id | host | port | db_schema | +------------+---------+------+-----------+ | 3 | glb.lan | 5507 | oxdb_5 | | 3 | glb.lan | 5507 | oxdb_6 | | 3 | glb.lan | 5507 | oxdb_7 | | 3 | glb.lan | 5507 | oxdb_8 | | 3 | glb.lan | 5507 | oxdb_9 | | 3 | glb.lan | 5507 | oxdb_10 | | 3 | glb.lan | 5507 | oxdb_11 | | 3 | glb.lan | 5507 | oxdb_12 | | 3 | glb.lan | 5507 | oxdb_13 | | 3 | glb.lan | 5507 | oxdb_14 | +------------+---------+------+-----------+
Troubleshooting
What if I have Update Tasks that are LOCKED?
If the command listExecutedUpdateTasks lists tasks that have the word LOCKED in the taskName row, these tasks could not be completed. This usually happens when Open-Xchange is being stopped while the update tasks are still running.
Do NOT stop Open-Xchange while Update Tasks are running!
If that happened to you, you need to manually remove these locks. In order to do that, you have to remove the rows from the table updateTask in every schema which have taskName set to LOCKED.
mysql> DELETE FROM updateTask WHERE taskName='LOCKED';
If you have multiple schemas, you can list all of them which contain that lock e.g. using this command:
 for i in $(echo show databases | mysql -uopenexchange -psecret | grep oxdatabase); do \
      echo "select taskName from ${i}.updateTask where taskName=\"LOCKED\"" | \
      mysql -uopenexchange -psecret | grep LOCKED > /dev/null && echo "database $i has a LOCK"; done
now for each of these schemas, run the sql query
mysql> DELETE FROM updateTask WHERE taskName='LOCKED';
If you're still having issues with locked tasks (check /var/log/open-xchange/ for logs), you might also need to clear the lock in the version table in the corresponding schema(s).
mysql> SELECT * from version; +---------+--------+---------------+------------------+--------------+ | version | locked | gw_compatible | admin_compatible | server | +---------+--------+---------------+------------------+--------------+ | 200 | 1 | 1 | 1 | oxserver | +---------+--------+---------------+------------------+--------------+
if locked is set to 1, run
mysql> UPDATE version SET locked=0;
What if I have Update Tasks that are in status false?
If the command listExecutedUpdateTasks lists tasks that have the word false in the successful row,
| taskName | successful | lastModified | 
| com.openexchange.groupware.update.tasks.PrgDatesPrimaryKeyUpdateTask | false | 2013-11-19 17:16:32 CET | 
a reason for that could be database servers that died under the high IO load of updating multiple schemas concurrently. To solve this problem, run the command /opt/open-xchange/sbin/forceupdatetask on the affected schema.
To prevent this situation, we recommend to run the updatetasks during low-traffic times, e.g. in the night on a machine that is not available to customers.