UpdateTasks: Difference between revisions

From Open-Xchange
Line 40: Line 40:
For large environments, we propose to prepare a distributed parallel execution of update tasks based on the assumption that each DB instance should be able to handle "some" (e.g. four) update tasks in parallel. The exact number depends on the DB hardware and needs to be determined by experience. But since the update tasks on different schemas operate independently we don't experience bottlenecks like lock contention from parallel execution of update tasks; it is only about available (CPU, IOPS, etc) resources.
For large environments, we propose to prepare a distributed parallel execution of update tasks based on the assumption that each DB instance should be able to handle "some" (e.g. four) update tasks in parallel. The exact number depends on the DB hardware and needs to be determined by experience. But since the update tasks on different schemas operate independently we don't experience bottlenecks like lock contention from parallel execution of update tasks; it is only about available (CPU, IOPS, etc) resources.


Thus, the idea is to create text files, one per DB instance, with one schema per line. Use the SQL query given below in the "How to see all schemas" section for guidance how to obtain / create such a list. (Use the <code>-B -N</code> options and split manually the file per <code>db_pool_id</code> into separate files per DB instance. Delete all columns but the <code>db_schema</code> column.)
Thus, the idea is to create text files, one per DB instance, with one schema per line. Use the SQL query given below in the "How to see all schemas" section for guidance how to obtain / create such a list. (Use the <code>-B -N</code> options and split manually the file per <code>db_pool_id</code> into separate files per DB instance. Delete all columns but the <code>db_schema</code> column. Name the files like <code>schema-n.txt</code> where <code>n</code> is the DB pool id.)


Use these separate, per-DB-instance files to run update tasks per DB instance in parallel using something like
Use these separate, per-DB-instance files to run update tasks per DB instance in parallel using something like

Revision as of 14:38, 12 December 2017

Update Task management in Open-Xchange

Overview

OX App Suite occasionally requires updates to the DB schemas via so-called update tasks. Usually an update task is something like adding tables, adding columns to tables, adding incides to tables, dropping columns, dropping tables, and so on.

Update tasks take care to apply changes to database tables which are required for new features or bugfixes.

Usually update tasks are only included in feature updates to OX App Suite (and not bugfix releases), but if a bugfix requires an update task, we will ship an update task also with a bugfix release.

Running update tasks

Automatically

Update tasks are applied when either the first user logs in to the OX UI or when triggered manually. Update tasks will be applied schema based. Per default, Open-Xchange stores 1000 contexts within a single schema.

We highly recommend to avoid this automatic update task execution on any real world production site in order to not overwhelm the databases with massive amounts of parallel running update tasks. See the Running_a_cluster#The_Big_Picture page (and the following sections of this page) for more background information.

Run update tasks on all schemas serially

Since 7.8.3 we offer a tool to execute all update tasks serially, one by one.

$ /opt/open-xchange/sbin/runallupdate

In large environments with multiple database instances this probably wastes a lot of speedup potential by parallelizing since update tasks can at least be parallelized by one per DB instance, probably also to (some small integer number) per DB instance. We'll cover how to do so in the following sections.

Run update tasks on one schema

Before 7.8.3 the only tool we offered to trigger update tasks was runupdate which can run update tasks for a given schema.

$ /opt/open-xchange/sbin/runupdate -n schema

This tool can also serve as component for some advanced combination of serial and parallel execution of update tasks to achive something like "execute N update tasks per DB instance in parallel".

Distributed execution of update tasks with limited parallelity

For large environments, we propose to prepare a distributed parallel execution of update tasks based on the assumption that each DB instance should be able to handle "some" (e.g. four) update tasks in parallel. The exact number depends on the DB hardware and needs to be determined by experience. But since the update tasks on different schemas operate independently we don't experience bottlenecks like lock contention from parallel execution of update tasks; it is only about available (CPU, IOPS, etc) resources.

Thus, the idea is to create text files, one per DB instance, with one schema per line. Use the SQL query given below in the "How to see all schemas" section for guidance how to obtain / create such a list. (Use the -B -N options and split manually the file per db_pool_id into separate files per DB instance. Delete all columns but the db_schema column. Name the files like schema-n.txt where n is the DB pool id.)

Use these separate, per-DB-instance files to run update tasks per DB instance in parallel using something like

cat schemas-3.txt | xargs -n1 -P4 /opt/open-xchange/sbin/runupdate -n

The -P switch to xargs defines parallel execution with the given number of parallel processes.

Spawn a command like this for every DB instance in parallel.

You end up with (4xN) parallel runupdate processes, where N is the number of DB instances.

Diagnosis / Monitoring

List executed update tasks for a given schema

$ /opt/open-xchange/sbin/listExecutedUpdateTasks -n schema

Sample output:

$ /opt/open-xchange/sbin/listExecutedUpdateTasks -n oxdatabase_5
 taskName                                                                                                successful lastModified
 [...]
 LOCKED                                                                                              true       2014-02-02 11:35:49 MEZ 
 com.openexchange.jslob.storage.db.groupware.DBJSlobCreateTableTask                                  true       2014-02-02 11:35:52 MEZ 
 com.openexchange.groupware.update.tasks.RemoveUnnecessaryIndexes                                    true       2014-02-02 11:35:54 MEZ 
 com.openexchange.groupware.update.tasks.CreateIcalPrincipalPrimaryKeyTask                           true       2014-02-02 11:35:54 MEZ 
 com.openexchange.groupware.update.tasks.MailAccountAddArchiveTask                                   true       2014-02-02 11:35:54 MEZ 
 com.openexchange.groupware.update.tasks.GenconfAttributesBoolsAddUuidUpdateTask                     true       2014-02-02 11:35:54 MEZ 
 com.openexchange.groupware.update.tasks.HeaderCacheDropFKTask                                       true       2014-02-02 11:35:55 MEZ 
 com.openexchange.groupware.update.tasks.ResourceClearDelTablesTask                                  true       2014-02-02 11:35:55 MEZ 
 com.openexchange.groupware.update.tasks.AddUUIDForUpdateTaskTable                                   true       2014-02-02 11:35:55 MEZ 
 com.openexchange.groupware.update.tasks.MailAccountAddReplyToTask                                   true       2014-02-02 11:35:55 MEZ 
 com.openexchange.groupware.tasks.database.TasksModifyCostColumnTask                                 true       2014-02-02 11:35:59 MEZ 
 com.openexchange.groupware.update.tasks.PrgContactsLinkageAddUuidUpdateTask                         true       2014-02-02 11:36:00 MEZ 
 com.openexchange.ajax.requesthandler.converters.preview.cache.groupware.PreviewCacheCreateTableTask true       2014-02-02 11:36:00 MEZ 
 com.openexchange.groupware.update.tasks.InfostoreExtendReservedPathsNameTask                        true       2014-02-02 11:36:00 MEZ 
 com.openexchange.contact.storage.rdb.sql.AddFilenameColumnTask                                      true       2014-02-02 11:37:47 MEZ 
 com.openexchange.groupware.update.tasks.GenconfAttributesStringsAddUuidUpdateTask                   true       2014-02-02 11:37:48 MEZ 
 [...]

How to see all schemas?

The database schemas are not exposed via OX tooling. You need to read them from the configdb. In principle you're looking for db_schema entries from the context_server2db_pool table and join that with corresponding lines from the db_pool table to get the database instance the schema is living on.

Sample query (works in general as of writing this documentation, may break on schema updates in future):

SELECT d.db_pool_id, SUBSTRING(hp.host_port, 1, LOCATE(':', hp.host_port)-1) AS host, SUBSTRING(hp.host_port, LOCATE(':', hp.host_port)+1) AS port, a.db_schema FROM db_pool d INNER JOIN (SELECT write_db_pool_id, db_schema FROM context_server2db_pool GROUP BY db_schema) AS a ON d.db_pool_id=a.write_db_pool_id INNER JOIN (SELECT d.db_pool_id as id, REPLACE(SUBSTRING(d.url, 1, LOCATE('/?', d.url)-1), 'jdbc:mysql://', '') AS host_port FROM db_pool d) AS hp ON d.db_pool_id=hp.id ORDER BY d.db_pool_id, CAST(SUBSTRING(db_schema, LOCATE('_', db_schema)+1) AS UNSIGNED);

If you also want to see login name and password for the given db hosts, add d.login, d.password to the fields returned by the query.

On a lab machine with 10 schemas on one DB instance the output looks like this:

 +------------+---------+------+-----------+
 | db_pool_id | host    | port | db_schema |
 +------------+---------+------+-----------+
 |          3 | glb.lan | 5507 | oxdb_5    |
 |          3 | glb.lan | 5507 | oxdb_6    |
 |          3 | glb.lan | 5507 | oxdb_7    |
 |          3 | glb.lan | 5507 | oxdb_8    |
 |          3 | glb.lan | 5507 | oxdb_9    |
 |          3 | glb.lan | 5507 | oxdb_10   |
 |          3 | glb.lan | 5507 | oxdb_11   |
 |          3 | glb.lan | 5507 | oxdb_12   |
 |          3 | glb.lan | 5507 | oxdb_13   |
 |          3 | glb.lan | 5507 | oxdb_14   |
 +------------+---------+------+-----------+

Troubleshooting

What if I have Update Tasks that are LOCKED?

If the command listExecutedUpdateTasks lists tasks that have the word LOCKED in the taskName row, these tasks could not be completed. This usually happens when Open-Xchange is being stopped while the update tasks are still running.

Do NOT stop Open-Xchange while Update Tasks are running!

If that happened to you, you need to manually remove these locks. In order to do that, you have to remove the rows from the table updateTask in every schema which have taskName set to LOCKED.

mysql> DELETE FROM updateTask WHERE taskName='LOCKED';

If you have multiple schemas, you can list all of them which contain that lock e.g. using this command:

 for i in $(echo show databases | mysql -uopenexchange -psecret | grep oxdatabase); do \
      echo "select taskName from ${i}.updateTask where taskName=\"LOCKED\"" | \
      mysql -uopenexchange -psecret | grep LOCKED > /dev/null && echo "database $i has a LOCK"; done

now for each of these schemas, run the sql query

mysql> DELETE FROM updateTask WHERE taskName='LOCKED';

If you're still having issues with locked tasks (check /var/log/open-xchange/ for logs), you might also need to clear the lock in the version table in the corresponding schema(s).

mysql> SELECT * from version;
+---------+--------+---------------+------------------+--------------+
| version | locked | gw_compatible | admin_compatible | server       |
+---------+--------+---------------+------------------+--------------+
|     200 |      1 |             1 |                1 | oxserver     |
+---------+--------+---------------+------------------+--------------+

if locked is set to 1, run

mysql> UPDATE version SET locked=0;

What if I have Update Tasks that are in status false?

If the command listExecutedUpdateTasks lists tasks that have the word false in the successful row,

taskName successful lastModified
com.openexchange.groupware.update.tasks.PrgDatesPrimaryKeyUpdateTask false 2013-11-19 17:16:32 CET

a reason for that could be database servers that died under the high IO load of updating multiple schemas concurrently. To solve this problem, run the command /opt/open-xchange/sbin/forceupdatetask on the affected schema.

To prevent this situation, we recommend to run the updatetasks during low-traffic times, e.g. in the night on a machine that is not available to customers.