Merge changes from topic "notedb-2.15-prep"

* changes:
  Rename dev-note-db.txt to note-db.txt
  dev-note-db.txt: Document trial mode
  Move non-migration-related NoteDb config options to regular docs
  Revamp NoteDb docs to describe supported migration process
  InitExperimental: Tone down NoteDb warnings
  MigrateToNoteDb: Default to non-trial mode
  Daemon: Unmark --migrate-to-note-db as experimental
This commit is contained in:
Dave Borowitz 2017-08-16 16:22:14 +00:00 committed by Gerrit Code Review
commit 5169864d68
8 changed files with 205 additions and 193 deletions

View File

@ -3286,8 +3286,8 @@ Common examples:
=== Section noteDb
NoteDb is the next generation of Gerrit storage backend, currently powering
`googlesource.com`. It is not (yet) recommended for general use, but if you want
to learn more, see the link:dev-note-db.html[developer documentation].
`googlesource.com`. For more information, including how to migrate your data,
see the link:note-db.html[documentation].
[[notedb.accounts.sequenceBatchSize]]notedb.accounts.sequenceBatchSize::
+
@ -3302,6 +3302,26 @@ each process retrieves at once.
+
By default, 1.
[[noteDb.retryMaxWait]]noteDb.retryMaxWait::
+
Maximum time to wait between attempts to retry update operations when one
attempt fails due to contention (aka lock failure) on the underlying ref
storage. Operations are retried with exponential backoff, plus some random
jitter, until the interval reaches this limit. After that, retries continue to
occur after a fixed timeout (plus jitter), up to
link:#noteDb.retryTimeout[`noteDb.retryTimeout`].
+
Defaults to 5 seconds; unit suffixes are supported, and assumes milliseconds if
not specified.
[[noteDb.retryTimeout]]noteDb.retryTimeout::
+
Total timeout for retrying update operations when one attempt fails due to
contention (aka lock failure) on the underlying ref storage.
+
Defaults to 20 seconds; unit suffixes are supported, and assumes milliseconds if
not specified.
[[oauth]]
=== Section oauth

View File

@ -1,179 +0,0 @@
= Gerrit Code Review - NoteDb Backend
NoteDb is the next generation of Gerrit storage backend, which replaces the
traditional SQL backend for change and account metadata with storing data in the
same repository as code changes.
.Advantages
- *Simplicity*: All data is stored in one location in the site directory, rather
than being split between the site directory and a possibly external database
server.
- *Consistency*: Replication and backups can use a snapshot of the Git
repository refs, which will include both the branch and patch set refs, and
the change metadata that points to them.
- *Auditability*: Rather than storing mutable rows in a database, modifications
to changes are stored as a sequence of Git commits, automatically preserving
history of the metadata. +
There are no strict guarantees, and meta refs may be rewritten, but the
default assumption is that all operations are logged.
- *Extensibility*: Plugin developers can add new fields to metadata without the
core database schema having to know about them.
- *New features*: Enables simple federation between Gerrit servers, as well as
offline code review and interoperation with other tools.
== Current Status
- Storing change metadata is fully implemented in master, and is live on the
servers behind `googlesource.com`. In other words, if you use
link:https://gerrit-review.googlesource.com/[gerrit-review], you're already
using NoteDb. +
- Storing some account data, e.g. user preferences, is implemented in releases
back to 2.13.
- Storing the rest of account data is a work in progress.
- Storing group data is a work in progress.
To use a configuration similar to the current state of `googlesource.com`, paste
the following config snippet in your `gerrit.config`:
----
[noteDb "changes"]
write = true
read = true
primaryStorage = NOTE_DB
disableReviewDb = true
----
This is the configuration that will be produced if you enable experimental
NoteDb support on a new site with `init`.
`googlesource.com` additionally uses `fuseUpdates = true`, because its repo
backend supports atomic multi-ref transactions. Native JGit doesn't, so setting
this option on a dev server would fail.
For an example NoteDb change, poke around at this one:
----
git fetch https://gerrit.googlesource.com/gerrit refs/changes/70/98070/meta \
&& git log -p FETCH_HEAD
----
== Configuration
Account and group data is migrated to NoteDb automatically using the normal
schema upgrade process during updates. The remainder of this section details the
configuration options that control migration of the change data, which is mostly
but not fully implemented.
Change migration state is configured in `gerrit.config` with options like
`noteDb.changes.*`. These options are undocumented outside of this file, and the
general approach has been to add one new option for each phase of the migration.
Assume that each config option in the following list requires all of the
previous options, unless otherwise noted.
- `noteDb.changes.write=true`: During a ReviewDb write, the state of the change
in NoteDb is written to the `note_db_state` field in the `Change` entity.
After the ReviewDb write, this state is written into NoteDb, resulting in
effectively double the time for write operations. NoteDb write errors are
dropped on the floor, and no attempt is made to read from ReviewDb or correct
errors (without additional configuration, below). +
This state allows for a rolling update in a multi-master setting, where some
servers can start reading from NoteDb, but older servers are still reading
only from ReviewDb.
- `noteDb.changes.read=true`: Change data is written
to and read from NoteDb, but ReviewDb is still the source of truth. During
reads, first read the change from ReviewDb, and compare its `note_db_state`
with what is in NoteDb. If it doesn't match, immediately "auto-rebuild" the
change, copying data from ReviewDb to NoteDb and returning the result.
- `noteDb.changes.primaryStorage=NOTE_DB`: New changes are written only to
NoteDb, but changes whose primary storage is ReviewDb are still supported.
Continues to read from ReviewDb first as in the previous stage, but if the
change is not in ReviewDb, falls back to reading from NoteDb. +
Migration of existing changes is described in the link:#migration[Migration]
section below. +
Due to an implementation detail, writes to Changes or related tables still
result in write calls to the database layer, but they are inside a transaction
that is always rolled back.
- `noteDb.changes.disableReviewDb=true`: All access to Changes or related tables
is disabled; reads return no results, and writes are no-ops. Assumes the state
of all changes in NoteDb is accurate, and so is only safe once all changes are
NoteDb primary. Otherwise, reading changes only from NoteDb might result in
inaccurate results, and writing to NoteDb would compound the problem. +
Thus it is up to an admin of a previously-ReviewDb site to ensure
MigratePrimaryStorage has been run for all changes. Note that the current
implementation of the `migrate-to-note-db` program does not do this. +
In this phase, it would be possible to delete the Changes tables out from
under a running server with no effect.
- `noteDb.changes.fuseUpdates=true`: Code and meta updates within a single
repository are fused into a single atomic `BatchRefUpdate`, so they either
all succeed or all fail. This mode is only possible on a backend that
supports atomic ref updates, which notably excludes the default file repository
backend.
[[migration]]
== Migration
Once configuration options are set, migration to NoteDb is primarily
accomplished by running the `migrate-to-note-db` program. Currently, this program
bulk copies ReviewDb data into NoteDb, but leaves primary storage of these
changes in ReviewDb, so the site is runnable with
`noteDb.changes.{write,read}=true`, but ReviewDb is still required.
Eventually, `migrate-to-note-db` will set primary storage to NoteDb for all
changes by default, so a site will be able to stop using ReviewDb for changes
immediately after a successful run.
There is code in `PrimaryStorageMigrator.java` to migrate individual changes
from NoteDb primary to ReviewDb primary. This code is not intended to be used
except in the event of a critical bug in NoteDb primary changes in production.
It will likely never be used by `migrate-to-note-db`, and in fact it's not
recommended to run `migrate-to-note-db` until the code is stable enough that the
reverse migration won't be necessary.
=== Zero-Downtime Multi-Master Migration
Single-master Gerrit sites can use `migrate-to-note-db` on an offline site to
rebuild NoteDb, but this doesn't work in a zero-downtime environment like
googlesource.com.
Here, the migration process looks like:
- Turn on `noteDb.changes.write=true` to start writing to NoteDb.
- Run a parallel link:https://research.google.com/pubs/pub35650.html[FlumeJava]
pipeline to write NoteDb data for all changes, and update all `note_db_state`
fields. (Sorry, this implementation is entirely closed-source.)
- Turn on `noteDb.changes.read=true` to start reading from NoteDb.
- Turn on `noteDb.changes.primaryStorage=NOTE_DB` to start writing new changes
to NoteDb only.
- Run a Flume to migrate all existing changes to NoteDb primary. (Also
closed-source, but basically just a wrapper around `PrimaryStorageMigrator`.)
- Turn off access to ReviewDb changes tables.
== Configuration
This section describes general configuration for the NoteDb backend that is not
directly related to the migration process. These options will continue to be
supported after the migration is complete and the migration-related options are
obsolete.
[[noteDb.retryMaxWait]]noteDb.retryMaxWait::
+
Maximum time to wait between attempts to retry update operations when one
attempt fails due to contention (aka lock failure) on the underlying ref
storage. Operations are retried with exponential backoff, plus some random
jitter, until the interval reaches this limit. After that, retries continue to
occur after a fixed timeout (plus jitter), up to
link:#noteDb.retryTimeout[`noteDb.retryTimeout`].
+
Only applicable when `noteDb.changes.fuseUpdates=true`.
+
Defaults to 5 seconds; unit suffixes are supported, and assumes milliseconds if
not specified.
[[noteDb.retryTimeout]]noteDb.retryTimeout::
+
Total timeout for retrying update operations when one attempt fails due to
contention (aka lock failure) on the underlying ref storage.
+
Only applicable when `noteDb.changes.fuseUpdates=true`.
+
Defaults to 20 seconds; unit suffixes are supported, and assumes milliseconds if
not specified.

View File

@ -66,6 +66,7 @@
. link:config-reverseproxy.html[Reverse Proxy]
. link:config-auto-site-initialization.html[Automatic Site Initialization on Startup]
. link:pgm-index.html[Server Side Administrative Tools]
. link:note-db.html[NoteDb]
== Developer
. Getting Started
@ -82,7 +83,6 @@
.. link:dev-stars.html[Starring Changes]
. link:dev-design.html[System Design]
. link:i18n-readme.html[i18n Support]
. link:dev-note-db.html[NoteDb]
== Maintainer
. link:dev-release.html[Making a Gerrit Release]

173
Documentation/note-db.txt Normal file
View File

@ -0,0 +1,173 @@
= Gerrit Code Review - NoteDb Backend
NoteDb is the next generation of Gerrit storage backend, which replaces the
traditional SQL backend for change and account metadata with storing data in the
same repository as code changes.
.Advantages
- *Simplicity*: All data is stored in one location in the site directory, rather
than being split between the site directory and a possibly external database
server.
- *Consistency*: Replication and backups can use a snapshot of the Git
repository refs, which will include both the branch and patch set refs, and
the change metadata that points to them.
- *Auditability*: Rather than storing mutable rows in a database, modifications
to changes are stored as a sequence of Git commits, automatically preserving
history of the metadata. +
There are no strict guarantees, and meta refs may be rewritten, but the
default assumption is that all operations are logged.
- *Extensibility*: Plugin developers can add new fields to metadata without the
core database schema having to know about them.
- *New features*: Enables simple federation between Gerrit servers, as well as
offline code review and interoperation with other tools.
== Current Status
- Storing change metadata is fully implemented in the 2.15 release. Admins may
use an link:#offline-migration[offline] or link:#online-migration[online] tool
to migrate change data from ReviewDb.
- Storing account data is fully implemented in the 2.15 release. Account data is
migrated automatically during the upgrade process by running `gerrit.war
init`.
- Account and change metadata on the servers behind `googlesource.com` is fully
migrated to NoteDb. In other words, if you use
link:https://gerrit-review.googlesource.com/[gerrit-review], you're already
using NoteDb.
For an example NoteDb change, poke around at this one:
----
git fetch https://gerrit.googlesource.com/gerrit refs/changes/70/98070/meta \
&& git log -p FETCH_HEAD
----
== Future Work ("Gerrit 3.0")
- Storing group data is a work in progress. Like account data, it will be
migrated automatically.
- NoteDb will be the only database format supported by Gerrit 3.0. The offline
change data migration tool will be included in Gerrit 3.0, but online
migration will only be available in the 2.x line.
[[migration]]
== Migration
Migrating change metadata can take a long time for large sites, so
administrators choose whether to do the migration offline or online, depending
on their available resources and tolerance for downtime.
Only change metadata requires manual steps to migrate it from ReviewDb; account
and group data is migrated automatically by `gerrit.war init`.
[[online-migration]]
=== Online
To start the online migration, set the `noteDb.changes.autoMigrate` option in
`gerrit.config` and restart Gerrit:
----
[noteDb "changes"]
autoMigrate = true
----
Alternatively, pass the `--migrate-to-note-db` flag to
`gerrit.war daemon`:
----
java -jar gerrit.war daemon -d /path/to/site --migrate-to-note-db
----
Both ways of starting the online migration are equivalent. Once started, it is
safe to restart the server at any time; the migration will pick up where it left
off. Migration progress will be reported to the Gerrit logs.
*Advantages*
* No downtime required.
*Disadvantages*
* Only available in 2.x; will not be available in Gerrit 3.0.
* Much slower than offline; uses only a single thread, to leave resources
available for serving traffic.
* Performance may be degraded, particularly of updates; data needs to be written
to both ReviewDb and NoteDb while the migration is in progress.
[[offline-migration]]
=== Offline
To run the offline migration, run the `migrate-to-note-db` program:
----
java -jar gerrit.war migrate-to-note-db /path/to/site
----
Once started, it is safe to cancel and restart the migration process, or to
switch to the online process.
*Advantages*
* Much faster than online; can use all available CPUs, since no live traffic
needs to be served.
* No degraded performance of live servers due to writing data to 2 locations.
* Available in both Gerrit 2.x and 3.0.
*Disadvantages*
* May require substantial downtime; takes about twice as long as an
link:#pgm-reindex[offline reindex]. (In fact, one of the migration steps is a
full reindex, so it can't possibly take less time.)
[[trial-migration]]
==== Trial mode
The offline migration tool also supports "trial mode", where changes are
migrated to NoteDb and read from NoteDb at runtime, but their primary storage
location is still ReviewDb, and data is kept in sync between the two locations.
To run the offline migration in trial mode, add `--trial` to
`migrate-to-note-db`:
----
java -jar gerrit.war migrate-to-note-db --trial /path/to/site
----
There are several use cases for trial mode:
* Help test early releases of the migration tool for bugs with lower risk.
* Try out new NoteDb-only features like
link:rest-api-changes.txt#get-hashtags[hashtags] without running the full
migration.
To continue with the full migration after running the trial migration, use
either the online or offline migration steps as normal. To revert to
ReviewDb-only, remove `noteDb.changes.read` and `noteDb.changes.write` from
`gerrit.config` and restart Gerrit.
== Configuration
The migration process works by setting a configuration option in `gerrit.config`
for each step in the process, then performing the corresponding data migration.
In general, users should not set these options manually; this section serves
primarily as a reference.
- `noteDb.changes.write=true`: During a ReviewDb write, the state of the change
in NoteDb is written to the `note_db_state` field in the `Change` entity.
After the ReviewDb write, this state is written into NoteDb, resulting in
effectively double the time for write operations. NoteDb write errors are
dropped on the floor, and no attempt is made to read from ReviewDb or correct
errors (without additional configuration, below).
- `noteDb.changes.read=true`: Change data is written
to and read from NoteDb, but ReviewDb is still the source of truth. During
reads, first read the change from ReviewDb, and compare its `note_db_state`
with what is in NoteDb. If it doesn't match, immediately "auto-rebuild" the
change, copying data from ReviewDb to NoteDb and returning the result.
- `noteDb.changes.primaryStorage=NOTE_DB`: New changes are written only to
NoteDb, but changes whose primary storage is ReviewDb are still supported.
Continues to read from ReviewDb first as in the previous stage, but if the
change is not in ReviewDb, falls back to reading from NoteDb. +
Migration of existing changes is described in the link:#migration[Migration]
section above. +
Due to an implementation detail, writes to Changes or related tables still
result in write calls to the database layer, but they are inside a transaction
that is always rolled back.
- `noteDb.changes.disableReviewDb=true`: All access to Changes or related tables
is disabled; reads return no results, and writes are no-ops. Assumes the state
of all changes in NoteDb is accurate, and so is only safe once all changes are
NoteDb primary. Otherwise, reading changes only from NoteDb might result in
inaccurate results, and writing to NoteDb would compound the problem. +

View File

@ -76,7 +76,7 @@ public class StandaloneNoteDbMigrationIT extends StandaloneSiteTest {
assertNotesMigrationState(NotesMigrationState.REVIEW_DB);
setUpOneChange();
migrate();
migrate("--trial");
assertNotesMigrationState(NotesMigrationState.READ_WRITE_NO_SEQUENCE);
try (ServerContext ctx = startServer()) {
@ -104,7 +104,7 @@ public class StandaloneNoteDbMigrationIT extends StandaloneSiteTest {
assertNotesMigrationState(NotesMigrationState.REVIEW_DB);
setUpOneChange();
migrate("--trial", "false");
migrate();
assertNotesMigrationState(NotesMigrationState.NOTE_DB);
try (ServerContext ctx = startServer()) {
@ -142,7 +142,7 @@ public class StandaloneNoteDbMigrationIT extends StandaloneSiteTest {
status.save();
assertServerStartupFails();
migrate("--trial", "false");
migrate();
assertNotesMigrationState(NotesMigrationState.NOTE_DB);
status = new GerritIndexStatus(sitePaths);

View File

@ -170,7 +170,7 @@ public class Daemon extends SiteProgram {
@Option(
name = "--migrate-to-note-db",
usage = "(EXPERIMENTAL) Automatically migrate changes to NoteDb",
usage = "Automatically migrate changes to NoteDb",
handler = ExplicitBooleanOptionHandler.class
)
private boolean migrateToNoteDb;

View File

@ -74,10 +74,9 @@ public class MigrateToNoteDb extends SiteProgram {
name = "--trial",
usage =
"trial mode: migrate changes and turn on reading from NoteDb, but leave ReviewDb as"
+ " the source of truth",
handler = ExplicitBooleanOptionHandler.class
+ " the source of truth"
)
private boolean trial = true; // TODO(dborowitz): Default to false in 3.0.
private boolean trial;
@Option(
name = "--sequence-gap",

View File

@ -58,10 +58,9 @@ class InitExperimental implements InitStep {
private void initNoteDb() {
ui.message(
"Use experimental NoteDb for change metadata?\n"
+ " NoteDb is not recommended for production servers."
+ " Please familiarize yourself with the documentation:\n"
+ " https://gerrit-review.googlesource.com/Documentation/dev-note-db.html\n");
"Use NoteDb for change metadata?\n"
+ " See documentation:\n"
+ " https://gerrit-review.googlesource.com/Documentation/note-db.html\n");
if (!ui.yesno(false, "Enable")) {
return;
}