124 lines
5.4 KiB
Markdown
124 lines
5.4 KiB
Markdown
# gerritcp
|
|
A utility to copy changesets from one gerrit instance to another
|
|
|
|
## usage
|
|
|
|
gerritcp -dir [dir] -config [config.yaml]
|
|
|
|
dir specified a directory to keep the git repo in (because it's usually too
|
|
large to keep in RAM), config.yaml points to the configuration for upstream,
|
|
downstream and workstreams.
|
|
|
|
## Theory of operation
|
|
The tool is built primarily for the use case of downstreaming coreboot into
|
|
Chromium OS. If feasible, extension of its scope to other scenarios is welcome.
|
|
|
|
The tool must:
|
|
|
|
* copy submitted changes from upstream to reviewable change sets
|
|
in downstream;
|
|
* mask out changes to a configured set of files;
|
|
* keep a configured set of files around that only exists in downstream;
|
|
* provide multiple work streams with well-defined semantics so that changes,
|
|
to a given subdirectory (originally: util/crossgcc) can be handled
|
|
separately from everything else;
|
|
* un-abandon change sets in downstream if they would be revived by some
|
|
transaction of this tool;
|
|
* skip change sets that are marked WIP in downstream;
|
|
* keep already downstreamed change sets alone, e.g. if they have been
|
|
reordered.
|
|
|
|
gerritcp keeps a bare git repo for tracking both upstream and downstream.
|
|
Its new commits are created by directly adding objects to the git repo so
|
|
there's no current checked-out tree that can get out of sync. It works its
|
|
way backward in the source branch, collecting information about which work
|
|
stream a change belongs to and if it is already present in downstream and
|
|
usable. As soon as all work streams have a root change to work from, gerritcp
|
|
works forward through the collected changes, creating new change sets and
|
|
immediately pushing them to downstream gerrit (marking them unabandoned
|
|
as necessary).
|
|
|
|
A change is "usable" if it is not marked WIP and if it contains metadata
|
|
of upstream's change submission process: WIP allows skipping changes, even
|
|
the top-of-patchtree change, in case they need to be put aside for a while,
|
|
e.g. when waiting for an accompanying fix. Checking for upstream's metadata
|
|
ensures that the downstream change in question is a downstreamed change:
|
|
There are sometimes changes that have been pushed (and reviewed) downstream
|
|
first and then put upstream for submission when ready. These should be
|
|
integrated with the history, but they shouldn't derail other upstream changes.
|
|
|
|
When applying a change to downstream, several things need to be taken care of:
|
|
|
|
- Since it's possible that changes have been taken out of downstream's patch
|
|
train, gerritcp can't simply adopt the toplevel `tree` object of the
|
|
upstream commit. Even files can't be adopted 1:1 but need to be merged
|
|
because there might be changes to skip.
|
|
- Since downstream handles a few files differently, these might have to be
|
|
skipped entirely, by dropping the downstream object ids into the upstream-ish
|
|
`tree` objects that result from the prior step.
|
|
- For work streams, the right parent commit needs to be chosen: It must be
|
|
the work stream specific parent (if that one isn't merged already) or
|
|
the upstream parent (in other cases) to account for situations in which
|
|
commits in the work stream touch files outside its direct responsibility.
|
|
|
|
Pushes to downstream need to be authenticated and there needs to be enough
|
|
access to gerrit to be able to unabandon changes, while pulls from upstream
|
|
can be anonymous. Future additions such as watching upstream's event stream
|
|
might require authentication on the upstream side as well.
|
|
|
|
## Caching opportunities
|
|
|
|
There are fewer caching opportunities than one might think:
|
|
|
|
Every run to push changes to downstream needs to newly figure out the root
|
|
changes to apply work streams to because downstream might have changed:
|
|
commits might have been submitted, patches been marked WIP or reordered, ...
|
|
|
|
However, even though starting from scratch, if not much happened between
|
|
two runs, the scan needn't to go deep into upstream's history (and therefore
|
|
doesn't have to mess with downstream a whole lot). The root change should be
|
|
found pretty soon. For smaller work streams there might be nothing to do at
|
|
all (if no commit matching its specification has been collected while going
|
|
back until a common point in time has been found)
|
|
|
|
## Limited work streams
|
|
|
|
With onlyIfTouching configured, finding the right root change to work from
|
|
is slightly more involved than just going back through git history. Once
|
|
finding the old commit that needs to be processed (as determined by gerrit
|
|
metadata in commit messages and commits lining up in upstream and downstream),
|
|
the commit to use as parent needs to be identified.
|
|
|
|
For this, the next-oldest commit touching any of the `onlyIfTouching`
|
|
paths needs to be used determined, which can be done locally (git log $those
|
|
$paths on upstream history). The next thing to determine is if that commit
|
|
is already submitted downstream. If not, it becomes the parent commit for
|
|
the work stream. Otherwise the upstream's parent commit (no matter if it's
|
|
touching any of these files or not) becomes the downstream parent change.
|
|
|
|
## Configuration format
|
|
|
|
Configuration is stored in a single file in yaml format:
|
|
|
|
```yaml
|
|
sites:
|
|
upstream|downstream:
|
|
url: string
|
|
repo: string
|
|
branch: string
|
|
authentication: none|cookie
|
|
cookieName: string
|
|
cookieVal: string
|
|
|
|
workstreams:
|
|
{name}:
|
|
onlyIfTouching:
|
|
- string # paths
|
|
- ...
|
|
neverModify:
|
|
- string # paths
|
|
- ...
|
|
```
|
|
|
|
There's a template [coreboot.org ->
|
|
chromiumos configuration](chromium-coreboot.yaml) in this repo.
|