No description
Find a file
2021-10-29 20:21:08 +02:00
chromium-coreboot.yaml Fix YAML syntax 2021-10-29 20:21:08 +02:00
README.md Add usage 2021-10-29 19:38:21 +02:00

gerritcp

A utility to copy changesets from one gerrit instance to another

usage

gerritcp -d [dir] -c [config.yaml]

dir specified a directory to keep the git repo in (because it's usually too large to keep in RAM), config.yaml points to the configuration for upstream, downstream and workstreams.

Theory of operation

The tool is built primarily for the use case of downstreaming coreboot into Chromium OS. If feasible, extension of its scope to other scenarios is welcome.

The tool must:

  • copy submitted changes from upstream to reviewable change sets in downstream;
  • mask out changes to a configured set of files;
  • keep a configured set of files around that only exists in downstream;
  • provide multiple work streams with well-defined semantics so that changes, to a given subdirectory (originally: util/crossgcc) can be handled separately from everything else;
  • un-abandon change sets in downstream if they would be revived by some transaction of this tool;
  • skip change sets that are marked WIP in downstream;
  • keep already downstreamed change sets alone, e.g. if they have been reordered.

gerritcp keeps a bare git repo for tracking both upstream and downstream. Its new commits are created by directly adding objects to the git repo so there's no current checked-out tree that can get out of sync. It works its way backward in the source branch, collecting information about which work stream a change belongs to and if it is already present in downstream and usable. As soon as all work streams have a root change to work from, gerritcp works forward through the collected changes, creating new change sets and immediately pushing them to downstream gerrit (marking them unabandoned as necessary).

A change is "usable" if it is not marked WIP and if it contains metadata of upstream's change submission process: WIP allows skipping changes, even the top-of-patchtree change, in case they need to be put aside for a while, e.g. when waiting for an accompanying fix. Checking for upstream's metadata ensures that the downstream change in question is a downstreamed change: There are sometimes changes that have been pushed (and reviewed) downstream first and then put upstream for submission when ready. These should be integrated with the history, but they shouldn't derail other upstream changes.

When applying a change to downstream, several things need to be taken care of:

  • Since it's possible that changes have been taken out of downstream's patch train, gerritcp can't simply adopt the toplevel tree object of the upstream commit. Even files can't be adopted 1:1 but need to be merged because there might be changes to skip.
  • Since downstream handles a few files differently, these might have to be skipped entirely, by dropping the downstream object ids into the upstream-ish tree objects that result from the prior step.
  • For work streams, the right parent commit needs to be chosen: It must be the work stream specific parent (if that one isn't merged already) or the upstream parent (in other cases) to account for situations in which commits in the work stream touch files outside its direct responsibility.

Pushes to downstream need to be authenticated and there needs to be enough access to gerrit to be able to unabandon changes, while pulls from upstream can be anonymous. Future additions such as watching upstream's event stream might require authentication on the upstream side as well.

Caching opportunities

There are fewer caching opportunities than one might think:

Every run to push changes to downstream needs to newly figure out the root changes to apply work streams to because downstream might have changed: commits might have been submitted, patches been marked WIP or reordered, ...

However, even though starting from scratch, if not much happened between two runs, the scan needn't to go deep into upstream's history (and therefore doesn't have to mess with downstream a whole lot). The root change should be found pretty soon. For smaller work streams there might be nothing to do at all (if no commit matching its specification has been collected while going back until a common point in time has been found)

Limited work streams

With onlyIfTouching configured, finding the right root change to work from is slightly more involved than just going back through git history. Once finding the old commit that needs to be processed (as determined by gerrit metadata in commit messages and commits lining up in upstream and downstream), the commit to use as parent needs to be identified.

For this, the next-oldest commit touching any of the onlyIfTouching paths needs to be used determined, which can be done locally (git log $those $paths on upstream history). The next thing to determine is if that commit is already submitted downstream. If not, it becomes the parent commit for the work stream. Otherwise the upstream's parent commit (no matter if it's touching any of these files or not) becomes the downstream parent change.

Configuration format

Configuration is stored in a single file in yaml format:

sites:
  upstream|downstream:
    url: string
    repo: string
    branch: string
    authentication: none|cookie
    cookieName: string
    cookieVal: string

workstreams:
  {name}:
    onlyIfTouching:
      - string # paths
      - ...
    neverModify:
      - string # paths
      - ...

There's a template coreboot.org -> chromiumos configuration in this repo.