OEP-14: Archiving edX GitHub Repositories

OEP OEP-14
Title Archiving edX GitHub Repositories
Last Modified 2017-01-18
Author Christina Roberts <christina@edx.org> Feanil Patel <feanil@edx.org>
Arbiter Nimisha Asthagiri <nasthagiri@edx.org>
Status Accepted
Type Process
Created 2017-01-09
References ORA PR Discussion, Initial Archiving Discussions

Abstract

The edx organization contains a large number of repositories, most of which are active and maintained, but some of which are now obsolete. To clarify the status of repositories, a process for archiving a repository is defined below.

Motivation

Recently openedx.yaml files were added to edX repositories per OEP-2. In the course of deciding owners for those repositories, there was an ORA PR Discussion about how best to handle deprecated or obsolete repositories. In particular, do obsolete repositories need owners, and how can repositories be clearly marked as present for archive purposes only?

This discussion resurfaced related to edX’s usage of Gemnasium to report the usage of third-party libraries that have known security issues. All repositories under the edX organization were being monitored, but this added noise when trying to understand the number of third-party library updates required for actively maintained repositories.

Specification

When a repository under the edx organization will no longer be maintained because it is no longer in use, the following steps should be followed.

Transfer to New Owner if Interest

First, if the repository is public, and a part of Open edX releases, follow these steps to see if anyone would like to take ownership of it:

  1. Post a notice to Open edX Deprecation Announcements announcing that the repository will be archived, and inquiring if anyone would like to take ownership of the repo. If there are no responses after 2 work days, skip to Archive Steps.
  2. If someone does wish to take ownership of the repository, email the internal edX developers mailing list to see if there are any objections. If there are no objections after 2 work days, post to Open edX Deprecation Announcements that the transfer will take place.
  3. Create an IT help ticket for the repository to be transferred to the new organization.
  4. Once the transfer has occurred, create a fork of the transferred repository into the edx organization and follow the Archive Steps below for the forked repo.

Archive Steps

  1. Update the README.rst file in the repository to state that it is archived, using the README Archive Statement below.
  2. Update the openedx.yaml file, creating it if necessary:
    • Add archived: True.
    • Remove the openedx-release key if it is present.
    • It is not necessary for the openedx.yaml file to define an owner for archived repos.
  3. Create an IT help ticket to update the description of the repository to begin with [ARCHIVED] and for the repository to be archived per GitHub’s archive process

README Archive Statement

Include this statement in the README.rst file:

This repository has been archived and is no longer supported—use it at your own risk. This repository may depend on out-of-date libraries with security issues, and security updates will not be provided. Pull requests against this repository will also not be merged.

For Repos that are Forks

If the repository is a fork of an upstream repository that is not within the edX organization, and will no longer be maintained, it can be transferred to the edx-unsupported organization.

  1. If you have the permissions on GitHub, simply transfer the repo to the edx-unsupported organization. If you don’t have permissions, file an IT ticket to have it done.
  2. Once the repo is in the edx-unsupported organization, archive it.

The reason we transfer forks, but archive our original code, is so that GitHub searches will still find code we authored. We don’t delete the forks because they are still needed by older unsupported Open edX installations.

Rationale

The proposed process leverages the already-existing archived flag in openedx.yaml. It does not require introducing a new organization that is maintained by edX, and the source code remains easily visible and searchable (see Rejected Alternatives).

Backward Compatibility

This proposal does not introduce any backward compatibility issues.

Reference Implementation

The Discussions Hackathon repository has been updated to conform to the Archive Steps.

Rejected Alternatives

There are a couple variations of this proposal that were originally discussed in Initial Archiving Discussions. Many of the steps of updating documentation and notifying the open source community are essentially the same; the major differences from the proposed process are outlined below.

Alternative 1: Transfer Repository

Transfer the obsolete repository to a new organization: edx-archived.

Note

We now use the edx-unsupported organization for forks that we no longer maintain.

Pros:

  • edx organization is no longer littered with unsupported/obsolete repositories.
  • GitHub search results within the edx organization do not include matches in archived repositories. This could decrease confusion, especially since repo descriptions are not included in search results.
  • Gemnasium monitoring may cease automatically (although this would need to be confirmed).
  • Pattern followed by Facebook, and thus might be familiar to others.

Reasons rejected:

  • This creates another organization that edX must maintain and adds administrative overhead.
  • It could be difficult for people to find the code through search, though forwarding links would work for anyone who already linked to the repositories.

Alternative 2: Create Archive Branch

Move the code from the master branch to an archived branch, while leaving the repository itself within edx organization.

Pros:

  • No need to create and maintain a new organization.
  • Gemnasium monitoring will cease automatically.
  • No help tickets to IT or DevOps are required.
  • This pattern was recommended on Anselm Hannemann’s blog, though it is not known how many organizations (if any) have adopted this process.

Reasons rejected:

  • Non-intuitive, and could be confusing for developers to understand the state of the code, as cloning the repo or viewing it on GitHub would show an empty repository (Note: this could possibly be improved by changing the default branch for the repository, but that might reintroduce the Gemansium monitoring issue).
  • It is unclear what the implications would be for any existing forks.

Change History

2017-01-18

  • Original publication

2017-05-23

  • Added steps for repositories that live in the edX org, but are forks of other, independent repositories

2019-05-16

  • Updated to use GitHub’s archive capability.
  • Don’t ask the community about public repos in the edx org that are not a part of Open edX.