MSR repository size utility

You can use the Python 3 utility tool to learn the size of your MSR repository. With the tool, you can make both basic size queries and simple size queries:

  • Basic size queries return the total size shared with other repositories and the portion unique to the repository itself.

  • Simple size queries return the total size of a repository only, without information as to which portion is shared with other repositories or which portions are unique to the repository itself.

Activate Python 3 utility tool

  1. Create a new Python 3 virtual environment:

    ~ python3 -m venv myenv
    
  2. Activate the virtual environment:

    ~ source myenv/bin/activate
    
  3. Create the requirements.txt file with the following content:

    certifi==2024.7.4
    charset-normalizer==3.3.2
    idna==3.7
    requests==2.32.3
    urllib3==2.2.2
    
  4. Install the requirements in the virtual environment:

    (myenv)   ~ pip install -r requirements.txt
    
  5. Run the tool from within the virtual environment:

    (myenv)   ~ python3 repository_size.py --help
    usage: repository_size.py [-h] [--host HOST] [--username USERNAME] [--password PASSWORD] [--page-size PAGE_SIZE]
                              [--simple [SIMPLE]] [--namespaces NAMESPACES] [--repositories REPOSITORIES] [--cacert CACERT]
                              [--insecure [INSECURE]] [--output OUTPUT] [--log-level LOG_LEVEL]
    
    Command line parameters

    Parameter

    Description

    host

    The host and port to use for MSR access. Default: 127.0.0.1:8443.

    username

    The MSR username.

    password

    The MSR password.

    page-size

    Maximum number of results to return per API request. Default: 10.

    simple

    If set to True, only the total size of the repository is fetched. If set to False, the total size of the repository is fetched, as is the size that is unique to the repository, and the size that is shared with other repositories through common blobs.

    namespaces

    List of comma-separated namespaces.

    repositories

    List of comma-separated repositories.

    cacert

    Path to the MSR CA certificate file.

    insecure

    Use an insecure connection.

    output

    Output the result to a JSON file, or to console if “-” is provided.

    log-level

    Log level.

Query MSR repository size

To query the sizes of all repositories within MSR and output a summary to stdout:

(myenv)   ~ python3 repository_size.py --cacert=/path/to/msr/cacert.pem --output -

Example output:

[2024-06-25 00:04:55,132] - INFO - Fetching simple=False repository sizes from MSR (127.0.0.1:8443) with user (admin)
[2024-06-25 00:04:55,132] - INFO - Received 0 repositories from user input
[2024-06-25 00:04:55,132] - INFO - No repositories or namespaces provided, so getting ALL repositories from MSR
[2024-06-25 00:04:55,737] - INFO - Fetched 11 repositories from MSR. Now fetching size for each repository. This may take a while...
[2024-06-25 00:04:55,737] - INFO - Retrieving sizes for 11 repositories (duplicates removed). This may take a while...
[2024-06-25 00:04:56,011] - INFO - Fetched size for 'msr/msr-api' repository: (unique=42680869, shared=5338472, total=48019341)
[2024-06-25 00:04:56,288] - INFO - Fetched size for 'msr/msr-content-cache' repository: (unique=15473937, shared=5338278, total=20812215)
[2024-06-25 00:04:56,568] - INFO - Fetched size for 'msr/msr-garant' repository: (unique=26427185, shared=5338472, total=31765657)
[2024-06-25 00:04:56,847] - INFO - Fetched size for 'msr/msr-installer' repository: (unique=26337376, shared=5338472, total=31675848)
[2024-06-25 00:04:57,148] - INFO - Fetched size for 'msr/msr-jobrunner' repository: (unique=1261495926, shared=0, total=1261495926)
[2024-06-25 00:04:57,459] - INFO - Fetched size for 'msr/msr-nginx' repository: (unique=40166380, shared=5338472, total=45504852)
[2024-06-25 00:04:57,747] - INFO - Fetched size for 'msr/msr-notary-server' repository: (unique=5742530, shared=5338472, total=11081002)
[2024-06-25 00:04:58,023] - INFO - Fetched size for 'msr/msr-notary-signer' repository: (unique=5293658, shared=5338472, total=10632130)
[2024-06-25 00:04:58,303] - INFO - Fetched size for 'msr/msr-registry' repository: (unique=37647972, shared=5338472, total=42986444)
[2024-06-25 00:04:58,581] - INFO - Fetched size for 'admin/harbor-core-base' repository: (unique=18599503, shared=0, total=18599503)
[2024-06-25 00:04:58,859] - INFO - Fetched size for 'admin/ubuntu' repository: (unique=27207556, shared=0, total=27207556)
{
    "msr/msr-api": {
        "namespace": "msr",
        "name": "msr-api",
        "unique": 42680869,
        "shared": 5338472,
        "total": 48019341
    },
    "msr/msr-content-cache": {
        "namespace": "msr",
        "name": "msr-content-cache",
        "unique": 15473937,
        "shared": 5338278,
        "total": 20812215
    },
    "msr/msr-garant": {
        "namespace": "msr",
        "name": "msr-garant",
        "unique": 26427185,
        "shared": 5338472,
        "total": 31765657
    },
    "msr/msr-installer": {
        "namespace": "msr",
        "name": "msr-installer",
        "unique": 26337376,
        "shared": 5338472,
        "total": 31675848
    },
    "msr/msr-jobrunner": {
        "namespace": "msr",
        "name": "msr-jobrunner",
        "unique": 1261495926,
        "shared": 0,
        "total": 1261495926
    },
    "msr/msr-nginx": {
        "namespace": "msr",
        "name": "msr-nginx",
        "unique": 40166380,
        "shared": 5338472,
        "total": 45504852
    },
    "msr/msr-notary-server": {
        "namespace": "msr",
        "name": "msr-notary-server",
        "unique": 5742530,
        "shared": 5338472,
        "total": 11081002
    },
    "msr/msr-notary-signer": {
        "namespace": "msr",
        "name": "msr-notary-signer",
        "unique": 5293658,
        "shared": 5338472,
        "total": 10632130
    },
    "msr/msr-registry": {
        "namespace": "msr",
        "name": "msr-registry",
        "unique": 37647972,
        "shared": 5338472,
        "total": 42986444
    },
    "admin/harbor-core-base": {
        "namespace": "admin",
        "name": "harbor-core-base",
        "unique": 18599503,
        "shared": 0,
        "total": 18599503
    },
    "admin/ubuntu": {
        "namespace": "admin",
        "name": "ubuntu",
        "unique": 27207556,
        "shared": 0,
        "total": 27207556
    }
}

To make a simple size query for a specific list of namespaces and repositories using an insecure connection:

(myenv)   ~ python3 repository_size.py --insecure --namespaces=admin --repositories=msr/msr-api,msr/msr-nginx --simple --output -

Example output:

[2024-06-25 00:06:15,567] - INFO - Fetching simple=True repository sizes from MSR (127.0.0.1:8443) with user (admin)
[2024-06-25 00:06:15,567] - INFO - Received 2 repositories from user input
[2024-06-25 00:06:15,855] - INFO - Fetched 2 repositories in 'admin' namespace
[2024-06-25 00:06:15,856] - INFO - Retrieving sizes for 4 repositories (duplicates removed). This may take a while...
[2024-06-25 00:06:16,124] - INFO - Fetched size for 'msr/msr-api' repository: (unique=0, shared=0, total=48019341)
[2024-06-25 00:06:16,426] - INFO - Fetched size for 'msr/msr-nginx' repository: (unique=0, shared=0, total=45504852)
[2024-06-25 00:06:16,699] - INFO - Fetched size for 'admin/harbor-core-base' repository: (unique=0, shared=0, total=18599503)
[2024-06-25 00:06:16,964] - INFO - Fetched size for 'admin/ubuntu' repository: (unique=0, shared=0, total=27207556)
{
    "msr/msr-api": {
        "namespace": "msr",
        "name": "msr-api",
        "total": 48019341
    },
    "msr/msr-nginx": {
        "namespace": "msr",
        "name": "msr-nginx",
        "total": 45504852
    },
    "admin/harbor-core-base": {
        "namespace": "admin",
        "name": "harbor-core-base",
        "total": 18599503
    },
    "admin/ubuntu": {
        "namespace": "admin",
        "name": "ubuntu",
        "total": 27207556
    }
}