gocfl / ocfl / extensions
After familiarizing ourselves with the basics and configuration of gocfl, the first practical step in an OCFL-based archive is the creation of a Storage Root.
The Storage Root is the top level of your archive. It is the container in which all OCFL objects are stored. A storage root contains:
ocfl_1.1.txt (or another version) marking the directory path as OCFL storage.ocfl_layout.json defining how objects are structured and named within the root.extensions/ folder for specific functional extensions.According to the OCFL Specification on Root Structure, clear rules apply to a storage root to ensure long-term interpretability:
ocfl_v1.1.txt (or similar, gocfl uses 0=ocfl_1.1) identifying the path as an OCFL Storage Root.ocfl_layout.json: Recommended to define the mapping of object IDs to directory paths.extensions/: A directory for extension configurations (see OCFL Storage Root Extensions).extensions folder (see OCFL Storage Root Extensions) and the directory structures for OCFL objects. Other folders are not allowed..md files created by gocfl) may be present as long as they do not collide with reserved names or object directories. They serve as self-documentation.gocfl initWith gocfl, you initialize a new storage root via the init subcommand.
gocfl init [path to storage root] [flags]
init--ocfl-version: Sets the specification version (default: 1.1).-d, --digest: Defines the default hash algorithm (e.g., sha512 or sha256).--default-storageroot-extensions: Path to a folder with configurations for extensions to be loaded initially. This extension template is crucial for the automatic configuration of the storage root (see below).In this example, we initialize a storage root using a specific configuration file and a target directory:
gocfl --config ./gocfl/config/gocfl.toml init ./gocfl/temp/test42/
--config ./gocfl/config/gocfl.toml: gocfl loads settings from the specified file. Especially important here is the entry under [Init] pointing to the extension template:
[Init]
StorageRootExtensions = "/home/ocfl/gocfl/config/extensions/storageroot"
init ./gocfl/temp/test42/: Creates the necessary OCFL structure files in the directory ./gocfl/temp/test42/ and copies the extension configurations and documentation defined in the template into the storage root.If you look into the directory after the command, you will see a structure that goes beyond the minimal OCFL specification, as gocfl creates useful additional information and configurations:
test42/
├── 0=ocfl_1.1 # Storage root marker (OCFL Version 1.1)
├── ocfl_layout.json # Definition of the path layout
├── ocfl_spec_1.1.md # The OCFL specification as a reference
├── extensions/ # Configurations for enabled extensions
│ ├── 0004-hashed-n-tuple-storage-layout/
│ ├── initial/
│ └── NNNN-gocfl-extension-manager/
├── 00XX-*.md # Documentation for standard extensions
├── NNNN-*.md # Documentation for the used GOCFL extensions
└── initial.md # Documentation of the initial configuration
0=ocfl_1.1: In the specification, this file is called ocfl_1.1.txt. gocfl uses this name by default to declare the version of the storage root. It is empty and serves only as a “name tag”.ocfl_layout.json: This file is crucial for scalability. In our example, 0004-hashed-n-tuple-storage-layout is used. This means that object IDs (like ark:/12345/bcd987) are hashed and distributed into subdirectories (e.g., a1b/2c3/d4e/...) to prevent too many folders in a single directory.ocfl_spec_1.1.md: This file contains the complete OCFL specification directly in the storage root. Thus, the root is not only self-contained (all data is present) but also self-describing, as the rules for access and interpretation are provided directly.extensions/: Here are the configuration files (config.json) for the extensions (see OCFL Storage Root Extensions).
0004-... configures the layout mentioned above.initial determines which extension is loaded first (in our case, the NNNN-gocfl-extension-manager).NNNN-gocfl-extension-manager is a gocfl-specific extension responsible solely for initializing the storage root..md files (e.g., ocfl_spec_1.1.md, 0001-*.md and NNNN-*.md): During initialization, gocfl copies both the complete OCFL specification and an extensive collection of descriptions for the available extensions (such as 0004-...md, NNNN-indexer.md or NNNN-migration.md) directly into the storage root. This happens during initialization by the Extension Manager. Thus, the principle of self-documentation is consistently implemented: anyone accessing this medium in the future will find not only the general specification but also technical explanations for all functions used in the archive right on-site, without being dependent on external websites.| Back to Table of Contents | Next Topic: OCFL Extensions |