## lamindb.integrations.lightning

PyTorch Lightning integration for LaminDB.

The public API has two layers:

* "Checkpoint" is the concrete LaminDB implementation that persists
  checkpoint, config, and "hparams.yaml" files as "Artifact" objects
  and annotates them with "Feature" objects.

* "ArtifactPublishingModelCheckpoint" is the generic extension layer
  adding checkpoint artifact lifecycle hooks without implementing
  Lamin persistence details yet.

External integrations can either subclass "Checkpoint" directly or
attach an "ArtifactObserver" to react to saved and removed artifacts.

Here is a guide: Lightning.

# Main API

class lamindb.integrations.lightning.Checkpoint(dirpath=None, *, features=None, monitor=None, verbose=False, save_last=None, save_top_k=1, save_weights_only=False, mode='min', auto_insert_metric_name=True, every_n_train_steps=None, train_time_interval=None, every_n_epochs=None, save_on_train_epoch_end=None, enable_version_counter=True, run_uid_is_version=True, artifact_observers=None)

 A "ModelCheckpoint" that annotates "pytorch" "lightning"
 checkpoints.

 Extends "lightning"'s "ModelCheckpoint" with artifact creation &
 feature annotation. Each checkpoint is a separate artifact whose
 key is derived from either the explicit "dirpath" or the trainer's
 logger configuration.

 When "dirpath" is omitted (recommended), Lightning decides where to
 store checkpoints locally (typically
 "lightning_logs/version_N/checkpoints/") and the artifact key is
 derived from the logger's "save_dir", "name", and "version".  When
 "dirpath" is provided, it is used directly as the key prefix.

 All artifacts are scoped under a single **base prefix**.
 Checkpoints (and "hparams.yaml") live under "{base}/checkpoints/";
 other artifacts (e.g. "config.yaml") live directly under "{base}/".

 Base prefix derivation (highest priority first):

 1. "dirpath" provided → "{dirpath}" (logger is ignored for key
 purposes)

 2. "dirpath" omitted, logger present →
 "{save_dir_basename}/{name}/{version}"

 3. "dirpath" omitted, no logger → empty

 When "run_uid_is_version" is "True" (the default) and a Lamin run
 context is active, the run UID is incorporated into the base
 prefix:

 * Case 1/3: the run UID is appended as an extra path segment (e.g.
 "my/dir/{run_uid}", or just "{run_uid}").

 * Case 2: the logger's auto-incremented "version" is *replaced* by
 the run UID ("{save_dir_basename}/{name}/{run_uid}").

 Resulting key layout (with run UID active):

 {base}/checkpoints/epoch=0-step=100.ckpt
 {base}/checkpoints/hparams.yaml
 {base}/config.yaml

 If available in the database through "save_lightning_features()",
 the following "lamindb.lightning" features are automatically
 tracked:

 * Artifact-level: "is_best_model", "is_last_model", "score",
 "model_rank", "save_weights_only", "monitor", "mode"

 * Run-level: "logger_name", "logger_version", "max_epochs",
 "max_steps", "precision", "accumulate_grad_batches",
 "gradient_clip_val", "monitor", "mode"

 Additionally, model hyperparameters (from "pl_module.hparams") and
 datamodule hyperparameters (from "trainer.datamodule.hparams") are
 captured if corresponding features exist.

 This is the concrete LaminDB implementation built on top of
 "ArtifactPublishingModelCheckpoint". Use it when you want LaminDB
 to be the persistence layer. For secondary systems such as ClearML,
 prefer attaching an "ArtifactObserver" or subclassing "Checkpoint"
 and reacting in "on_artifact_saved()".

 Parameters:
| * **dirpath** (_PATH | None, default: "None") -- Directory for |
 checkpoints.  When provided, also used as the artifact key
 prefix.  When omitted (recommended), Lightning picks the local
 directory and the key prefix is derived from the logger.

 * **features** (dict[Literal['run', 'artifact'], dict[str, Any]]
| None, default: "None") -- Features to annotate runs and |
 artifacts. Use "run" key for run-level features (static
 metadata). Use "artifact" key for artifact-level features
 (values can be static or None for auto-population from trainer
 metrics/attributes).

| * **monitor** (str | None, default: "None") -- Quantity to |
 monitor for saving best checkpoint.

 * **verbose** (bool, default: "False") -- Verbosity mode.

| * **save_last** (bool | None, default: "None") -- Save a copy of |
 the last checkpoint.

 * **save_top_k** (int, default: "1") -- Number of best
 checkpoints to keep.

 * **save_weights_only** (bool, default: "False") -- Save only
 model weights (not optimizer state).

 * **mode** (Literal['min', 'max'], default: "'min'") -- One of
 "min" or "max" for monitor comparison.

 * **auto_insert_metric_name** (bool, default: "True") -- Include
 metric name in checkpoint filename.

| * **every_n_train_steps** (int | None, default: "None") -- |
 Checkpoint every N training steps.

| * **train_time_interval** (timedelta | None, default: "None") -- |
 Checkpoint at time intervals.

| * **every_n_epochs** (int | None, default: "None") -- Checkpoint |
 every N epochs.

| * **save_on_train_epoch_end** (bool | None, default: "None") -- |
 Run checkpointing at end of training epoch.

 * **enable_version_counter** (bool, default: "True") -- Append
 version to filename to avoid collisions.

 * **run_uid_is_version** (bool, default: "True") -- When "True"
 (default) and a Lamin run context is active, incorporate the
 run UID into the base prefix. For the logger case the logger's
 auto-incremented version is replaced; for the dirpath and no-
 logger cases the run UID is appended as an extra path segment.
 Prevents cross-run key collisions.

| * **artifact_observers** (list[ArtifactObserver] | None, |
 default: "None") -- Optional observer objects notified when
 checkpoint, config, or hparams artifacts are saved or when
 checkpoint files are removed locally. Observers follow
 "ArtifactObserver" and receive "ArtifactSavedEvent" and
 "ArtifactRemovedEvent".

 -[ Examples ]-

 Let Lightning decide where to store checkpoints (recommended):

 import lightning as pl
 from lightning.pytorch.loggers import CSVLogger
 from lamindb.integrations import lightning as ll

 ll.save_lightning_features()

 callback = ll.Checkpoint(monitor="val_loss", save_top_k=3)
 logger = CSVLogger(save_dir="logs")

 trainer = pl.Trainer(callbacks=[callback], logger=logger)
 trainer.fit(model, dataloader)

 # Query checkpoints — key prefix is derived from the logger
 # e.g. "logs/lightning_logs/version_0/checkpoints/"
 ln.Artifact.filter(key__startswith=callback.checkpoint_key_prefix)

 Explicit "dirpath" for full control over the artifact key prefix:

 callback = ll.Checkpoint(
 dirpath="deployments/my_model/",
 monitor="val_loss",
 save_top_k=3,
 )

 trainer = pl.Trainer(callbacks=[callback])
 trainer.fit(model, dataloader)

 # Query checkpoints
 ln.Artifact.filter(key__startswith=callback.checkpoint_key_prefix)

 Using the CLI:

 # config.yaml
 trainer:
 callbacks:
 - class_path: lamindb.integrations.lightning.Checkpoint
 init_args:
 monitor: val_loss
 save_top_k: 3

 # Run with:
 # python main.py fit --config config.yaml

 For more, see the guide: Lightning.

 property base_prefix: str

 The base artifact key prefix for all artifacts from this
 callback.

 Checkpoints live under "{base_prefix}/checkpoints/" and configs
 directly under "{base_prefix}/".

 Available after "setup()" has been called.

 property checkpoint_key_prefix: str

 The artifact key prefix used for checkpoint artifacts.

 Available after "setup()" has been called, for example once
 "trainer.fit()" has started.

 setup(trainer, pl_module, stage)

 Validate user features and detect available auto-features.

 Return type:
 "None"

 resolve_artifact_storage_uri(artifact)

 Resolve the physical artifact location for downstream
 registries.

 This is the stable abstraction external packages should use
 instead of reconstructing storage locations from Lamin
 internals.

 Return type:
 "str"

 resolve_artifact_key(trainer, filepath, kind)

 Return the Lamin artifact key for a checkpoint-related file.

 Return type:
 "str"

 save_checkpoint_artifact(trainer, filepath, *, feature_values=None)

 Save a checkpoint artifact to Lamin and emit the corresponding
 event.

 This is the main persistence hook used by "_save_checkpoint()".
 It is a useful override point for subclasses that want to
 augment Lamin persistence while keeping the generic lifecycle
 behavior from the base class.

 Return type:
 "Artifact"

 save_config_artifact(trainer, config_path)

 Save a Lightning CLI config artifact and emit the corresponding
 event.

 Config artifacts are routed through the same lifecycle surface
 as checkpoints so observers and subclasses see a unified event
 stream.

 Return type:
 "Artifact"

 save_hparams_artifact(trainer, hparams_path)

 Save Lightning's auto-generated hparams file and emit the event.

 Returns "None" if Lightning did not generate "hparams.yaml" for
 the current run.

 Return type:
| "Artifact" | "None" |

lamindb.integrations.lightning.save_lightning_features()

 Save features to auto-track lightning parameters & metrics.

 Creates the following features under the "lamindb.lightning"
 feature type if they do not already exist:

 Artifact-level features:

 * "is_best_model" (bool): Whether this checkpoint is the best
 model.

 * "is_last_model" (bool): Whether this checkpoint is the most
 recently saved model.

 * "score" (float): The monitored metric score.

 * "model_rank" (int): Rank among all checkpoints (0 = best).

 * "save_weights_only" (bool): Whether this checkpoint only stores
 model weights.

 * "monitor" (str): Metric name this checkpoint uses for comparison.

 * "mode" (str): Optimization mode ("min" or "max") used for
 checkpoint ranking.

 Run-level features:

 * "logger_name" (str): Name from the first Lightning logger.

 * "logger_version" (str): Version from the first Lightning logger.

 * "max_epochs" (int): Maximum number of epochs.

 * "max_steps" (int): Maximum number of training steps.

 * "precision" (str): Training precision (e.g., "32", "16-mixed",
 "bf16").

 * "accumulate_grad_batches" (int): Number of batches to accumulate
 gradients over.

 * "gradient_clip_val" (float): Gradient clipping value.

 * "monitor" (str): Metric name being monitored.

 * "mode" (str): Optimization mode ("min" or "max").

 Parameters:
 **None.**

 Return type:
 "None"

 -[ Example ]-

 Save the features to the database:

 from lamindb.integrations import lightning as ll

 ll.save_lightning_features()

# Auxiliary classes

class lamindb.integrations.lightning.ArtifactPublishingModelCheckpoint(*args, artifact_observers=None, **kwargs)

 ModelCheckpoint with observable artifact lifecycle hooks.

 This layer captures artifact kinds, observer registration,
 saved/removed
 events, latest artifact tracking, and key compatibility hooks.
 Concrete subclasses remain responsible for how artifacts are
 persisted.

 Subclasses are expected to implement:

 * "resolve_artifact_key()" to map local files to logical
 artifact keys

 * "resolve_artifact_storage_uri()" to expose a stable backend
 URI

 * "save_checkpoint_artifact()", "save_config_artifact()", and
 "save_hparams_artifact()" to persist files

 "SaveConfigCallback" only depends on this base class, which
 means a custom checkpoint callback can participate in config
 saving without inheriting from Lamin's concrete "Checkpoint".

| property last_checkpoint_artifact: Any | None |

 The most recently saved checkpoint artifact handle.

| property last_config_artifact: Any | None |

 The most recently saved config artifact handle.

| property last_hparams_artifact: Any | None |

 The most recently saved hparams artifact handle.

| property last_artifact_event: ArtifactSavedEvent | ArtifactRemovedEvent | None |

 The last artifact lifecycle event emitted by this callback.

 get_last_artifact(kind)

 Return the most recently saved artifact for a given artifact
 kind.

 Return type:
| "Any" | "None" |

 add_artifact_observer(observer)

 Register an observer notified about artifact lifecycle events.

 Return type:
 "None"

 remove_artifact_observer(observer)

 Unregister a previously added artifact observer.

 Return type:
 "None"

 resolve_artifact_storage_uri(artifact)

 Resolve the physical location for a persisted artifact.

 Return type:
 "str"

 resolve_artifact_key(trainer, filepath, kind)

 Return the logical artifact key for a checkpoint-related file.

 Return type:
 "str"

 on_artifact_saved(event)

 Hook for subclasses after an artifact has been saved.

 Return type:
 "None"

 on_artifact_removed(event)

 Hook for subclasses after a checkpoint file has been removed.

 Return type:
 "None"

 save_checkpoint_artifact(trainer, filepath, *, feature_values=None)

 Persist a checkpoint artifact and emit the corresponding event.

 Return type:
 "Any"

 save_config_artifact(trainer, config_path)

 Persist a config artifact and emit the corresponding event.

 Return type:
 "Any"

 save_hparams_artifact(trainer, hparams_path)

 Persist an hparams artifact and emit the corresponding event.

 Return type:
| "Any" | "None" |

class lamindb.integrations.lightning.SaveConfigCallback(*args, **kwargs)

 SaveConfigCallback that also saves config to the instance.

 Use with LightningCLI to save the resolved configuration file
 alongside checkpoints.

 The local config file is saved under "{save_dir}/{name}/{version}/"
 derived from the first logger, avoiding Lightning's
 "trainer.log_dir" which hardcodes an "isinstance" check for
 "TensorBoardLogger" / "CSVLogger" and silently changes the
 directory for other loggers.

 This callback looks for any "ArtifactPublishingModelCheckpoint",
 not just Lamin's concrete "Checkpoint". That keeps the config-save
 path aligned with custom subclasses built on the generic artifact-
 publishing base.

 Config artifacts are stored directly under the **base prefix** of
 the active "Checkpoint" callback.  The base prefix follows the same
 derivation rules as for checkpoints (dirpath > logger > empty), so
 configs are always co-located with their checkpoints:

 * "Checkpoint.dirpath" set → "{dirpath}/config.yaml"
 ("{dirpath}/{run_uid}/config.yaml" with run-UID scoping)

 * Logger present, no "dirpath" →
 "{save_dir_basename}/{name}/{version}/config.yaml"

 * Neither → "config.yaml" (or "{run_uid}/config.yaml" with run-UID
 scoping)

 Example:

 from lightning.pytorch.cli import LightningCLI
 from lamindb.integrations import lightning as ll

 cli = LightningCLI(
 MyModel,
 MyDataModule,
 save_config_callback=ll.SaveConfigCallback,
 )

 setup(trainer, pl_module, stage)

 Save resolved configuration file alongside checkpoints.

 Return type:
 "None"

class lamindb.integrations.lightning.ArtifactSavedEvent(kind, key, local_path, trainer, artifact, storage_uri)

 Metadata emitted after a checkpoint-related artifact has been
 persisted.

 "artifact" is intentionally typed generically so downstream
 integrations can expose their own persisted object while still
 using the common lifecycle API. "storage_uri" is the stable hand-
 off value for registries such as ClearML.

class lamindb.integrations.lightning.ArtifactRemovedEvent(kind, key, local_path, trainer, artifact=None, storage_uri=None)

 Metadata emitted after a local checkpoint file has been removed.

 Removal currently applies to checkpoint files. Config and hparams
 artifacts are save-only in the current Lightning integration.

| artifact: "Any" | "None" = None |

| storage_uri: "str" | "None" = None |