Reference
This page describes project-level structure and conventions in the full repository.
Use this page after any of the numbered options.
For package internals and reusable package reference material, use:
planktonclas: https://github.com/woutdecrop/planktonclas
Package entry points
planktonclas.apiDEEPaaS-facing API layer. Handles metadata, schema generation, training dispatch, model loading, file validation, and prediction formatting.
planktonclas.train_runfileDirect training runner. Creates output directories, builds generators, trains the TensorFlow model, stores metrics, saves checkpoints, and optionally evaluates a test split.
planktonclas.configLoads the packaged default config template or a user-provided project
config.yaml, validates values, and exposes the flattened configuration dictionary used across the package.planktonclas.pathsCentral path resolver for images, models, checkpoints, logs, stats, and predictions.
planktonclas.report_utilsGenerates evaluation plots and summary files in the timestamped
results/directory.planktonclas.test_utilsInference helpers for crop-based prediction and top-k accuracy computation.
planktonclas.visualizationVisualization and explainability utilities, including saliency-related helpers used by the notebooks.
Configuration map
The runtime configuration is grouped in the active config.yaml under:
generalmodeldatasettrainingmonitoraugmentationtesting
Important conventions
images are read from
general.images_directoryif
data/dataset_files/is empty, training can generate split files automatically from the image-folder structureif you provide custom split files,
classes.txtandtrain.txtare the minimum expected files underdata/dataset_files/outputs are organized by training timestamp under
models/<timestamp>/training with test evaluation saves both prediction JSON files and a compact metrics JSON under
models/<timestamp>/predictions/inference defaults to the latest available trained timestamp
new local training runs save their final exported model as
final_model.keras, while the legacy pretrainedPhytoplankton_EfficientNetV2B0model still usesfinal_model.h5planktonclas reportsuggests the most recent timestamp when--timestampis omitted and can prompt for another run by numberplanktonclas reportdefaults toquickmode and only generates the subfolder threshold plots infullmode
Practical usage after a model is created
Once a model has been trained through the command-line, API, or notebook workflow, you can also interact with it directly from Python.
Typical things you may want to do are:
load a project config
load a trained model from a specific timestamp
predict one image from Python
inspect where the package is writing model outputs
Load the project config
from planktonclas import config
config.set_config_path("my_project/config.yaml")
conf = config.get_conf_dict()
Load a trained model
from planktonclas import config, paths
from planktonclas.api import load_inference_model
config.set_config_path("my_project/config.yaml")
paths.CONF = config.get_conf_dict()
load_inference_model(
timestamp="2026-03-26_120000",
ckpt_name="best_model.keras",
)
Predict one image from Python
from planktonclas import config, paths, api, test_utils
config.set_config_path("my_project/config.yaml")
paths.CONF = config.get_conf_dict()
api.load_inference_model()
conf = config.conf_dict
labels, probabilities = test_utils.predict(
model=api.model,
X=["/absolute/path/to/image.png"],
conf=conf,
top_K=5,
filemode="local",
merge=False,
use_multiprocessing=False,
)
Inspect output locations
from planktonclas import config, paths
config.set_config_path("my_project/config.yaml")
paths.CONF = config.get_conf_dict()
print(paths.get_models_dir())
print(paths.get_checkpoints_dir())
print(paths.get_logs_dir())
print(paths.get_predictions_dir())
Source files
For the implementation details, start with these files in the repository:
planktonclas/api.pyplanktonclas/train_runfile.pyplanktonclas/config.pyplanktonclas/paths.pyplanktonclas/test_utils.py