I want to develop a solution to manage the dataset and also the development environment detection.
One solution I can think of is setup an environment like what wandb has done with WANDB_MODE.
class SummaryWriter(LightningLoggerBase):
def is_full_training(self):
return os.get("ENVIRONMENT") == "PRODUCTION"
The self-hosted runner also should be mount to a shared NFS, eg. DATA_PATH on our lab!