ML Training
Submit a training job
curl -sS -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
https://computalot.com/api/v1/jobs \
-d '{
"type": "structured_runner",
"runner_command": ["python", "train.py"],
"payload": {"epochs": 100, "batch_size": 32},
"project": "my-ml-project",
"timeout_s": 7200,
"requirements": {"profile": "gpu"}
}'Report progress
import json, sys
for epoch in range(100):
loss = train_one_epoch()
print(f"COMPUTALOT_PROGRESS:{json.dumps({'epoch': epoch, 'loss': loss})}")
sys.stdout.flush()Save artifacts
import os, torch
model_path = os.path.join(os.environ['COMPUTALOT_ARTIFACT_DIR'], 'model.pt')
torch.save(model.state_dict(), model_path)Last updated on