* Rename to datamanager since it handles a complete "database" backed by an
objectstorage and etcd
* Don't write every single entry as a single file but group them in a single
file. In future improve this to split the data in multiple files of a max size.
`lts` was choosen to reflect a "long term storage" but currently it's just an
object storage implementation. So use this term and "ost" as its abbreviation
(to not clash with "os").
An executor can handle multiple archs (an executor that talks with a k8s cluster
with multi arch nodes). Don't use a label for archs but a custom executor
field.
* Add the concept of executor groups and siblings executors
* Add the concept of dynamic executor: an executor in an executor group that
doesn't need to be manually deleted from the scheduler since the other sibling
executors will take care of cleaning up its pods.
* Remove external labels visibility from pod.
* Add functions to return the sibling executors and the executor group
* Delete pods of disappeared sibling executors
take and change a copy of the current run so we'll change newRun and use curRun
status for logic decision. In this way result are reproducible or they will be
affected by the random run.Tasks map iteration order.
Handle config files with name `config.jsonnet`, `config.json` and
`config.yml` and take the first from the repository in this order
For a jsonnet file execute it and use the generated output as the config
The current config format was thought for future extensions for reusing runtimes
and job definitions adding some parameters.
After a lot of thoughts this looks like a complex approach: the final result
will be a sort of templating without a lot of powers.
Other approach like external templating should be an alternative but I really
don't think templating yaml is the way to go.
A much better approach will to just use jsonnet when we need to create matrix
runs and a lot of other use cases.
So just make the config a simple yaml/json. User can generate their config using
any preferred tool and in future we'll leverage jsonnet automated parsing and
provide a lot of jsonnet based examples for most use cases.
Main changes:
* Runs are now an array and not a map. The run name is in the Name field
* Tasks are now an array and not a map. The task name is in the Name field
* Use https://github.com/ghodss/yaml so we'll use json struct tags and unmarshall functions
Handle the task dependencies conditions:
* on_success (default if no conditions are specified)
* on_failure
* on_skipped
Not the runservice won't stop run but continue executing tasks that depends on a
parent also if this is failed
Additionally don't save a CloneURL field inside the project type.
If in future some git source doesn't provide a clone url we could just calculate
it from project.RepoPath or call the remote api to retrieve it.
* split functions in sub parts to ease future testing
* save run fewer times
* rework events logic to considere both run phase and result changes (emit an
event on every phase or result change)
Add the ability to define a run with a setuperror phase.
When the run setup has errors client could submit a run with a list of setup
errors. In such case the run will be created in the setuperror phase.
Setup errors are currently generated by the webhook receiver and the run service
when it checks the run config for possible issues.
* Use just RunConfig
* Use StaticEnvironment vs Environment in RunConfig to distinguish between env
that won't change at run recreation from env that could change at every
recreation
* The RunCreate api will just receive the runtasks instead of a runconfig (more
right)
* client: always parse the json error message field and return its contents
* Use ErrBadRequest and ErrNotFound in every handler and command
* Gateway: by default pass underlying service error (configstore, runservice) to
client keeping the status code and message. In future, if some errors must be
masked, we should change the specific parts that need special handling.
* Command: use ErrBadRequest
* Always return a json message also on error. For internal errors return a
generic "internal server error" message to not leak the real internal error to
clients
* Return 201 Created on resource creation
* Return 204 No Content on resource deletion and other action with no json
output
* Always return a json message also on error. For internal errors return a
generic "internal server error" message to not leak the real internal error to
clients
* Return 201 Created on resource creation
* Return 204 No Content on resource deletion and other action with no json
output
* Always return a json message also on error. For internal errors return a
generic "internal server error" message to not leak the real internal error to
clients
* Return 201 Created on resource creation
* Return 204 No Content on resource deletion and other action with no json
output
* Remove all the small index files on the lts
* Keep on s3 only a full index of all runs containing the runid, grouppath and phase
million of runs can take only some hundred of megabytes
* Periodically create a new dump of the index