Commit Graph

122 Commits

Author SHA1 Message Date
Simone Gotti ce7924d7fd runservice: use the path escaped cache key
Use the path escaped cache key so we can also handle cache keys with slashes
inside.
2019-05-08 12:15:17 +02:00
Simone Gotti bec9476d6c runservice: store related runid with logs and archives
Logs and archives can be shared by multiple runs. So removing a run doesn't
imply that we could also remote the logs and archives since they could be
"referenced" by another run.

Store also the runids as specific objects along with the logs and archives so,
we'll remove them only when no runids objects exist.
2019-05-08 12:11:46 +02:00
Simone Gotti 1e34dca95d runservice: split and simplify scheduler and executor naming
Also if they are logically part of the runservice the names runserviceExecutor
and runserviceScheduler are long and quite confusing for an external user

Simplify them separating both the code parts and updating the names:

runserviceScheduler -> runservice
runserviceExecutor -> executor
2019-05-07 23:56:10 +02:00
Simone Gotti e4e7de4ad2 runservice/gateway: return run on run action 2019-05-07 23:23:58 +02:00
Simone Gotti afae185e11 *: rework run approval and annotations
* runservice: use generic task annotations instead of approval annotations
* runservice: add method to set task annotations

* gateway: when an user call the run task approval action, it will set in the
task annotations the approval users ids. The task won't be approved.

* scheduler: when the number of approvers meets the required minimum number
(currently 1) call the runservice to approve the task

In this way we could easily implement some approval features like requiring a
minimum number of approvers (saved in the task annotations) before marking the
run as approved in the runservice.
2019-05-06 15:19:29 +02:00
Simone Gotti a590c21127 runservice api: get run from readdb 2019-05-06 15:18:49 +02:00
Simone Gotti 3139ef38d9 runservice readdb: get run from ost db if it's not in run db 2019-05-06 14:55:10 +02:00
Simone Gotti cb78ea48bc runservice: rename command(handler) to action(handler)
Since we're going to migrate all actions (also queries that now are implemented
in the api handlers) there
2019-05-03 23:59:21 +02:00
Simone Gotti bad18bf814 *: report objects size for objectstorage.WriteObject 2019-05-02 09:49:55 +02:00
Simone Gotti 34cfdfeb3b objectstorage: add size option to WriteObject
On s3 limit the max object size to 1GiB when the size is not provided (-1) or
the minio client will calculate a big part size since it tries to use the
maximum object size (5TiB) and will allocate a very big buffer in ram. Also
leave as commented out the previous hack that was firstly creating the file
locally to calculate the size and then put it (for future reference).
2019-05-02 09:47:38 +02:00
Simone Gotti e964aa3537 objectstorage: add persist option to WriteObject
This options is a noop on s3 but on the posix implementation it becomes useful
when there isn't the need to have a persistent file, thus avoiding some fsync
calls.
2019-05-01 15:06:47 +02:00
Simone Gotti 27f84738d6 runservice: simplify workspace restore 2019-04-30 14:00:34 +02:00
Simone Gotti da6aefa7e2 runservice readdb: also resync changegroups 2019-04-29 10:16:19 +02:00
Simone Gotti f5cf3b9fa7 runservice: check changegroup name 2019-04-29 10:12:34 +02:00
Simone Gotti 2c3e6bf9e4 wal: update and rename to datamanager
* Rename to datamanager since it handles a complete "database" backed by an
objectstorage and etcd

* Don't write every single entry as a single file but group them in a single
file. In future improve this to split the data in multiple files of a max size.
2019-04-26 16:00:03 +02:00
Simone Gotti 41e333d7ec *: rename "lts" to "ost"
`lts` was choosen to reflect a "long term storage" but currently it's just an
object storage implementation. So use this term and "ost" as its abbreviation
(to not clash with "os").
2019-04-27 15:16:48 +02:00
Simone Gotti 33c328b3f5 runservice: move all scheduler etcd data to own dir 2019-04-27 08:59:47 +02:00
Simone Gotti e1368d18d6 runservice: add etcd pinger loop 2019-04-27 08:50:25 +02:00
Simone Gotti 9c7c589bba runservice executor: use k8s client informers/listers
Use k8s client informers/listers instead of polling the api every time
2019-04-26 10:15:23 +02:00
Simone Gotti 8989bd0e8e runservice: pass arch to driver
k8s driver: use the provided arch and set the related nodeselector label
(`kubernetes.io/arch`) when not empty.
2019-04-25 13:42:34 +02:00
Simone Gotti 6f88bd3d53 runservice: handle multiple executor archs
An executor can handle multiple archs (an executor that talks with a k8s cluster
with multi arch nodes). Don't use a label for archs but a custom executor
field.
2019-04-25 13:30:46 +02:00
Simone Gotti e0d37b08f2 runservice: add k8s driver 2019-04-22 17:54:24 +02:00
Simone Gotti 07bc4a21ff runservice scheduler: automatically remove dynamic executors 2019-04-24 13:25:41 +02:00
Simone Gotti 7c9be9b57d runservice executor: remove unused GetPodByID method 2019-04-24 15:53:03 +02:00
Simone Gotti a0d69f4bc3 runservice executor: update for executor groups
* Add the concept of executor groups and siblings executors
* Add the concept of dynamic executor: an executor in an executor group that
doesn't need to be manually deleted from the scheduler since the other sibling
executors will take care of cleaning up its pods.
* Remove external labels visibility from pod.
* Add functions to return the sibling executors and the executor group
* Delete pods of disappeared sibling executors
2019-04-24 12:37:55 +02:00
Simone Gotti 4da4f48f98 runservice executor: rename pod labels
* Use a command namespaced prefix
* Add executor id label for future usage
2019-04-22 18:19:43 +02:00
Simone Gotti abf908bcad runservice executor: rename makeEnv to makeEnvSlice 2019-04-22 18:19:13 +02:00
Simone Gotti 7e9abbf529 runservice executor: add driver Setup method
Remote custom `copytoolbox` hack and use a generic `Setup` function in the
driver interface
2019-04-22 18:17:55 +02:00
Simone Gotti 7ebc436854 runservice executor: generate pod id outside driver 2019-04-22 17:53:34 +02:00
Simone Gotti 17f3dc89f2 runservice executor: remove unused CopyTo method from driver 2019-04-22 18:27:48 +02:00
Simone Gotti dfeba334f6 runservice: update docker registry auth 2019-04-22 14:38:25 +02:00
Simone Gotti 9c74b4ddc1 runservice scheduler: choose scheduler only if it has capacity 2019-04-17 20:59:28 +02:00
Simone Gotti 1ac139434e runservice scheduler: cancel unscheduled root tasks when run has result
When run has a result set, root tasks not yet scheduled must be cancelled.
2019-04-17 18:00:34 +02:00
Simone Gotti 9f89a923c0 runservice scheduler: take a copy of run in advanceRunTasks
take and change a copy of the current run so we'll change newRun and use curRun
status for logic decision. In this way result are reproducible or they will be
affected by the random run.Tasks map iteration order.
2019-04-17 18:06:31 +02:00
Simone Gotti 4dd89646af runservice executor: report ActiveTasksLimit
Add a config option to set the active tasks limit and report it.
2019-04-17 15:51:20 +02:00
Simone Gotti 455623e58a runservice executor: report running tasks 2019-04-17 15:47:58 +02:00
Simone Gotti adf9c73518 runservice scheduler: choose executor with right arch
Choose an executor matching the required arch or any if no arch is required
2019-04-17 15:26:09 +02:00
Simone Gotti 22f0865aa3 runconfig: add and populate Runtime.Arch 2019-04-17 15:23:50 +02:00
Simone Gotti a511fbf10c runservice: executor: provide architecture information 2019-04-17 15:22:26 +02:00
Simone Gotti d3f658c5ad runservice: add run cache cleaner
Removes old cache entries (defaults to 7 days)
2019-04-17 13:58:41 +02:00
Simone Gotti 06374e14fd runservice: resolve ~ in working_dir 2019-04-15 11:12:07 +02:00
Simone Gotti 8bde2f2bc0 runservice: implement caching
Add `save_cache` and `restore_cache steps`
2019-04-13 14:58:56 +02:00
Simone Gotti 3928851c10 runservice: rename Run.RunTasks to Run.Tasks 2019-04-12 17:45:38 +02:00
Simone Gotti 68e95ad3be runservice: implement task dependencies conditions
Handle the task dependencies conditions:
* on_success (default if no conditions are specified)
* on_failure
* on_skipped

Not the runservice won't stop run but continue executing tasks that depends on a
parent also if this is failed
2019-04-12 16:46:04 +02:00
Simone Gotti 5165984030 runservice: convert RunConfigTask.Depends to a map 2019-04-12 17:04:07 +02:00
Simone Gotti 991fcc59de runservice: stop all running executor tasks when run is marked to stop 2019-04-11 23:44:55 +02:00
Simone Gotti c300a37d09 runservice: add some initial scheduler tests 2019-04-11 17:23:59 +02:00
Simone Gotti 634a8a543c runservice: implement docker registry auth
By now just support default username/password login

In future also support additional container registries with their own credential
helpers
2019-04-10 17:13:51 +02:00
Simone Gotti 751361daea runservice: refactor scheduling logic
* split functions in sub parts to ease future testing
* save run fewer times
* rework events logic to considere both run phase and result changes (emit an
event on every phase or result change)
2019-04-10 14:48:47 +02:00
Simone Gotti da27348a1d runservice: implement run setup errors
Add the ability to define a run with a setuperror phase.

When the run setup has errors client could submit a run with a list of setup
errors. In such case the run will be created in the setuperror phase.

Setup errors are currently generated by the webhook receiver and the run service
when it checks the run config for possible issues.
2019-04-09 16:51:37 +02:00
Simone Gotti 671b89d391 runservice: merge RunConfig and RunData
* Use just RunConfig
* Use StaticEnvironment vs Environment in RunConfig to distinguish between env
that won't change at run recreation from env that could change at every
recreation
* The RunCreate api will just receive the runtasks instead of a runconfig (more
right)
2019-04-09 18:11:00 +02:00
Simone Gotti 3642be6f21 */api: Use helpers for error handling
* client: always parse the json error message field and return its contents
* Use ErrBadRequest and ErrNotFound in every handler and command
* Gateway: by default pass underlying service error (configstore, runservice) to
client keeping the status code and message. In future, if some errors must be
masked, we should change the specific parts that need special handling.
2019-04-09 14:53:00 +02:00
Simone Gotti 643dfe4072 runservice api: improve response handling
* Command: use ErrBadRequest
* Always return a json message also on error. For internal errors return a
generic "internal server error" message to not leak the real internal error to
clients
* Return 201 Created on resource creation
* Return 204 No Content on resource deletion and other action with no json
output
2019-04-08 18:04:42 +02:00
Simone Gotti 7d787c5f77 *: implement task approval 2019-04-08 17:29:57 +02:00
Simone Gotti 04f3905ea1 client: fix content type header case 2019-04-08 12:28:15 +02:00
Simone Gotti f3781c9087 *: fix rest methods
* use POST instead of PUT for resource creation
* use PUT instead of POST for resource special actions
2019-04-08 08:54:45 +02:00
Simone Gotti fc891409ca update wal and readdb 2019-04-01 12:54:43 +02:00
Simone Gotti 75b5b65da3 runservice: check that readdb is initialized 2019-03-29 12:20:54 +01:00
Simone Gotti e46766829c runservice: rework store and readdb logic
* Remove all the small index files on the lts
* Keep on s3 only a full index of all runs containing the runid, grouppath and phase
  million of runs can take only some hundred of megabytes
* Periodically create a new dump of the index
2019-03-29 12:15:48 +01:00
Simone Gotti 3c5eb71ba8 runservice: make and check that group paths are absolute 2019-03-29 12:09:32 +01:00
Simone Gotti 48ab496beb *: add api to query last run per group 2019-03-29 12:00:18 +01:00
Simone Gotti 21447fc59d etcd: allow specifying a revision for a get 2019-03-29 11:37:22 +01:00
Simone Gotti c9089c3ccc runservice: allow restart run only if possible 2019-03-29 09:09:57 +01:00
Simone Gotti 1657a35a6f runservice: refactor fetch phase check
Use dedicated functions
2019-03-29 09:00:19 +01:00
Simone Gotti 65c425b22b wal: report when wal is ready
in this way the wal instance will be used only after it's ready (initialized
etcd when needed)
2019-03-28 15:46:24 +01:00
Simone Gotti 8f4a5b29b9 *: implement setup step 2019-03-13 15:48:35 +01:00
Simone Gotti b05b377d31 runservice: add option to define custom container entrypoint 2019-03-13 12:12:32 +01:00
Simone Gotti 16ac6ada66 runservice: add privileged containers options 2019-03-13 12:11:46 +01:00
Simone Gotti f09602cdc3 *: implement run stop 2019-03-08 10:02:37 +01:00
Simone Gotti 6f38c48066 *: initial implementation of when conditions 2019-03-07 18:01:34 +01:00
Simone Gotti 9d2c133817 runservice implement initial basic run restart 2019-03-04 16:11:18 +01:00
Simone Gotti 36fc79dfc6 runservice: initial commit 2019-02-21 15:54:50 +01:00