Commit Graph

492 Commits

Author SHA1 Message Date
Simone Gotti 627f40987e run: export also pull request id environment variable to runs 2020-03-09 09:42:51 +01:00
Simone Gotti 1820c0247c config: provide jsonnet context top level argument 2020-03-06 11:19:49 +01:00
Simone Gotti e20abf053c runconfig: disable password authentication in clone step
If for some reasons the ssh public key auth fails, avoid the clone step to block
during a git clone waiting for a password.
2020-03-05 16:11:20 +01:00
Simone Gotti 0e2b01a586 runservice: correctly handle skipped tasks in fetcher
skip fetching of tasks with status skipped, not only tasks marked as skip.
This avoid many wrong an noisy logs of type "executor task with id taskid
doesn't exist. This shouldn't happen. Skipping fetching"
2020-03-02 10:40:59 +01:00
Simone Gotti eb180da914
Merge pull request #225 from sgotti/runservice_fix_handling_of_wrong_executortask_status
runservice: fix handling of wrong executortask status
2020-03-02 10:26:32 +01:00
Simone Gotti 4da7c23bb8
Merge pull request #224 from sgotti/executor_use_cancellable_context_in_executetask
executor: use cancellable context in executetask
2020-03-02 10:26:16 +01:00
Simone Gotti 382705bde9
Merge pull request #223 from sgotti/executor_fix_stopping_of_not_running_task
executor: fix stopping of not running tasks
2020-03-02 09:45:39 +01:00
Simone Gotti 19611c18e7 runservice: fix handling of wrong executortask status
updateRunTaskStatus should also accept transitions from not started to a
finished state like "success", "failed", "stopped" since we could miss some
status updates from the executor for many reasons.
2020-02-28 13:02:35 +01:00
Simone Gotti e4507446ed executor: use cancellable context in executetask
Use a cancellable context to handle running task stop.
When the context is done the pod will be stopped.
2020-02-28 10:52:36 +01:00
Simone Gotti 97d145a9d3 executor: fix stopping of not running tasks
When a related runningTask doesn't exist and the executor task status is
running just report it as failed ignoring if it's marked to stop.
2020-02-28 10:45:07 +01:00
Simone Gotti 3ac018e6e5 runservice: use all scheduled tasks in scheduleRun
rename activeExecutorTasks to scheduledExecutorTasks and don't filter out
finished tasks.
In some logic we need all the scheduled tasks and not only the not finished
ones.
2020-02-28 09:56:12 +01:00
Simone Gotti a4e280cd7d
Merge pull request #222 from sgotti/executor_fix_reporting_of_stopped_tasks_and_steps
executor: fix reporting of stopped tasks and steps
2020-02-28 09:55:40 +01:00
Simone Gotti 19b8c7f427 executor: fix reporting of stopped tasks and steps
In executeTask set the executor task and step phase to stop if task spec Stop is
true.
2020-02-27 17:38:00 +01:00
Simone Gotti 88dbca15a3 executor: serialize task handling
taskUpdater will be called serially and won't block. It'll execute a goroutine
for executing the task and for sending the task state to the scheduler.

executeTask will just start task execution, all the logic of choosing if
starting a task is moved inside taskUpdater

In this way we avoid concurrency issues when handling the same executorTask
in parallel
2020-02-27 17:19:42 +01:00
Simone Gotti 145c87b4c0 runservice: minimize scheduling of tasks that will be queued by the executor
Since the executor only periodically updates its state we could end up
scheduling much more tasks than the executor ActiveTasksLimit. This will happen
in the case of many parallel tasks that can all start at the same time.

To avoid this also considere the executor tasks saved in etcd that represent
the real view of scheduled tasks.
2020-02-27 11:03:03 +01:00
Simone Gotti 5dd9e587fe runservice: mark not running tasks as skipped when run marked to stop
Currently when a run is marked to stop we are going to stop currently running
tasks and then their childs will be marked as skipped.

But tasks not depending on a stopped task (root task or childs with a finished
parent) that are just waiting for an executor slot, will be scheduled when
there will be a free slot also if the run is marked to stop (and then the
scheduler will stop them after some seconds).

This patch will mark all not started tasks as skipped when the run is marked to
stop.
2020-02-26 16:45:09 +01:00
Simone Gotti eb48e73a54 gateway: move authentication apis to /api/v1alpha/auth
Move the various authentication apis to /api/v1alpha/auth since they should be
versioned like other apis.
2020-02-19 10:48:14 +01:00
Simone Gotti ed53183431 go.mod: update dependencies
Update all the updatable dependencies
2020-02-18 13:55:50 +01:00
Simone Gotti dad7447989 gitsources: handle skipverify also in oauth2 requests
Pass a custom http client set to skip tls verification if required to oauth2
calls.
2020-02-11 21:49:32 +01:00
Simone Gotti 0611b5cc67 github: handle nil user email 2020-02-11 15:59:30 +01:00
Simone Gotti 59463944db github: use the provided api url
we were always setting the public github url in the github client instead of the
provided api url.
2020-02-11 09:10:09 +01:00
Carlo Mandelli 182eb14b20 cmd: project option to disable passing variables to PR from forked repo 2020-01-28 09:02:37 +01:00
Carlo Mandelli d049782e29 tests: add unique name for logs of the third ConfigStore instance 2020-01-16 13:54:13 +01:00
Simone Gotti 2de91549a3 tests: improve services logging
During tests provide a zaptest Logger so all services output will be redirected
to golang testing logger.

When multiple services of the same type are provided add a unique name field to
distinguish them.
2020-01-15 12:30:34 +01:00
Simone Gotti ecf355721f docker: create a toolbox volume for every pod
Instead of doing the current hack of copying the agola toolbox inside the host
tmp dir (always done but only needed when running the executor inside a docker
container) that has different issues (like tmp file removal done by
tmpwatch/systemd-tmpfiles), use a solution similar to the k8s driver: for every
pod create a volume containing the agola-toolbox and remove it at pod removal.

We could also use a single "global" volume but we should handle cases like
volume removal (i.e. a docker volume prune command). So for now just create a
dedicated per pod volume.
2020-01-10 12:25:12 +01:00
Simone Gotti eafa4d1381 datamanager tests: don't wait for etcd down
It is causing some timeout errors since there can be another instance from
another test run in parallel started on the same port.
2019-12-02 13:32:57 +01:00
Carlo Mandelli 3e47bc601a gateway/runservice: add api to delete step logs 2019-11-18 10:34:56 +01:00
Simone Gotti 7e8f7155d7 runservice: improve errors in logsHandler
return errNotExist in readTaskLogs when the run,task or step doesn't exist.
2019-11-15 15:50:58 +01:00
Simone Gotti f7d0950ca1 *: write and flush header on log handlers
Explicitly write and flush the headers in the various services LogHandlers.

Currently the 200 response and the other headers will be automatically written
by the golang http implementation only when we send something in the body. But if
there's nothing to send (no logs yet written) the client will never receive the
headers and cannot know if the request was successful.
2019-11-14 10:52:45 +01:00
Simone Gotti 32a08ec5c8
Merge pull request #179 from sgotti/runservice_logshandler_improve_errors
runservice: improve errors in logsHandler
2019-11-14 09:56:24 +01:00
Simone Gotti 66e182a55d runservice: improve errors in logsHandler
* return errNotExist in readTaskLogs when the executor task doesn't exist: so
the client will receive a 404 instead of a 500 (since a generic error will be
mapped to a 500).
* Wrap the errNotExist returned by readTaskLogs with a new ErrNotExits reporting
"log doesn't exist"
2019-11-13 15:50:20 +01:00
Simone Gotti 07cde065c8 runservice: use etcd mutex TryLock on fetching
When fetching avoid concurrent fetches from multiple runservices using an etcd
mutex TryLock.
2019-11-13 11:53:54 +01:00
Simone Gotti 9fd4b662a8
Merge pull request #175 from camandel/api_logarchived
gateway: add api to get log status
2019-11-13 11:45:31 +01:00
Carlo Mandelli 8ed813e722 gateway: add api to get log status 2019-11-13 10:01:51 +01:00
Simone Gotti 5ab9f7c970 *: use etcd mutex TryLock
etcd PR 11104 (https://github.com/etcd-io/etcd/pull/11104) implemented mutex
TryLock. Since it's only available in etcd master just copy relevant code and
add a TODO to remove it when updating the etcd client to a version implementing
TryLock.

Use TryLock everywhere where it'll be useful.
2019-11-12 22:27:17 +01:00
Simone Gotti 24a9563872
Merge pull request #167 from sgotti/datamanager_writedatasnapshot_skip_already_checkpointed_wals
datamanager: skip already applied wals in writeDataSnapshot
2019-11-12 14:57:42 +01:00
Simone Gotti d679254516 readdb: improve HandleEvents goroutine exiting
Rename errCh to doneCh (error is not needed) and always send to it when one of
the HandleEvents functions exits (not only on error).

This will ensure that all the goroutines will be stopped also if one of them
returns without an error.
2019-11-12 11:03:21 +01:00
Simone Gotti 1e70e3404b datamanager: skip already applied wals in writeDataSnapshot
As an optimization don't apply already applied wals.
2019-11-12 11:02:52 +01:00
Simone Gotti dfd0f8c806 datamanager tests: increase etcd waitdown timeout 2019-11-12 10:47:28 +01:00
Simone Gotti 72f279c4c3 *: improve error handling
* objectstorage: remove `types` package and move `ErrNotExist` in base package
* objectstorage: Implement .Is and add helper `IsErrNotExist` for `ErrNotExist`
* util: Rename `ErrNotFound` to `ErrNotExist`
* util: Add `IsErr*` helpers and use them in place of `errors.Is()`
* datamanager: add `ErrNoDataStatus` to report when there's not data status in ost
* runservice/common: remove `ErrNotExist` and use errors in util package
2019-11-11 12:17:35 +01:00
Simone Gotti 5af07d0852 objectstorage: use a single package
remove all the subpackages and just use a single package
2019-11-08 16:31:48 +01:00
Simone Gotti 35e1ec0e15 datamanager: remove old storage wals
Remove all wals not required by the existing data status files and not existing
in etcd.
2019-11-08 15:39:17 +01:00
Simone Gotti 9c0eb3d7ef datamanager: refactor ReadWal
make ReadWal directly return a *WalHeader
2019-11-08 13:24:43 +01:00
Simone Gotti 9c1f3b2a69 datamanager: check wal previouswalsequence is correct in initEtcd 2019-11-07 17:05:40 +01:00
Simone Gotti acd62a3f90 datamanager: don't create ost wal checkpointed files
currently creating .checkpointed files in the ost isn't useful.
We already have the data snapshot that reports the last checkpointed wal.
2019-11-07 10:32:41 +01:00
Simone Gotti 4fcb067052 datamanager: clean old data files
keep the last n (now set to 3) data status files and remove all other data status
files and unneeded data files.
2019-11-07 10:30:31 +01:00
Simone Gotti 52cb683267 datamanager: fix index creation on multiple data files
When during a checkpoint more than one file is created the entries position in
the index is not right since it's not reset at every new index.

Fix it and add related tests.
2019-11-06 15:33:40 +01:00
Carlo Mandelli aab2321d58 services: check config only for enabled services 2019-11-05 09:25:22 +01:00
Simone Gotti 4b4416fc99 datamanager: add data sequence to data file name
When creating a datafile name make it start with the current data sequence. This
is useful in future to know which data sequence created a new data file.
2019-11-04 09:23:12 +01:00
Simone Gotti e06dc332e2 sequence: add tests for String and Parse methods
Add tests to sequence String() and Parse(string) methods.
2019-10-31 16:53:57 +01:00