Commit Graph

513 Commits

Author SHA1 Message Date
Simone Gotti
32a08ec5c8
Merge pull request #179 from sgotti/runservice_logshandler_improve_errors
runservice: improve errors in logsHandler
2019-11-14 09:56:24 +01:00
Simone Gotti
66e182a55d runservice: improve errors in logsHandler
* return errNotExist in readTaskLogs when the executor task doesn't exist: so
the client will receive a 404 instead of a 500 (since a generic error will be
mapped to a 500).
* Wrap the errNotExist returned by readTaskLogs with a new ErrNotExits reporting
"log doesn't exist"
2019-11-13 15:50:20 +01:00
Simone Gotti
07cde065c8 runservice: use etcd mutex TryLock on fetching
When fetching avoid concurrent fetches from multiple runservices using an etcd
mutex TryLock.
2019-11-13 11:53:54 +01:00
Simone Gotti
9fd4b662a8
Merge pull request #175 from camandel/api_logarchived
gateway: add api to get log status
2019-11-13 11:45:31 +01:00
Carlo Mandelli
8ed813e722 gateway: add api to get log status 2019-11-13 10:01:51 +01:00
Simone Gotti
5ab9f7c970 *: use etcd mutex TryLock
etcd PR 11104 (https://github.com/etcd-io/etcd/pull/11104) implemented mutex
TryLock. Since it's only available in etcd master just copy relevant code and
add a TODO to remove it when updating the etcd client to a version implementing
TryLock.

Use TryLock everywhere where it'll be useful.
2019-11-12 22:27:17 +01:00
Simone Gotti
24a9563872
Merge pull request #167 from sgotti/datamanager_writedatasnapshot_skip_already_checkpointed_wals
datamanager: skip already applied wals in writeDataSnapshot
2019-11-12 14:57:42 +01:00
Simone Gotti
d679254516 readdb: improve HandleEvents goroutine exiting
Rename errCh to doneCh (error is not needed) and always send to it when one of
the HandleEvents functions exits (not only on error).

This will ensure that all the goroutines will be stopped also if one of them
returns without an error.
2019-11-12 11:03:21 +01:00
Simone Gotti
1e70e3404b datamanager: skip already applied wals in writeDataSnapshot
As an optimization don't apply already applied wals.
2019-11-12 11:02:52 +01:00
Simone Gotti
dfd0f8c806 datamanager tests: increase etcd waitdown timeout 2019-11-12 10:47:28 +01:00
Simone Gotti
72f279c4c3 *: improve error handling
* objectstorage: remove `types` package and move `ErrNotExist` in base package
* objectstorage: Implement .Is and add helper `IsErrNotExist` for `ErrNotExist`
* util: Rename `ErrNotFound` to `ErrNotExist`
* util: Add `IsErr*` helpers and use them in place of `errors.Is()`
* datamanager: add `ErrNoDataStatus` to report when there's not data status in ost
* runservice/common: remove `ErrNotExist` and use errors in util package
2019-11-11 12:17:35 +01:00
Simone Gotti
5af07d0852 objectstorage: use a single package
remove all the subpackages and just use a single package
2019-11-08 16:31:48 +01:00
Simone Gotti
35e1ec0e15 datamanager: remove old storage wals
Remove all wals not required by the existing data status files and not existing
in etcd.
2019-11-08 15:39:17 +01:00
Simone Gotti
9c0eb3d7ef datamanager: refactor ReadWal
make ReadWal directly return a *WalHeader
2019-11-08 13:24:43 +01:00
Simone Gotti
9c1f3b2a69 datamanager: check wal previouswalsequence is correct in initEtcd 2019-11-07 17:05:40 +01:00
Simone Gotti
acd62a3f90 datamanager: don't create ost wal checkpointed files
currently creating .checkpointed files in the ost isn't useful.
We already have the data snapshot that reports the last checkpointed wal.
2019-11-07 10:32:41 +01:00
Simone Gotti
4fcb067052 datamanager: clean old data files
keep the last n (now set to 3) data status files and remove all other data status
files and unneeded data files.
2019-11-07 10:30:31 +01:00
Simone Gotti
52cb683267 datamanager: fix index creation on multiple data files
When during a checkpoint more than one file is created the entries position in
the index is not right since it's not reset at every new index.

Fix it and add related tests.
2019-11-06 15:33:40 +01:00
Carlo Mandelli
aab2321d58 services: check config only for enabled services 2019-11-05 09:25:22 +01:00
Simone Gotti
4b4416fc99 datamanager: add data sequence to data file name
When creating a datafile name make it start with the current data sequence. This
is useful in future to know which data sequence created a new data file.
2019-11-04 09:23:12 +01:00
Simone Gotti
e06dc332e2 sequence: add tests for String and Parse methods
Add tests to sequence String() and Parse(string) methods.
2019-10-31 16:53:57 +01:00
Simone Gotti
e18794764e go.mod: update dependencies
Update all the updatable dependencies
2019-10-29 09:31:38 +01:00
Carlo Mandelli
7a51404b71 run config: add tty option for run steps 2019-10-28 16:58:54 +01:00
Simone Gotti
1eb16886d8 objectstorage: add WriteObject tests
Test WriteObject using different size values: unspecified, equal to the buffer
size or less than the buffer size.
2019-10-25 15:16:42 +02:00
Simone Gotti
2e520dae55
Merge pull request #154 from sgotti/objectstorage_object_size
objectstorage: return object size in objectinfo
2019-10-25 12:23:48 +02:00
Simone Gotti
4c88bb75a3
Merge pull request #153 from sgotti/objectstorage_posix_limitreader_only_size_gt_0
objectstorage posix: use limitreader only when size is specified.
2019-10-25 12:23:35 +02:00
Simone Gotti
ae1e92b115 objectstorage: return object size in objectinfo
Return object size in object info.
2019-10-25 10:53:36 +02:00
Simone Gotti
58f68601e6 objectstorage posix: use limitreader only when size is specified.
Use limitreader only when size is specified (greater or equal to 0).
When size is unknown (less than 0) limitreader will immediately return EOF
instead of writing the whole data.
2019-10-25 10:26:59 +02:00
Simone Gotti
0388003d09 objectstorage s3: use limitreader in write object
If size is specified limit reads to size bytes.
2019-10-25 10:06:22 +02:00
Simone Gotti
a0450a5e69
Merge pull request #146 from sgotti/gitea_fix_getref
gitea: use GetRepoRefs instead of GetRepoRef
2019-10-23 10:29:23 +02:00
Simone Gotti
f0e4bbfeeb gitea: use GetRepoRefs instead of GetRepoRef
Looks like GetRepoRef doesn't correcly handle gitea repo refs response expecting
a single entry. Instead, at least with latest gitea version, the response is
always an array of refs. So use GetRepoRefs.
2019-10-22 10:17:00 +02:00
Simone Gotti
446e626f9f gateway: fix project create run http method
Make it a POST instead of a PUT.
2019-10-22 09:50:01 +02:00
Carlo Mandelli
6fccb935c4 docker: mount multiple volumes 2019-10-17 09:21:46 +02:00
Simone Gotti
3d0c68b5fc gitsources: don't set branch value when in a pull request
We were passing the source branch name as the Branch value in the webhook data.

This patch will just delete this assignment. If in future it's needed let's add
it with a different name to not cause confusion.
2019-10-14 22:42:08 +02:00
Simone Gotti
fa4b41ab74 when: match only the current ref type
Only match the current ref type, ie: don't match a branch when the ref type is a
tag or pull request.

Ref is always matched because it's not related to a specific ref type.
2019-10-14 17:08:44 +02:00
Simone Gotti
7d62481415 *: implement ability to add tmpfs volumes to containers
* Add a generic container volume option that currently only support tmpfs. In
future it could be expanded to use of host volumes or other kind of volumes (if
supported by the underlying executor)

* Implement creation of tmpfs volumes in docker and k8s drivers.
2019-10-08 16:36:23 +02:00
Carlo Mandelli
4fc8f3ebed cmd: add version command 2019-10-05 12:11:18 +02:00
Simone Gotti
a7ca2848e6 datamanager: remove some logs from tests 2019-10-02 09:27:55 +02:00
Simone Gotti
11da186913 gateway: add parentRef field to project group update api
and make all the request fields optional
2019-10-01 14:39:45 +02:00
Simone Gotti
973cfe8770 configstore: implement project group move 2019-10-01 12:03:57 +02:00
Simone Gotti
e2a0fedfb8 configstore: disable root project group deletion 2019-10-01 10:26:17 +02:00
Simone Gotti
8b9464486d gateway: add parentRef field to project update api
and make all the request fields optional
2019-09-27 17:10:04 +02:00
Simone Gotti
fe9e3e8317 configstore: implement project move 2019-09-27 11:07:49 +02:00
Carlo Mandelli
cd8175c156 add clone options 2019-09-25 09:29:39 +02:00
Simone Gotti
f5c0e91f39 gateway: api implement get secrets removeoverridden
Implement missing removeoverridden option for get secrets.
2019-09-20 15:39:21 +02:00
Carlo Mandelli
436aa8f1de skip run with special commit 2019-09-18 11:53:22 +02:00
Simone Gotti
714da3ffe3 executor: add missing mutex unlock
Add missing mutex unlock that will cause deadlocks.
2019-09-18 09:49:38 +02:00
Simone Gotti
9cfb21d365 gateway: run api return more step data
For every step return the step type.

For a run step also return the shell and exit status.
2019-09-17 17:43:48 +02:00
Simone Gotti
39829f1ec4 runservice: save step exitstatus in run.
For every step save also the command exit status.
2019-09-17 14:35:37 +02:00
Simone Gotti
12b02143b2 runservice: don't save executor task data in etcd
Reorganize ExecutorTask to better distinguish between the task Spec and
the Status.

Split the task Spec in a sub part called ExecutorTaskSpecData that contains
tasks data that don't have to be saved in etcd because it contains data that can
be very big and can be generated starting from the run and the runconfig.
2019-09-17 12:03:43 +02:00
Simone Gotti
7d375e4c4e runservice: add run workspace cleaner
Removes old workspace files (defaults to 7 days)
2019-09-17 09:40:23 +02:00
Simone Gotti
6ee76274d7 executor: set the container exec user in every step 2019-09-12 10:55:07 +02:00
Simone Gotti
51e9a32db7 runconfig: set task default shell
Currently, if no shell is defined in the task and in the step, the executor will
use an hardcoded default shell.

This will cause changed run behavior if we add an option to globally set the
agola default shell.

To avoid this set the task shell to the default shell inside the runconfig if
it's empty so future executions will always use this value.
2019-09-11 18:51:18 +02:00
Simone Gotti
6d7410cfce runservice: remove run step user
Defining an option to override the user for a run step is too much fine grained
and, for consistency, will require to do the same also for the other steps
(clone, *workspace etc...).

Remove it since it's probably enough to define it at the task level.
2019-09-11 15:02:08 +02:00
Simone Gotti
33c860e78c
Merge pull request #103 from sgotti/gitserver_dont_write_on_error
gitserver: don't return http response/error when calling external git process
2019-09-09 16:43:40 +02:00
Simone Gotti
0e61aa4e39 gitserver: don't return http response/error when calling external git process
On a git process error don't write the error message to the response body since
it'll break the git protocol and don't try to write the status header (since it's not
possible as it was automatically written by the go http server before writing
the body).
2019-09-09 15:46:32 +02:00
Simone Gotti
9f580863da util: Fix PathList output when path ends with slashes
Fix PathList when a path ends with one or more slashes and add related tests.
2019-09-09 14:49:00 +02:00
Simone Gotti
70eeddb719 types: use a global When type
Currently we are using different `When` types for every service and convert
between them. This is a good approach if we want to keep isolated all the
services (like if we were using different repos for every service instead of the
current monorepo).

But currently, since When is identical between all the services, simplify this by
using a common When type.
2019-09-05 09:37:27 +02:00
Simone Gotti
bfc42ef60e runservice: fix get tasks to run
Currently `advanceRunTasks` isn't deterministic and doesn't calculate the final
state in one call. So could happen that `getTasksToRun` will select a task to be
executed since its parent are finished (marked as skipped in advanceRunTasks)
but the task isn't marked to be skipped (because advanceRunTasks has calculated
this task before its parents).

Currently fix this doing the same task selection logic done in `advanceRunTasks`
and add a TODO to make `advanceRunTasks` be deterministic by processing tasks by
their level (from level 0).
2019-08-30 15:59:25 +02:00
Simone Gotti
6d1f159500 tests: add wait function in place of sleep
Add a function to wait for a specific condition instead of sleeping for a fixed
number of seconds.
2019-08-30 12:50:49 +02:00
Simone Gotti
53dad95cd0 cmd: fix variable create/update
In c1ff28ef9f we exported various types. Unfortunately the types used by cmd
variable create/update are the wrong types and marshalling fails. Fix it using
the right type. In future this internal types should be exported.
2019-08-29 16:38:19 +02:00
Simone Gotti
79c74e9423 config: fix check on task and parents with common deps 2019-08-12 23:11:19 +02:00
Simone Gotti
2676770336 userdirectrun: add options to define variables
Add a --var and --var-file options (repeatable multiple times) to define the
variables to be used in the run.
2019-08-06 16:58:00 +02:00
Simone Gotti
e31b0b47ef executor: listen on wildcard address
Since the current logic is to use the first available private ip address as the
advertized address we have to listen on wildcard since a different host provided
in web.ListenAddress will make the executor unreachable.

In future improve this to let the user to manually define the bind and the
advertized address (perhaps using go-sockaddr templates like done by consul) to
also support nat between the schedulers and the executors.
2019-08-06 13:42:42 +02:00
Simone Gotti
db742a6cd6 config: add run when field
Don't create a run if a when condition is defined and it doesn't match.
2019-08-05 16:07:47 +02:00
Simone Gotti
4ec0b33eb4 userdirectrun: allow setting destination branch/tag/ref
Allow setting the destination branch/tag/ref so users can test the run
conditions based on the branch/tag/ref.

To simulate a pull request an user can define a ref that matches one of these
regular expressions: `refs/pull/(\d+)/head`, `refs/merge-requests/(\d+)/head`
2019-08-05 14:45:34 +02:00
Simone Gotti
c17772040b tests: test also clone step
Also test clone step so we are sure that the clone url is correct.
2019-08-05 13:33:58 +02:00
Simone Gotti
1c96b5fbff
Merge pull request #81 from sgotti/docker_driver_use_fixed_client_api_version
docker driver: use fixed client api version
2019-08-04 23:56:17 +02:00
Simone Gotti
df66cfc736 docker driver: use fixed client api version
Set the client required api version to 1.26. In this way we'll work with docker
>= 1.13.1
2019-08-04 23:38:20 +02:00
Simone Gotti
b3672bf927 docker driver: use toolbox exec
Older version of docker doesn't support the exec api Env and WorkingDir options.

Support these versions by doing the same we already do with the k8s driver: use
the `toolbox exec` command that will set the provided Env, change the cwd to the
WorkingDir and the exec the wanted command.
2019-08-04 18:09:34 +02:00
Simone Gotti
c1ff28ef9f *: export clients and related types
Export clients and related packages.

The main rule is to not import internal packages from exported packages.

The gateway client and related types are totally decoupled from the gateway
service (not shared types between the client and the server).

Instead the configstore and the runservice client currently share many types
that are now exported (decoupling them will require that a lot of types must be
duplicated and the need of functions to convert between them, this will be done
in future when the APIs will be declared as stable).
2019-08-02 12:02:01 +02:00
Simone Gotti
fd26e617b3 configstore: move configstore types inside configstore package
Since they're not types common to all the services but belongs to the
configstore.

Next step will be to make them local to the configstore and not directly used by
other services since these types are also stored.
2019-08-02 10:05:47 +02:00
Simone Gotti
d0c5621201 util: remove time.go
The same function is already provided by pointer.go
2019-08-01 14:14:56 +02:00
Simone Gotti
e48a28d5b9
Merge pull request #71 from sgotti/executor_set_task_endtime_when_marking_task_failed
executor: set task endTime when marking as failed
2019-07-29 14:43:13 +02:00
Simone Gotti
43e445f3aa
Merge pull request #70 from sgotti/executor_fix_task_endtime_setup_step_failed
executor: fix typo in setting task endTime when setup failed
2019-07-29 14:43:02 +02:00
Simone Gotti
b81ad4cd8c runservice: fix/improve executor delete logic
* Don't fail tasks inside the delete executor action, just delete the executor
from etcd

* The scheduler, when detecting a task without a related executor will mark the
task as failed and correctly set end time of the task and its steps.
2019-07-29 12:06:15 +02:00
Simone Gotti
f812597410 runservice: maintenance/export/import
Implement runservice maintenance mode and export/import.

When runservice is set in maintenance mode it'll start only the maintenance and
export/import handlers.

Setting maintenance mode will set a key in etcd so all the runservice instances
will detect it and enter in maintenance mode. This is done asyncronously so it
could take some time (future improvements will add some api to show all the
runservice states)

Export is always available and will export the datamanager contents. Currently
only datamanager contents are exported (no logs and workspace archives).

Import is available only during maintenance, given a datamanager export will
import it and reset etcd to this import state.
2019-07-29 11:52:30 +02:00
Simone Gotti
0ecfc24def executor: fix typo in setting task endTime when setup failed
There was a typo so we weren't setting the task endTime when the setup step
failed.

Also unify all logic to just use `et` (instead of a mix of `et` or `rt.et`)
2019-07-29 10:00:32 +02:00
Simone Gotti
1707be9528 executor: set task endTime when marking as failed
Add missing set of task endTime when the executor is marking the task as failed
due to no related running task (usually after executor restart).
2019-07-29 09:58:17 +02:00
Simone Gotti
fafa5188c2 configstore: maintenance/export/import
Implement configstore maintenance mode and export/import.

When configstore is set in maintenance mode it'll start only the maintenance and
export/import handlers.

Setting maintenance mode will set a key in etcd so all the configstore instances
will detect it and enter in maintenance mode. This is done asyncronously so it
could take some time (future improvements will add some api to show all the
configstore states)

Export is always available and will export the datamanager contents.

Import is available only during maintenance, given a datamanager export will
import it and reset etcd to this import state.
2019-07-26 10:55:04 +02:00
Simone Gotti
f3fa229f6c util: add GoWait function
GoWait will increase the provided waitGroup on start and execute a goroutine
that will run the provided functions and then decrease the waitGroup
2019-07-26 10:55:04 +02:00
Simone Gotti
bd035e9840 util: use context in backoff 2019-07-26 10:36:11 +02:00
Simone Gotti
ceafc2ef98 readdb: close and open readdb on Run 2019-07-25 17:59:54 +02:00
Simone Gotti
6f3798e8fe *: use sleep timer in loops
So we'll react instantly to a context cancel instead of waiting on time.Sleep
returning.
2019-07-25 16:22:54 +02:00
Simone Gotti
b8c2b4020a db: use context functions
Use the go sql context functions (ExecContext, QueryContext etc...)

The context is saved inside Tx so the library users should only pass it one time
to the db.Do function.
2019-07-25 14:49:53 +02:00
Simone Gotti
3404cb94b9 datamanager: implement import/export
* export: exports the newest data checkpoint. It forces a checkpoint before
exporting (currently no wals are exported)

* import: cleans up etcd, creates a new datasnaphot from the provided import stream
and then initializes etcd. Currently no old data is removed from the object
storage but it's just ignored.
2019-07-25 11:12:49 +02:00
Simone Gotti
3987caf664 datamanager: implement maintenance mode
when datamanager is started in maintenance mode no goroutines are scheduled and
no initial etcd initialization is done
2019-07-24 17:37:27 +02:00
Simone Gotti
3297244d5d db: retry on sqlite locked error
Since we are using the shared cache with the lock notify we won't receive
SQLITE_BUSY errors but we could receive SQLITE_LOCKED errors due to deadlocks or
locked tables on concurrent read and write transactions.

This patch catches this kind of errors and retries the tx until maxTxRetries.
2019-07-24 12:20:33 +02:00
Simone Gotti
77ee8d9e7d
Merge pull request #58 from sgotti/readdb_fix_deadlock
readdb: fix deadlock in Run method
2019-07-23 15:47:05 +02:00
Simone Gotti
85876310af readdb: fix deadlock in Run method
In runservice readdb Run method we could end with a deadlock if two of the
goroutines that call HandleEvents.* try to write to the errCh at the same
time before the errCh is read. If this happens one of the two will be blocked on
writing to the channel but the read won't happen since it'll blocked by
wg.Wait().

Fix this doing:
* use a buffered channel large as the number of executed goroutines.
* create a new errCh at every loop (so we'll ignore later errors after the first
one)

Note: we could also use a non blocking send to avoid this situation but we
should also start the wg.Wait before the goroutines or earlier errors could be
lost causing another kind of hang.
2019-07-23 14:56:26 +02:00
Simone Gotti
75d68b2b52 runservice: stop run also if result is not set 2019-07-23 12:11:01 +02:00
Simone Gotti
3a963ef95f readdb: error if there's no wal in etcd 2019-07-18 16:44:28 +02:00
Simone Gotti
3f64bda0cc readdb: save walSequence provided by data file 2019-07-18 16:44:28 +02:00
Simone Gotti
16820e9033 readdb: insert current wal sequence after checking wal status 2019-07-18 16:44:28 +02:00
Simone Gotti
f7175c4592 datamanager: save previous wal in waldata 2019-07-18 16:44:28 +02:00
Simone Gotti
18c5ae0492 datamanager: error if there're wals but not a datastatus in ost 2019-07-18 16:44:28 +02:00
Simone Gotti
c034819087 datamanager: accept optional datastatus in initEtcd 2019-07-18 16:44:28 +02:00
Simone Gotti
df716fccc6 datamanager: create a new wal and checkpoint in initEtcd
When doing an initEtcd (new instance or etcd reset) create a new wal (that will
have a new sequence epoch) and do a checkpoint.

In this way:

* readdb will detect that an epoch change and do a full resync
* we always have a data file (also if empty) that provides the last checkpointed
wal. This information could be used by readdb to resync
2019-07-18 16:44:28 +02:00
Simone Gotti
cb2a871be6 datamanager: start initEtcd from last datastatus 2019-07-18 16:44:28 +02:00
Simone Gotti
445ef24daa datamanager: add option to force a checkpoint 2019-07-18 16:44:27 +02:00