Commit Graph

108 Commits

Author SHA1 Message Date
Simone Gotti 77ee8d9e7d
Merge pull request #58 from sgotti/readdb_fix_deadlock
readdb: fix deadlock in Run method
2019-07-23 15:47:05 +02:00
Simone Gotti 85876310af readdb: fix deadlock in Run method
In runservice readdb Run method we could end with a deadlock if two of the
goroutines that call HandleEvents.* try to write to the errCh at the same
time before the errCh is read. If this happens one of the two will be blocked on
writing to the channel but the read won't happen since it'll blocked by
wg.Wait().

Fix this doing:
* use a buffered channel large as the number of executed goroutines.
* create a new errCh at every loop (so we'll ignore later errors after the first
one)

Note: we could also use a non blocking send to avoid this situation but we
should also start the wg.Wait before the goroutines or earlier errors could be
lost causing another kind of hang.
2019-07-23 14:56:26 +02:00
Simone Gotti 75d68b2b52 runservice: stop run also if result is not set 2019-07-23 12:11:01 +02:00
Simone Gotti 3a963ef95f readdb: error if there's no wal in etcd 2019-07-18 16:44:28 +02:00
Simone Gotti 3f64bda0cc readdb: save walSequence provided by data file 2019-07-18 16:44:28 +02:00
Simone Gotti 16820e9033 readdb: insert current wal sequence after checking wal status 2019-07-18 16:44:28 +02:00
Simone Gotti 86d822a247 service: handle cors config and use it only on gateway
* Don't make cors enabled on all (*) by default.
* Handle related web.allowedOrigins options
* Only the gateway api should be called by a browser so setup the cors handler
only on it
2019-07-13 23:15:00 +02:00
Simone Gotti 940264e413 runservice: add lock around compatchangegroups
just to avoid concurrency errors when multiple instances are running
2019-07-10 10:20:35 +02:00
Simone Gotti 11a2ff48d6 runservice: delete executor task early
currently we are deleting the executor tasks only when all the run tasks
log/archives were fetched. But it'll better to remove a single executor task
when the task fetching is finished.

This could also fix possible issues on k8s since we are scheduling tasks but the
k8s scheduler may not schedule them if there aren't enough resources causing a
scheduling deadlock since we won't remove finished pods because their related
tasks are not removed and k8s cannot start new pods since it has no resources.
2019-07-08 16:03:14 +02:00
Simone Gotti 45a460ebc0 runservice: handle run not existing
Check if the response from the readdb is null and return an http not found error
2019-07-08 09:30:15 +02:00
Simone Gotti 04ef20464d runservice: add getruns filter by result 2019-07-05 10:32:51 +02:00
Simone Gotti 929a6fb654 service/*: log error only if nil 2019-07-04 15:50:37 +02:00
Simone Gotti 5db23410d0 runservice: set base path for datamanager
Don't put datamanager base dirs inside the root of the ost but use a base path.

Let's do it now before releasing since this is a breaking change that requires
moving the ost data to the new path
2019-07-03 17:18:21 +02:00
Simone Gotti d989fe9639 datamanager: always handle basepath
Currently we aren't setting a basepath and it wasn't always correctly handled.
Fix missing basepath handling and improve tests to also use a non empty
basepath.
2019-07-03 17:03:37 +02:00
Simone Gotti 87a472aaaf runservice: add CacheGroup field to runconfig
The cache group fields defines under which cache group the run cache data will
belong. This is needed/useful for some next changes:

* Make cache correctly work for user direct runs. Since the user direct runs all
belong to the same run group (the user id) all the use direct runs will share the
same caches. To distinguish between the different caches we need to use something
in addition to the user id (the local repo uuid generated by the direct run
start command)
* Share the cache between multiple projects
2019-07-03 15:16:37 +02:00
Simone Gotti 19793db0c2 runservice: fix linter errors
Fix errors reported by default golangci-lint linters
2019-07-02 14:53:01 +02:00
Simone Gotti 8d67844cc4 *: use vanity url
use agola.io domain
2019-07-01 11:40:20 +02:00
Simone Gotti 5e21089baa *: remove unneeded logging
remove many log.Info entries that where old debugging entries and move some of
them to the Debug level.
2019-06-14 11:28:00 +02:00
Simone Gotti 5c911523c5 sentinel: skip executor that don't allow privileged containers
if they are requested.
2019-06-13 18:32:56 +02:00
Simone Gotti 0296d594b5 executor: add option to allow privileged containers
* add a config option allowPrivilegedContainers
* fail task setup if privileged containers are requested but they aren't
allowed.
* report if privileged containers are allowed to the runservice
2019-06-13 18:31:08 +02:00
Simone Gotti a53e14b4e8 runservice: check if executor is alive before scheduling tasks
Check that the last update time is less than 1 minute (currently hardcoded)
2019-06-12 18:12:37 +02:00
Simone Gotti 6e8d467c80 util: add ErrInternal
ErrInternal is an internal error that should be provided to the user (http api
will return a 500 with the error message)

It'll be used for any kind of error that are not auth or bad requests (like
errors to communicate to another service)
2019-06-11 10:59:21 +02:00
Simone Gotti 863277af2d runservice types: refactor unmarshal
* Don't use a complex UnmarshalJSON for RunConfigTask and ExecutorTask but
introduce a Steps type as a slice of Step (where Step is an empty interface)
and declare an UnmarshalJSON method on the Step type.
2019-06-08 16:30:36 +02:00
Simone Gotti 22bd181fc8 datamanager: implement data files splitting
split data files in multiple files of a max size (default 10Mib)
In this way every data snapshot will change only the datafiles that have some
changes instead of the whole single file.
2019-06-03 16:17:27 +02:00
Simone Gotti 6d095cbe50 readdb: ensure that we apply only etcd committed wals
Ensure that we apply only etcd commited wals to avoid doing an unuseful insert
when the wal becomes committed storage.
2019-06-03 18:02:09 +02:00
Simone Gotti c0a165de31 runservice: fix query when grouping by path
When grouping by group path we have to apply the filter to the group by using
HAVING and not using WHERE
2019-05-23 23:21:45 +02:00
Simone Gotti 4d7605a86b gateway: improve ErrFromRemote handling
Don't create an ErrFromRemote wrapping the returned error but
wrap the ErrFromRemote

Also use xerrors Is/As to get the underlying error to return to api clients
while maintaining context for logging
2019-05-23 12:59:11 +02:00
Simone Gotti 9b2ce717c7 *: migrate to "golang.org/x/xerrors"
Just a raw replace of "github.com/pkg/errors".

Next steps will improve errors (like remote errors, api errors, not exist errors
etc...) to leverage its functionalities
2019-05-23 11:23:14 +02:00
Simone Gotti b3867fb7ca objectstorage: add posix standard storage
rename the previous posix storage to posixflat and make it currently not user
selectable (since I'm not sure it's really worth using it).

The new posix storage uses the filesystem without any escaping so it's not a
real flat namespace.

This isn't a real issue since also minio is not a flat namespace and we are so
forced to use it like a hierarchycal filesystem.
2019-05-21 15:17:53 +02:00
Simone Gotti 0e10a406f9 *: remove server sent events from logs handlers
Just use basic http streaming and send all the data as it's available without
splitting by new lines
2019-05-19 14:35:04 +02:00
Simone Gotti bda7a3eb8b *: add run cancel action
and remove unused change phase to finished
2019-05-15 14:42:50 +02:00
Simone Gotti 81d557d785 runservice: add runEvents handler 2019-05-15 09:46:21 +02:00
Simone Gotti ac893f1c91 runservice: trigger run event in change run phase action 2019-05-15 09:41:56 +02:00
Simone Gotti b95fb98f3c runservice: move RunEvent to types 2019-05-15 09:40:32 +02:00
Simone Gotti 59e4a1f0ba runservice: update runconfig group comment 2019-05-10 11:16:57 +02:00
Simone Gotti 0c94386149 gateway: add badges endpoint
Currently we generate badges only for projects branches. In future this could be
extended to also generate badges for tags and PRs
2019-05-10 11:08:24 +02:00
Simone Gotti ce7924d7fd runservice: use the path escaped cache key
Use the path escaped cache key so we can also handle cache keys with slashes
inside.
2019-05-08 12:15:17 +02:00
Simone Gotti bec9476d6c runservice: store related runid with logs and archives
Logs and archives can be shared by multiple runs. So removing a run doesn't
imply that we could also remote the logs and archives since they could be
"referenced" by another run.

Store also the runids as specific objects along with the logs and archives so,
we'll remove them only when no runids objects exist.
2019-05-08 12:11:46 +02:00
Simone Gotti 1e34dca95d runservice: split and simplify scheduler and executor naming
Also if they are logically part of the runservice the names runserviceExecutor
and runserviceScheduler are long and quite confusing for an external user

Simplify them separating both the code parts and updating the names:

runserviceScheduler -> runservice
runserviceExecutor -> executor
2019-05-07 23:56:10 +02:00
Simone Gotti e4e7de4ad2 runservice/gateway: return run on run action 2019-05-07 23:23:58 +02:00
Simone Gotti afae185e11 *: rework run approval and annotations
* runservice: use generic task annotations instead of approval annotations
* runservice: add method to set task annotations

* gateway: when an user call the run task approval action, it will set in the
task annotations the approval users ids. The task won't be approved.

* scheduler: when the number of approvers meets the required minimum number
(currently 1) call the runservice to approve the task

In this way we could easily implement some approval features like requiring a
minimum number of approvers (saved in the task annotations) before marking the
run as approved in the runservice.
2019-05-06 15:19:29 +02:00
Simone Gotti a590c21127 runservice api: get run from readdb 2019-05-06 15:18:49 +02:00
Simone Gotti 3139ef38d9 runservice readdb: get run from ost db if it's not in run db 2019-05-06 14:55:10 +02:00
Simone Gotti cb78ea48bc runservice: rename command(handler) to action(handler)
Since we're going to migrate all actions (also queries that now are implemented
in the api handlers) there
2019-05-03 23:59:21 +02:00
Simone Gotti bad18bf814 *: report objects size for objectstorage.WriteObject 2019-05-02 09:49:55 +02:00
Simone Gotti 34cfdfeb3b objectstorage: add size option to WriteObject
On s3 limit the max object size to 1GiB when the size is not provided (-1) or
the minio client will calculate a big part size since it tries to use the
maximum object size (5TiB) and will allocate a very big buffer in ram. Also
leave as commented out the previous hack that was firstly creating the file
locally to calculate the size and then put it (for future reference).
2019-05-02 09:47:38 +02:00
Simone Gotti e964aa3537 objectstorage: add persist option to WriteObject
This options is a noop on s3 but on the posix implementation it becomes useful
when there isn't the need to have a persistent file, thus avoiding some fsync
calls.
2019-05-01 15:06:47 +02:00
Simone Gotti 27f84738d6 runservice: simplify workspace restore 2019-04-30 14:00:34 +02:00
Simone Gotti da6aefa7e2 runservice readdb: also resync changegroups 2019-04-29 10:16:19 +02:00
Simone Gotti f5cf3b9fa7 runservice: check changegroup name 2019-04-29 10:12:34 +02:00