it's fairly straightforward to install a gitlab-runner and execute
locally, as far as I can tell a malicious GitLab installation could
still send a modified "script" (post-processed .gitlab-ci.yml) or repo
checkout down to the runner. Maybe there's some way to audit this, but I
couldn't find an obvious one. Maybe configuring the runner to log at
debug level would record enough?
Advanced configuration | GitLab
Thtat's not what I mean. I don't mean installing your own runner locally
and hooking it up with GitLab. I mean installing the gitlab-runner
package (only!) and *not* hooking it up in GitLab.
Instead, you run the job completely locally, without involving GitLab at
all. That's done with the `gitlab-runner exec` command:
GitLab Runner commands | GitLab
We have docs about this here:
ci · Wiki · The Tor Project / TPA / TPA team · GitLab
This removes a large part of the attack surface because GitLab is taken
out of the equation. It reduces the stack to:
* your local computer and operating system
* your git repository
* the executor (e.g. Docker) and its image
It's still pretty darn large, but it's better than before.
Ahhh right, I'd forgotten about `gitlab-runner`'s `exec` feature. Unfortunately the current implementation of the feature is a bit hacky and not super well-documented. IIUC they took it from a 3rd party pull request, tried to rip it back out, but too many people screamed so it's still there in a semi-zombie state. It looks like they're working on designing a new implementation that they'll be happier with. Local runner execution MVC (#2797) · Issues · GitLab.org / gitlab-runner · GitLab.
The current version only runs a single job, not a whole pipeline, so you still need some wrapper logic for multi-job pipelines to run them in the right order, copy artifacts between each-other, initialize pipeline-level variables, etc.
For the debian package build I got it partly working, but couldn't find a way to run a single-job out of a parameterized matrix (which they use to build for multiple platforms and architectures). Given the other headaches and lack of documentation I shelved this approach for the moment (Confirm Tor Project tor.git package builds are reproducible (#40615) · Issues · The Tor Project / Core / Tor · GitLab).
I agree that this feature is potentially very useful. The "v2" proposal of the feature will run a whole pipeline, but communicates with Gitlab to help do so, which may defeat the purpose again from our perspective (at least without some careful auditing of the communication between gitlab and the runner). Local runner execution MVC (#2797) · Issues · GitLab.org / gitlab-runner · GitLab
For that issue I ended up hacking together a small python script that
processes the .gitlab-ci.yml into something to feed directly through
Docker. It's currently a bit hacky and specialized for the Debian tor
package build. I think it could be generalized further to be reusable if
that's of interest (maybe using Docker Compose to orchestrate jobs
within a pipeline), but am still thinking about whether there's a better
reproduce_pipeline.py · main · Jim Newsome / reproduce-tor-debian-build · GitLab
Note that @eighthave has done a similar thing for F-Droid, you might
want to collaborate.
Thanks, good to know!
I think the improvement of that over the above is that you remove the
"gitlab-runner" part of the attack surface. It's a pretty large attack
surface because the runners are a surprisingly large amount of code, but
I wonder if it's worth the trouble...
What's the threat model here specifically? Backdoored gitlab-runner code?
Right - I agree there's not much security benefit over the `gitlab-runner exec` approach. I just found I ultimately wasn't getting that much benefit out of it since I was already having to write all the pipeline-orchestration, and got tired of wrestling with the lack of documentation etc :).
Right now my top candidate we haven't tried yet is to install a full
local GitLab in addition to a local gitlab-runner; maybe using their
published Docker imageshttps://docs.gitlab.com/ee/install/docker.html.
This seems like the least engineering effort (~none) but a bit more work
for every individual wanting to do such a local build.
Other organisations run *two* GitLab instances for that purpose, by the
way. GitLab.com included, from what I understand.
Keeping as much logic out of the .gitlab-ci.yml as possible so that the
gitlab yml is trivial to manually reproduce outside of gitlab (e.g. run
`./build.sh`) is probably ideal, though gives up some gitlab
What functionality are you thinking of here?
For example the debian package build in particular makes heavy use of yml templating. The same thing could be achieved other ways - e.g. moving the yml snippets out to shell files/functions that can be invoked by the other "job scripts", but it adds more indirection and fragmentation vs having everything in one place in the yml file.
For multi-job pipelines, you also still end up having to duplicate the outer orchestration between jobs in the pipeline between yml and some other driver script. You can mitigate this by using fewer jobs (maybe just 1) but that's again giving up some gitlab functionality.
Thanks for the input!
On 6/20/22 09:20, Antoine Beaupré wrote:
tor-project mailing list