Ask.Cyberinfrastructure

How do I provide authentication on GitHub Pages?

We have static documentation for our Group on GitHub Pages, and it’s a known fact that even private repos produce static content that is public:

I’m curious what kinds of strategies other groups have used to authenticate an entire, or part of, a private repo? For our use case, we would ideally want SAML (single sign on) sort of deal.

It looks like there is something called jekyll-auth that we could use for (some kind of auth) but the dependency is building with Heroku (which we don’t do).

Here is an example using GitHub credentials, grabbing the content from a private repository, which would work if we didn’t want to use SAML (which we do).

It seems to me like however you try to do this, you’ll end up having to host something behind the authentication regime you want – and if that’s SAML, that implies at least a slightly sophisticated backend, I think?

In such a situation, you might not be gaining anything from involving GitHub Pages at all.

@iki this is a good point! The benefits of having docs on GitHub come down to community contribution. Within our group, if the docs are hosted on a private server, it might still be a bit of a hassle to get access to change them. Having them on GitHub means that if I’m a user or a member of the hosting group and I see something missing or off, I can open an issue or pull request. Or I can ask a question! This is possible to do given a custom deployment (to an external server) but it’s much easier to just push to the master branch and have it magically appear on GitHub pages :slight_smile:

So - what about a server existing solely for the SAML auth bit? Do you think that could be possible (and if you have an idea how, please share!)

I thought about this a bit, but I can’t think of a reasonable way to do this that keeps the documentation strictly within GitHub Pages.

We have our (new, still under construction) documentation in a normal public repo (https://github.com/UCL-RITS/mkdocs-rc-docs) and deployed automatically via a simple cronjob running a slightly fancier version of git pull on an Apache server. Would this sort of option, with a private repo, Deploy Keys, and a standard Apache Shibboleth setup, work for you? It’s significantly more complex, but I’m not seeing any sensible ways to make it simpler while maintaining the SAML component.

It would definitely be worth a try, because we don’t have a solution at the moment! If the repo is public, then the docs are freely browsable via GitHub? What does the Shibboleth protect?

I would love to take a look at what you have set up, and see if we can customize it for the private case. The one issue that comes to mind (given a private repo) is that if a user wants to contribute, he/she couldn’t easily open a PR. But I wonder if it comes down to git pulls anyway, if there could be a public and private repo that are served alongside one another?