auth: mandate JWT bearer token auth #199

uniqueg · 2023-10-04T08:09:20Z

Closes #198

The proposed change requires compliant TES implementations to implement JWT-based bearer token authentication.

This will make it easy for client applications to implement the popular OpenID Connect protocol to authenticate users and generate access tokens that could be used to give access to a varity of GA4GH API-backed microservices, following the OAuth2 framework.

In fact, this is also the authentication/authorization flow suggested by the current GA4GH Authentication and Authorization Infrastructure guidelines.

As highlighted in the overview section of #198, bearer token authentication is also the current consensus across other GA4GH OpenAPI specifications that have defined at least one security scheme.

The suggestions to state JWT as the bearerFormat (which accepts arbitrary strings but mentions JWT in its documentation) and to describe the expected behavior explicitly were included to strongly encourage implementers to follow JWT OAuth2 bearer token specifications expressly.

Of course, each implementation can still choose to support any number of alternative, or additional, security schemes.

@MattMcL4475

denis-yuen

Thanks for the heads-up!

patmagee · 2023-10-04T16:01:16Z

openapi/task_execution_service.openapi.yaml

+ bearerAuth:
+ type: http
+ scheme: bearer
+ bearerFormat: JWT


Requiring JWT for an access token may be a bit overly prescriptive. The OIDC Standard does not require access tokens to be in the JWT format, only ID tokens so systems are free to implement access tokens as Opaque Strings.

Less important then the format of the token is the flow that is involved. Is there a way to define OIDC as the flow that is required?

@patmagee do you think both BearerAuth and OpenID should be specified, or just OpenID?

https://swagger.io/docs/specification/authentication/

components: securitySchemes: OpenID: type: openIdConnect openIdConnectUrl: https://example.com/.well-known/openid-configuration

fyi @uniqueg @patmagee :

"The GA4GH TES (Global Alliance for Genomics and Health Task Execution Schema) is designed for interoperability and often used in multi-tenant, distributed, and security-sensitive environments. Here are some considerations for each security scheme:

OpenID:
Since you want to support OpenID, including this is a must. OpenID Connect is well-suited for scenarios where identity information and distributed single sign-on are required.

OpenID:
type: openIdConnect
openIdConnectUrl: https://example.com/.well-known/openid-configuration

BasicAuth:
Basic Authentication is simple but less secure. In a setting where advanced features like single sign-on or multi-factor authentication are important, BasicAuth falls short.

BearerAuth:
Bearer Tokens are a common way to handle API authentication. They can work well with OpenID Connect if the tokens are JWTs issued by an OIDC Identity Provider. You might include this for more simplistic clients that aren't OIDC-compliant but can handle JWTs.

ApiKeyAuth:
API keys are straightforward for client-to-server authentication but aren't designed for scenarios where you need to authenticate end-users or deal with federated identity providers.

OAuth2:
OAuth2 is more flexible than OIDC and can provide more granular permissions via scopes. OIDC is actually built on top of OAuth2, so if you're implementing OIDC, you're implicitly using OAuth2. Including a separate OAuth2 security scheme could be redundant, but it might be useful if you have different services or levels of access requiring OAuth2's more extensive capabilities.

Recommendations:
Mandatory: Since you wish to support OpenID, the OpenID entry should definitely be there.

Conditional: BearerAuth can stay if you envision clients that will use simple Bearer Tokens instead of going through the full OIDC flow.

Optional: OAuth2 can be included for use-cases that need more granular permissions not covered by your OIDC scopes or for clients that are specifically OAuth2-compliant but not OIDC-compliant.

Discouraged: BasicAuth and ApiKeyAuth are generally not recommended for sensitive, distributed, or multi-user environments.

Based on your specific needs, you can decide to keep some, all, or none of the other schemes in addition to OpenID."

Good point @patmagee. Indeed, TES and similar microservices would receive access tokens, not ID tokens, and (OAuth2-compliant) access tokens could indeed be opaque as well.

However, the main point for this issue and PR is to form a basis on which different TES implementations would be interoperable in terms of authN and authZ (we can worry about that later, it's much more complex), so as to be able to realize multicloud use cases. And if every implementation uses a different token format, you would still always need agreements between client and server, which is not scalable (at least that is my understanding). Therefore I feel that if we do not explicitly request implementers to support JWT-based bearer tokens, we wouldn't have won much.

Also, I think the issue is only relevant for public-facing APIs. If organizations choose to use, e.g., TES internally, there is no strong need to be fully compliant. The same is true for any organization that deliberately chooses not to support one requirement or another for their public-facing APIs, for whatever reason. Nobody is required to implement the entirety of a specification - it would just limit the interoperability. In that way I feel that putting this in the specs prescriptively is for implementers caring about a wide level of interoperability, in which case JWT bearer tokens make sense (I feel).

As for the flow: OpenAPI supports OAuth2 where you can then specify flows. However, I think this is going too deep into AuthZ for the time being. As far as I can tell from previous discussions on the topic (work order tokens, Google's macaroons, the complex use cases and requirements summarized here), we are not at all sure what an "interoperable access control" would look like, and whether it would or could be compliant with OAuth2 (e.g., it would appear to me that adding Passports/Visas to the payload of a POST endpoint is not - apart from being a rather serious REST violation). So while I would actually love to be even more prescriptive here, I'm not sure that that wouldn't come back to haunt us (I feel moving from a bearer to an OAuth2 scheme is easier than the other way around).

As for OIDC, OpenAPI supports the OIDC Discovery scheme. But as I said above, I think TES and similar services would need to process access tokens, not ID tokens, so I don't think it makes sense to use that scheme here or elsewhere among the Cloud APIs (although through the strong encouragement to support JWT bearer tokens, it is pretty much implied that access tokens would be generated through OIDC on the client side, and not, e.g., SAML).

All in all, I feel that the suggested PR, including the explicit mention of the JWT format (which aligns well with Passport and the GA4GH AAI policies), is a good middleground to prepare for a integrated interoperable AuthN/AuthZ policy across GA4GH API-powered mulitclouds, without necessarily committing to OAuth2 or any other standard except JWT.

And thanks @MattMcL4475: I guess ChatGPT kinda supports that view, then. I guess OpenID again is on the level of authentication, which would generally happen on the client side of TES and implies OIDC, which I think we all agree with, but is not in scope for TES.

@uniqueg I can see what you are trying to achieve here.

While jwts do simplify server code (ie you just need to understand how to parse them and all the information is there), it does not simplify trust.

In a real world setting a single JWT is likely only usable for a single TES api, since the tes service would need to trust the upstream issuer, and the token should ideally be scoped to a specific audience.

If that is that case the calling service will need to get a token for each of the downstream tes services anyway. If the flow is discoverable via a discovery endpoint then this is possible and the specific payload of the token is really not important.

If the intention is that a single token will work on all downstream tes servers, I can see there being issues there unless there is a priori trust with the issuer. At which point the specific format also does not matter since all services in the system trust the same issuer

Yes, the use case is for using the same access token across a network of multiple TES instances and/or other GA4GH (Cloud) API-powered microservices.

I guess for us this would never happen. Reusing an access token across services within a network especially across organizational boundaries is generally considered a bad security practice.

You end up with an excessively powerful token that has a fairly large attack surface.

The ideal scenario is for each service to accept a token with a specific and unique audience which means that token is only usable on that service.

For us, we accomplish this using token exchange to take the users Id token or access token and exchange it for a token that can work on the target service with the designated issuer. Hence the actual contents of that token do not really matter to us.

Using token exchange each tes can have its own issuer it trusts, so long as the gateway WES/TES knows about it and has a client with that issuer that can perform a token exchange.

This has the benefit of using the actual user permissions

This sounds like an interesting pattern indeed. Would you care to elaborate on who calls whom for the token exchange, and how you think a TES could broadcast its trusted issuer so that the gateway (or WES, or whatever upstream client that may be interested in talking to multiple TES instances) could negotiate the exchange for any given issuer once the preferred TES has been determined? Or a reference with a good explanation and ideally a call diagram specifically for the cross-organizational use case? I've been looking for more information on this exact problem for a long time, but failed to make any real headway.

Also would like to tag @martin-kuba here

@uniqueg for a commercial example you can look at Google Clouds Workload Identity Federation for inspiration.

In terms of TES, I think we really have everything we need to make this possible

Please note there are definitely multiple ways to achieve this, but this is just one:

The service info would need need to expose authorization information such as the token uri, or better yet the well known configuration endpoint

The TES/WES gateway would need to know about the existence of the the tes instance and it's auth info (retrieved from the well known config)

In a perfect world, each issuer will also expose an OAuth endpoint for client registration

In reality, the gateway would likely need to have a client with the target tes pre-configured with the appropriate grant types configured (in this case the token exchange grant type)

the issuer may need to be configured (as is the case with google) to "know" how to process the up stream token

When a request comes into the gateway, the gateway takes the submitted access token and uses the appropriate client to exchange the token for a new one that is useable with the target tes instance

If this seems complex, it's because it kind of is. We are dealing with federated trust here and that is great, but it also introduces complexity.

Brining it back to your original PR (this is a bit orthogonal to it), I think this probably illustrates why requiring JWT may be a bit restrictive as opposed to just requiring bearer auth.

auth: mandate JWT bearer token auth

9b52bc2

uniqueg requested review from kellrott, vsmalladi, patmagee and denis-yuen October 4, 2023 08:09

auth: update AuthN/AuthZ documentation

cdcee6d

denis-yuen assigned uniqueg Oct 4, 2023

denis-yuen approved these changes Oct 4, 2023

View reviewed changes

patmagee reviewed Oct 4, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auth: mandate JWT bearer token auth #199

auth: mandate JWT bearer token auth #199

uniqueg commented Oct 4, 2023

denis-yuen left a comment

patmagee Oct 4, 2023

MattMcL4475 Oct 4, 2023 •

edited

Loading

MattMcL4475 Oct 4, 2023 •

edited

Loading

uniqueg Oct 4, 2023 •

edited

Loading

patmagee Oct 4, 2023

patmagee Oct 4, 2023

patmagee Oct 4, 2023

uniqueg Oct 5, 2023

uniqueg Oct 5, 2023

patmagee Oct 5, 2023

auth: mandate JWT bearer token auth #199

Are you sure you want to change the base?

auth: mandate JWT bearer token auth #199

Conversation

uniqueg commented Oct 4, 2023

denis-yuen left a comment

Choose a reason for hiding this comment

patmagee Oct 4, 2023

Choose a reason for hiding this comment

MattMcL4475 Oct 4, 2023 • edited Loading

Choose a reason for hiding this comment

MattMcL4475 Oct 4, 2023 • edited Loading

Choose a reason for hiding this comment

uniqueg Oct 4, 2023 • edited Loading

Choose a reason for hiding this comment

patmagee Oct 4, 2023

Choose a reason for hiding this comment

patmagee Oct 4, 2023

Choose a reason for hiding this comment

patmagee Oct 4, 2023

Choose a reason for hiding this comment

uniqueg Oct 5, 2023

Choose a reason for hiding this comment

uniqueg Oct 5, 2023

Choose a reason for hiding this comment

patmagee Oct 5, 2023

Choose a reason for hiding this comment

MattMcL4475 Oct 4, 2023 •

edited

Loading

MattMcL4475 Oct 4, 2023 •

edited

Loading

uniqueg Oct 4, 2023 •

edited

Loading