Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auth: mandate JWT bearer token auth #199

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open

Conversation

uniqueg
Copy link
Contributor

@uniqueg uniqueg commented Oct 4, 2023

Closes #198

The proposed change requires compliant TES implementations to implement JWT-based bearer token authentication.

This will make it easy for client applications to implement the popular OpenID Connect protocol to authenticate users and generate access tokens that could be used to give access to a varity of GA4GH API-backed microservices, following the OAuth2 framework.

In fact, this is also the authentication/authorization flow suggested by the current GA4GH Authentication and Authorization Infrastructure guidelines.

As highlighted in the overview section of #198, bearer token authentication is also the current consensus across other GA4GH OpenAPI specifications that have defined at least one security scheme.

The suggestions to state JWT as the bearerFormat (which accepts arbitrary strings but mentions JWT in its documentation) and to describe the expected behavior explicitly were included to strongly encourage implementers to follow JWT OAuth2 bearer token specifications expressly.

Of course, each implementation can still choose to support any number of alternative, or additional, security schemes.

@MattMcL4475

Copy link
Member

@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads-up!

bearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requiring JWT for an access token may be a bit overly prescriptive. The OIDC Standard does not require access tokens to be in the JWT format, only ID tokens so systems are free to implement access tokens as Opaque Strings.

Less important then the format of the token is the flow that is involved. Is there a way to define OIDC as the flow that is required?

Copy link
Contributor

@MattMcL4475 MattMcL4475 Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patmagee do you think both BearerAuth and OpenID should be specified, or just OpenID?

https://swagger.io/docs/specification/authentication/

components:
  securitySchemes:
    OpenID:
      type: openIdConnect
      openIdConnectUrl: https://example.com/.well-known/openid-configuration

Copy link
Contributor

@MattMcL4475 MattMcL4475 Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi @uniqueg @patmagee :

"The GA4GH TES (Global Alliance for Genomics and Health Task Execution Schema) is designed for interoperability and often used in multi-tenant, distributed, and security-sensitive environments. Here are some considerations for each security scheme:

OpenID:
Since you want to support OpenID, including this is a must. OpenID Connect is well-suited for scenarios where identity information and distributed single sign-on are required.

OpenID:
type: openIdConnect
openIdConnectUrl: https://example.com/.well-known/openid-configuration

BasicAuth:
Basic Authentication is simple but less secure. In a setting where advanced features like single sign-on or multi-factor authentication are important, BasicAuth falls short.

BearerAuth:
Bearer Tokens are a common way to handle API authentication. They can work well with OpenID Connect if the tokens are JWTs issued by an OIDC Identity Provider. You might include this for more simplistic clients that aren't OIDC-compliant but can handle JWTs.

ApiKeyAuth:
API keys are straightforward for client-to-server authentication but aren't designed for scenarios where you need to authenticate end-users or deal with federated identity providers.

OAuth2:
OAuth2 is more flexible than OIDC and can provide more granular permissions via scopes. OIDC is actually built on top of OAuth2, so if you're implementing OIDC, you're implicitly using OAuth2. Including a separate OAuth2 security scheme could be redundant, but it might be useful if you have different services or levels of access requiring OAuth2's more extensive capabilities.

Recommendations:
Mandatory: Since you wish to support OpenID, the OpenID entry should definitely be there.

Conditional: BearerAuth can stay if you envision clients that will use simple Bearer Tokens instead of going through the full OIDC flow.

Optional: OAuth2 can be included for use-cases that need more granular permissions not covered by your OIDC scopes or for clients that are specifically OAuth2-compliant but not OIDC-compliant.

Discouraged: BasicAuth and ApiKeyAuth are generally not recommended for sensitive, distributed, or multi-user environments.

Based on your specific needs, you can decide to keep some, all, or none of the other schemes in addition to OpenID."

Copy link
Contributor Author

@uniqueg uniqueg Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point @patmagee. Indeed, TES and similar microservices would receive access tokens, not ID tokens, and (OAuth2-compliant) access tokens could indeed be opaque as well.

However, the main point for this issue and PR is to form a basis on which different TES implementations would be interoperable in terms of authN and authZ (we can worry about that later, it's much more complex), so as to be able to realize multicloud use cases. And if every implementation uses a different token format, you would still always need agreements between client and server, which is not scalable (at least that is my understanding). Therefore I feel that if we do not explicitly request implementers to support JWT-based bearer tokens, we wouldn't have won much.

Also, I think the issue is only relevant for public-facing APIs. If organizations choose to use, e.g., TES internally, there is no strong need to be fully compliant. The same is true for any organization that deliberately chooses not to support one requirement or another for their public-facing APIs, for whatever reason. Nobody is required to implement the entirety of a specification - it would just limit the interoperability. In that way I feel that putting this in the specs prescriptively is for implementers caring about a wide level of interoperability, in which case JWT bearer tokens make sense (I feel).

As for the flow: OpenAPI supports OAuth2 where you can then specify flows. However, I think this is going too deep into AuthZ for the time being. As far as I can tell from previous discussions on the topic (work order tokens, Google's macaroons, the complex use cases and requirements summarized here), we are not at all sure what an "interoperable access control" would look like, and whether it would or could be compliant with OAuth2 (e.g., it would appear to me that adding Passports/Visas to the payload of a POST endpoint is not - apart from being a rather serious REST violation). So while I would actually love to be even more prescriptive here, I'm not sure that that wouldn't come back to haunt us (I feel moving from a bearer to an OAuth2 scheme is easier than the other way around).

As for OIDC, OpenAPI supports the OIDC Discovery scheme. But as I said above, I think TES and similar services would need to process access tokens, not ID tokens, so I don't think it makes sense to use that scheme here or elsewhere among the Cloud APIs (although through the strong encouragement to support JWT bearer tokens, it is pretty much implied that access tokens would be generated through OIDC on the client side, and not, e.g., SAML).

All in all, I feel that the suggested PR, including the explicit mention of the JWT format (which aligns well with Passport and the GA4GH AAI policies), is a good middleground to prepare for a integrated interoperable AuthN/AuthZ policy across GA4GH API-powered mulitclouds, without necessarily committing to OAuth2 or any other standard except JWT.

And thanks @MattMcL4475: I guess ChatGPT kinda supports that view, then. I guess OpenID again is on the level of authentication, which would generally happen on the client side of TES and implies OIDC, which I think we all agree with, but is not in scope for TES.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uniqueg I can see what you are trying to achieve here.

While jwts do simplify server code (ie you just need to understand how to parse them and all the information is there), it does not simplify trust.

In a real world setting a single JWT is likely only usable for a single TES api, since the tes service would need to trust the upstream issuer, and the token should ideally be scoped to a specific audience.

If that is that case the calling service will need to get a token for each of the downstream tes services anyway. If the flow is discoverable via a discovery endpoint then this is possible and the specific payload of the token is really not important.

If the intention is that a single token will work on all downstream tes servers, I can see there being issues there unless there is a priori trust with the issuer. At which point the specific format also does not matter since all services in the system trust the same issuer

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the use case is for using the same access token across a network of multiple TES instances and/or other GA4GH (Cloud) API-powered microservices.

I guess for us this would never happen. Reusing an access token across services within a network especially across organizational boundaries is generally considered a bad security practice.

You end up with an excessively powerful token that has a fairly large attack surface.

The ideal scenario is for each service to accept a token with a specific and unique audience which means that token is only usable on that service.

For us, we accomplish this using token exchange to take the users Id token or access token and exchange it for a token that can work on the target service with the designated issuer. Hence the actual contents of that token do not really matter to us.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using token exchange each tes can have its own issuer it trusts, so long as the gateway WES/TES knows about it and has a client with that issuer that can perform a token exchange.

This has the benefit of using the actual user permissions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like an interesting pattern indeed. Would you care to elaborate on who calls whom for the token exchange, and how you think a TES could broadcast its trusted issuer so that the gateway (or WES, or whatever upstream client that may be interested in talking to multiple TES instances) could negotiate the exchange for any given issuer once the preferred TES has been determined? Or a reference with a good explanation and ideally a call diagram specifically for the cross-organizational use case? I've been looking for more information on this exact problem for a long time, but failed to make any real headway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also would like to tag @martin-kuba here

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uniqueg for a commercial example you can look at Google Clouds Workload Identity Federation for inspiration.

In terms of TES, I think we really have everything we need to make this possible

Please note there are definitely multiple ways to achieve this, but this is just one:

  1. The service info would need need to expose authorization information such as the token uri, or better yet the well known configuration endpoint
  2. The TES/WES gateway would need to know about the existence of the the tes instance and it's auth info (retrieved from the well known config)
  3. In a perfect world, each issuer will also expose an OAuth endpoint for client registration
  4. In reality, the gateway would likely need to have a client with the target tes pre-configured with the appropriate grant types configured (in this case the token exchange grant type)
  • the issuer may need to be configured (as is the case with google) to "know" how to process the up stream token
  1. When a request comes into the gateway, the gateway takes the submitted access token and uses the appropriate client to exchange the token for a new one that is useable with the target tes instance

If this seems complex, it's because it kind of is. We are dealing with federated trust here and that is great, but it also introduces complexity.

Brining it back to your original PR (this is a bit orthogonal to it), I think this probably illustrates why requiring JWT may be a bit restrictive as opposed to just requiring bearer auth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Require JWT Bearer Token authentication
4 participants