Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formally define (ABNF) 3.x URL templating behavior #3256

Open
handrews opened this issue Apr 24, 2023 · 27 comments
Open

Formally define (ABNF) 3.x URL templating behavior #3256

handrews opened this issue Apr 24, 2023 · 27 comments
Assignees
Labels
clarification requests to clarify, but not change, part of the spec request matching Matching requests to URL templates, media types, etc. review
Milestone

Comments

@handrews
Copy link
Member

As noted in Slack discussions, there is no formal specification for where variables can appear in server URL templates and path templates. There does not seem to be any reason to place restrictions on it.

Ideally, we could leverage the ABNF of RFC 6570 URI Templates even though OAS 3.x templating is not the same as RFC 6570 in general. (@darrelmiller @webron @MikeRalphson any concerns here?)

@handrews handrews self-assigned this Apr 27, 2023
davishmcclurg added a commit to davishmcclurg/OpenAPI-Specification that referenced this issue Nov 20, 2023
`Server.url` and `Link.operationRef` both allow variable substitution
with {brackets}, which means they're not always valid URI references.

For example, the [current specification][0] shows
`https://{username}.gigantic-server.com:{port}/{basePath}` as a Server
Object `url`, but it's not a valid URI reference because the host
includes curly brackets.

[`operationRef`][1] similarly includes
`https://na2.gigantic-server.com/#/paths/~12.0~1repositories~1{username}/get`
as an example that isn't valid using the `uri-reference` format.

I looked into the other uses of `uri-reference` and they seemed ok.

Related:
- OAI#2586
- OAI#3235
- OAI#3256
- davishmcclurg/json_schemer#158

[0]: https://spec.openapis.org/oas/v3.1.0#server-object-example
[1]: https://spec.openapis.org/oas/v3.1.0#operationref-examples
@handrews
Copy link
Member Author

Issue #2119 brings up a lot of points relevant to this.

@handrews handrews modified the milestones: v3.1.1, v3.0.4 Jan 27, 2024
@handrews handrews added clarification requests to clarify, but not change, part of the spec request matching Matching requests to URL templates, media types, etc. and removed 3.0.4 labels Jan 27, 2024
@char0n
Copy link
Contributor

char0n commented May 9, 2024

@handrews I created a implementation with formal ANBF grammar in https://github.com/char0n/openapi-path-templating.

I did a couple of assumption, because of the fact that the templating is not defined very precisely. Hope it helps a bit.

There does not seem to be any reason to place restrictions on it.

IMHO the restriction would be to now allow template expressions in URL query (searchParams) and URL fragment

@handrews
Copy link
Member Author

@char0n thank you! I don't know if we're going to get to this in the current patch releases or not (I'm unassigning it for onw as I have too many things in my backlog), but this will help when we do get there!

@handrews handrews removed their assignment May 13, 2024
@handrews
Copy link
Member Author

handrews commented Jun 4, 2024

@char0n I took a look at your ABNF- it looks like it restricts path template variables to whole segments. This issue got filed because in a discussion with TSC folks on slack, it was decided that template variables are not restricted to whole segments. You can have a template like /foo{bar}/whatever. I don't know how many API descriptions actually do that, but I figured you might want to know since the initial comment in this issue was not sufficiently clear.

@char0n
Copy link
Contributor

char0n commented Jun 6, 2024

@handrews I've created implementation with formal ABNF grammar for Server URL Templating in https://github.com/char0n/openapi-server-url-templating.

It's distinct from https://github.com/char0n/openapi-path-templating, as the Server URL Templating is based on RFC 6570, but Path Templating is more based on RFC 3986.

@char0n
Copy link
Contributor

char0n commented Jun 6, 2024

@char0n I took a look at your ABNF- it looks like it restricts path template variables to whole segments. This issue got filed because in a discussion with TSC folks on slack, it was decided that template variables are not restricted to whole segments. You can have a template like /foo{bar}/whatever. I don't know how many API descriptions actually do that, but I figured you might want to know since the initial comment in this issue was not sufficiently clear.

The grammar is not restricted to the whole segments:

path-segment                   = 1*( path-literal / template-expression )

path-segment is defined as non-terminal rule containing one or more path-literal or template-expression.

This path template /foo{bar}/whatever' is parsed into following AST:

[
  [ 'path-template', '/foo{bar}/whatever' ],
  [ 'path', '/foo{bar}/whatever' ],
  [ 'slash', '/' ],
  [ 'path-literal', 'foo' ],
  [ 'template-expression', '{bar}' ],
  [ 'template-expression-param-name', 'bar' ],
  [ 'slash', '/' ],
  [ 'path-literal', 'whatever' ]
]

NOTE: there is a test that guarantee this works as expected in https://github.com/char0n/openapi-path-templating/blob/bea2682e889a6bc06da95596b303dcc77304402f/test/parse.js#L45

@char0n
Copy link
Contributor

char0n commented Jun 6, 2024

@handrews,

There is one un-clarity related to template-expression-param-name rule from openapi-path-templating. OpenAPI spec doesn't technically constrain Parameter Object.name fixed field in any way. It just says it's a string. This means that people can use "parameter name" as a value.

This also means that the path template will need to be defined as /{parameter name}

It's just another example how OpenAPI Path Templating differs from RFC 6570.


@handrews taking into account the above, template-expression-param-name should be less constrained and in theory should allow any non empty string value. What do you think?

@LasneF
Copy link
Member

LasneF commented Jun 17, 2024

The key point here is that RFC is defining a way to represent template
according to this specification it defines that template are a string , with basic rules about concatenation and one about size. no more.

in OAS leveraging parameter with a type and a format allows munch more complex definition but the ambigious pattern policy is not fully defined, especially this one /pets/{petId} vs /pets/{name} mentioned as invalid ; where there is nothing defined toward type of format

in the other hands some large framework has implemented that path templating can support either type , like in Python on Flask

  • /Hello/int:post_id
  • /hello/string:you

or in Java where you URI are always considered as a string BUT allowing regular expression for matching

  • "/users/{id: [0-9]*}
  • or even more complex by defining a "binder"

changing to the RFC 'like' meaning path as string with low format capability , could be seen as a breaking change , and would requires to a significant change in OAS spec to be precise (like not leveraging parameter object for instance)

using the full OAS capability as defines here need just more sample about ambiguous path
and setting the limit as
do we consider the json type only , or the type + the format (but that s more tricky about as for pattern can overlap)

notice that if /a/{b} vs /a/c is declared as non ambigious , looks on URI evaluation to me /{b}/a and /a/{b} should be same level taking the same principle but applying segment per segment from left to right

@char0n
Copy link
Contributor

char0n commented Jun 17, 2024

@handrews,

So is it safe to assume that {name} part must abide to RFC6570 and is defined by following ABNF non-terminal?

name       =  varchar *( ["."] varchar )
varchar    =  ALPHA / DIGIT / "_" / pct-encoded

My current implementation assume (probably incorrectly):

name                   = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved          = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded         = "%" HEXDIG HEXDIG
sub-delims          = "!" / "$" / "&" / "'" / "(" / ")"
                                / "*" / "+" / "," / ";" / "="

@LasneF
Copy link
Member

LasneF commented Jun 19, 2024

@handrews

@LasneF currently, parameter type is not considered as part of request matching

here i do not want to open a pandora box , may be better to add a sample mentioning it

@char0n , i am not even sure that is safe , as i am not sure of the support of the complexity of such pattern by most of framework

{/list*,path:4} or
{+path,x}/here with sample like this /foo/bar,1024/here

i am wondering if this should not be limited to something simplistic (that still need to be defined)

@handrews
Copy link
Member Author

@char0n I don't think there's anything safe to assume here. I have no idea what tools do in such cases, and we do not place syntactic restrictions on our URL Template variables. I think it would be equally valid (based on the current text) to say you can use whatever name you want, but before you pass it to an RFC6570 library you need to encode it. That's essentially how the new Appendix C says to treat in: query params.

@handrews
Copy link
Member Author

@OAI/tsc review request: Does anyone have the bandwidth to sort this out for 3.0.4 and 3.1.1? If not, let's change the milestone to 3.2.0. It does not seem to be something that is causing problems in practice and I need to focus more on Moonwalk now and don't have the time to research this.

@ralfhandl
Copy link
Contributor

My naive interpretation of the current specification text:

  • The Paths Object points to section Path Templating.
  • Section Path Templating defines (a bit informally, yet unmistakably) that a template expression has the form { + path parameter name + }, which is then replaced with the path parameter value.
  • It then points to the Path Item Object and the Operation Object, which in turn point to the Parameter Object.
  • The Parameter Object explains how to construct a path parameter value and also points to the new Appendix C: Using RFC6570 Implementations.

Not sure what we could add here.

@lornajane lornajane modified the milestones: v3.0.4, v3.2.0 Sep 5, 2024
@handrews handrews changed the title Formally define 3.x URL templating behavior Formally define (ABNF) 3.x URL templating behavior Nov 21, 2024
@baywet
Copy link
Contributor

baywet commented Nov 25, 2024

Hey everyone,
Thank your for starting to think about that.

Like we've done in 3.1.1 I believe we're going to be able to make more progress here by scoping things out a bit more.

Ideally (IMHO) we'd be leveraging RFC6570 URI templates as it improves reusability, and even unlocks scenarios like /foo/bar?query=baz accepts on schema for the POST request body and /foo/bar?query=baz&query2=zag another (arguably a bad design, but not even possible today).
However, this is most likely a change we can't make in 3.X since it'd be a breaking change, so we can/should push that to 4.X instead.

Working backwards from that, there are two confusions I often see in specifications:

  • Which paths are "conflicting" according to the spec? e.g. /foo/{bar} and /foo/{zag}. While this call out is present in 4.8.8.2 it's not present in 3.5. TLDR I think the spec should call this out in a more prominent way.
  • Which characters are allowed for path variable names? (e.g. is a + sign ok?) While the spec calls out RFC3986 for the values in section 3.5 it's hard to discover. And doesn't say anything (as far as I understand) for the name of the variable itself. Maybe we could start aligning with RFC 6570 here? without completely overhauling path templating.

I hope this helps. Happy to expand if I wasn't clear on some aspects here.

@handrews
Copy link
Member Author

@baywet This issue is just for describing the existing system with a formal ABNF, not for changing how it works.

For improvements on what templates are ambiguous, there are several request matching issues filed on that topic.

@handrews
Copy link
Member Author

@baywet

Which characters are allowed for path variable names?

How to handle characters outside of the defnitely-OK range is addressed in Appendix C "Using RFC6570-based serialization" §C.4.4 Illegal variable names as parameter names.

@baywet
Copy link
Contributor

baywet commented Nov 26, 2024

Thank you for the additional information.

Implementations of this specification MAY use an implementation of RFC6570 to perform variable expansion

Appendix C

To me, it's not obvious that OAS path template variable names MUST follow the naming pattern of variables in RFC 6570. Not only the appendix says "MAY", but also:

To be clear, I'm not debating what convention should the variable names follow, I'm simply trying to point out that discovering such convention in the current form of the document is difficult.

@lornajane
Copy link
Contributor

Discussed in TDC and agreed that we can better signpost the existing content in Appendix C, and also make it clearer how flexible these sections are or could be. We discussed examples that include URLs like /sort/{date},{user}/filter/... or /user-{userId}.

Next steps are for @baywet to put together some proposed changes and @handrews to review.

@baywet
Copy link
Contributor

baywet commented Nov 28, 2024

@handrews I've created implementation with formal ABNF grammar for Server URL Templating in https://github.com/char0n/openapi-server-url-templating.

It's distinct from https://github.com/char0n/openapi-path-templating, as the Server URL Templating is based on RFC 6570, but Path Templating is more based on RFC 3986.

@char0n thanks for starting such great work here! While reviewing those, I was wondering what lead you to the conclusion that Server URL templating supports RFC6570 since

Variables and MAY be relative, to indicate that the host location is relative to the location where the document containing the Server Object is being served. Variable substitutions will be made when a variable is named in {braces}.

here

Or maybe I missed something?

@baywet
Copy link
Contributor

baywet commented Nov 29, 2024

@handrews I've added a bunch of test cases to the excellent work @char0n started here. This is meant to be a base for discussion, please have a look at it when you have a couple minutes. swaggerexpert/openapi-path-templating#145

@char0n
Copy link
Contributor

char0n commented Dec 3, 2024

@char0n thanks for starting such great work here! While reviewing those, I was wondering what lead you to the conclusion that Server URL templating supports RFC6570 since

From what I remember: Server URL templating doesn't refer to any RFC explicitly, I've inferred (from the information available within the OpenAPI spec) that the closest thing to this is subset of RFC 6570: URI Template. Because Server Variable can be substituted in any part of the URI, literals around server-variables are parsed differently that for Path Templating, where template-expressions are only occurring in path part of the URI.

Hence - the Server URL templating uses ABNF non-terminals from RFC 6570: URI Template, and Path templating uses ABNF non-terminals from RFC3986 section 3.3. to parse literals around template expressions and server variables.

Both openapi-path-templating and openapi-server-url-templating uses the same same value for the ABNF non-terminal describing what's allowed as the value between { and }.

# openapi-path-templating
template-expression-param-name = 1*( unreserved / pct-encoded / sub-delims / ":" / "@" )
# openapi-server-url-templating
server-variable-name   = 1*( unreserved / pct-encoded / sub-delims / ":" / "@" )

Hope it clears things up a bit.

@baywet
Copy link
Contributor

baywet commented Dec 3, 2024

Thank you for the additional information.

The specification aspect of that brings so many questions:

  • what do we expect client applications to do? resolve the server URL first, then append the path template, and then resolve this? or concat first and resolve all at once?
  • if resolve all at once, what should happen if a server variable conflicts with a path template parameter (same name) ? Can the path template use server variables at all?
  • what characters are valid for the server variable name?
  • do we care that authorized values for the variables living in the scheme, port and base path should be different? What about variables that span the whole authority? or both scheme and authority? (i.e. do we want to prevent people expanding to invalid server urls)

I'd suggest that we focus on locking down path templating first, and come back to this one later to avoid loosing focus. Plus the answers from path templating should help us with server templating, and the relationship to one another. Thoughts?

@char0n
Copy link
Contributor

char0n commented Dec 3, 2024

what do we expect client applications to do? resolve the server URL first, then append the path template, and then resolve this? or concat first and resolve all at once?

The tooling that I work on, resolve the server URL and path template separately and then concatenate them.

if resolve all at once, what should happen if a server variable conflicts with a path template parameter (same name) ? Can the path template use server variables at all?

Because of above answer, this is a non issue.

do we care that authorized values for the variables living in the scheme, port and base path should be different? What about variables that span the whole authority? or both scheme and authority? (i.e. do we want to prevent people expanding to invalid server urls)

AFAICT, from spec itself and how people use it, the variable can be wherever you want it to be. One of the more exotic usecase I've seen is SASS on Premise:

servers:
  - url: "{server}/v1"
    variables:
      server:
        default: https://api.example.com # SaaS server

More are described here: https://swagger.io/docs/specification/v3_0/api-host-and-base-path/

notEthan pushed a commit to notEthan/scorpio that referenced this issue Dec 4, 2024
`Server.url` and `Link.operationRef` both allow variable substitution
with {brackets}, which means they're not always valid URI references.

For example, the [current specification][0] shows
`https://{username}.gigantic-server.com:{port}/{basePath}` as a Server
Object `url`, but it's not a valid URI reference because the host
includes curly brackets.

[`operationRef`][1] similarly includes
`https://na2.gigantic-server.com/#/paths/~12.0~1repositories~1{username}/get`
as an example that isn't valid using the `uri-reference` format.

I looked into the other uses of `uri-reference` and they seemed ok.

Related:
- OAI/OpenAPI-Specification#2586
- OAI/OpenAPI-Specification#3235
- OAI/OpenAPI-Specification#3256
- davishmcclurg/json_schemer#158

[0]: https://spec.openapis.org/oas/v3.1.0#server-object-example
[1]: https://spec.openapis.org/oas/v3.1.0#operationref-examples

(cherry picked from commit 5f765f29e3c12dcab370f5155fe21c6895a8ac5e)
@baywet
Copy link
Contributor

baywet commented Dec 5, 2024

From the TDC discussions: there's a preference to start a pull request from what we have today rather than tweaking the tests suite from the library. @baywet to put a pull request together for the path templates and we'll iterate from there.

@baywet
Copy link
Contributor

baywet commented Dec 5, 2024

@handrews @mikekistler just put together #4244 as a starting point for discussion. Let me know what you think!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification requests to clarify, but not change, part of the spec request matching Matching requests to URL templates, media types, etc. review
Projects
None yet
Development

No branches or pull requests

6 participants