-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release R137 #1238
Release R137 #1238
Conversation
Thanks for your pull request. Is this your first contribution to a Snowplow open source project? Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://docs.snowplowanalytics.com/docs/contributing/contributor-license-agreement/ to learn more and sign. Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks. |
The yauaa schema already got thoroughly reviewed in a previous PR. It was not included in the R136 release, because of concerns about breaking RDB loader migrations. Since then, RDB Loader 4.3.0 was released, which fixes the bug in which Redshift columns were not altered for the new field lengths. So we are safe to release a new version of yauaa schema with longer field lengths. We have not, however, fixed a different RDB Loader bug which causes problems when evolving a schema to have new fields. This bug is going to take longer to fix. It is a race condition, so it will affect some pipelines, but not others. Given the high number of pipelines that use the yauaa schema, I think it is dangerous to add a new field to this schema. This new PR therefore drops the idea of adding a new field |
.github/scripts/constants.sh
Outdated
VALIDATOR_URI="https://github.com/snowplow-incubator/igluctl/releases/download/${VALIDATOR_V}/igluctl" | ||
VALIDATOR_V="0.10.1" | ||
VALIDATOR_ZIP="igluctl_${VALIDATOR_V}.zip" | ||
VALIDATOR_URI="https://github.com/snowplow-incubator/igluctl/releases/download/${VALIDATOR_V}/$VALIDATOR_ZIP" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how long GitHub maintains redirecting old repo addresses, it works now but the repo lives under snowplow
now, you might want to change that to
VALIDATOR_URI="https://github.com/snowplow/igluctl/releases/download/${VALIDATOR_V}/$VALIDATOR_ZIP"
LGTM |
1e0ada7
to
60e8667
Compare
Update: I also removed |
LGTM, one note, I think we should make it very clear in the Enrich release which includes this new schema that they should be running RDB Loader 4.3+ to ensure they don't hit the length issue. |
@paulboocock it's a bit more complicated: The RDB Loader migrates tables immediately when the schema is published. So if a pipeline has events containing So for anyone running a version of RDB Loader released between April 2022 and 6th September 2022, the migration will "run" but won't actually change the column lengths. Nothing will break, and open source users won't immediately notice that anything happened. It could only affect pipelines later, when they upgrade to an Enrich with a newer version of yauaa. Some events might exceed the maximum lengths in their warehouse, and then loading would break. When it gets this far, it will not even help to update their RDB loader version. It's a horrible mess. My only suggestions to alleviate this problem are:
|
I think:
is likely the best result we've got here in terms of balancing the situation. Bumping to 2-0-0 is also going to be extremely disruptive for users (data models everywhere) and I'd hope that anyone who is running an upgraded RDB Loader since April 2022 is reasonably on top of their upgrades. |
This is now tested against RDB Loader 4.3.0 and against the pending Enrich PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy with the changes made.
4272b14
to
85ca48c
Compare
Release 137 (2022-11-01) | ||
------------------------ | ||
Add nl.basjes/yauaa_context/jsonschema/1-0-4 (#1216) | ||
Bump igluctl version to allow schema lists in dev registry (#1236) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add ?
Add dev.amp.snowplow/amp_session/jsonschema/1-0-0 (#1235)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @igneel64 -- I had changed it locally and then I forgot to push it!! Fixed now.
85ca48c
to
0ce84c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving this one but going to wait to merge it until closer to EMEA coming online.
On top of the warning message can we also prepare some SQL statements for Redshift that would allow a user to fix their YAUAA table in case they do fall in that awkward loader version window that doesn't perform the update correctly?
That means that when we merge this in we can post a message on Discourse with an advisory and a way to fix it if they are unlucky enough.
Wdyr @istreeter ?
To be clear... merging this PR is not the dangerous merge. The dangerous step (for some pipelines) will be upgrading enrich at a later date. Having said that, it's sensible to merge this at a time of day when people are on hand to trouble shoot something unexpected. And it's courteous to put a notice on discourse just in case. Bear with me until I prepare a notice. Then hopefully you can merge this either Friday morning, or a morning early next week. |
No description provided.