-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add load_tstamp field #572
Conversation
modules/loader/src/main/scala/com/snowplowanalytics/snowplow/rdbloader/db/Statement.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
modules/loader/src/main/scala/com/snowplowanalytics/snowplow/rdbloader/db/Migration.scala
Outdated
Show resolved
Hide resolved
modules/loader/src/main/scala/com/snowplowanalytics/snowplow/rdbloader/db/Statement.scala
Show resolved
Hide resolved
modules/loader/src/main/scala/com/snowplowanalytics/snowplow/rdbloader/db/Statement.scala
Show resolved
Hide resolved
5024381
to
175f811
Compare
case s: DSchemaList.Single => | ||
val message = s"Illegal State: updateTable called for a table with known single schema [${s.schema.self.schemaKey.toSchemaUri}]\ncolumns: ${columns.mkString(", ")}\nstate: $state" | ||
case _: DSchemaList.Single => | ||
val message = s"Illegal State: updateTable called for a table with known single schema [${current.toSchemaUri}]\ncolumns: ${columns.mkString(", ")}\nstate: $state" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chuwy I just changed the s.schema.self.schemaKey
to current
as you described in the issue. Couldn't be sure what to do improve it further. If you have any idea, just let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we can change Schema
case class there to JSON as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find where Schema
case class is used in here. Could you show me, please ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We realized that Schema
case class of SchemaDDL has derived encoder and it outputs lots of null values. We decided with @chuwy that best way is updating encoder in SchemaDDL initially then use it in here. Related issue on SchemaDDL can be found in here: snowplow/schema-ddl#105
...edder/src/main/scala/com/snowplowanalytics/snowplow/rdbloader/shredder/batch/Discovery.scala
Show resolved
Hide resolved
Hey @chuwy! Could you give it a look one more time ? |
...on/src/test/scala/com.snowplowanalytics.snowplow.rdbloader.common/TableDefinitionsSpec.scala
Outdated
Show resolved
Hide resolved
.fold(_ => (f::l, r), { | ||
case time if since.fold(true)(_.isBefore(time)) && until.fold(true)(_.isAfter(time)) => (l, f::r) | ||
case _ => (l, r) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think pattern match would work better here:
Common.parseFolderTime(time) match {
case Right(time) if since.fold(true)(_.isBefore(time)) && until.fold(true)(_.isAfter(time)) => (l, f :: r)
case Right(_) => (l, r)
case Left(_) => (f :: l, r)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized using filter
is better than foldLeft
therefore change the function a bit. I think fold
is going better in here with the new version of this function but let me know if you still think otherwise.
}) | ||
} else (f::l, r) | ||
} | ||
inInterval.reverse ++ nonTimeFormatted.reverse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two points on performance:
- My hunch is that first concatenating, then reversing would be faster, given how bad the concatenation of two linked lists?
- Also IIRC (can be wrong - worth to check) list concatenation is much faster if your first list is smaller - it does a lot less iterations.
I wouldn't be usually concerned about it, but these lists could have tens of thousands of elements and it can cause big unnecessary delays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, definitely agree. I will look into it 👍 It will look like this in the end: (nonTimeFormatted ++ inInterval).reverse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized using filter
is much more better than foldLeft
therefore change the function a bit. We don't have to use list concat or reverse with this way which is a big plus I guess.
...edder/src/main/scala/com/snowplowanalytics/snowplow/rdbloader/shredder/batch/Discovery.scala
Show resolved
Hide resolved
case s: DSchemaList.Single => | ||
val message = s"Illegal State: updateTable called for a table with known single schema [${s.schema.self.schemaKey.toSchemaUri}]\ncolumns: ${columns.mkString(", ")}\nstate: $state" | ||
case _: DSchemaList.Single => | ||
val message = s"Illegal State: updateTable called for a table with known single schema [${current.toSchemaUri}]\ncolumns: ${columns.mkString(", ")}\nstate: $state" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we can change Schema
case class there to JSON as well?
Moved to #604, closing. |
Reopened since it is removed from 2.1.0 release. |
No description provided.