You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For a given dataset, I want to know how many records have dates. I also want to know how many of those are useful, have issues, and maybe what their precision is. I envision this as a bar chart, where the records are grouped in categories based on the quality of the dates.
So the 3 relevant issues RECORDED_DATE_INVALID, RECORDED_DATE_MISMATCH and RECORDED_DATE_MISMATCH are of limited use to indicate the data quality. One approach to provide much more relevant date information, is to use the Canadensys Narwhal Processor.
eventDate
issue
eventDate from verbatim.txt
verbatimEventDate
year
month
day
Process
IFeventDate!=""ANDissueDOESNOTCONTAIN(RECORDED_DATE_MISMATCH)THENcategory="Valuable date (all in ISO8601)"/* Well, MM-DD-YYYY are still in there */ELSEIFissueCONTAINS(RECORDED_DATE_MISMATCH/* The only issue that keep eventDate populated */)verbatim.txt.eventDate!=""/* Since GBIF empties eventDate (see #27) in occurrence.txt, we'd have to look in verbatim.txt :( */ORverbatimEventDate!=""ORyear!=""OR(year!=""ANDmonth!="")OR(year!=""ANDmonth!=""ANDday!="")/* A date was provided */THENcategory="Date provided, but not interpreted by GBIF"ELSEcategory="Date not provided"
The text was updated successfully, but these errors were encountered:
The pretty useless process if we just use GBIF issues:
IFissueCONTAINS(RECORDED_DATE_INVALIDRECORDED_DATE_MISMATCHRECORDED_DATE_UNLIKELY)THENcategory="Date with issues"ELSEIFeventDate!=""THENcategory="Valuable date (all in ISO8601)"ELSETHENcategory="Date not provided"/* This is just incorrect! See issue #27 */
We need to look in verbatim.txt to get a useful eventDate (as GBIF overwrites them without warning in occurrence.txt, see eventDate can be set blank with no issue thrown #27 - need to confirm with them that no field in occurrence.txt has the original eventDate). If so, how challenging is it to loop over that file too?
Do we use the Canadensys Narwhal processor to provide high quality categories, instead of the current basic ones?
Description
For a given dataset, I want to know how many records have dates. I also want to know how many of those are useful, have issues, and maybe what their precision is. I envision this as a bar chart, where the records are grouped in categories based on the quality of the dates.
Categories (in order of increasing data quality)
Questions
RECORDED_DATE_MISMATCH
ifday
,year
,month
are correctly provided.RECORDED_DATE_UNLIKELY
also matches invalid dates:99 XXX 9999
RECORDED_DATE_INVALID
,RECORDED_DATE_MISMATCH
andRECORDED_DATE_MISMATCH
are of limited use to indicate the data quality. One approach to provide much more relevant date information, is to use the Canadensys Narwhal Processor.eventDate
is provided, GBIF doesn't seem to look in verbatimEventDate or year, month, day. The literal values of those fields are shown on the website though.Terms we need
Process
The text was updated successfully, but these errors were encountered: