You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Taxon names can be ambiguous due to synonymies and homonymies. To facilitate integration of returned trait data matrices with other trait data, having also identifiers for taxa rather than just names can help greatly.
Some characters (entities) in a matrix may be known to be much more similar semantically (or conceptually) than others, but to assess this with metrics the entities need to be tied into an ontology. To enable this, the identifiers for characters (for pk_ontotrace, this would be the identifiers of their entities) are required.
The identifiers could all be queried for one-by-one from the Phenoscape API, but for larger matrices this may be time consuming, and because the identifiers are (or ought to be, see phenoscape/phenoscape-kb-services#20) already returned in NeXML from the Phenoscape API, having to query for them again seems unnecessary.
Initial plan for implementing this is to optionally return a list instead of a data.frame. The list would include the matrix, a table of taxon identifiers, and a table of entity identifiers. @sckott and @cboettig - are there better ways of doing this?
Implementing this depends on the metadata extraction in RNeXML getting fixed (see ropensci/RNeXML#129), and on character identifier annotations being added to the output NeXML in OntoTrace (see phenoscape/phenoscape-kb-services#20).
The text was updated successfully, but these errors were encountered:
To be clear what are these taxa identifiers? I assume they are IDs used internally within Phenoscape?
As for what we get back in the NeXML, these are VTO identifiers. Ideally there'd also be NCBI identifiers, I suppose (though they are only available for a small subset of VTO).
Taxon names can be ambiguous due to synonymies and homonymies. To facilitate integration of returned trait data matrices with other trait data, having also identifiers for taxa rather than just names can help greatly.
Some characters (entities) in a matrix may be known to be much more similar semantically (or conceptually) than others, but to assess this with metrics the entities need to be tied into an ontology. To enable this, the identifiers for characters (for
pk_ontotrace
, this would be the identifiers of their entities) are required.The identifiers could all be queried for one-by-one from the Phenoscape API, but for larger matrices this may be time consuming, and because the identifiers are (or ought to be, see phenoscape/phenoscape-kb-services#20) already returned in NeXML from the Phenoscape API, having to query for them again seems unnecessary.
Initial plan for implementing this is to optionally return a list instead of a data.frame. The list would include the matrix, a table of taxon identifiers, and a table of entity identifiers. @sckott and @cboettig - are there better ways of doing this?
Implementing this depends on the metadata extraction in RNeXML getting fixed (see ropensci/RNeXML#129), and on character identifier annotations being added to the output NeXML in OntoTrace (see phenoscape/phenoscape-kb-services#20).
The text was updated successfully, but these errors were encountered: