-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[backend] Add search terms #554
Conversation
This code enhances the metadata information with a set of search terms, which simplify query operations and avoid the manual inspections of the Perceval items. The search terms are included in a dict with the following shape: { 'term-1': 'value-1', 'term-2': 'value-2', 'term-3': 'value-3', } The search terms are added to the metadata information of each item in `search_terms` attributes. If `search_terms` is not set, it will be set to `None`. Tests have been added accordingly. The backend version is set to 0.9.0. Signed-off-by: Valerio Cosentino <[email protected]>
This code extends the Jira backend by including a set of search terms to simplify query operations. The search terms introduced are: `project_id`, `project_key` and `project_name`. Tests have been added accordingly. The backend version is set to 0.13.0. Signed-off-by: Valerio Cosentino <[email protected]>
2 similar comments
An initial evaluation of the new feature has been done to measure the impact on the raw indexes in GrimoireLab. The evaluation consisted in running the raw collection on a Jira with >10000 issues using
Furthermore, the additional benefits of the search terms are:
The next evaluation will focus on the impact of the search tems approach on `filter-raw' data. |
To use the search_terms in the filter raw a minor change is needed in ELK (
In both cases, the same number of items are enriched and there is no significant variation in the execution times. |
The PR is ELK is available at: chaoss/grimoirelab-elk#665 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea looks great but I have some minor comments to change the implementation. Please look at them and tell me if they make sense or not. I think it will be better to use the same behaviour that we have for classified_fields
.
@@ -74,14 +74,27 @@ class Backend: | |||
Classified data filtering and archiving are not compatible to prevent | |||
data leaks or security issues. | |||
|
|||
Each backend can also provides a set of search terms to simplify query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would changed this with:
Each backend can also provides a set of search terms to simplify query | |
Each backend might provide a set of search terms to simplify query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the use of might
(which expresses permission) over can
(which expresses ability/capability) in this sentence.
I'll fix the typo (provides
)
project_name = item['fields']['project']['name'] | ||
|
||
terms = { | ||
'project_id': project_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does make sense to include the id
of the item too? The one returned by metadata_id()
.
I have prepared another PR addressing your comments: #560, so we can better evaluate the two solutions. |
Closing this PR since PR #560 seems better |
This PR proposes to extend the metadata attributes by adding a new field (
search_terms
) with the purpose of simplifying query operations and avoding inspecting the data attribute.The search terms are included in a dict with the following shape:
The search terms are added to the metadata information of each item in
search_terms
attribute. If search terms are not defined,search_terms
will be set toNone
.This new feature is showcased with the Jira backend.