Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gitter] add gitter backend support to ELK #831

Merged
merged 1 commit into from
Apr 16, 2020

Conversation

imnitishng
Copy link
Contributor

@imnitishng imnitishng commented Apr 3, 2020

This PR is for adding an early support of gitter backend in grimoirelab-elk. Please have a look.
Fixes #820

Signed-off-by: Nitish Gupta [email protected]

@imnitishng
Copy link
Contributor Author

I have added Raw and Enricher classes for the backend, however the enrichment of data is quite shallow at the moment because of the limited data returned by perceval.
Studies are not implemented in this commit. I think it will be better to add them after some discussions, so please drop some ideas on that as well as creation of enriched data.
I will be adding tests and documentation soon.

@valeriocos
Copy link
Member

Hi @imnitishng, sorry for the late reaction. I'm reviewing the PR right now

@valeriocos valeriocos self-requested a review April 5, 2020 08:06
Copy link
Member

@valeriocos valeriocos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @imnitishng for the PR. Overall it looks great, I left a first round of comments :)

Please remember to

grimoire_elk/enriched/gitter.py Outdated Show resolved Hide resolved
grimoire_elk/enriched/gitter.py Outdated Show resolved Hide resolved
grimoire_elk/raw/gitter.py Show resolved Hide resolved
grimoire_elk/raw/gitter.py Outdated Show resolved Hide resolved
grimoire_elk/raw/gitter.py Show resolved Hide resolved
grimoire_elk/enriched/gitter.py Outdated Show resolved Hide resolved
grimoire_elk/enriched/gitter.py Outdated Show resolved Hide resolved
grimoire_elk/raw/gitter.py Outdated Show resolved Hide resolved
grimoire_elk/raw/gitter.py Outdated Show resolved Hide resolved
grimoire_elk/raw/gitter.py Outdated Show resolved Hide resolved
@imnitishng
Copy link
Contributor Author

I have resolved the issues and created schema for gitter. Please have a look. I just realised some of the attributes need to be removed from the schema I developed. I will do it soon and add the required tests for this backend. Thank You for the review!

@imnitishng
Copy link
Contributor Author

imnitishng commented Apr 5, 2020

@valeriocos, I also wanted to discuss about the kinds of studies and visualizations we could perform in the data.
So far what I have in mind based on the panels for Git dashboard were -

  • Count of authors in a room over time
  • Count of issues and the issue submitters
  • People being mentioned the most
  • Messages by time zones
  • Developers being inactive
  • Attracted developers
  • Active users
  • Median time for replies
  • Average people seeing the messages plotted over time
  • Latest or Oldest URLs or issues being mentioned

This is a very high level idea of the future support I have planned, I will be digging into the details to implement this very soon. Please provide some suggestions or references that you feel I must look through. Thank you!

@valeriocos
Copy link
Member

Thank you @imnitishng for sharing an initial set of metrics! Can you answer the following questions?

Count of issues and the issue submitters

How can we get the issue submitters?

People being mentioned the most

It's a really intersting objective. If we go to this direction, I understand that we should expand the get_identities method to collect the user info stored in mentions. Does it make sense to measure also people mentioning the most?

Developers being inactive, Attracted developers

How would you like to implement these metrics?

@imnitishng
Copy link
Contributor Author

imnitishng commented Apr 6, 2020

How can we get the issue submitters?

Well issue submitters are the people mentioning some issue in the messages, so we already have the userid and username of the individual who has mentioned issues in his message. We could easily develop some rich data associated with the particular user ID. Just like how Git backend implements the number of issues or pull requests opened by a user.

people mentioning the most

Yea that sounds great, we could add it too, it might be a good metric to convey the most active users in a room

Developers being inactive, Attracted developers
How would you like to implement these metrics?

Well I had this idea as I went through git and github dashboards, I suppose people coming and joining the rooms over some defined time period (like GSoC time period of 1-2 months) might be classified in the category of attracted developers.
The people going inactive without any recent messages or mentions in the chat room (again in a specified time period) might be considered as inactive.
This is a very non technical overview, however thanks for the ideas I will dive deeper into the codebase for implementing these metrics and get back to you with an initial implementation soon.

@valeriocos
Copy link
Member

Thank you for the clarifications @imnitishng !

Well issue submitters are the people mentioning some issue in the messages

Is an approximation you are doing or does Gitter convert the issue submitters (the ones that opened an issue in GitHub) to users that mention those issues in their Gitter messages?

In the first case, we can redefine the metrics as Count of issues and the users mentioning issues

To move things forward, I would propose to work on a first implementation of the enricher/dashboard that focuses on a limited set of metrics among the ones you proposed. This set shouldn't require the implementation of a study. Later, we can see how to incorporate more metrics. WDYT?

@imnitishng
Copy link
Contributor Author

imnitishng commented Apr 6, 2020

Is an approximation you are doing or does Gitter convert the issue submitters (the ones that opened an issue in GitHub) to users that mention those issues in their Gitter messages?

Gitter does not do anything as such, it just returns the data consisting of -

  • The github repository whose issue has been mentioned
  • The issue number (eg. # 728)
  • The name and id of the person who has mentioned the issue in his message.

There is no way of knowing who opened the issue in the github repository from the data returned by gitter API. However we could rely on some new functions based on github perceval backend to get info about the person who opened the issue. I will think about this.

In the first case, we can redefine the metrics as Count of issues and the users mentioning issues

Yes this seems better. I'll do it.

I would propose to work on a first implementation of the enricher/dashboard that focuses on a limited set of metrics

Sure, I am on it, will update the PR once I have the basic implementation complete. Thanks for the valuable suggestions.

@valeriocos
Copy link
Member

Thank you @imnitishng , ping me when the PR is ready for review

@imnitishng
Copy link
Contributor Author

imnitishng commented Apr 11, 2020

Hi @valeriocos, I have added the metrics and enriched data. Please have a look and provide your suggestions. Will add tests soon.
I have the dashboard ready too, will make a PR for that in sigils too. Please have a look.

Copy link
Member

@valeriocos valeriocos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @imnitishng for working on this PR. I have some questions about the method __get_rich_links and the ones related to it. Beyond this the PR looks good.

Can you add some tests? You can look at the existing tests (e.g., https://github.com/chaoss/grimoirelab-elk/blob/master/tests/test_slack.py) to write the ones for Gitter.

The tests leverage on a common base (ref here). In the test data folder, you need to add a file containing some docs extracted from the Pagure backend of Perceval (see example here). These docs are then used for testing the code of the raw and enrich connectors.

Thanks!

grimoire_elk/enriched/gitter.py Outdated Show resolved Hide resolved
grimoire_elk/enriched/gitter.py Outdated Show resolved Hide resolved
grimoire_elk/enriched/gitter.py Outdated Show resolved Hide resolved
grimoire_elk/enriched/gitter.py Show resolved Hide resolved
grimoire_elk/enriched/gitter.py Outdated Show resolved Hide resolved
schema/gitter.csv Outdated Show resolved Hide resolved
schema/gitter.csv Outdated Show resolved Hide resolved
schema/gitter.csv Outdated Show resolved Hide resolved
schema/gitter.csv Show resolved Hide resolved
grimoire_elk/enriched/gitter.py Outdated Show resolved Hide resolved
@imnitishng
Copy link
Contributor Author

Hi @valeriocos, I have added the tests, please have a look. However some work in logic and schemas is left because of the pending discussions. Will finalize the PR as soon as we are done with the discussion. Thank you for the help.

@coveralls
Copy link

coveralls commented Apr 13, 2020

Pull Request Test Coverage Report for Build 2115

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 42 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.1%) to 79.321%

Files with Coverage Reduction New Missed Lines %
/home/travis/build/chaoss/grimoirelab-elk/grimoire_elk/enriched/github.py 2 75.62%
/home/travis/build/chaoss/grimoirelab-elk/grimoire_elk/utils.py 40 63.26%
Totals Coverage Status
Change from base Build 2111: 0.1%
Covered Lines: 7733
Relevant Lines: 9749

💛 - Coveralls

Copy link
Member

@valeriocos valeriocos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @imnitishng , overall the PR looks good. I left some minor comments. Thanks for your work

grimoire_elk/enriched/gitter.py Show resolved Hide resolved
grimoire_elk/enriched/gitter.py Show resolved Hide resolved
@imnitishng
Copy link
Contributor Author

Hi @valeriocos, I have finished the work on this backend based on your suggestions and submitted PRs in the concerned repositories. Please have a look, thank you! 😃

@imnitishng imnitishng changed the title added gitter backend support [gitter] add gitter backend support to ELK Apr 14, 2020
Copy link
Member

@valeriocos valeriocos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @imnitishng thank you the PR. I left a really minor comment.

If you are interested, you can add a note to this PR which will appear in the next release. The note should include a high-level description of the changes introduced in this PR (to let non technical people understand what the change is about).

If you are interested in adding the note, please follow the instructions at: https://github.com/Bitergia/release-tools#changelog. Consider also to share your feedback when using the tool (what you liked, what you didn't like and things you would like to improve).

If you're not interested, can you clarify why?

In both cases, your feedback is valuable and will allow us to improve the way to reflect the code changes in the release.

Thanks!

grimoire_elk/enriched/gitter.py Outdated Show resolved Hide resolved
@imnitishng
Copy link
Contributor Author

@valeriocos done, please have a look.

@valeriocos
Copy link
Member

thanks @imnitishng ! Did you commit the release note file (#831 (review)) ?

@imnitishng
Copy link
Contributor Author

Doing it now.

Raw and Enriched indexes have been added along with their tests and schemas.

Signed-off-by: Nitish Gupta <[email protected]>
@imnitishng
Copy link
Contributor Author

Done!

Copy link
Member

@valeriocos valeriocos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @imnitishng for your patience and good work!

@valeriocos valeriocos merged commit fa7ce42 into chaoss:master Apr 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[gitter] Adding data storage for gitter
3 participants