4

I'm collecting GitHub issue statistics over time on our project: total number of issues, number of issues with a particular label, number of issues in a given state (open/closed). Right now, I have a Python script to parse the project webpage with the desired labeling/state for the info I want, e.g.,

http://github.com/<projectname>/issues?label=<label_of_interest>&state=<state_of_interest>
However, parsing the HTML is fragile since if the GitHub API changes, more often than not, my code fails.

Does someone describe how to use the GitHub API (or barring that, know of some other way, preferably in Python) to collect these statistics without relying on the underlying HTML?

1

3 Answers 3

3

May I be so forward as to suggest that you use my wrapper around the GitHub API for this? With github3.py, you can do the following:

import github3

github = github3.login("braymp", "braymp's super secret password")
repo = github.repository("owner", "reponame")
open_issues = [i for i in repo.iter_issues()]
closed_issues = [i for i in repo.iter_issues(state='closed')]

A call to refresh may be necessary because I don't honestly recall if GitHub sends all of the issue information upon the iteration like that (e.g., replace i.refresh() for i in <generator> as the body of the list comprehensions above).

With those, you can iterate over the two lists and you will be able to use the labels attribute on each issue to figure out which labels are on an issue. If you decide to merge the two lists, you can always check the status of the issue with the is_closed method.

I suspect the actual statistics you can do yourself. :)

The documentation for github3.py can be found on ReadTheDocs and you'll be particularly interested in Issue and Repository objects.

You can also ask further questions about github3.py by adding the tag for it in your StackOverflow question.

Cheers!

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks @sigmavirus24 (and sorry for the delay in accepting your response). This seems to work for me.
0

I'd take a look at Octokit. Which doesn't support Python currently, but does provide a supported interface to the GitHub API for Ruby.

https://github.com/blog/1517-introducing-octokit

Comments

0

Although this doesn't fully meet your specifications (the "preferably Python" part), Octokit is a fantastic (and official - it's developed by GitHub) way of interacting with the GitHub API. You wrote you'd like to get Issues data. It's as easy as installing, requiring the library, and getting the data (no need for authentication if the project is public).

Install:

gem install octokit

Add this to your Ruby file to require the Octokit library:

require 'octokit'

Although there are a lot of things you can get from Octokit::Client::Issues, you may want to get a paginated list of all the issues in a repository:

Octokit.list_issues('octokit/octokit.rb')
  # => [Array<Sawyer::Resource>] A list of issues for a repository.

If you're really keen on using Python, you might want to have a look at the GitHub API docs for Issues. Really, it's as easy as getting a URL like: https://api.github.com/repos/octokit/octokit.rb/issues and get the JSON data (although I'm not familiar with Python, I'm sure these some JSON parsing library); no need for authentication for public repos.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.