Introducing the Fivetran dbt Package for GitHub
For an overview of how dbt powers advanced transformations, and information about our other dbt packages, take a look at this recent blog.
dbt Package for GitHubOur dbt package for GitHub helps you to better track the state of issues, pull requests and their related assignments in order to increase velocity for codebase updates. The packages make use of the Fivetran Github connector which enables the package to directly ingest all of the data passed through the GitHub API to join these disparate tables for:
- Enriching GitHub issues with their assignees and time to completion
- Time metrics attached to pull requests to track life cycles from creation, to review, to merge
- A weekly, monthly, and quarterly overview of your opened and closed issues and pull requests
- Whether there’s a disproportionate issue to assignee ratio
- Establishing a potential “cliff” timeline for a pull request to fall through the cracks
- High-level pull request completion tracking, by week, month and quarter.
- Determining the average time taken in each stage of a pull request to forecast an issue completion timeline
Challenges of the GitHub APIThe GitHub API splits out contextual information about issues and pull requests, such as assignees and history, into various endpoints, which makes it easier to define your API requests to target the exact information that you’re looking for, but harder to join the data for analytics.
How Fivetran HelpsOur native GitHub connector automatically brings in data about issues, pull requests and their corresponding contextual information in a pre-defined format (see Fivetran’s documentation for the GitHub schema) that makes it easy to start querying your data right away. By continuing to replicate your GitHub data into your centralized data warehouse at a frequency that you dictate, and using the provided dbt package, you’ll be able to better track and optimize your development team’s efficiency. Use GitHub as a standalone source or combine this data with common project tracking software, such as Jira or Asana, to provide your organization insight into the complete software delivery process.
Next StepsGet the dbt package for GitHub: This does advanced modeling, i.e., data transformations, dependencies, and target table creation. The primary outputs of this package are described below. Intermediate models are used to create these output models:
- github_Issues: Each record represents a GitHub issue, enriched with data about its assignees, milestones, and time comparisons
- github_pull requests: Each record represents a GitHub pull request, enriched with data about its repository, reviewers, and durations between review requests, merges and reviews
- github metrics: Each record represents enriched metrics about PRs and issues that were created and closed during day, week, month, or quarter periods.