Reporting Server Replication
Strategy for reporting within the Kinetic Platoform
Requirements
Kinetic Platform and “reporting database”. We targeted SQL databases. Suggestions are: Postgres, Microsoft SQL Server, and Oracle
How to setup replication to a reporting server
To enable replication you need to configure and enable webhooks for submissions in your kapp.
We will setup a webhook. The webhook will relay the action type (create, update, delete) and the related data to the associated workflow. This workflow will then be configured to forward the data to the SQL database optimized for reporting.
- Create a SQL database and setup replication user
- Import handler and configure for SQL database
- Import (or create) tree(s) for create, update, delete
- Test (fake data)
- Configure webhooks for create, update, delete
- Test
Reporting Data Model
Kapp name, table name
Kapp table (kapp fields -- allows reporting across all forms)
Space attributes
Kapp attributes
Table attributes
What is happening in the handler
The Kinetic Platform (Core) forwards the transaction data via webhook to the workflow engine (Task). Task workflow then executes the handler which parses the transaction data and performs the appropriate work.
Types of actions
- Create
- Update
- Delete
Types of functions
- Kapp create
- Table create
- Table modification
Types of data
- Space data
- Kapp data
- Record data
Under the sheets of the handler, the handler is checking to see if the following exists.
- A table with the name of the “Kapp slug”, if not it creates it
- Inserts/upserts/deletes “generic fields” + form slug into the table
- A table with the name of the Kapp + slug of the form, if not it creates it
- Validates the form has the correct fields, if not, creates/removes them
- Inserts/upserts/deletes fields from form into the table
- Validates the form has the correct fields, if not, creates/removes them
Field Naming Convention
The columns in the database tables are prefixed by the following letters:
- c: The c indicates that this is a Core field which is added automatically by Core.
- l: The l indicates that this is a Question on your form and contains the Limited Answer. The data stored in the l column is Limited to 4,000 characters.
- u: The u indicates that this is a Question on your form and contains the Unlimited Answer.
Security
The Kinetic Platform has a robust security model and is enforced when retrieving data from the system. Reporting data is replicated outside of the system and therefore the security models are not enforced. Security must be managed by the SQL system and appropriate rules must be defined by the system administrator.
Design Decisions
Another Database
Reporting directly against a production database can have negative performance impact. If a reporting database is slow then reports can be tied up, and that can be bad, but if the production database is tied up because of poor queries that is a whole other level of bad. That’s hundreds of people calling your service desk reporting an outage bad. There are ways to protect against that in production but I’ve seen far too many various software implementations through out my years (Remedy, ITCAM, and ITM to name a few) where this exact problem has come up to not feel comfortable relying on that and instead wanting a separate reporting database.
Production databases are optimized for user input. Reporting databases are structured and indexed for reporting needs.
Dynamic Schema
We wanted report writing to be as easy as possible. That meant as few JOINs, and PIVOTs as possible. Our constantly updating schema for database tables and columns gives as meaningful as possible names to tables and columns. This means that we’ve lowered the barrier of technical experience needed for getting basic reports. With very minimal effort on multiple BI platforms, you can empower a non-technical manager to find out how many requests their direct reports have filed within the last 30, 60, etc days. Or how many requests their direct reports have fulfilled within the last 30, 60, etc days.
This means that someone will need to manually add indexes to these tables potentially (we add basic ones) but it also means that without foreign keys the data is straightforward to archive and set retention policies.
Deletes
We replicate the delete -- you may want to keep deleted data and mark with a status of deleted vs actually deleting the record.
Kapp and Table changes
Renaming a field, kapp, or attribute has potential consequences, as all changes are replicated.
Asynchronous
The reporting data will not be instantly replicated. The reporting system is not “real-time” updated with the Kinetic Platform. The reporting system is typically a couple seconds up to a minute behind the original Kinetic Platform (Core) database.
It is important to not use the reporting database as a “source of record” or as a “cache” of any sort. It will be out of date, may be “pending deletes” and other problematic issues from a “production data” perspective. The reporting database is designed to facilitate reporting typically used by business to see trends and roll-ups typically grouped in date ranges (examples: counts by quarter, counts by group).
It is not a production source of data. Data used in the Kinetic Platform should always be pulled from it’s system of record.
Most of our clients don’t need the performance overhead of ensuring that submission data is available to report on exactly as soon as it’s updated in CE. A slight delay is okay. And with webhooks this continuous, asynchronous, and light-weight replication turned out to be a very good solution. One other benefit to the asynchronous, continuous replication is if a problem occurs adding the submission to the reporting database, it’s easily, and safely retryable. The workflow will only perform upserts and will not update a record in the reporting database unless the submission has a more current date/time than what is already in the reporting database for that submission.
No Auditing
We didn’t add a new record to the reporting database for everytime a submission is updated for the same reason we decided to go with a dynamic schema, simplicity. If there is only one record in the database per submission, this is easier for less technical report writers to navigate and work with.
Disclaimer
We designed this strategy to fit the needs of the majority of our clients and to address the common recurring requests we’ve heard for reporting needs. Every client isn’t the same though, and that’s okay! The general strategy still works and it’s completely open to be modified if this doesn’t work for your requirements. Not sure where to start implementing your deviated reporting requirement? Reach out to us! We’d love to hear from you and see where we could possibly help you out.
Updated over 3 years ago