How to Troubleshoot Missing Data
Question
How can I troubleshoot missing data?
Environment
HVR 5
Answer
There can be multiple reasons why data may be missing. Some of the reasons are:
- Due to a configuration error.
- By moving the capture position forward to quickly overcome some errors but failing to recover the skipped data.
Details
We need the client to help us:
- identify a missed change
- determine when that change took place on the source database
- establish whether that change was recorded in the source database transaction log
Once we've ascertained those facts, HVR can investigate why that change was not captured or integrated.
Steps
Perform an online refresh of the out-of-sync table to reconcile the source and target to establish a baseline.
Add the parameter Integrate /JournalRouterFiles for the integration location group and regenerate the integrate job. This will save all processed router files to
$HVR_CONFIG/jnl/<hub>/<chn>/YYYYMMDD
.These router files will provide us with a history of what HVR captured and submitted for integration. Additionally, you should do some regular housekeeping on the hub. For example, schedule the hvrmaint utility to manage disk utilization with option
–journal_keep_days=n
to periodically purge the router files aftern
days, i.e., 7.It is very important to schedule a frequent row-by-row and verbose HVR Compare so any mismatches are detected as soon as possible.
If you anticipate that the compare will produce more output than the current hub can handle, move the channel to a less busy hub. It is important to note that the compare will almost always report differences, but that is typically due to 'in-flight data' – changes that have been committed at the source but have yet to be replicated to the target. Instead, it is the difference that persists across multiple runs that we need to consider ourselves with.
Once you've identified a mismatch, use column
update_date
or equivalent to determine when the row was last written at the source.Liaise with the source DBA and secure the DBMS transaction log(s) for the transaction that made the missed change.
We will then use a transaction log viewer to verify that the change, as well as the entire transaction containing that change, is indeed present in the transaction log(s). After all, it is possible that the application (temporarily) disables transaction logging, preventing HVR from capturing the changes.
Once you've identified a mismatch, use column
update_date
or equivalent to determine when the row was last written at the source.Attach or upload the transaction logs, router files, and .enroll* files to HVR.
In conclusion, work with your DBA to ensure that the transaction logs remain available for some time. For example, should we identify a missed change that took place a week ago, we will need the transaction log that was active at that time.
Advice will be provided based on the cause of the discrepancy. Once the data has been collected, upload the data to the ticket.
TIP: This may be requested multiple times to identify the issue.
Once we have enough information, it may be requested by support to schedule a refresh in the short term but wait for that action to be specifically requested else valuable information needed to troubleshoot may be lost.