Why Are Old Rows Charged MAR When Using Aurora XMIN Replication?
Question
I have enabled the pg_visibility
extension and created the fivetran.get_all_pages
wrapper function so that Fivetran uses the new set of extract queries described here. I notice that some old, unchanged, and/or frozen rows are still being extracted. Why is Fivetran still extracting these rows and counting as MAR?
Environment
- Connector: Amazon Aurora PostgreSQL
- Sync method: XMIN system column
Answer
Amazon Aurora does not support the pageinspect
plugin which would allow Fivetran to filter at the tuple (row) level instead of the page level.
Aurora only supports the pg_visibility
plugin which maintains a list of frozen pages.
pageinspect
shows which tuples inside of a page are frozen.
To ensure data integrity, Fivetran must extract all pages which are not frozen even though a page my contain multiple frozen tuples.
The following screenshot shows an example where the 2nd page contains two frozen tuples and 1 active tuple. In this case, we would extract page one and page two and charge six MAR: