Why Is the XMIN Wraparound Causing MAR Spikes?
Issue
The value of the xmin
column has an upper limit of 4,294,967,295. After the database's total number of transactions exceeds this value, it "wraps around" to zero. After this event occurs in PostgreSQL versions 9.5 and later, multiple rows may have the same xmin
value and Fivetran may additionally sync older, unchanged "frozen" rows, resulting in a MAR spike and higher costs.
Environment
- Connector: PostgreSQL
- Sync method: XMIN system column
Resolution
The recommended solution to this problem is to use logical replication as the incremental sync method. If logical replication is not an option, take the following steps to resolve the excess MAR issue:
- Verify that Fivetran has read access to the
ctid
andxmax
columns on all tables. - Enable the
pg_visibility
extension and create an associated wrapper function to filter frozen pages. - Enable the
pageinspect
extension and create an associated wrapper function to filter frozen rows.
This solution will not be enabled for:
- Non-ordinary tables (views, materialized views, partitioned tables, and etc.).
- Cases where the number of transactions between syncs is higher than the
vacuum_freeze_min_age
. - Heroku PostgreSQL.
- Amazon Aurora PostgreSQL (partially works, see note below).
NOTE: Amazon Aurora PostgreSQL supports the
pg_visibility
extension, but not thepageinspect
extension. When using an Aurora PostgreSQL source database, only frozen pages can be detected, not frozen rows in a page. Any frozen rows contained in unfrozen pages may still be synced, resulting in excess MAR. For more information, see our Aurora frozen rows article.
Verify that Fivetran has read access to the ctid
and xmax
columns
Fivetran must be able to access the ctid
column and xmax
column in every table that needs to be synced. If table level read-only access has been granted to the Fivetran user, we will have access to those columns and you can skip to the next step. If column level permissions were granted, then manual access likely needs to be granted.
If Fivetran does not have access to these columns on any particular table, you will be notified with a warning explaining which tables are missing privileges to read these columns. For information on how to verify and grant access, see this article.
Enable the pg_visibility
extension
NOTE: Do I need to make the Fivetran user a superuser?
No. We access the
pg_visibility
andpageinspect
extensions using wrapper functions that run with elevated superuser privileges using theSECURITY DEFINER
keyword. This allows the Fivetran user to make use of these extensions without requiring superuser access.
Run the following command as a superuser:
CREATE EXTENSION pg_visibility;
Confirm the extension is installed by running the following query:
SELECT * FROM pg_available_extensions WHERE name = 'pg_visibility' AND installed_version IS NOT NULL;
Create get_all_pages
wrapper function
Fivetran will use this to leverage the pg_visibility_map
function from the pg_visibility
extension to detect frozen pages in a table. Run the following commands as a superuser:
CREATE SCHEMA fivetran;
CREATE OR REPLACE FUNCTION fivetran.get_all_pages(v_table_name character varying)
RETURNS TABLE (
pagenumber integer,
all_visible_yn boolean,
all_frozen_yn boolean)
LANGUAGE plpgsql
SECURITY definer
AS $function$
declare
begin
RETURN QUERY
SELECT blkno::int as pageNumber,
all_visible as all_visible_yn,
all_frozen as all_frozen_yn
FROM pg_visibility_map($1::regclass);
END;
$function$;
GRANT USAGE ON SCHEMA fivetran TO FIVETRAN_CONNECTOR_DATABASE_USERNAME_HERE;
GRANT EXECUTE ON FUNCTION fivetran.get_all_pages TO FIVETRAN_CONNECTOR_DATABASE_USERNAME_HERE;
NOTE: Ensure that the function is defined with
SECURITY definer
and owned by a superuser, not the Fivetran user.
Confirm the wrapper function was created by running the following query:
SELECT p.proname as name
FROM pg_catalog.pg_namespace n
JOIN pg_catalog.pg_proc p ON p.pronamespace = n.oid
WHERE n.nspname = 'fivetran'
AND proname IN ('get_all_pages')
AND has_function_privilege(p.oid, 'execute')
AND has_schema_privilege(n.oid, 'usage');
NOTE: Replace
FIVETRAN_CONNECTOR_DATABASE_USERNAME_HERE
with the specific PostgreSQL user account name, which the Fivetran connector uses to access the source database.
NOTE: Skip the remaining steps if you are using Amazon Aurora PostgreSQL. See our Aurora frozen rows article for more information.
Enable the pageinspect
extension
Run the following command as a superuser:
CREATE EXTENSION pageinspect;
Confirm the extension is installed by running the following query:
SELECT * FROM pg_available_extensions WHERE name = 'pageinspect' AND installed_version IS NOT NULL;
Create get_items_in_page
wrapper function
Fivetran will use this to leverage the get_raw_page
and heap_page_items
functions from the pageinspect
extension to detect frozen rows in a page. Run the following commands as a superuser:
CREATE OR REPLACE FUNCTION fivetran.get_items_in_page(
v_table_name character varying,
v_page_number integer)
RETURNS TABLE (
lp smallint,
lp_off smallint,
lp_flags smallint,
lp_len smallint,
t_xmin xid,
t_xmax xid,
t_field3 integer,
t_ctid tid,
t_infomask integer,
t_hoff smallint,
t_bits text,
t_oid oid,
t_data bytea)
LANGUAGE plpgsql
SECURITY DEFINER
AS $function$
declare
begin
RETURN QUERY
SELECT h.lp, h.lp_off, h.lp_flags,
h.lp_len, h.t_xmin, h.t_xmax,
h.t_field3, h.t_ctid, h.t_infomask,
h.t_hoff, h.t_bits, h.t_oid, h.t_data
FROM heap_page_items(get_raw_page($1,$2)) as h;
END;
$function$;
GRANT EXECUTE ON FUNCTION fivetran.get_items_in_page TO FIVETRAN_CONNECTOR_DATABASE_USERNAME_HERE;
NOTE: Ensure that the function is defined with
SECURITY definer
and owned by a superuser, not the Fivetran user.
Verify wrapper function creation
Confirm the wrapper function was created by running the following query:
SELECT p.proname as name
FROM pg_catalog.pg_namespace n
JOIN pg_catalog.pg_proc p ON p.pronamespace = n.oid
WHERE n.nspname = 'fivetran'
AND proname IN ('get_items_in_page')
AND has_function_privilege(p.oid, 'execute')
AND has_schema_privilege(n.oid, 'usage');
NOTE: Replace
FIVETRAN_CONNECTOR_DATABASE_USERNAME_HERE
with the specific PostgreSQL user account name that the Fivetran connector uses to access the source database.