PDF tracking & web analytics

Track PDF Engagement with PDF.js (Page Views, Scroll Depth, Downloads, Print). GA4 / GTM ready

Published:

Updated:

Categories:

, ,

Backstory (why this post exists at all)

This article basically wrote itself because I ran into a simple question: how to track what users actually do inside a PDF on website?

Companies love PDFs. Whitepapers, manuals, reports, price lists, templates… you name it.

And as a site owner you usually know only one thing: how many times the “Open PDF” button was clicked. That’s… cute, but useless.

Did users scroll?
Did visitors reach the last page in PDF?
Did they read even half a page or just bounced instantly?
Did someone download this PDF? Print it?

If you ask GPT, it will confidently suggest “just track the click” or “use an iframe.” Spoiler: that won’t give you real engagement either.

The root problem is simple: a raw PDF file isn’t a normal web page. Web analytics tools can’t hook into it the way they hook into HTML. No DOM events you control. No reliable scroll tracking. No interaction layer.

So, I went digging in a direction I hadn’t touched since my “HTML5 era” (201x): <canvas>. You can render anything there… and if it’s in the browser as HTML/JS, you can track it.

That’s how I landed on PDF.js – Mozilla’s open-source PDF renderer/viewer. It’s free, runs in the browser, and (this is the important part) it exposes an event bus you can subscribe to. Meaning: you can hang analytics events on almost everything the user does.

The idea in one line

Don’t open PDFs as PDFs. Open them inside a PDF.js viewer page – and track events inside that viewer.

Stage 1: Install PDF.js (self-hosted)

I’m assuming you use the prebuilt PDF.js distribution and host it on your site. Mozilla’s own docs recommend using the latest release for production.

Example structure:

/pdfjs/
  ├── build/
  │   ├── pdf.js
  │   └── pdf.worker.js
  └── web/
      └── viewer.html

Note: depending on the build/version you might see module files (.mjs) too. The concept stays the same.

Stage 2: Embed the viewer (iframe approach)

If your PDF is displayed inside a page (blog post, landing page, etc.) – embed the viewer:

<iframe
  src="/pdfjs/web/viewer.html?file=/docs/sample.pdf"
  width="100%"
  height="600"
  style="border:0"
></iframe>

A couple of practical notes:

  • The viewer uses a file= query parameter (URL-encode it if needed).
  • If you load PDFs from another domain, you’ll hit CORS unless that domain explicitly allows it. PDF.js uses Fetch/XHR under the hood.

Stage 3: Get events from the PDF.js EventBus (the good stuff)

PDF.js viewer exposes PDFViewerApplication.eventBus. That’s where the tracking magic happens.

The reliable hook

PDF.js has a webviewerloaded event PDFViewerApplication.initializedPromise. Mozilla even documents this pattern for third-party viewer usage.

Put this JS inside viewer.html (or load it as an extra script from viewer.html). That’s the cleanest approach because you avoid cross-frame access problems.

document.addEventListener("webviewerloaded", () => {
  PDFViewerApplication.initializedPromise.then(() => {
    const bus = PDFViewerApplication.eventBus;

    bus.on("documentloaded", () => console.log("PDF loaded"));
    bus.on("pagechanging", (e) => console.log("Page:", e.pageNumber));
    bus.on("scalechanging", (e) => console.log("Zoom:", e.scale));
  });
});

Tiny warning (because reality exists)

There are versions/edge cases where webviewerloaded behaves weirdly when the viewer is inside an iframe (yes, really). This has been reported upstream.

If you ever hit that, the dumb-but-effective fallback is: run your code after DOMContentLoaded and wait for initializedPromise anyway (when your code lives inside viewer.html, it usually just works):

document.addEventListener("DOMContentLoaded", () => {
  PDFViewerApplication.initializedPromise.then(() => {
    // same code here
  });
});

Stage 4: Send events to GA4 (gtag) or to GTM (dataLayer)

Option A: Direct GA4 via gtag()

function sendPDFEvent(name, params = {}) {
  gtag("event", name, {
    event_category: "PDF",
    ...params,
  });
}

Now wire it to EventBus:

PDFViewerApplication.initializedPromise.then(() => {
  const app = PDFViewerApplication;
  const bus = app.eventBus;

  bus.on("documentloaded", () => {
    sendPDFEvent("pdf_loaded", {
      pdf_pages: app.pdfDocument?.numPages,
      pdf_url: app.url || location.href,
    });
  });

  bus.on("pagechanging", (e) => {
    sendPDFEvent("pdf_page_view", {
      page_number: e.pageNumber,
    });
  });

  bus.on("scalechanging", (e) => {
    sendPDFEvent("pdf_zoom_change", {
      zoom_scale: e.scale,
    });
  });
});

Option B: GTM-friendly dataLayer.push()

This is my usual preference if the site is GTM-driven:

window.dataLayer = window.dataLayer || [];

function pushPDFEvent(event, params = {}) {
  window.dataLayer.push({
    event,
    event_category: "PDF",
    ...params,
  });
}

Then swap sendPDFEvent() -> pushPDFEvent().

Stage 5: Downloads & Printing (two ways)

The stable way: listen to EventBus actions

PDF.js viewer dispatches download and print actions through the event bus. So you can do:

bus.on("download", () => pushPDFEvent("pdf_download"));
bus.on("print", () => pushPDFEvent("pdf_print"));

This is nicer than click listeners because UI IDs and toolbar structure can change between versions.

The “just click the button” way (works, but version-sensitive)

If you do want click tracking, be aware: in current viewer builds, the IDs are usually downloadButton and printButton (not download / print).

document.getElementById("downloadButton")
  ?.addEventListener("click", () => pushPDFEvent("pdf_download"));

document.getElementById("printButton")
  ?.addEventListener("click", () => pushPDFEvent("pdf_print"));

Stage 6: Scroll depth (because PDF.js won’t hand it to you)

PDF.js doesn’t give you a neat “25%/50%/75%” scroll event out of the box.

But the viewer scroll container is trackable. Typical target: #viewerContainer.

const vc = document.getElementById("viewerContainer");
const fired = {};

vc?.addEventListener("scroll", () => {
  const pct = Math.floor(((vc.scrollTop + vc.clientHeight) / vc.scrollHeight) * 100);

  [25, 50, 75, 100].forEach((th) => {
    if (pct >= th && !fired[th]) {
      fired[th] = true;
      pushPDFEvent("pdf_scroll_depth", { scroll_depth: th });
    }
  });
}, { passive: true });

Tip from practice:
Scroll depth is good, but it’s not the same as “read”. Pair it with pagechanging and time-on-doc if you want something closer to “consumption”.

Stage 7: Time engaged (simple, realistic version)

I like “active time” more than “time since open”.

Rule of thumb: if there’s no activity for 5 seconds, the user is probably not engaging.

let lastActivity = Date.now();
let engagedMs = 0;

["scroll", "click", "keydown", "mousemove", "touchstart"].forEach((evt) => {
  window.addEventListener(evt, () => (lastActivity = Date.now()), { passive: true });
});

setInterval(() => {
  const idle = Date.now() - lastActivity > 5000;
  const visible = document.visibilityState === "visible";

  if (!idle && visible) engagedMs += 1000;
}, 1000);

// Send a ping every 15s of active time (or change the interval)
setInterval(() => {
  if (engagedMs > 0) {
    pushPDFEvent("pdf_engagement_ping", { engaged_ms: engagedMs });
    engagedMs = 0;
  }
}, 15000);

If you hate ping events, send a single “summary” event on close (pagehide) — just keep in mind some browsers may drop last-second events.

What I ended up with (example)

On my site I embedded a PDF via a viewer shortcode like this:

Skip to PDF content

(Your implementation may differ, but the tracking logic stays the same.)

If iframe doesn’t fit your case (PDF must live on its own URL)

Let’s say you don’t want the PDF inside a page. You want a “normal” PDF link, but still track engagement.

The trick: make the “PDF link” point to the viewer, not to the raw file.

Example idea:

  • User clicks: /docs/my-whitepaper/
  • That route serves: PDF.js viewer
  • Viewer loads: /files/my-whitepaper.pdf

This gives you analytics and a clean URL structure.

GTM setup (viewer.html as a tracked environment)

Step 1: Add GTM snippet into viewer.html

  1. Open pdfjs/web/viewer.html.
    Inside <head>:
<!-- Google Tag Manager -->
<script>
(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
  j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
  'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
})(window,document,'script','dataLayer','GTM-XXXXXXX');
</script>
<!-- End Google Tag Manager -->
  1. Right after <body>:
<!-- Google Tag Manager (noscript) -->
<noscript><iframe src="https://www.googletagmanager.com/ns.html?id=GTM-XXXXXXX"
height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript>
<!-- End Google Tag Manager (noscript) -->

GTM tracking options (pick your poison)

1) Click tracking on viewer buttons (fast setup)

In many current builds, the main toolbar buttons are:

  • Download: downloadButton
  • Print: printButton
  • Prev/Next: previous, next

Create triggers:

  • Trigger type: Click – All Elements
  • Filter: Click ID equals downloadButton (same for printButton, previous, next)

Then create GA4 Event tags:

  • pdf_download
  • pdf_print
  • pdf_prev_page
  • pdf_next_page

Add event_category = PDF.

2) Page view tracking (better via EventBus → dataLayer)

Yes, PDF.js can dispatch DOM events in some setups, but this breaks across versions and config changes.

If you want this rock-solid: push a custom event into dataLayer from EventBus:

bus.on("pagechanging", (e) => {
  dataLayer.push({ event: "pdf_page_view", page_number: e.pageNumber });
});

Then in GTM:

  • Trigger type: Custom Event
  • Event name: pdf_page_view
  • GA4 Event tag: pdf_page_view + parameter page_number

3) Scroll depth trigger on #viewerContainer (needs JS variable)

GTM’s built-in Scroll Depth trigger is for the page, not for the PDF container. So, either:

  • use my JS scroll tracking and push events, or
  • build custom JS variables + listeners in GTM.

Bonus: “lazy mode” GTM JSON (import-ready)

I’m keeping your idea, but I’m translating it and updating button IDs to the current common ones (downloadButton, printButton).

Replace GTM-XXXXXXX and check IDs in your version (open web/viewer.js and search for getElementById("downloadButton")).

{
  "exportFormatVersion": 2,
  "exportTime": "2026-02-16T12:00:00",
  "containerVersion": {
    "container": {
      "publicId": "GTM-XXXXXXX",
      "name": "PDF.js Tracker"
    },
    "tag": [
      {
        "name": "GA4 - PDF Download",
        "type": "ga4Event",
        "parameter": [
          { "key": "eventName", "value": "pdf_download" },
          {
            "key": "eventParameters",
            "type": "list",
            "list": [
              {
                "type": "map",
                "map": [
                  { "key": "name", "value": "event_category" },
                  { "key": "value", "value": "PDF" }
                ]
              }
            ]
          }
        ],
        "triggerId": ["1"]
      },
      {
        "name": "GA4 - PDF Print",
        "type": "ga4Event",
        "parameter": [
          { "key": "eventName", "value": "pdf_print" },
          {
            "key": "eventParameters",
            "type": "list",
            "list": [
              {
                "type": "map",
                "map": [
                  { "key": "name", "value": "event_category" },
                  { "key": "value", "value": "PDF" }
                ]
              }
            ]
          }
        ],
        "triggerId": ["2"]
      }
    ],
    "trigger": [
      {
        "name": "Click - PDF Download",
        "type": "CLICK",
        "filter": [
          {
            "type": "EQUALS",
            "parameter": [
              { "type": "template", "key": "arg0", "value": "{{Click ID}}" },
              { "type": "template", "key": "arg1", "value": "downloadButton" }
            ]
          }
        ],
        "uniqueTriggerId": "1"
      },
      {
        "name": "Click - PDF Print",
        "type": "CLICK",
        "filter": [
          {
            "type": "EQUALS",
            "parameter": [
              { "type": "template", "key": "arg0", "value": "{{Click ID}}" },
              { "type": "template", "key": "arg1", "value": "printButton" }
            ]
          }
        ],
        "uniqueTriggerId": "2"
      }
    ]
  }
}

Leave a Reply

Your email address will not be published. Required fields are marked *