Asset Graph

The asset graph is built during the analyze phase of every pageflare run. It tracks two things: the dependency relationships between files, and a rename map used when content-hash-based filenames are enabled.

Dependency tracking

Each node in the graph is a file path. Each directed edge represents a “references” relationship — a source file that contains a URL or path pointing to a target file.

Examples of relationships that are tracked:

index.html → styles/main.css (via <link rel="stylesheet">)
styles/main.css → fonts/inter.woff2 (via url() in @font-face)
index.html → scripts/app.js (via <script src="...">)
blog/post.html → images/hero.jpg (via <img src="...">)

This graph lets processors answer questions like “which HTML pages reference this font?” and “what resources does this page depend on?” without re-scanning files during processing.

Reference rewriting

When filenames change — either due to content hashing or output path remapping — all references to those files must be updated. The asset graph makes this straightforward:

After processing computes the final output path for each file, a rename map (original path → new path) is populated.
During the write phase, each file’s content is scanned for references using the dependency graph.
Any reference that appears in the rename map is rewritten to the new path before the file is saved.

This covers references in HTML attributes (href, src, action, data-src), CSS url() values, and JavaScript string literals that match known asset paths.

Content hashing PRO

Content hashing (the hash_filenames feature) appends a short hash derived from the file’s contents to its filename before the extension:

styles/main.css  →  styles/main.a3f9b2c1.css
scripts/app.js   →  scripts/app.d84e21f0.js

The hash is computed over the final processed bytes — after all processors have run — so it reflects the actual content that will be served.

Why content hashing improves caching

Without content hashing, your CDN or browser cache must use Cache-Control: max-age to decide when to re-fetch a file. Setting a long max-age (e.g. one year) means stale files may be served after a deployment. Setting a short max-age wastes bandwidth with unnecessary revalidation requests.

Content hashing enables the best of both worlds:

Set Cache-Control: max-age=31536000, immutable on all hashed assets. The URL uniquely identifies the content — it will never change.
When a file changes, its hash changes, producing a new URL. The HTML (or CSS or JS) that references it is updated to the new URL by the reference rewriting step above.

HTML files themselves are not hashed — they are the entry points that browsers request by path. HTML files should use a short max-age or Cache-Control: no-cache so browsers always get the current version (which then references the latest hashed assets).

Hash computation order

Because a CSS file’s hash must be known before the HTML that references it can be finalized, hashing proceeds in dependency order:

Leaf assets (images, fonts) are hashed first.
CSS files (which may reference images and fonts) are hashed next, after their references are rewritten to hashed paths.
JS files are hashed.
HTML files are finalized last, with all CSS, JS, and image references already rewritten to their hashed URLs.

The dependency graph determines this order automatically, ensuring references are always consistent.

Pipeline — the analyze phase that populates the graph
Processors — how processors read the graph during the process phase
CSS processors — critical CSS extraction uses the CSS→HTML graph edges
Font processors — font preloading uses CSS→font graph edges to determine which fonts each page needs