Asset Graph
The asset graph is built during the analyze phase of every pageflare run. It tracks two things: the dependency relationships between files, and a rename map used when content-hash-based filenames are enabled.
Dependency tracking
Section titled “Dependency tracking”Each node in the graph is a file path. Each directed edge represents a “references” relationship — a source file that contains a URL or path pointing to a target file.
Examples of relationships that are tracked:
index.html→styles/main.css(via<link rel="stylesheet">)styles/main.css→fonts/inter.woff2(viaurl()in@font-face)index.html→scripts/app.js(via<script src="...">)blog/post.html→images/hero.jpg(via<img src="...">)
This graph lets processors answer questions like “which HTML pages reference this font?” and “what resources does this page depend on?” without re-scanning files during processing.
Reference rewriting
Section titled “Reference rewriting”When filenames change — either due to content hashing or output path remapping — all references to those files must be updated. The asset graph makes this straightforward:
- After processing computes the final output path for each file, a rename map (
original path → new path) is populated. - During the write phase, each file’s content is scanned for references using the dependency graph.
- Any reference that appears in the rename map is rewritten to the new path before the file is saved.
This covers references in HTML attributes (href, src, action, data-src), CSS url() values, and JavaScript string literals that match known asset paths.
Content hashing PRO
Section titled “Content hashing ”Content hashing (the hash_filenames feature) appends a short hash derived from the file’s contents to its filename before the extension:
styles/main.css → styles/main.a3f9b2c1.cssscripts/app.js → scripts/app.d84e21f0.jsThe hash is computed over the final processed bytes — after all processors have run — so it reflects the actual content that will be served.
Why content hashing improves caching
Section titled “Why content hashing improves caching”Without content hashing, your CDN or browser cache must use Cache-Control: max-age to decide when to re-fetch a file. Setting a long max-age (e.g. one year) means stale files may be served after a deployment. Setting a short max-age wastes bandwidth with unnecessary revalidation requests.
Content hashing enables the best of both worlds:
- Set
Cache-Control: max-age=31536000, immutableon all hashed assets. The URL uniquely identifies the content — it will never change. - When a file changes, its hash changes, producing a new URL. The HTML (or CSS or JS) that references it is updated to the new URL by the reference rewriting step above.
HTML files themselves are not hashed — they are the entry points that browsers request by path. HTML files should use a short max-age or Cache-Control: no-cache so browsers always get the current version (which then references the latest hashed assets).
Hash computation order
Section titled “Hash computation order”Because a CSS file’s hash must be known before the HTML that references it can be finalized, hashing proceeds in dependency order:
- Leaf assets (images, fonts) are hashed first.
- CSS files (which may reference images and fonts) are hashed next, after their references are rewritten to hashed paths.
- JS files are hashed.
- HTML files are finalized last, with all CSS, JS, and image references already rewritten to their hashed URLs.
The dependency graph determines this order automatically, ensuring references are always consistent.
Related concepts
Section titled “Related concepts”- Pipeline — the analyze phase that populates the graph
- Processors — how processors read the graph during the process phase
- CSS processors — critical CSS extraction uses the CSS→HTML graph edges
- Font processors — font preloading uses CSS→font graph edges to determine which fonts each page needs