Syd Harris Notes

Building Optimize CLI tool

Syd Harris — Wed, 25 Feb 2026 00:00:00 +0000

Back in September, I had this idea for a project (a launch for another time). There I was, building another microsite. Run it locally. Everything looks perfect. Push to production...

Broken Open Graph tags. Alt text missing on half the images. No Schema markup. URLs that reference the dev environment. And a critical page that has noindex from testing.

That's the thing about meta tags, schema markup, and accessibility attributes they're invisible until they're not. Until you push to production. Run the schema.org validator. Open Chrome DevTools and use Lighthouse. Someone shares your link and you don't see that pretty social image. A screen reader encounters your page. Or Google Search Console says you've got a bunch of 404 pages.

Who's got time to be missing key things when we're all trying to keep up with changes in AI? I just wanted to build a static site that technically had a fighting chance to stand out in generative and traditional search.

Why Not Just Use Lighthouse?

Lighthouse is great. But it's solving a different problem.

It boots up headless Chrome, executes JavaScript, measures Core Web Vitals, captures screenshots, analyzes runtime performance... For a static site where I just need to know "are my meta tags correct?" or "does the schema JSON exist?", it's overkill during development.

Have you ever tried auditing a 50-page site? With Lighthouse you can only audit 1 page at a time. And if you want to run batch audits there's Unlighthouse, which is pretty cool. The downside is it has the same dependencies as Lighthouse. So all the above is true.

And if you're working in a resource-constrained environment (not by choice but by circumstance), every node_module and dependency counts. Running Lighthouse locally was not an option. Imagine building on a 20GB flash drive. How can you efficiently test a site before it makes it to production—and without causing your system to crash?

Creative problem solving

Then I thought, why not just test the build folder?

What started out as just a few simple Node.js script checks turned into something I could customize for any project. Space was no longer a problem. And what took minutes, I was able to do in seconds.

Feedback loop resolved. Code, test (errors), fix, test (pass), push.

Constraints breed creativity

The What?

A modular and extensible post-build and continuous CLI-first static site validator and diagnostic tool that you can plug into any environment. There are 8 core checks:

Meta
Open graph
Schema
Media
Links
Sources
Accessibility
Hierarchy

Each check is built as a module that can be configured for different project requirements via the configuration file. It scans 50 pages in seconds. That's fast enough to run on every build and get instant feedback while you're building.

It's 38.4KB unpacked for core engine. No dependencies are used other than Changesets for versioning, pnpm for modular repo management, and Node.js.

No bloated dependency trees. No hidden framework requirements. Just focused tooling that you can extend as needed. Run it as a dev dependency, include it in your agentic or CI/CD workflow, or adapt it and make your own tools with extensions or custom checks that plug into the core engine.

I decided to call it Optimize, and it's a project I'm working on at NOVL. If you're building static sites—whether with Eleventy, Hugo, Astro, Next.js static exports, or Jekyll—this is for you.

Try it out

Currently it's in beta and I'd love your feedback. The plan is to open source it, but for now, real feedback matters. It solved a problem I kept running into. Maybe it'll solve yours too.

Install the package:

# Install on your project
npm install -D @bynovl/optimize

Set up your config (optimize.config.js):

export default {
  outDir: 'dist', // required
  ignore: ['404.html', '403.html', '402.html'],
}

Start testing:

# Runs a full audit
npx optimize

Join me

Come join me in optimizing the web. Leaving no site behind. Sign up for the beta.

Introducing the `context` Attribute: A New Standard for Semantic Web Markup

Syd Harris — Sun, 12 Apr 2026 00:00:00 +0000

Modern web development, especially with frameworks like React, often produces HTML that is visually correct but semantically ambiguous. Class and ID names are frequently auto-generated or used solely for styling and JavaScript hooks, making it difficult for both humans and AI to understand the true intent of each element.

While the latter examples provide context, relying on class or ID names for semantic meaning is not ideal and are often repurposed for styling or dynamic behavior not creating a clear separation of presentation and function. I've been thinking alot about a dedicated context attribute for HTML elements. This attribute encodes the semantic role or intent of an element, separate from its styling or dynamic data.



Jane Doe

Why Not `aria-`, `data-`, `id`, `class`, or `itemscope`?

ARIA attributes (like aria-label, aria-role) are designed for accessibility, specifically to help screen readers and assistive technologies interpret web content. They are not intended for general semantic annotation or AI-driven workflows. ARIA's primary audience is accessibility tools, and its vocabulary is specific to accessibility needs.

Microdata and RDFa (e.g. item-scope) are standards for embedding structured data in HTML, usually for search engines and knowledge graphs. They require using pre-defined vocabularies (like schema.org), which can be too rigid or verbose for many real-world content needs. Microdata is best for marking up entities for external consumption, not for subjective, project-specific context.


  Jane Doe

Data attributes (e.g. data-*) are meant for dynamic data and JavaScript hooks. While you can add semantic context using a data attribute (e.g., data-context), this is not their primary purpose. Data attributes are meant for storing custom data private to the page or application, often for scripting or state.

ID attributes are meant to be unique within a page and are primarily used for JavaScript targeting and as URL-friendly hash anchors (e.g., #section-faqs or #what-is-context-attr). While useful for navigation and scripting, they are not intended to convey semantic meaning about the content itself even some meaning can be extacted.


What is context?

Classes are meant for presentation. While they can become descriptive using BEM (Block Element Modifier) or OOCSS (Object Oriented CSS), their focus should be on the element/component for styling not to communicate semantic intent. When BEM classes try to combine both presentational and semantic intent, they can become unwieldy and ambiguous.

How `context` Differs

context is for subjective, project- or content-specific meaning, not limited to accessibility or external knowledge graphs. Though it could work hand in hand.
It enables richer, more flexible annotation of intent, role, or meaning—beyond what ARIA, microdata and schema allow.
It is designed for both human and AI understanding, and can evolve with your content and workflows.
Every element can be mapped to a JSON object with both its presentational and semantic roles clearly defined.

Benefits

Separation of Concerns: Style, function, and meaning are clearly separated.
AI & Automation: Enables robust extraction, translation, and content QA workflows.
Accessibility: Tools can better infer the purpose of each element.
Maintainability: Future developers can quickly understand the intent of markup.

Final Note

While context is ideal for clarity and semantics, current tooling and frameworks (including React and some HTML validators) may not fully support custom attributes without the data- prefix. However, for most modern browsers and JavaScript, you can select and manipulate the context attribute directly via the DOM (e.g., document.querySelector('[context="faq-question"]')).

By making this small change in how we develop, we enable immediate, robust HTML-to-JSON extraction, unlocking what I call an HTML-driven API. This means anyone who understands HTML and structured data can create an API or data contract directly from the markup—no need for separate backend schemas or complex integrations. The structure and content of the API payload are derived from the annotated HTML itself, using context attributes to map markup to JSON objects that preserve both presentational and semantic meaning. This approach lets the frontend drive the data model, ensures consistency between UI and data, and empowers rapid prototyping and integration.

Example: Putting It All Together

This flow shows how annotated HTML can be extracted to JSON, processed by a backend or AI, and then used to update the UI—enabling a seamless, dynamic, and intelligent web experience.

Composing HTML


  
    What is context?
    A semantic marker for intent.

Extracting JSON with JavaScript

const faqList = Array.from(document.querySelectorAll('[context="faq-item"]')).map(item => ({
  "faq-question": item.querySelector('[context="faq-question"]')?.textContent.trim() || '',
  "faq-answer": item.querySelector('[context="faq-answer"]')?.textContent.trim() || ''
}));
const result = { "faq-list": faqList };

Generated JSON (HTML-Driven API) for Backend/AI Automation

// Can stores as JS module or JSON
{
  "faq-list": [
    {
      "faq-question": "What is context?",
      "faq-answer": "A semantic marker for intent."
    }
  ]
}

Dynamic HTML Update from API/JSON

// Select the all faqItems
const faqItems = document.querySelectorAll('[context="faq-item"]');

// Option 1: Backend or local fetch (API)
fetch('/faqs.json').then(res => res.json()).then(data => { /* see below */});

// Option 2: Import as JS module (faqs.js exports the JSON)
import { data } from './faqs.js';

data["faq-list"].forEach((faq, i) => {
  const item = faqItems[i];
  if (item) {
    const q = item.querySelector('[context="faq-question"]');
    const a = item.querySelector('[context="faq-answer"]');
    if (q) q.textContent = faq["faq-question"];
    if (a) a.textContent = faq["faq-answer"];
  }
});

If context were to become a new standard the real challenge is not technical, but social: developer adoption and education. Like Microdata, RDFa, Schema.org, and accessibility rules before it, the value of context will only be realized if teams understand, believe in, and apply it.