When There's No API, the Browser Is the API

An AI agent is only as useful as the systems it can touch. A voice agent that can hold a perfect conversation but cannot actually book the appointment is a demo, not a product. So the real constraint on agent automation is rarely the model. It is integration: can the agent read and write the system of record the business actually runs on?

For a large share of the platforms small businesses depend on, the conventional answer is "no." A booking platform, a clinic management system, a salon scheduler: many expose no public API at all, or offer a thin one that omits the operations you need, or gate the real API behind an enterprise tier the customer will never buy. The standard integration playbook stops there. Ours doesn't.

The reframe

A platform without an API is not a platform without an interface. It has a very rich, very well-tested interface: the web application its own users log into every day. Staff create bookings, reschedule them, pull up a client's history, take a payment. Every one of those actions is reachable through the browser, because that is the only way the platform's own customers do them.

So the question stops being "does this platform have an API?" and becomes "can a human do this in a browser?" If the answer is yes, an agent can do it too, by driving that same browser with Playwright. The web UI becomes the integration surface. The browser is the API.

What this looks like in practice

We run headless Chromium under Playwright, server-side, and treat each platform's UI as a set of operations exposed to the agent as tools. For a booking platform that means the obvious verbs:

Authenticate as the business, the same way their staff log in.
Read the schedule, availability, and client records.
Create a booking, reschedule it, cancel it, change its status.
Create or update a customer record.

The agent doesn't know or care that there's a browser underneath. It calls a tool like createBooking(...) exactly as it would call a REST endpoint; behind that tool, Playwright navigates the platform's pages, fills the same forms a receptionist would, waits for the same confirmations, and returns structured data back to the model. From the agent's side it is just another capability. From the platform's side it is just another logged-in session.

This is how we've integrated scheduling and clinic platforms across the board, including ones with no usable public API. The pattern is the same each time; only the selectors and the login flow change.

The hard parts (and how we handle them)

Browser automation is easy to demo and hard to run in production. The difference is entirely in the parts that aren't the happy path.

Sessions, not logins. Logging in is the slow, fragile, sometimes MFA-guarded step. You do not want to do it on every request. So we authenticate once, persist the resulting browser context (its cookies and storage), and reuse the session across many operations. A scheduled health check pings each integration's session, detects when one has expired, and re-authenticates proactively, before an agent ever hits a logged-out page. The agent experiences a warm, always-logged-in connection; the messy re-auth happens out of band.

Credentials, encrypted and per-tenant. Driving a UI means holding real login credentials. Those are encrypted at rest per business (we use envelope encryption via a managed KMS) and only decrypted in memory at the moment a session needs establishing. The automation layer never stores a credential in the clear.

API where it exists, browser where it doesn't. This is not an anti-API stance. Where a platform offers a real endpoint for an operation, we use it: it is faster, cheaper, and more stable than driving a page. Several of our integrations are hybrids, calling the official API for the operations it covers and falling back to Playwright only for the ones it doesn't. Browser automation is the universal fallback, not the first choice.

Anti-bot and scale. Some platforms actively fight automation. Hardened headless flags handle the basics; for the harder cases we can route a session through a managed scraping browser with residential egress instead of local Chromium. The agent-facing tool is identical either way.

The honest trade-offs

This method is powerful, not free, and it is worth being clear about the costs:

Fragility. A UI is an undocumented, unversioned contract. When the platform ships a redesign, selectors break. Real APIs change far less often. Mitigation is monitoring and fast selector maintenance, not pretending it won't happen.
Latency and cost. Driving a browser is heavier than an HTTP call: more time, more compute. For high-volume paths a real API wins decisively, which is exactly why we use one when it's there.
Terms of service. Automating a platform on a customer's own behalf, with the customer's own credentials and consent, is a different posture from scraping a third party. We treat that line seriously and integrate only on behalf of the account holder.

None of these are reasons not to do it. They are reasons to engineer it properly: sessions, encryption, monitoring, and a real API wherever one is offered.

The payoff

Reframing integration around the browser changes what "integrable" means. The set of systems an AI agent can act inside is no longer limited to the ones with a developer portal and an API key. It expands to every system a human can operate, which is to say, all of them.

Across the scheduling, salon, and clinic platforms our customers actually run on, with this method we have not yet found a platform we cannot integrate with our AI-powered agents. If your staff can do it in a browser, our agents can do it too, on your behalf, around the clock. The API was never the point. The capability was.