Documentation Index
Fetch the complete documentation index at: https://intunedhq.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Deprecated: This function is deprecated and will be removed in the future.
Extracts a structured object from a web page.
export declare function extractObjectFromPage(
page: Page,
options: {
label: string;
entityName: string;
entitySchema: SimpleObjectSchema;
strategy?: ImageStrategy | HtmlStrategy;
prompt?: string;
optionalPropertiesInvalidator?: (
result: Record<string, string | null> | null
) => string[];
variantKey?: string;
apiKey?: string;
}
): Promise<Record<string, string | null> | null>;
Examples
import { extractObjectFromPage } from "@intuned/browser/optimized-extractors";
await page.goto(
"https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html"
);
const book = await extractObjectFromPage(page, {
entityName: "book",
label: "book-extraction",
entitySchema: {
type: "object",
required: ["name", "price", "reviews"],
properties: {
name: {
type: "string",
description: "book name",
},
price: {
type: "string",
description: "book price",
},
reviews: {
type: "string",
description: "Number of reviews",
},
},
},
});
console.log(book);
// output:
// { name: 'A Light in the Attic', price: '£51.77', reviews: '0' }
Arguments
The Playwright Page object from which to extract the data.
A label for this extraction process, used for billing and monitoring.
The name of the entity being extracted. Must be 1–50 characters long and can only contain letters, digits, periods, underscores, and hyphens.
The schema of the entity being extracted.
Optional. The strategy to use for extraction, if not provided, the html strategy with claude haiku will be used.
Optional. A prompt to guide the extraction process.
options.optionalPropertiesInvalidator
Optional. A function to invalidate optional properties.
Optional. A variant key for the extraction process.
Optional. An API key for AI extraction. Extractions made with your API key won’t be billed to your account.
Returns: any
A promise that resolves to the extracted object.