QVAC Logo

ocr( )

Performs Optical Character Recognition (OCR) on an image to extract text.

function ocr(params: OCRClientParams): { blocks: Promise<{ bbox?: [number, number, number, number]; confidence?: number; text: string }[]>; blockStream: AsyncGenerator<{ bbox?: [number, number, number, number]; confidence?: number; text: string }[]>; stats: Promise<{ detectionTime?: number; recognitionTime?: number; totalTime?: number } | —> }

Parameters

NameTypeRequired?Description
paramsOCRClientParamsThe OCR parameters

OCRClientParams

FieldTypeRequired?Description
imagestring | BufferThe image to process — either a file path string or a Buffer containing the image data
modelIdstringThe ID of the loaded OCR model to use
optionsOCROptionsAdditional OCR processing options
streambooleanWhen true, text blocks are yielded incrementally via blockStream. Defaults to false

Returns

{ blocks: Promise<{ bbox?: [number, number, number, number]; confidence?: number; text: string }[]>; blockStream: AsyncGenerator<{ bbox?: [number, number, number, number]; confidence?: number; text: string }[]>; stats: Promise<{ detectionTime?: number; recognitionTime?: number; totalTime?: number } |> }
FieldTypeDescription
blocksPromiseResolves with all detected text blocks, each containing the recognized text, bounding box coordinates, and confidence score
blockStreamAsyncGeneratorYields arrays of text blocks as they are detected in streaming mode
statsPromiseResolves with OCR timing metrics (detection time, recognition time, total time), or undefined if not available

Example

// Non-streaming mode (default) - get all blocks at once
const { blocks } = ocr({ modelId, image: "/path/to/image.png" });
for (const block of await blocks) {
  console.log(block.text, block.bbox, block.confidence);
}

// Streaming mode - process blocks as they arrive
const { blockStream } = ocr({ modelId, image: imageBuffer, stream: true });
for await (const blocks of blockStream) {
  console.log("Detected:", blocks);
}

On this page