Extract text from any image — printed documents, screenshots, scanned pages, signs — using Tesseract.js OCR running entirely in your browser. No upload, no server.
OCR (Optical Character Recognition) converts pixels of text in images into actual character data. Modern open-source OCR via Tesseract.js works in any browser, supports 100+ languages, and runs offline after the initial language model download. Privacy-friendly because nothing leaves your device.
Click the file picker and choose an image (JPG, PNG, WEBP, BMP). Click Run OCR. The first run downloads the language model (~10 MB cached afterwards). Tesseract analyses the image, recognises characters, and returns the extracted text. Copy with one click.
Use it to extract text from scanned documents, from photos of receipts, from screenshots where you want the text not the image, from book pages you've photographed, and from non-editable PDF images.
OCR works best on clean, high-resolution printed text. Handwriting, low-resolution images, blurry scans, and non-uniform backgrounds reduce accuracy. For maximum quality, scan at 300 DPI minimum and ensure even lighting.
Limited — Tesseract handles clean printed text best. For handwriting use specialised tools or human transcription.
No — it's cached in browser storage after the first run.
No — Tesseract.js runs in your browser. Nothing is uploaded.
Explore more media & ocr on the tool hub — or jump straight to the OCR (Optical Character Recognition), JPG To Word, Image Translator.