Build an OCR Service in Node.js & Express Using Tesseract.js 🧠

4 min readApr 18, 2025
Source: Pixabay

Optical Character Recognition (OCR) is a powerful tool for extracting text from images and PDFs. In this guide, we’ll build a modular OCR microservice using Node.js, Express, and Tesseract.js, with support for both image and PDF uploads. We’ll follow best practices for structure, error handling, and file management.

Table of Contents

  1. Introduction
  2. Project Setup
  3. Project Structure
  4. Installing Dependencies
  5. Setting Up the Express Server
  6. Designing the Modular Architecture
  7. Implementing the OCR Service
  8. Creating Controllers
  9. Defining Routes
  10. Testing the Service

1. Introduction

Optical Character Recognition (OCR) is a powerful technology that enables the extraction of text from images, scanned documents, or handwritten notes. By converting visual information into machine-readable text, OCR opens up a wide range of possibilities β€” from automating data entry to making printed documents searchable and accessible.
OCR plays a vital role in bridging the gap between the physical and digital worlds. In this article, we’ll explore how to build a simple yet effective OCR service using Node.js, Express, and the Tesseract OCR engine.

Tesseract.js brings the power of Tesseract OCR to Node.js, making it easy to integrate OCR into your web services.

We’ll build a REST API that accepts image or PDF uploads and returns extracted text β€” using a clean, maintainable architecture.

You can find the complete source code for this OCR service on GitHub. Feel free to explore, clone, or modify the project to suit your needs β€” it’s completely open source! If you find it helpful or end up using it in your own projects, a ⭐️ on the repo would be greatly appreciated. Your support helps keep the project alive and encourages further development!

2. Project Setup

mkdir ocr-service
cd ocr-service
npm init -y

3. Project Structure

ocr-service/
β”œβ”€β”€ src/
β”‚ β”œβ”€β”€ controllers/
β”‚ β”‚ └── ocrController.js
β”‚ β”œβ”€β”€ routes/
β”‚ β”‚ └── ocrRoutes.js
β”‚ β”œβ”€β”€ services/
β”‚ β”‚ └── ocrService.js
β”‚ β”œβ”€β”€ temp/ # Temporary files for processing
β”‚ β”œβ”€β”€ server.js
β”‚ └── eng.traineddata # Optional: Language data
β”œβ”€β”€ .gitignore
β”œβ”€β”€ package.json
└── README.md

4. Installing Dependencies

We’ll need the following:

  • express – Web framework
  • multer – For handling file uploads
  • tesseract.js – OCR engine
  • sharp – Image processing
  • dotenv – Environment config
  • cors – CORS support
  • nodemon – Auto-restart for dev
  • child_process – To run pdftoppm for PDF conversion

Install with

npm install express multer tesseract.js sharp dotenv cors
npm install --save-dev nodemon

Note: PDF support requires pdftoppm.
macOS: brew install poppler
Ubuntu: sudo apt-get install poppler-utils

5. Setting Up the Express Server

src/server.js

import express from 'express';
import cors from 'cors';
import dotenv from 'dotenv';
import ocrRoutes from './routes/ocrRoutes.js';

dotenv.config();

const app = express();
const port = process.env.PORT || 3000;

app.use(cors());
app.use(express.json());
app.use(express.urlencoded({ extended: true }));

app.use('/', ocrRoutes);

// Centralized error handler
app.use((err, req, res, next) => {
console.error('Error caught by central handler:', err.stack);
res.status(500).json({ error: err.message || 'Something went wrong!' });
});

app.listen(port, () => {
console.log(`🧠 OCR server listening at http://localhost:${port}`);
});

6. Designing the Modular Architecture

We break our logic into:

  • Routes: Handle endpoints & uploads
  • Controllers: Handle requests/responses
  • Services: Core logic for OCR & file processing

This promotes clean separation of concerns, testability, and scalability.

7. Implementing the OCR Service

src/services/ocrService.js

import { createWorker } from 'tesseract.js';
import sharp from 'sharp';
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';
import { exec } from 'child_process';
import { promisify } from 'util';

const execAsync = promisify(exec);
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);

const tempDir = path.join(__dirname, '../temp');
await fs.mkdir(tempDir, { recursive: true });

const performOCR = async (imageBuffer) => {
const worker = await createWorker('eng');
try {
const { data: { text } } = await worker.recognize(imageBuffer);
return text;
} finally {
await worker.terminate();
}
};

const performPDFOCR = async (pdfBuffer) => {
let worker = null;
const tempFiles = [];

try {
const pdfPath = path.join(tempDir, `temp_${Date.now()}.pdf`);
await fs.writeFile(pdfPath, pdfBuffer);
tempFiles.push(pdfPath);

const outputPrefix = path.join(tempDir, `page_${Date.now()}`);
await execAsync(`pdftoppm -png -r 300 "${pdfPath}" "${outputPrefix}"`);

const files = await fs.readdir(tempDir);
const pageFiles = files
.filter(file => file.startsWith(path.basename(outputPrefix)))
.sort();

worker = await createWorker('eng');
let extractedText = '';

for (const pageFile of pageFiles) {
const pagePath = path.join(tempDir, pageFile);
tempFiles.push(pagePath);
const imageBuffer = await fs.readFile(pagePath);

const processedImage = await sharp(imageBuffer).sharpen().toBuffer();
const { data: { text } } = await worker.recognize(processedImage);
extractedText += text + '\n\n';
}

return extractedText.trim();
} finally {
if (worker) await worker.terminate();
for (const file of tempFiles) {
try { await fs.unlink(file); } catch {}
}
}
};

export default { performOCR, performPDFOCR };

8. Creating Controllers

src/controllers/ocrController.js

import ocrService from '../services/ocrService.js';

const handleHealthCheck = (req, res) => {
res.json({ message: 'OCR server is running!' });
};

const handleOCRRequest = async (req, res, next) => {
if (!req.file) {
return res.status(400).json({ error: 'No file uploaded' });
}
try {
const text = await ocrService.performPDFOCR(req.file.buffer);
res.json({ text });
} catch (error) {
next(new Error('PDF OCR processing failed'));
}
};

export { handleHealthCheck, handleOCRRequest };

9. Defining Routes

src/routes/ocrRoutes.js

import express from 'express';
import multer from 'multer';
import { handleHealthCheck, handleOCRRequest } from '../controllers/ocrController.js';

const router = express.Router();
const upload = multer({ storage: multer.memoryStorage() });

router.get('/', handleHealthCheck);
router.post('/ocr', upload.single('image'), handleOCRRequest);

export default router;

10. Testing the Service

You can test the OCR service using curl or Postman:

Image or PDF Upload

curl -X POST http://localhost:3000/ocr \
-F "image=@/path/to/your/file.png"
curl -X POST http://localhost:3000/ocr \
-F "image=@/path/to/your/file.pdf"

Sample Response:

{
"text": "Extracted text from your PDF or image..."
}

Error Handling

All errors are caught by centralized middleware and returned as JSON:

{
"error": "PDF OCR processing failed"
}

Environment Variables

You can use a .env file to configure settings like PORT, future API keys, etc.

Conclusion

You now have a working OCR microservice that:

  • Accepts images and PDFs
  • Extracts text using Tesseract.js
  • Follows a modular and clean architecture
  • Cleans up temp files automatically

--

--

Aaleen Mirza
Aaleen Mirza

Written by Aaleen Mirza

Life-long learner | Enthusiastic | Software Developer

No responses yet