Categories: News

iText launches iText pdfOCR, a powerful open-source product enabling text recognition in scanned documents and conversion into editable PDFs

LONDON, UK – Media
OutReach – 30 June 2020 – iText Group NV, a globally recognized thought-leader and
innovator in PDF libraries and solutions, today announced the launch of iText
pdfOCR, the newest addition to their award-winning software offering. 

 

iText
pdfOCR, which is part of the renowned iText 7 PDF SDK, offers Optical
Character Recognition (OCR) functionality to convert printed text in scanned
documents and images into a fully searchable PDF/A-3u compliant format (PDF
version 1.7) and make accessing those texts easier and faster. Without
machine-readable text, printed or scanned documents cannot be searched, indexed
or interpreted. Logical follow-up actions could be data extraction with iText pdf2Data, secure content
redaction with iText pdfSweep, or multilingual
document recreation with iText pdfCalligraph. With repurposing data
with the low-code document generator iText DITO® often being the final
cherry on the cake.

The iText
pdfOCR add-on is built on the Tesseract OCR engine technology. Tesseract
supports over 100 languages and was originally developed by Hewlett-Packard
(’85), and was released under the Apache open source license in 2005. Since
2006, its development has been sponsored by Google.

 

“With COVID-19
urging companies to accelerate their digital transformation projects,
organizations are forced to explore new ways of accessing and managing their
data — existing and new. Being a leader in the digital documents space, we’re
pleased to be at the forefront of this new era. As such, I am very proud to
announce the latest addition to our PDF library for today’s new world: thanks
to the OCR capabilities of iText pdfOCR many new opportunities
will open up for users and enterprises that want to maximize their data
potential.” Yeonsu Kim, CEO at iText Group NV stated.

 

“Staying true
to our open-source roots, we’ve decided to build iText
pdfOCR upon the open-source Tesseract OCR Engine. With this, we wish
to reconfirm our positioning as an open-source company – a value which is
appreciated by our millions of users and clients.”

 

“With this new
addition to our PDF library, developers will now be able to leverage data
locked away in documents which until now weren’t accessible. Our latest product
enables them to enlarge their digital workflow capabilities by accessing the
data buried in scanned files and deploy it for any action or purpose they or
their end-user would like.” Tony Van den Zegel, VP of Products & Marketing
at iText Group NV and General Manager at iText Software Belgium, said.

 

The applications
of iText
pdfOCR are various: for instance, archiving of historical documents,
translations of legal documents, automatic data entry while processing all
sorts of physical applications or claims, and sorting of otherwise not editable
printed or scanned documents.

 

Please tune in
for live demos on 9 July 2020. More information can be found here.

Miscw.com

Recent Posts

Cathay expands lifestyle offerings in Southeast Asia with Mitsui Mall Group partnership in Malaysia

KUALA LUMPUR, MALAYSIA - Media OutReach Newswire - 24 March 2025 - Premium travel lifestyle…

3 hours ago

The United Laboratories and Novo Nordisk announce exclusive license agreement for UBT251, a GLP-1/GIP/glucagon triple receptor agonist

GUANGDONG, CHINA AND BAGSVÆRD, DENMARK - Media OutReach Newswire - 24 March 2025 - The…

4 hours ago

Contractor Sentiment Generally Positive as the Worst of Price Pressures Ease, According to Office Fit Out Vendors Across Asia Pacific

Six Greater China cities are ranked in the top 20 most expensive office fit out…

5 hours ago

SonicWall Partner Focus Drives Key Business Momentum in Transformational Year

SonicWall’s Wide-ranging Platform of Cybersecurity Solutions and Growth with Managed Security Services Providers (MSSPs) Underpin…

6 hours ago

Lenovo Hybrid AI Advantage with NVIDIA Boosts Business Productivity and Efficiency with New Scalable Agentic AI Solutions

New Lenovo agentic AI and hybrid AI factory platforms include NVIDIA Blackwell Ultra support and…

6 hours ago

CTF Life Introduces “GBA MediAccess” Outpatient Insurance Plan

Leverages CTF Group’s Diverse Conglomerate to Address Medical Needs of Families in the Guangdong–Hong Kong–Macau…

6 hours ago