The NetBSD Project

CVS log for pkgsrc/graphics/tesseract/Makefile

[BACK] Up to [cvs.NetBSD.org] / pkgsrc / graphics / tesseract

Request diff between arbitrary revisions


Default branch: MAIN


Revision 1.67 / (download) - annotate - [select for diffs], Tue Apr 25 14:57:05 2023 UTC (5 weeks, 2 days ago) by wiz
Branch: MAIN
CVS Tags: HEAD
Changes since 1.66: +3 -1 lines
Diff to previous 1.66 (colored)

tessseract: require gcc 8 for std::filesystem

Revision 1.66 / (download) - annotate - [select for diffs], Wed Apr 19 08:10:27 2023 UTC (6 weeks, 1 day ago) by adam
Branch: MAIN
Changes since 1.65: +2 -1 lines
Diff to previous 1.65 (colored)

revbump after textproc/icu update

Revision 1.65 / (download) - annotate - [select for diffs], Sun Apr 2 12:36:27 2023 UTC (8 weeks, 4 days ago) by adam
Branch: MAIN
Changes since 1.64: +2 -2 lines
Diff to previous 1.64 (colored)

tesseract: updated to 5.3.1

5.3.1

Update README.md
Fix FP division by zero
Fix linkage of icu and pango
Fix build with gcc 13 by including
msvc debug: fix wrong lib name in generated pkgconfig file
Fix libdir in tesseract.pc from CMake
Replace 'can not' by 'cannot'
Readme: Link to list of supported languages
Improve the DebugDump output by slightly adjusting the format.

Revision 1.64 / (download) - annotate - [select for diffs], Mon Jan 30 07:57:28 2023 UTC (4 months ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2023Q1-base, pkgsrc-2023Q1
Changes since 1.63: +10 -7 lines
Diff to previous 1.63 (colored)

tesseract: updated to 5.3.0

5.3.0

Fix memory issues in ScrollView::MessageReceiver
autotools: Add rule for svpaint executable
Replace call of exit function by return statement in main function
Fix the build on CodeQL/Analyze
CI: Remove Ubuntu 18.04
configure.ac: fix build on aarch64_be
SW CI: Add paths filter
Create .mailmap
Fix tesseract.pc from cmake to match autotools
Update README.md
Fixed 2 errors
fix issue 3940 - remove colormap before thresholding
Update upload-artifact action
Update checkout action to version 3
Fix Markdownlint
Fix broken links in CONTRIBUTING.md
pdfrenderer.cpp: Ignore non-text blocks
lstm.train: allow .box from .raw.png too
Fix a number of performance issues
Fix training tools for legacy engine
Fix function tesseract::WriteFeature
Modernize function ObjectCache::DeleteUnusedObjects
More fixes for issue

Revision 1.63 / (download) - annotate - [select for diffs], Sun Jan 29 21:16:47 2023 UTC (4 months ago) by ryoon
Branch: MAIN
Changes since 1.62: +2 -2 lines
Diff to previous 1.62 (colored)

*: Recursive revbup from graphics/freetype2

Revision 1.62 / (download) - annotate - [select for diffs], Tue Jan 3 17:36:27 2023 UTC (4 months, 3 weeks ago) by wiz
Branch: MAIN
Changes since 1.61: +2 -2 lines
Diff to previous 1.61 (colored)

*: recursive bump for tiff shlib major bump

Revision 1.61 / (download) - annotate - [select for diffs], Wed Nov 23 16:20:23 2022 UTC (6 months, 1 week ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2022Q4-base, pkgsrc-2022Q4
Changes since 1.60: +2 -2 lines
Diff to previous 1.60 (colored)

massive revision bump after textproc/icu update

Revision 1.60 / (download) - annotate - [select for diffs], Mon Apr 18 19:11:24 2022 UTC (13 months, 1 week ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2022Q3-base, pkgsrc-2022Q3, pkgsrc-2022Q2-base, pkgsrc-2022Q2
Changes since 1.59: +2 -2 lines
Diff to previous 1.59 (colored)

revbump for textproc/icu update

Revision 1.59 / (download) - annotate - [select for diffs], Wed Dec 8 16:05:07 2021 UTC (17 months, 3 weeks ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2022Q1-base, pkgsrc-2022Q1, pkgsrc-2021Q4-base, pkgsrc-2021Q4
Changes since 1.58: +2 -2 lines
Diff to previous 1.58 (colored)

revbump for icu and libffi

Revision 1.58 / (download) - annotate - [select for diffs], Fri Jul 16 09:16:27 2021 UTC (22 months, 2 weeks ago) by jperkin
Branch: MAIN
CVS Tags: pkgsrc-2021Q3-base, pkgsrc-2021Q3
Changes since 1.57: +3 -2 lines
Diff to previous 1.57 (colored)

tesseract: Avoid C++ <version> issue on macOS.

Revision 1.57 / (download) - annotate - [select for diffs], Wed Apr 21 11:41:59 2021 UTC (2 years, 1 month ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2021Q2-base, pkgsrc-2021Q2
Changes since 1.56: +2 -2 lines
Diff to previous 1.56 (colored)

revbump for textproc/icu

Revision 1.56 / (download) - annotate - [select for diffs], Thu Nov 5 09:08:27 2020 UTC (2 years, 6 months ago) by ryoon
Branch: MAIN
CVS Tags: pkgsrc-2021Q1-base, pkgsrc-2021Q1, pkgsrc-2020Q4-base, pkgsrc-2020Q4
Changes since 1.55: +2 -2 lines
Diff to previous 1.55 (colored)

*: Recursive revbump from textproc/icu-68.1

Revision 1.55 / (download) - annotate - [select for diffs], Mon Aug 17 20:19:10 2020 UTC (2 years, 9 months ago) by leot
Branch: MAIN
CVS Tags: pkgsrc-2020Q3-base, pkgsrc-2020Q3
Changes since 1.54: +2 -2 lines
Diff to previous 1.54 (colored)

*: revbump after fontconfig bl3 changes (libuuid removal)

Revision 1.54 / (download) - annotate - [select for diffs], Fri Jun 5 12:49:00 2020 UTC (2 years, 11 months ago) by jperkin
Branch: MAIN
CVS Tags: pkgsrc-2020Q2-base, pkgsrc-2020Q2
Changes since 1.53: +2 -2 lines
Diff to previous 1.53 (colored)

*: Apply revbump for graphics/giflib API change.

Revision 1.53 / (download) - annotate - [select for diffs], Tue Jun 2 08:24:08 2020 UTC (2 years, 11 months ago) by adam
Branch: MAIN
Changes since 1.52: +2 -2 lines
Diff to previous 1.52 (colored)

Revbump for icu

Revision 1.52 / (download) - annotate - [select for diffs], Sun Apr 12 08:28:51 2020 UTC (3 years, 1 month ago) by adam
Branch: MAIN
Changes since 1.51: +2 -2 lines
Diff to previous 1.51 (colored)

Recursive revision bump after textproc/icu update

Revision 1.51 / (download) - annotate - [select for diffs], Tue Mar 10 22:10:14 2020 UTC (3 years, 2 months ago) by wiz
Branch: MAIN
CVS Tags: pkgsrc-2020Q1-base, pkgsrc-2020Q1
Changes since 1.50: +2 -2 lines
Diff to previous 1.50 (colored)

librsvg: update bl3.mk to remove libcroco in rust case

recursive bump for the dependency change

Revision 1.50 / (download) - annotate - [select for diffs], Sun Mar 8 16:50:09 2020 UTC (3 years, 2 months ago) by wiz
Branch: MAIN
Changes since 1.49: +2 -1 lines
Diff to previous 1.49 (colored)

*: recursive bump for libffi

Revision 1.49 / (download) - annotate - [select for diffs], Sun Dec 29 16:44:12 2019 UTC (3 years, 5 months ago) by adam
Branch: MAIN
Changes since 1.48: +2 -2 lines
Diff to previous 1.48 (colored)

tesseract: updated to 4.1.1

4.1.1 Release:
Implemented sw build (cppan is depreciated)
Improved cmake build
Code cleanup and optimization
A lot of bug fixes...

Revision 1.48 / (download) - annotate - [select for diffs], Mon Jul 8 18:37:03 2019 UTC (3 years, 10 months ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2019Q4-base, pkgsrc-2019Q4, pkgsrc-2019Q3-base, pkgsrc-2019Q3
Changes since 1.47: +2 -3 lines
Diff to previous 1.47 (colored)

tesseract: updated to 4.1.0

4.1.0 Release
Added new renders Alto, LSTMBox, WordStrBox.
Added character boxes in hOCR output.
Added python training scripts (experimental) as alternative shell scripts.
Better support AVX / AVX2 / SSE.
Disable OpenMP support by default.
Fix for bounding box problem.
Implemented support for whitelist/blacklist in LSTM engine.
Improved cmake configuration.
Code modernization and improvements.
A lot of bug fixes...

Revision 1.47 / (download) - annotate - [select for diffs], Sat May 4 16:05:33 2019 UTC (4 years ago) by leot
Branch: MAIN
CVS Tags: pkgsrc-2019Q2-base, pkgsrc-2019Q2
Changes since 1.46: +2 -2 lines
Diff to previous 1.46 (colored)

tesseract: Avoid unportable `=' test(1) operator

PKGREVISION++

(There should be no change, i.e. the test(1) code path seems still never
crossed, but bump it for extra paranoia.)

Revision 1.46 / (download) - annotate - [select for diffs], Wed Apr 3 00:32:47 2019 UTC (4 years, 2 months ago) by ryoon
Branch: MAIN
Changes since 1.45: +2 -2 lines
Diff to previous 1.45 (colored)

Recursive revbump from textproc/icu

Revision 1.45 / (download) - annotate - [select for diffs], Sun Dec 9 18:52:31 2018 UTC (4 years, 5 months ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2019Q1-base, pkgsrc-2019Q1, pkgsrc-2018Q4-base, pkgsrc-2018Q4
Changes since 1.44: +2 -2 lines
Diff to previous 1.44 (colored)

revbump after updating textproc/icu

Revision 1.44 / (download) - annotate - [select for diffs], Wed Nov 28 12:04:20 2018 UTC (4 years, 6 months ago) by adam
Branch: MAIN
Changes since 1.43: +3 -1 lines
Diff to previous 1.43 (colored)

tesseract: build depends on asciidoc

Revision 1.43 / (download) - annotate - [select for diffs], Sun Nov 18 18:07:20 2018 UTC (4 years, 6 months ago) by adam
Branch: MAIN
Changes since 1.42: +4 -3 lines
Diff to previous 1.42 (colored)

tesseract: use REPLACE_BASH; fix building man-pages; courtesy of Mustafa D. :)

Revision 1.42 / (download) - annotate - [select for diffs], Wed Nov 14 22:21:47 2018 UTC (4 years, 6 months ago) by kleink
Branch: MAIN
Changes since 1.41: +2 -2 lines
Diff to previous 1.41 (colored)

Revbump after cairo 1.16.0 update.

Revision 1.41 / (download) - annotate - [select for diffs], Mon Nov 12 03:52:18 2018 UTC (4 years, 6 months ago) by ryoon
Branch: MAIN
Changes since 1.40: +2 -1 lines
Diff to previous 1.40 (colored)

Recursive revbump from hardbuzz-2.1.1

Revision 1.40 / (download) - annotate - [select for diffs], Sat Nov 3 09:13:07 2018 UTC (4 years, 6 months ago) by adam
Branch: MAIN
Changes since 1.39: +3 -5 lines
Diff to previous 1.39 (colored)

tesseract: updated to 4.0.0

V4.0.0:
New OCR engine
- Added a new OCR engine that uses neural network system based on LSTMs, with major accuracy gains.
- This includes new training tools for the LSTM OCR engine. A new model can be trained from scratch or by fine tuning an existing model.
- Added trained data that includes LSTM models to 123 languages.
- Added optional accelerated code paths for the LSTM recognizer:
  * Using OpenMP
  * Using SIMD: AVX2 / AVX / SSE4.1
- Added a new parameter lstm_choice_mode that allows to include alternative symbol choices in the hOCR output.
- The new LSTM engine still does not support all features from the old legacy engine (see missing features).

Other OCR engines
- The pattern matching OCR engine that was the primary OCR engine in previous versions is still available in this version.
- Removed the 'Cube' OCR engine from the codebase. It was used for Hindi and for Arabic. The New LSTM engine performs much better, thus the Cube engine was no longer needed.

Updated build system
- Tesseract now uses semantic versioning.
- Tesseract now requires Leptonica 1.74.0 or a higher version.
- For building Tesseract from source code, a compiler with good C++ 11 support is required. See here for a list of officially supported compilers.
- Added unit tests to the main repo. The unit tests require Git submodules and the code for training.
- Added an option to compile Tesseract without the code of the legacy OCR engine.
- Update minimum required autoconf version to 2.63.
- Training tools dependencies - Update minimum required versions: ICU 52.1, Pango 1.22.0.
- Reorganized Tesseract's source tree. Most sources are now below the src directory.

Bug fixes and enhancements
- Fixed many issues that triggered compiler warnings.
- Fixed many issues reported by Coverity Scan or LGTM.
- Fixes to trainingdata rendering.
- Fixed damage to binary images when processing PDFs.
- Don't trigger a deliberate segmentation fault for fatal errors in release code.
- Fixed some issues in OpenCL code. OpenCL now works for the legacy Tesseract OCR engine, but does not improve the performance. It is not implemented for the LSTM OCR engine.
- Improved multi-page TIFF handling.
- Improvements to PDF rendering.
- Added version information and improved help texts to the training tools.
- Added faster version of log2().
- Documented in tesseract man page the option to use an input text file which contains lists of images.
- Made 'osd' the default traineddata when psm 0 is requested (currently this feature is only implemented in the command line interface, but not in the API).
- Removed tessedit_pageseg_mode 1 from hocr, pdf, and tsv config files. The user should explicitly use --psm 1 if that is desired.
- The list of available languages and scripts is now sorted alphabetically.
- Parameter unlv_tilde_crunching changed to false, because of default values cause issues in cases of unlv output in Tesseract 4.
- Removed obsolete code.

Revision 1.39 / (download) - annotate - [select for diffs], Fri Jul 20 03:34:16 2018 UTC (4 years, 10 months ago) by ryoon
Branch: MAIN
CVS Tags: pkgsrc-2018Q3-base, pkgsrc-2018Q3
Changes since 1.38: +2 -1 lines
Diff to previous 1.38 (colored)

Recursive revbump from textproc/icu-62.1

Revision 1.38 / (download) - annotate - [select for diffs], Fri Jun 22 09:50:16 2018 UTC (4 years, 11 months ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2018Q2-base, pkgsrc-2018Q2
Changes since 1.37: +2 -5 lines
Diff to previous 1.37 (colored)

tesseract: updated to 3.05.02

V3.05.02
* Fixed linking with Leptonica
* Fix build for Mingw-w64
* Fix Training error "Couldn't find a matching blob"
* Fix unterminated string

Revision 1.37 / (download) - annotate - [select for diffs], Mon Jun 11 15:01:49 2018 UTC (4 years, 11 months ago) by fhajny
Branch: MAIN
Changes since 1.36: +3 -3 lines
Diff to previous 1.36 (colored)

graphics/tesseract: Revert update to data version 4.00. Using version 4 data with version 3 program is not supported. Fixes https://github.com/joyent/pkgsrc/issues/113.

Revision 1.36 / (download) - annotate - [select for diffs], Sun Apr 29 10:16:20 2018 UTC (5 years, 1 month ago) by adam
Branch: MAIN
Changes since 1.35: +3 -3 lines
Diff to previous 1.35 (colored)

tesseract: added buildlink3; fixed COMMENT and HOMEPAGE

Revision 1.35 / (download) - annotate - [select for diffs], Mon Apr 16 14:34:41 2018 UTC (5 years, 1 month ago) by wiz
Branch: MAIN
Changes since 1.34: +2 -2 lines
Diff to previous 1.34 (colored)

Recursive bump for new fribidi dependency in pango.

Revision 1.34 / (download) - annotate - [select for diffs], Sat Apr 14 07:34:26 2018 UTC (5 years, 1 month ago) by adam
Branch: MAIN
Changes since 1.33: +2 -2 lines
Diff to previous 1.33 (colored)

revbump after icu update

Revision 1.33 / (download) - annotate - [select for diffs], Mon Mar 12 11:16:51 2018 UTC (5 years, 2 months ago) by wiz
Branch: MAIN
CVS Tags: pkgsrc-2018Q1-base, pkgsrc-2018Q1
Changes since 1.32: +2 -2 lines
Diff to previous 1.32 (colored)

Recursive bumps for fontconfig and libzip dependency changes.

Revision 1.32 / (download) - annotate - [select for diffs], Thu Jan 25 11:30:34 2018 UTC (5 years, 4 months ago) by adam
Branch: MAIN
Changes since 1.31: +4 -4 lines
Diff to previous 1.31 (colored)

tesseract: updated tessdata to 4.00

Revision 1.31 / (download) - annotate - [select for diffs], Thu Nov 30 16:45:27 2017 UTC (5 years, 6 months ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2017Q4-base, pkgsrc-2017Q4
Changes since 1.30: +2 -2 lines
Diff to previous 1.30 (colored)

Revbump after textproc/icu update

Revision 1.30 / (download) - annotate - [select for diffs], Mon Sep 18 09:53:23 2017 UTC (5 years, 8 months ago) by maya
Branch: MAIN
CVS Tags: pkgsrc-2017Q3-base, pkgsrc-2017Q3
Changes since 1.29: +2 -1 lines
Diff to previous 1.29 (colored)

revbump for requiring ICU 59.x

Revision 1.29 / (download) - annotate - [select for diffs], Sun Jul 30 22:32:18 2017 UTC (5 years, 10 months ago) by wiz
Branch: MAIN
Changes since 1.28: +2 -2 lines
Diff to previous 1.28 (colored)

Switch github HOMEPAGEs to https.

Revision 1.28 / (download) - annotate - [select for diffs], Wed Jun 14 14:41:26 2017 UTC (5 years, 11 months ago) by fhajny
Branch: MAIN
CVS Tags: pkgsrc-2017Q2-base, pkgsrc-2017Q2
Changes since 1.27: +2 -3 lines
Diff to previous 1.27 (colored)

Update graphics/tesseract to 3.05.01.

- Fixed several build issues
- Fixed C-API
- Backport pdfrenderer changes
- Code clean up

Revision 1.27 / (download) - annotate - [select for diffs], Sat Apr 22 21:03:38 2017 UTC (6 years, 1 month ago) by adam
Branch: MAIN
Changes since 1.26: +2 -2 lines
Diff to previous 1.26 (colored)

Revbump after icu update

Revision 1.26 / (download) - annotate - [select for diffs], Tue Feb 28 15:20:07 2017 UTC (6 years, 3 months ago) by ryoon
Branch: MAIN
CVS Tags: pkgsrc-2017Q1-base, pkgsrc-2017Q1
Changes since 1.25: +2 -1 lines
Diff to previous 1.25 (colored)

Recursive revbump from graphics/libwebp

Revision 1.25 / (download) - annotate - [select for diffs], Tue Feb 21 17:51:18 2017 UTC (6 years, 3 months ago) by fhajny
Branch: MAIN
Changes since 1.24: +7 -4 lines
Diff to previous 1.24 (colored)

Update graphics/tesseract to 3.05.00

- Made some fine tuning to the hOCR output.
- Added TSV as another optional output format.
- Fixed ABI break introduced in 3.04.00 with the AnalyseLayout()
  method.
- text2image tool - Enable all OpenType ligatures available in a font.
  This feature requires Pango 1.38 or newer.
- Training tools - Replaced asserts with tprintf() and exit(1).
- Fixed Cygwin compatibility.
- Improved multipage tiff processing.
- Improved the embedded pdf font (pdf.ttf).
- Enable selection of OCR engine mode from command line.
- Changed tesseract command line parameter '-psm' to '--psm'.
- Added new C API for orientation and script detection, removed the
  old one.
- Increased minimum autoconf version to 2.59.
- Removed dead code.
- Fixed many compiler warning.
- Fixed memory and resource leaks.
- Fixed some issues with the 'Cube' OCR engine.
- Fixed some openCL issues.
- Added option to build Tesseract with CMake build system.
- Implemented CPPAN support for easy Windows building.

Revision 1.24 / (download) - annotate - [select for diffs], Sun Feb 12 06:25:31 2017 UTC (6 years, 3 months ago) by ryoon
Branch: MAIN
Changes since 1.23: +2 -2 lines
Diff to previous 1.23 (colored)

Recursive revbump from fonts/harfbuzz

Revision 1.23 / (download) - annotate - [select for diffs], Mon Feb 6 13:55:30 2017 UTC (6 years, 3 months ago) by wiz
Branch: MAIN
Changes since 1.22: +2 -2 lines
Diff to previous 1.22 (colored)

Recursive bump for harfbuzz's new graphite2 dependency.

Revision 1.22 / (download) - annotate - [select for diffs], Sun Dec 4 05:17:30 2016 UTC (6 years, 5 months ago) by ryoon
Branch: MAIN
CVS Tags: pkgsrc-2016Q4-base, pkgsrc-2016Q4
Changes since 1.21: +2 -2 lines
Diff to previous 1.21 (colored)

Recursive revbump from textproc/icu 58.1

Revision 1.21 / (download) - annotate - [select for diffs], Mon Apr 11 19:01:53 2016 UTC (7 years, 1 month ago) by ryoon
Branch: MAIN
CVS Tags: pkgsrc-2016Q3-base, pkgsrc-2016Q3, pkgsrc-2016Q2-base, pkgsrc-2016Q2
Changes since 1.20: +2 -1 lines
Diff to previous 1.20 (colored)

Recursive revbump from textproc/icu 57.1

Revision 1.20 / (download) - annotate - [select for diffs], Sun Apr 3 12:46:18 2016 UTC (7 years, 1 month ago) by joerg
Branch: MAIN
CVS Tags: pkgsrc-2016Q1-base, pkgsrc-2016Q1
Changes since 1.19: +2 -2 lines
Diff to previous 1.19 (colored)

Needs pkg-config.

Revision 1.19 / (download) - annotate - [select for diffs], Wed Mar 30 11:38:59 2016 UTC (7 years, 2 months ago) by fhajny
Branch: MAIN
Changes since 1.18: +3 -1 lines
Diff to previous 1.18 (colored)

Make sure leptonica is detected properly

Revision 1.18 / (download) - annotate - [select for diffs], Thu Mar 17 12:51:14 2016 UTC (7 years, 2 months ago) by fhajny
Branch: MAIN
Changes since 1.17: +23 -20 lines
Diff to previous 1.17 (colored)

Update graphics/tesseract to 3.04.01.
Move to new home at Github. Clean up.

2015-02-17 - V3.04.01
- Added OSD renderer for psm 0. Works for single page and
  multi-page images.
- Improve tesstrain.sh script.
- Simplify build and run of ScrollView.
- Improved PDF output for OS X Preview utility.
- INCOMPATIBLE fix to hOCR line height information - commit
  134ebc3.
- Added option to build Tesseract without Cube OCR engine
  (-DNO_CUBE_BUILD).
- Enable OpenMP support.
- Many bug fixes.

2015-07-11 - V3.04.00
- Tesseract development is now done with Git and hosted at
  github.com (Previously we used Subversion as a VCS and
  code.google.com for hosting).
- Tesseract now requires leptonica 1.71 or a higher version.
- Removed official support for VS 2008.
- Added support for 39 additional scripts/languages, including:
  amh, asm, aze_cyrl, bod, bos, ceb, cym, dzo, fas, gle, guj, hat,
  iku, jav, kat, kat_old, kaz, khm, kir, kur, lao, lat, mar, mya,
  nep, ori, pan, pus, san, sin, srp_latn, syr, tgk, tir, uig, urd,
  uzb, uzb_cyrl, yid
- Major updates to training system as a result of extensive
  testing on 100 languages.
- New training data for over 100 languages
- Improved performance with PIC compilation option.
- Significant change to invisible font system in pdf output to
  improve correctness and compatibility with external programs,
  particularly ghostscript.
- Improved font identification.
- Major change to improve layout analysis for heavily diacritic
  languages: Thai, Vietnamese, Kannada, Telugu etc.
- Fixed problems with shifted baselines so recognition can recover
  from layout analysis errors.
- Major refactor to improve speed on difficult images, especially
  when running a heap checker.
- Moved params from global in page layout to tesseractclass.
- Improved single column layout analysis.
- Allow ocr output to multiple formats using tesseract command
  line executable.
- Fixed issues with mixed eng+ara scripts.
- Improved script consistency in numbers.
- Major refactor of control.cpp to enable line recognition.
- Added tesstrain.sh - a master training script.
- Added ability to text2image training tool to just list available
  fonts.
- Added ability to text2image to underline words.
- Improved efficiency of image processing for PDF output.
- Added parameter description for each parameter listed with
  'print-parameters' command line option.
- Added font info to hOCR output.
- Enabled streaming input and output of multi-page documents.
- Many bug fixes.

2014-02-04 - V3.03(rc1)
- Added new training tool text2image to generate box/tif file
  pairs from text and truetype fonts.
- Added support for PDF output with searchable text.
- Removed entire IMAGE class and all code in image directory.
- Tesseract executable: support for output to stdout; limited
  support for one
  page images from stdin  (especially on Windows)
- Added Renderer to API to allow document-level processing and
  output of document formats, like hOCR, PDF.
- Major refactor of word-level recognition, beam search,
  eliminating dead code.
- Refactored classifier to make it easier to add new ones.
- Generalized feature extractor to allow feature extraction from
  greyscale.
- Improved sub/superscript treatment.
- Improved baseline fit.
- Added set_unicharset_properties to training tools.
- Many bug fixes.
- More training source data included.

Revision 1.17 / (download) - annotate - [select for diffs], Wed Jan 6 10:46:53 2016 UTC (7 years, 4 months ago) by adam
Branch: MAIN
Changes since 1.16: +2 -2 lines
Diff to previous 1.16 (colored)

Revbump after updating graphics/libwebp

Revision 1.16 / (download) - annotate - [select for diffs], Wed Oct 7 11:26:22 2015 UTC (7 years, 7 months ago) by fhajny
Branch: MAIN
CVS Tags: pkgsrc-2015Q4-base, pkgsrc-2015Q4
Changes since 1.15: +3 -1 lines
Diff to previous 1.15 (colored)

Network libs still needed, fix build on SunOS.

Revision 1.15 / (download) - annotate - [select for diffs], Tue Oct 7 16:47:27 2014 UTC (8 years, 7 months ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2015Q3-base, pkgsrc-2015Q3, pkgsrc-2015Q2-base, pkgsrc-2015Q2, pkgsrc-2015Q1-base, pkgsrc-2015Q1, pkgsrc-2014Q4-base, pkgsrc-2014Q4
Changes since 1.14: +2 -1 lines
Diff to previous 1.14 (colored)

Revbump after updating libwebp and icu

Revision 1.14 / (download) - annotate - [select for diffs], Thu Oct 2 16:06:02 2014 UTC (8 years, 8 months ago) by adam
Branch: MAIN
Changes since 1.13: +21 -28 lines
Diff to previous 1.13 (colored)

Changes 3.02.02:
* Moved ResultIterator/PageIterator to ccmain.
* Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic.
* Added paragraph detection in layout analysis/post OCR.
* Fixed inconsistent xheight during training and over-chopping.
* Added simultaneous multi-language capability.
* Refactored top-level word recognition module.
* Added experimental equation detector.
* Improved handling of resolution from input images.
* Blamer module added for error analysis.
* Cleaned up externally used namespace by removing includes from baseapi.h.
* Removed dead memory mangagement code.
* Tidied up constraints on control parameters.
* Added support for ShapeTable in classifier and training.
* Refactored class pruner.
* Fixed training leaks and randomness.
* Major improvements to layout analysis for better image detection, diacritic detection, better textline finding, better tabstop finding.
* Improved line detection and removal.
* Added fixed pitch chopper for CJK.
* Added UNICHARSET to WERD_CHOICE to make mult-language handling easier.
* Fixed problems with internally scaled images.
* Added page and bbox to string in tr files to identify source of training data better.
* Fixes to Hindi Shiroreka splitter.
* Added word bigram correction.
* Reduced stack memory consumption and eliminated some ugly typedefs.
* Added new uniform classifier API.
* Added new training error counter.
* Fixed endian bug in dawg reader.
* Many other fixes, including the way in which the chopper finds chops and messes with the outline while it does so.

Revision 1.13 / (download) - annotate - [select for diffs], Tue Sep 23 19:07:06 2014 UTC (8 years, 8 months ago) by jperkin
Branch: MAIN
CVS Tags: pkgsrc-2014Q3-base, pkgsrc-2014Q3
Changes since 1.12: +3 -1 lines
Diff to previous 1.12 (colored)

SunOS needs -lsocket -lnsl.

Revision 1.12 / (download) - annotate - [select for diffs], Sat Jan 26 21:38:01 2013 UTC (10 years, 4 months ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2014Q2-base, pkgsrc-2014Q2, pkgsrc-2014Q1-base, pkgsrc-2014Q1, pkgsrc-2013Q4-base, pkgsrc-2013Q4, pkgsrc-2013Q3-base, pkgsrc-2013Q3, pkgsrc-2013Q2-base, pkgsrc-2013Q2, pkgsrc-2013Q1-base, pkgsrc-2013Q1
Changes since 1.11: +2 -2 lines
Diff to previous 1.11 (colored)

Revbump after graphics/jpeg and textproc/icu

Revision 1.11 / (download) - annotate - [select for diffs], Sat Oct 6 14:11:22 2012 UTC (10 years, 7 months ago) by asau
Branch: MAIN
CVS Tags: pkgsrc-2012Q4-base, pkgsrc-2012Q4
Changes since 1.10: +1 -2 lines
Diff to previous 1.10 (colored)

Drop superfluous PKG_DESTDIR_SUPPORT, "user-destdir" is default these days.

Revision 1.10 / (download) - annotate - [select for diffs], Mon Feb 6 12:40:37 2012 UTC (11 years, 3 months ago) by wiz
Branch: MAIN
CVS Tags: pkgsrc-2012Q3-base, pkgsrc-2012Q3, pkgsrc-2012Q2-base, pkgsrc-2012Q2, pkgsrc-2012Q1-base, pkgsrc-2012Q1
Changes since 1.9: +2 -2 lines
Diff to previous 1.9 (colored)

Revbump for
a) tiff update to 4.0 (shlib major change)
b) glib2 update 2.30.2 (adds libffi dependency to buildlink3.mk)

Enjoy.

Revision 1.9 / (download) - annotate - [select for diffs], Mon Jan 18 09:59:09 2010 UTC (13 years, 4 months ago) by wiz
Branch: MAIN
CVS Tags: pkgsrc-2011Q4-base, pkgsrc-2011Q4, pkgsrc-2011Q3-base, pkgsrc-2011Q3, pkgsrc-2011Q2-base, pkgsrc-2011Q2, pkgsrc-2011Q1-base, pkgsrc-2011Q1, pkgsrc-2010Q4-base, pkgsrc-2010Q4, pkgsrc-2010Q3-base, pkgsrc-2010Q3, pkgsrc-2010Q2-base, pkgsrc-2010Q2, pkgsrc-2010Q1-base, pkgsrc-2010Q1
Changes since 1.8: +2 -2 lines
Diff to previous 1.8 (colored)

Second try at jpeg-8 recursive PKGREVISION bump.

Revision 1.8 / (download) - annotate - [select for diffs], Wed Aug 26 19:57:51 2009 UTC (13 years, 9 months ago) by sno
Branch: MAIN
CVS Tags: pkgsrc-2009Q4-base, pkgsrc-2009Q4, pkgsrc-2009Q3-base, pkgsrc-2009Q3
Changes since 1.7: +2 -1 lines
Diff to previous 1.7 (colored)

bump revision because of graphics/jpeg update

Revision 1.7 / (download) - annotate - [select for diffs], Wed Jul 22 20:57:47 2009 UTC (13 years, 10 months ago) by wiz
Branch: MAIN
Changes since 1.6: +7 -7 lines
Diff to previous 1.6 (colored)

Update to 2.04. Set LICENSE.

June 30 2009 - V2.04
	  Integrated bug fixes and patches and misc changes for portability.
	  Integrated a patch to remove some of the "access" macros.
	  Removed dependence on lua from the viewer, speeding it up
	  dramatically.
	  Fixed the viewer so it compiles and runs properly!
	  Specifically fixing issues: 1, 63, 67, 71, 76, 81, 82, 106, 111,
	  112, 128, 129, 130, 133, 135, 142, 143, 145, 147, 153, 154, 160,
	  165, 170, 175, 177, 187, 192, 195, 199, 201, 205, 209, 108, 169

Revision 1.6 / (download) - annotate - [select for diffs], Tue Jul 21 16:00:19 2009 UTC (13 years, 10 months ago) by brook
Branch: MAIN
Changes since 1.5: +7 -2 lines
Diff to previous 1.5 (colored)

Add language-specific data sets distributed by the project.  The tesseract
distribution itself just creates dummy, placeholder data sets that cannot
be used.

Revision 1.5 / (download) - annotate - [select for diffs], Thu Oct 30 22:12:59 2008 UTC (14 years, 7 months ago) by wiz
Branch: MAIN
CVS Tags: pkgsrc-2009Q2-base, pkgsrc-2009Q2, pkgsrc-2009Q1-base, pkgsrc-2009Q1, pkgsrc-2008Q4-base, pkgsrc-2008Q4
Changes since 1.4: +4 -1 lines
Diff to previous 1.4 (colored)

Replace patch-ab with a post-extract rule. No change to the binary package,
just one file less in pkgsrc ;)

Revision 1.4 / (download) - annotate - [select for diffs], Fri May 30 13:06:26 2008 UTC (15 years ago) by wiz
Branch: MAIN
CVS Tags: pkgsrc-2008Q3-base, pkgsrc-2008Q3, pkgsrc-2008Q2-base, pkgsrc-2008Q2, cwrapper, cube-native-xorg-base, cube-native-xorg
Changes since 1.3: +3 -2 lines
Diff to previous 1.3 (colored)

Update to 2.03:

January 23 2008 - V2.02
          Improvements to clustering, training and classifier.
          Major internationalization improvements for large-character-set
          languages, eg Kannada.
          Removed some compiler warnings.
          Added multipage tiff support for training and running.
          Updated graphics output to talk to new java-based viewer.
          Added ability to save n-best lists.
          Added leptonica support for more file types.
          Improved Init/End to make them safe.
          Reduced memory use of dictionaries.
          Added some new APIs to TessBaseAPI.
April 21 2008 - V2.02 (again)
          Fixed namespace collisions with jpeg library (INT32).
          Portability fixes for Windows for new code.
          Updates to autoconf system for new code.
April 22 2008 - V2.03
          Fixed crash introduced in 2.02.
	  Fixed lack of tessembedded.cpp in distribution.
	  Added test for leptonica header files and conditional test for lib.

Revision 1.3 / (download) - annotate - [select for diffs], Thu Nov 29 16:42:08 2007 UTC (15 years, 6 months ago) by wiz
Branch: MAIN
CVS Tags: pkgsrc-2008Q1-base, pkgsrc-2008Q1, pkgsrc-2007Q4-base, pkgsrc-2007Q4
Changes since 1.2: +2 -2 lines
Diff to previous 1.2 (colored)

Update to 2.01:

August 27 2007 - V2.01
	  Fixed UTF8 input problems with box file reader.
	  Fixed various infinite loops and crashes in dawg code.
	  Removed include of config_auto.h from host.h.
	  Added automatic wctype encoding to unicharset_extractor.
	  Fixed dawg table too full error.
	  Removed svn files from tarball.
	  Added new functions to tessdll.
	  Increased maximum utf8 string in a classification result to 8.

Revision 1.2 / (download) - annotate - [select for diffs], Sat Jul 28 01:02:14 2007 UTC (15 years, 10 months ago) by wiz
Branch: MAIN
CVS Tags: pkgsrc-2007Q3-base, pkgsrc-2007Q3
Changes since 1.1: +2 -3 lines
Diff to previous 1.1 (colored)

Update to 2.00, provided by Rumko on pkgsrc-users.

July 02 2007 - V2.00
	  Converted internal character handling to UTF8.
	  Trained with 6 languages.
	  Added unicharset_extractor, wordlist2dawg.
	  Added boxfile creation mode.
	  Added UNLV regression test capability.
	  Fixed problems with copyright and registered symbols.
	  Fixed extern "C" declarations problem.

Revision 1.1.1.1 / (download) - annotate - [select for diffs] (vendor branch), Fri May 18 06:39:27 2007 UTC (16 years ago) by wiz
Branch: TNF
CVS Tags: pkgsrc-2007Q2-base, pkgsrc-2007Q2, pkgsrc-20070518
Changes since 1.1: +0 -0 lines
Diff to previous 1.1 (colored)

Initial import of tesseract-1.04b from pkgsrc-wip (packaged by heinz@
and myself):

This code is a raw OCR engine. It has NO PAGE LAYOUT ANALYSIS, NO
OUTPUT FORMATTING, and NO UI. It can only process an image of a
single column and create text from it. It can detect fixed pitch
vs proportional text.  Having said that, in 1995, this engine was
in the top 3 in terms of character accuracy, and it compiles and
runs on both Linux and Windows. Another current limitation is that
it only recognizes English and its character set is only US-ASCII.
Training code IS included in the open source release however, and
will be included in a future release.

Revision 1.1 / (download) - annotate - [select for diffs], Fri May 18 06:39:27 2007 UTC (16 years ago) by wiz
Branch: MAIN

Initial revision

This form allows you to request diff's between any two revisions of a file. You may select a symbolic revision name using the selection box or you may type in a numeric name using the type-in text box.




CVSweb <webmaster@jp.NetBSD.org>