The NetBSD Project

CVS log for pkgsrc/graphics/tesseract/patches/Attic/patch-viewer_scrollview.cpp

[BACK] Up to [] / pkgsrc / graphics / tesseract / patches

Request diff between arbitrary revisions

Default branch: MAIN

Revision 1.3, Sat Nov 3 09:13:07 2018 UTC (5 years, 3 months ago) by adam
Branch: MAIN
Changes since 1.2: +1 -1 lines

tesseract: updated to 4.0.0

New OCR engine
- Added a new OCR engine that uses neural network system based on LSTMs, with major accuracy gains.
- This includes new training tools for the LSTM OCR engine. A new model can be trained from scratch or by fine tuning an existing model.
- Added trained data that includes LSTM models to 123 languages.
- Added optional accelerated code paths for the LSTM recognizer:
  * Using OpenMP
  * Using SIMD: AVX2 / AVX / SSE4.1
- Added a new parameter lstm_choice_mode that allows to include alternative symbol choices in the hOCR output.
- The new LSTM engine still does not support all features from the old legacy engine (see missing features).

Other OCR engines
- The pattern matching OCR engine that was the primary OCR engine in previous versions is still available in this version.
- Removed the 'Cube' OCR engine from the codebase. It was used for Hindi and for Arabic. The New LSTM engine performs much better, thus the Cube engine was no longer needed.

Updated build system
- Tesseract now uses semantic versioning.
- Tesseract now requires Leptonica 1.74.0 or a higher version.
- For building Tesseract from source code, a compiler with good C++ 11 support is required. See here for a list of officially supported compilers.
- Added unit tests to the main repo. The unit tests require Git submodules and the code for training.
- Added an option to compile Tesseract without the code of the legacy OCR engine.
- Update minimum required autoconf version to 2.63.
- Training tools dependencies - Update minimum required versions: ICU 52.1, Pango 1.22.0.
- Reorganized Tesseract's source tree. Most sources are now below the src directory.

Bug fixes and enhancements
- Fixed many issues that triggered compiler warnings.
- Fixed many issues reported by Coverity Scan or LGTM.
- Fixes to trainingdata rendering.
- Fixed damage to binary images when processing PDFs.
- Don't trigger a deliberate segmentation fault for fatal errors in release code.
- Fixed some issues in OpenCL code. OpenCL now works for the legacy Tesseract OCR engine, but does not improve the performance. It is not implemented for the LSTM OCR engine.
- Improved multi-page TIFF handling.
- Improvements to PDF rendering.
- Added version information and improved help texts to the training tools.
- Added faster version of log2().
- Documented in tesseract man page the option to use an input text file which contains lists of images.
- Made 'osd' the default traineddata when psm 0 is requested (currently this feature is only implemented in the command line interface, but not in the API).
- Removed tessedit_pageseg_mode 1 from hocr, pdf, and tsv config files. The user should explicitly use --psm 1 if that is desired.
- The list of available languages and scripts is now sorted alphabetically.
- Parameter unlv_tilde_crunching changed to false, because of default values cause issues in cases of unlv output in Tesseract 4.
- Removed obsolete code.

Revision 1.2 / (download) - annotate - [select for diffs], Thu Jan 25 11:30:34 2018 UTC (6 years, 1 month ago) by adam
Branch: MAIN
CVS Tags: pkgsrc-2018Q3-base, pkgsrc-2018Q3, pkgsrc-2018Q2-base, pkgsrc-2018Q2, pkgsrc-2018Q1-base, pkgsrc-2018Q1
Changes since 1.1: +3 -1 lines
Diff to previous 1.1 (colored)

tesseract: updated tessdata to 4.00

Revision 1.1 / (download) - annotate - [select for diffs], Mon Apr 29 21:31:12 2013 UTC (10 years, 9 months ago) by joerg
Branch: MAIN
CVS Tags: pkgsrc-2017Q4-base, pkgsrc-2017Q4, pkgsrc-2017Q3-base, pkgsrc-2017Q3, pkgsrc-2017Q2-base, pkgsrc-2017Q2, pkgsrc-2017Q1-base, pkgsrc-2017Q1, pkgsrc-2016Q4-base, pkgsrc-2016Q4, pkgsrc-2016Q3-base, pkgsrc-2016Q3, pkgsrc-2016Q2-base, pkgsrc-2016Q2, pkgsrc-2016Q1-base, pkgsrc-2016Q1, pkgsrc-2015Q4-base, pkgsrc-2015Q4, pkgsrc-2015Q3-base, pkgsrc-2015Q3, pkgsrc-2015Q2-base, pkgsrc-2015Q2, pkgsrc-2015Q1-base, pkgsrc-2015Q1, pkgsrc-2014Q4-base, pkgsrc-2014Q4, pkgsrc-2014Q3-base, pkgsrc-2014Q3, pkgsrc-2014Q2-base, pkgsrc-2014Q2, pkgsrc-2014Q1-base, pkgsrc-2014Q1, pkgsrc-2013Q4-base, pkgsrc-2013Q4, pkgsrc-2013Q3-base, pkgsrc-2013Q3, pkgsrc-2013Q2-base, pkgsrc-2013Q2

Add a number of includes hidden by libstdc++'s name space pollution.

This form allows you to request diff's between any two revisions of a file. You may select a symbolic revision name using the selection box or you may type in a numeric name using the type-in text box.

CVSweb <>