Up to [cvs.NetBSD.org] / pkgsrc / converters / orcus
Request diff between arbitrary revisions
Keyword substitution: kv
Default branch: MAIN
orcus: Update to 0.19.2 Changelog: 0.19.2: general fixed a build issue with gcc 14 due to a missing include for std::find_if and std::for_each. fixed a segmentation fault with the orcus-test-xml-mapped test which manifested on hppa hardware, as originally reported on https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1054376. xls-xml fixed a crash when loading a document that includes a style record referencing an unnamed style record as its parent. In Excel-generated documents, styles only reference named styles as their parents. But in 3rd-party generated documents, styles referencing unnamed styles as their parents can occur. gnumeric fixed a crash when the document model returned a null pointer when a reference resolver interface was requested. 0.19.1: general implemented orcus::create_filter() which instantiates a filter object of specified type. The returned object is of type orcus::iface::import_filter. moved test cases for format detection to the respective filter test files. gnumeric fixed a bug where the import filter did not set the formula grammer prior to importing. 0.19.0: general added support for allowing use of std::filesystem, std::experimental::filesystem or boost::filesystem per build configuration. xlsx refactored styles import to use style indices returned by the document model implementer rather than using the indices stored in the file. This allows the implementer to aggregate some style records and re-use the same index for records that are stored as different records in the original file. xls-xml fixed a bug where column styles were not applied to the correct columns when the starting column index was not 0. gnumeric overhauled the Gnumeric import filter to fix many bugs and support many missing features relative to the other filters included in orcus. Most notable mentions are: cell styles rich-text strings named ranges row heights and column widths merged cells parquet added partial support for Apache Parquet import filter. This is still heavily experimental.
orcus: Update to 0.18.1 Changelog: 0.18.1 sax parser * added support for optionally skipping multiple BOM's in the beginning of XML stream. This affects all XML-based file format filters such as xls-xml (aka Excel 2003 XML). xml-map * fixed a bug where XML documents consisting of simple single-column records were not properly converted to sheet data. xls-xml * fixed a bug where the filter would always pass border color even when it was not set. buildsystem * added new configure switches --without-benchmark and --without-doc-example to optinally skip building of these two directories. 0.18.0 general * fixed the flat output mode to properly calculate the lengths of UTF-8 encoded strings. * replaced all uses of std::strtol() to parse_integer() to properly parse strings that are not necessarily null-terminated. * added a new output format type 'debug-state' which dumps the internal state of the populated document model in detail. This can be useful during debugging. * separated the import_shared_string interface implementation from the backend shared strings store per separation of responsibility. * merged the foo_t and foo_active_t struct pair, such as font_t and font_active_t, in the styles store into a single type using std::optional. * revised the documentation and public API and cleaned things up where necessary. ods * re-implemented the number format styles import to correctly keep track of element stacks and correctly perform structure checks to detect malformed documents. * added new interface to import named styles applied to columns. * added new interface to import attributes for asian and complex scripts for the folloiwng font attributes: * font name * font size * font style * font weight * re-designed the styles import interface to make it multi-level. * re-worked the import of the style:text-underline-width attribute to make its handling more in line with the specifications. xls-xml * added support for importing wrap-text and shrink-to-fit cell format attributes. * added support for importing cell-hidden and locked attributes. * added support for importing direct and named cell formats applied to columns and rows. xlsx * added support for importing wrap-text and shrink-to-fit cell format attributes. * added support for importing direct and named cell formats applied to columns and rows. xml-map * added a new interface to pass the encoding information to the document model so that it can correctly decode non-UTF-8-encoded string values.
orcus: fix build with GCC 13
orcus: Update to 0.17.2 Changelog: orcus 0.17.2 * ods * fixed a bug where the state of style:cell-protect="none" was not explicitly pushed, thereby having had the same effect as not having this attribute. After the fix, style:cell-protect="none" will explicitly push the hidden state to false, locked state to false, and the formula-hidden state to false. orcus 0.17.1 * general * addressed a number of coverity issues. * removed a variety of compiler warnings. * ods * re-generated sax parser tokens from ODF v1.3. * revised the style import code to only push style attributes that are actually specified in the XML. * xls-xml * revised the XML structure validation strategy to ignore any mis-placed elements and their sub structures rather than aborting the import. orcus 0.17.0 * general * set the baseline C++ version to 17. * cleaned up the public API to replace pstring with std::string_view, union with std::variant, and boost::optional with std::optional. With this change, the public API no longer has dependency on boost. * spreadsheet document * switched to using ixion::model_iterator for horizontal iteration of cells instead of using mdds::mtv::collection. * fixed a bug where exporting a spreadsheet document containing adjacent merged cells regions to html incorrectly exported the merged cell areas. * xlsx * cached cell values are now correctly loaded from the file. * sax parser * utf-8 names are now allowed as element and attribute names. * css parser * unquoted utf-8 property values are now allowed. * orcus-json * fixed segmentation fault when using --mode structure with the Windows build. * added yaml output option. * xml-map * fixed a bug where mapping of an XML document with namespace aliases sometimes corrupts the alias values. * python * added orcus.FormulaTokenOp enum type which describes type formula token operator types in a more finer grained manner. * documentation * added notes to how to use orcus-xml and orcus-json to map XML and JSON documents to spreadsheet documents.
converters: Replace RMD160 checksums with BLAKE2s checksums All checksums have been double-checked against existing RMD160 and SHA512 hashes
converters: Remove SHA1 hashes for distfiles
orcus: Update to 0.16.1 Changelog: 0.16.1 fixed a build issue on 32-bit linux platforms, which was indirectly caused by ixion. fixed json parsing bug caused by an uninitialized variable, which manifested itself on debian 32-bit platform. removed compiler warnings on unused variables from the base parser handlers. 0.16.0 general full formula recalculations are now optional when loading documents. It makes more effective use of cached formula results. added the option of failing on the first faulty cell, or skipping them. fixed a bug that caused the threaded_sax_token_parser to deadlock. added base parser handler classes in the public headers so that they can be sub-classed to overwrite necessary handler methods. json-parser parsing of numeric values are now more strict for better conformance to the specs. ods added support for loading named expressions from ods documents. fixed an infinite loop when loading one of the attached ods documents from https://bugs.documentfoundation.org/show_bug.cgi?id=82414 xlsx fixed a segfault when loading the xlsx document from https://bugs.documentfoundation.org/show_bug.cgi?id=83711. xls-xml fixed a bug that prevented formulas from referencing cells located in later sheets. xml-map adjusted the xml path expressions to be more like XPath. Previously, an attribute was expressed as '@' in the old expression, but XPath uses '/@'. The new expression uses '/@' for an attribute. added the ability to identify and import ranges from XML documents without map file. added the ability to generate map file from XML documents for user customization. added support to specify default namespace in the map file. python added orcus.Cell class to represent individual cell values and attributes. fixed several memory leaks in the python binding layer. modified orcus.csv.read() function to take string input, instead of bytes. added __version__ attribute to the orcus module. cleaned up orcus.detect_format function to only take the stream parameter. added named_expressions properties to Document and Sheet class objects. added Python API to bulk-process a number of spreadsheet documents (orcus.tools.file_processor). added Python API to download attachments from bugzilla services via REST API (orcus.tools.bugzilla).
orcus: Update to 0.15.4 Release Notes fixed a build error with gcc 10 with LTO. For more details, visit https://bugs.gentoo.org/715154. removed potentially non-free specification and schema files from the package.
orcus: Update to 0.15.3 Changelog: orcus 0.15.3 * xml-map * fixed another bug related to filling of cells down the column in a linked range with nested repeat elements. The bug would occur when the field in a linked range is more than one level deeper than the nearest row group element. * xls-xml * fixed a bug where TopCell and LeftCell attributes of the Table element were not properly honored. orcus 0.15.2 * xml-map * fixed a bug that prevented filling of cells down the column in a linked range with nested repeat elements. The bug would occur when the field in a linked range is associated with an element content rather than an attribute. * xls-xml * added code to properly pick up and pass the number format codes, including named number format values such as 'General Date', 'Long Time, 'Currency' etc. * fixed a build issue on older macOS environment, related to passing an rvalue to a tuple expecting a const reference. The root cause was a bug in libc++ of LLVM < 7. * fixed a build issue with gcc5. orcus 0.15.1 * switched xml_map_tree to using boost::object_pool to manage the life cycles of the objects within xml_map_tree, to avoid memory fragmentation. * fixed incorrect handling of newly created elements in xml_map_tree. * fixed segfault caused by double deletion of allocated memory for xml_map_tree::element, which seemed to happen only on 32-bit gcc builds. * fixed weird test failures related to equality check of two double-precision values, caused probably by aggressive compiler optimization which only seems to get triggered in 32-bit gcc builds. orcus 0.15.0 * spreadsheet interface * import_sheet::fill_down_cells() has been added as a required method, to allow the import filter code to duplicate cell value downward in one step. * json parser * added test cases from JSONTestSuite. * fixed a bug on parsing an empty array containing one or more blank characters between the brackets. * sax parser * fixed a bug on parsing an attribute value with encoded character immediately followed by a ';', such as '&;'. * fixed a bug on parsing an assignment character '=' that either preceded or followed by whitespaces in attribute definition. * optionally use SSE4.2 intrinsics to speed up element name parsing. * orcus-xml * revised its cli interface to make use of boost's program_options. * orcus-xml-dump's functionality has been combined into orcus-xml. * map mode now supports nested repeat elements to be mapped as range fields. * orcus-json * map mode has been added to allow mapping of JSON documents to spreadsheet document model. This mode either takes explicit mapping rule via map file, or performs automatic mapping by auto-identifying mappable ranges by analyzing the structure of the JSON document. * structure mode has been added to display the logical structures of JSON documents. * significantly improved performance of json document tree by utilizing object pool to manage the life cycles of json value instances. * xls-xml * added support for importing named color values in the ss:Color attributes. * added support for handling UTF-16 streams that contains byte order marks. * spreadsheet document * significantly improved performance of flat format output generation. * internal * string_pool now uses boost's object_pool to manage the instances of stored strings. * file_content class has been added to memory-map file contents instead of loading them in-memory. * memory_content class has been added to map in-memory buffer with the optional ability to perform unicode conversion. * dom_tree has been renamed to dom::document_tree, and its interface has been cleaned up to hide its implementation details.
orcus: Add upstream merge request URI to the patch
Update to 0.14.1 Changelog: orcus 0.14.1 * addressed a number of coverity issues. * improved precision of points-to-twips measurement conversions by reducing the number of numeric operations to be performed. This especially helps on i386 platforms. orcus 0.14.0 * spreadsheet interface * import_data_table::set_range() now receives a parameter of type range_t. * import_sheet::set_array_formula() interface methods have been removed and replaced with import_sheet::get_array_formula() that returns an interface of type import_array_formula. * import_formula interface class has been added to replace the formula related methods of import_sheet. As a result, set_formula(), set_shared_formula(), and set_formula_result() methods have been removed from the import_sheet interface class. * import_auto_filter::set_range() now receives a parameter of type range_t, rather than a string value representing a range. * import_sheet::set_fill_pattern_type() interface method now takes an enum value of type fill_pattern_t, rather than a string value. * xls-xml * pick up the character set from the XML declaration, and pass it to the client app via import_global_settings interface. * support importing of array formulas. * xlsx * support importing of array formulas. * fixed a bug where sheet indices being passed to the append_sheet() interface method were incorrect. * shared formula handling code has been re-worked. * spreadsheet::sheet class has been de-coupled from the import and export interfaces. * previously known as import_styles class is now split into styles class and import_styles factory wrapper class. * sax_parser now gracefully ignores leading whitespace(s) if any, rather than aborting the parsing for it's not a valid XML stream to have leading whitespace(s). In the future we should make this behavior configurable. * python * add orcus.xlsx.read() function that takes a file object to load an xlsx file as a replacement for orcus.xlsx.read_file(). * add orcus.ods.read(), orcus.xls_xml.read(), orcus.csv.read(), and orcus.gnumeric.read() functions. * add orcus.Sheet.write() method which exports sheet content to specified format. For now only the csv format type is supported. * xml_map_tree no longer requires the source stream persisted in memory between the read and write. * the sax parser now stores the offset positions of each element rather than their memory positions, in order to make the position values usable between duplicated stream instances. * xml_structure_tree to support selection of an element by element path. * document * correctly set the argument separator depending on the formula grammar type. This change fixes loading of ods documents with formula cells. * fixed a build issue with boost 1.67.
Update to 0.13.4 * Fix build with boost 1.65.0 Changelog: 2018-02-26 Kohei Yoshida <kohei.yoshida@gmail.com> [ef2e27538e335583ef3ff85c4bc4f512efc72eb5] Up the version to 0.13.4. 2018-02-21 Markus Mohrhard <markus.mohrhard@googlemail.com> [13af2fbab2cac1020d6bb840833c0e0efc231bff] protect the self-closing xml element code against self-closing root elements Found by Antti Levomäki and Christian Jalio from Forcepoint. (cherry picked from commit 12e5d89cbd7101c61fbdf063322203a1590a0ef5) 2018-02-19 Kohei Yoshida <kohei.yoshida@gmail.com> [b8848ef7fc6a7d89e3f872574e36cbbab82275b0] xls-xml: Gracefully handle formula cells without cached results. This fixes #51. (cherry picked from commit 32a1b05ffc6edd7d528b6760dab9035252329ab0)
Update to 0.13.3 Changelog: 2018-02-14 Kohei Yoshida <kohei.yoshida@gmail.com> [7ca73a7c83504a30a1d24444a27f57a86451100f] Up the version to 0.13.3. 2018-02-13 Kohei Yoshida <kohei.yoshida@gmail.com> [66bbbd42f5d135b7e7dd57eaa7fdf6fd69c664df] xls-xml: Import hidden row and column flags. (cherry picked from commit 95420c1a1e8c082bb5953b2a49f0d56eef0e5f7e) 2018-02-08 Kohei Yoshida <kohei.yoshida@gmail.com> [0798d81a4c771b69b4b8eade396c88ffb5416b04] xlsx: Remove carriage returns from multi-line strings. Let's try to consistently only use linefeed characters for multi- line strings. (cherry picked from commit 0412bd269983825e5019a8a12267b54f51117aba) 2018-02-08 Kohei Yoshida <kohei.yoshida@gmail.com> [0a4e8c44fc8229818191c6b9b46e4de079d0ca3b] xls-xml: Pick up border colors. (cherry picked from commit e065d26dabafea465ec49e7d79775e62014ac0db) 2018-02-07 Kohei Yoshida <kohei.yoshida@gmail.com> [9662fce62ce77f87a4a8ba61f4507ec08e705b57] xlsx: Let's not forget to apply color for diagonal borders too. (cherry picked from commit c392ea15000b331bb6580b09c1045fd14b449b46) 2018-01-31 Kohei Yoshida <kohei.yoshida@gmail.com> [473526e1ca3a7117e2daf977e1b82a0a3977fc84] We are supposed to use the foreground color for solid fill. (cherry picked from commit f821995022df8dd1e580dd22cf131584b2b1ac4f) 2018-01-31 Kohei Yoshida <kohei.yoshida@gmail.com> [98d2b3377da71b713a37f9004acff3c02c22ce2b] Alpha value of 0 means fully transparent. I'm sure 255 was intended. (cherry picked from commit f7953a814d6a43205791b6cc01c528ef5d4b1ce3) 2018-01-26 Kohei Yoshida <kohei.yoshida@gmail.com> [5aba1df254cf4e052ad013d4b8ac886e274b74fa] Revert "fix automake warning" This reverts commit e4e1e3eb41755a4520a22b904a638da0770836f1. This fixes the breakage on 'make distcheck'.
converters/orcus: import orcus-0.13.2 Standalone file import filter library for spreadsheet documents. This package contains the 0.13 branch of the library.
Remove orcus, unused.
Update to 0.11.2 Changelog: 2016-05-11 Kohei Yoshida <kohei.yoshida@gmail.com> [d6084fe1771052e516ecfb270cb24dd9917a1895] Up the version to 0.11.2. 2016-05-11 Kohei Yoshida <kohei.yoshida@gmail.com> [70fd8327c94b27a99e2c7800e91c13e5099cceda] Make it buildable with mdds-1.2.
Add patches for allowing mdds-1.0 to be detected. Bump PKGREVISION.
Add SHA512 digests for distfiles for converters category Problems found with existing distfile: distfiles/libiconv-1.13-cp932.patch.gz No changes made to the libiconv distinfo file. Otherwise, existing SHA1 digests verified and found to be the same on the machine holding the existing distfiles (morden). All existing SHA1 digests retained for now as an audit trail.
Update to 0.9.2: No Changelog found. Major API change -- 0.10 in directory names instead of 0.8 before.
Update to 0.7.1: This is a maintenance release. It primarily includes bug fixes and build fixes since the 0.7.0 release with no new features. That said, the most notable aspect of this release is that it is buildable with the version 0.9.0 of the Ixion library which was just released a week ago. So, if you are trying to package and distribute the newly-released Ixion library but are unable to do so because of Orcus not being buildable with it, you might be interested in this release.
Update to 0.7.0 * Change to 0.8.0 branch * Change license to mpl-2.0 from mit Changelog: Add some more formats.
add support for mdds and let pkg-config find zlib now. reduce autotools requirements and make sure pthreads are used. TODO: libixion support (once it is added to pkgsrc)
Packaged converters/orcus, a library that deals with spreadsheet documents (libreoffice dependency).