Up to [cvs.NetBSD.org] / pkgsrc / textproc / icu
Request diff between arbitrary revisions
Keyword substitution: kv
Default branch: MAIN
icu: update to 76.1 again, now that fallout is/can be fixed. bulk-test-icu has same failures as with 75.1 except for webkit-gtk, for which I'll commit a fix next.
icu: downgrade to 75.1 The world is not ready for 76.1 yet: - c++17 requirement (I think) - more libraries need to be explicitly listed
icu: update to 76.1. ICU 76 updates to Unicode 16 (blog), including new characters and scripts, emoji, collation & IDNA changes, and corresponding APIs and implementations. It also updates to CLDR 46 (beta blog) locale data with new locales, significant updates to existing locales, and various additions and corrections. For example, the CLDR and Unicode default sort orders are now very nearly the same. Most of the java.time (Temporal) types can now be formatted directly using the existing ICU4J date/time formatting classes. There are some new APIs to make ICU easier to use with modern C++ and Java patterns. Most of the C/C++ APIs added for this purpose are implemented as C++ header-only APIs, and usable on top of binary stable C APIs, which is a first for ICU. The Java and C++ technology preview implementations of the (also in tech preview) CLDR MessageFormat 2.0 specification have been updated to match recent changes. ICU 76 and CLDR 46 are major releases, including a new version of Unicode and major locale data improvements.
icu: updated to 75.1 ICU 75.1 Unicode® ICU 75 updates to CLDR 45 (beta blog) locale data with new locales and various additions and corrections. C++ code now requires C++17 and is being made more robust. The CLDR MessageFormat 2.0 specification is now in technology preview, together with a corresponding update of the ICU4J (Java) tech preview and a new ICU4C (C++) tech preview. For details, please see https://icu.unicode.org/download/75.
icu: updated to 74.2 ICU 74.2 updates to CLDR 44.1 locale data. These are maintenance releases for ICU 74 and CLDR 44, with limited sets of bug fixes and no API or structural changes. The CLDR bug fix relevant for ICU is for some formatting patterns that erroneously had two adjacent space characters. These are coalesced into one. (CLDR-17233) Important: DateFormat.getInstanceForSkeleton() and the DateTimePatternGenerator sometimes used the wrong patterns because they failed to use/inherit certain data (ICU-22575 — CLDR 44 had removed some redundant data that ICU relied on)
icu: update to 74.1. ICU 74 is now available. It updates to Unicode 15.1, including new characters, emoji, security mechanisms, and corresponding APIs and implementations. It also updates to CLDR 44 (blog) locale data with new locales and various additions and corrections.
icu: updated to 73.2 ICU 73.2 updates to CLDR 43.1 locale data. These are maintenance releases for ICU 73 and CLDR 43, with limited sets of bug fixes and no API or structural changes.
icu: fix build breakage Commit 2de88f9d9c07f7e693449f94858d96053222acea / issue 21833 changes UChar_t* to char16_t* in ures.h (lines 815, 840, 862, 885 at least, functions ures_getUnicodeString() and friends). This breaks compilation of code using -DUCHAR_TYPE=uint16_t. https://unicode-org.atlassian.net/browse/ICU-22356 Workaround: revert to previous version, like Gentoo does https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=5a5db7fc59b5ff77e2cf44f51784ed4e87aeab5b Bump PKGREVISION.
icu: updated to 73.1 ICU 73 improves Japanese and Korean short-text line breaking, reduces C++ memory use in date formatting, and promotes the Java person name formatter from tech preview to draft.
icu: updated to 72.1 ICU 72.1 We are pleased to announce the release of Unicode® ICU 72. It updates to Unicode 15, and to CLDR 42 locale data with various additions and corrections.
icu: install shared libraries correctly
icu: updated to 71.1 ICU 71 is now available. It updates to CLDR 41 locale data with various additions and corrections. ICU 71 adds phrase-based line breaking for Japanese, for short Japanese text, such as in titles and headings; and adds support for Hindi written in Latin letters (hi_Latn), aka “Hinglish”.
icu: updated to 70.1 ICU 70 updates to Unicode 14, including new characters, scripts, emoji, and corresponding API constants. ICU 70 adds support for emoji properties of strings. It also updates to CLDR 40 locale data with many additions and corrections. ICU 70 also includes many other bug fixes and enhancements, especially for measurement unit formatting.
textproc: Replace RMD160 checksums with BLAKE2s checksums All checksums have been double-checked against existing RMD160 and SHA512 hashes Unfetchable distfiles (fetched conditionally?): ./textproc/convertlit/distinfo clit18src.zip
textproc: Remove SHA1 hashes for distfiles
icu: updated to 69.1 ICU 69.1 released. It updates to CLDR 39. This release includes significant improvements for measurement unit formatting and number formatting in general.
icu: Update to 68.2 Changelog: * Fix some bugs.
icu: Update to 68.1 Changelog: Common Changes CLDR 38 Support for units of measurement in inflected languages (phase 1) 14 new measurement units: candela, imperial quart, etc. Improved locale ID canonicalization spec & data New language at Modern coverage: Norwegian Nynorsk New languages at Moderate coverage: Fulah (Adlam), Chakma, Asturian New languages at Basic coverage: Dogri, Sanskrit Measurement unit preferences (ICU-20568) New usage() option on NumberFormatter to select the most appropriate unit for a locale and context New outputUnit() getter on FormattedNumber to get the output unit after resolution In skeletons, specify the context using the "usage" stem Example: locale "en-GB", skeleton "usage/person unit-width-full-name unit/kilogram", input 80 (expressed in kilograms), output "12 stone, 8.4 pounds" Usages are pulled in from CLDR (e.g. CLDR v38 Unit Preferences). PluralRules selection for ranges of numbers (ICU-21190) Locale ID canonicalization now conforms to the CLDR spec including edge cases; co-developed with CLDR spec & data improvements (ICU-21236, ICU-21115 & others) New LocaleMatcher options: custom threshold (ICU-21144), no default locale (ICU-21029) DateIntervalFormat supports output options such as capitalization (ICU-20651) Uppercasing for the Armenian language (hy) now maps ligature to (ICU-13416) Data size reduction: Rule-based segmentation data files (RBBI) use a more compact data format and are now half as large (ICU-13565) Measurement units are normalized in skeleton string output: i.e., calling toSkeleton() on a NumberFormatter returns "unit/meter" instead of "measure-unit/length-meter" The ICU User Guide has been migrated to Markdown format, hosted via GitHub Pages: https://unicode-org.github.io/icu/userguide/ Removed usage of terms like "blacklist" (ICU-21176), "master" (ICU-21242), and "grandfathered" (ICU-21184) as much as possible. Time zone data (tzdata) version 2020d (2020-oct-21) PluralRules category for compact notation numbers in French (ICU-13836) ICU4C Specific Changes New C API for number range formatting (unicode/unumberrangeformatter.h), for example "750 m - 1.2 km" (ICU-21182)
icu: updated to 67.1 ICU 67 updates to CLDR 37 locale data with many additions and corrections. This release also includes the updates to Unicode 13, subsuming the special CLDR 36.1 and ICU 66 releases. ICU 67 includes many bug fixes for date and number formatting, including enhanced support for user preferences in the locale identifier. The LocaleMatcher code and data are improved, and number skeletons have a new “concise” form that can be used in MessageFormat strings.
icu: fix build error on sh3. Also remove now duplicated part for ARMEB.
icu: updated to 66.1 ICU 66 It updates to Unicode 13 & CLDR 36.1. New, extra Q1 releases for low-risk integration of Unicode 13. ICU 65 It updates to CLDR 36 locale data with many additions and corrections, and some new measurement units. The Java LocaleMatcher API is improved, and ported to C++. For building ICU data, there are new filtering options, and new tracing support for data loading in ICU4C.
Oops, fix NetBSD RCSID.
Add support for ARMEB.
icu: restore -install_name fix for Darwin
Update to 64.2 Changelog: 2019-04-17: ICU 64.2 released. This maintenance update for ICU 64 includes draft Unicode 12.1 update, CLDR 35.1 locale data and support for new Japanese era Reiwa.
icu: Fix build on SunOS.
Update to 64.1 Changelog: Common Changes Unicode 12: 554 new characters, including 4 new scripts and 61 new emoji characters. CLDR 35 Somali and Javanese data now up to modern level Cebuano, Hausa, Igbo, and Yoruba data now up to basic level 23 additional measurement units Many data additions and corrections in many other languages The following language has been added to ICU: Cebuano This version of ICU does not yet implement the Indic Grapheme Cluster improvements from CLDR 35. New Japanese calendar era from 2019: CLDR and ICU include data for testing that can be enabled. (ICU #12973, CLDR #10750) To enable CLDR new Japanese era placeholder name, set environment variable (and Java system property for ICU4J) ICU_ENABLE_TENTATIVE_ERA=true (This was added in ICU 63). Support added for Gannen year numbering (using 元 for the first year of an era) in the Japanese locale Japanese-calendar full, long, and medium formats. Gannen year support is also automatically added for other non-numeric formats (those containing other kanji characters such as 年) derived from pattern skeletons unless specifically overridden. (ICU #20441, CLDR #11843, CLDR #11819) We are planning for an ICU 64.2 update in 2019-April which will add the new Japanese era with its real name. ICU 64 now uses "rearguard" TZ data. (Recent versions have used "vanguard" data with certain overrides.) (ICU-20398) ICU data filtering: The ICU4C build accepts an optional filter script that specifies a subset of the data to be built, with whitelists and blacklists for locales and for resource bundle paths. (ICU-10923, design doc) See this new documentation page: userguide/icu_data/buildtool.md MessageFormat has new pattern syntax for specifying the style of a date/time argument via a locale-independent skeleton rather than a locale-specific pattern. (ICU-9622) Date/time skeletons use the same "::" prefix as number skeletons. Example MessageFormat pattern string: "We close on {closing,date,::MMMMd} at {closing,time,::jm}." Many formatting APIs can now output a new type of result object which is-a FormattedValue (Java & C++), or convertible to a UFormattedValue (C). These combine the result strings with easy iteration over FieldPosition metadata. ICU4C Specific Changes New C++ class LocaleBuilder for building a Locale from subtags, keywords, and extensions. (ICU-20328) Parallel to the existing ICU4J ULocale.Builder class. For C++ MeasureUnit instances, there are now additional factory methods that return units by value, not by pointer-with-ownership. (ICU-20337) Various Out-Of-Memory (OOM) issues have been fixed. (ticket query)
Pullup ticket #5909 - requested by spz textproc/icu: security fix Revisions pulled up: - textproc/icu/Makefile 1.121 - textproc/icu/distinfo 1.81 - textproc/icu/patches/patch-CVE-2018-18928 1.1 --- Module Name: pkgsrc Committed By: spz Date: Wed Feb 13 20:51:57 UTC 2019 Modified Files: pkgsrc/textproc/icu: Makefile distinfo Added Files: pkgsrc/textproc/icu/patches: patch-CVE-2018-18928 Log Message: add patch for CVE-2018-18928 from upstream
add patch for CVE-2018-18928 from upstream
Apply ICU-20208 uspoof.cpp function checkImpl should be static from https://github.com/unicode-org/icu/commit/8baff8f03e07d8e02304d0c888d0bb21ad2eeb01 This should fix the firefox build with this version of icu Bump PKGREVISION
icu: updated to 63.1 ICU 63.1: Common Changes - CLDR 34 - Segmentation rules and emoji sort order adjusted for Unicode 11 - Somali and Javanese data now up to moderate level (document content) - Tongan, Konkani, Maori, Dzongkha, Tatar, Kurdish (ku), and Xhosa data now up to basic level - Many data additions and corrections in many other languages - The following languages have been added to ICU: Sindhi, Maori, Turkmen, Javanese, Interlingua, Kurdish (ku), Xhosa - New currency: Venezuela's Bolívar Soberano (VES) - New Japanese calendar era from 2019: CLDR and ICU include data for testing that can be enabled. To enable CLDR new Japanese era placeholder name, set environment variable (and Java system property for ICU4J) ICU_ENABLE_TENTATIVE_ERA=true. - New API for number and currency range formatting Support for additional Unicode properties: Indic_Positional_Category & Indic_Syllabic_Category and Vertical_Orientation - New API for code point maps and tries, mapping Unicode code points (U+0000..U+10FFFF) to integer values. - Java classes CodePointMap, CodePointTrie, MutableCodePointTrie - C types UCPMap, UCPTrie, UMutableCPTrie - New API for getting a UnicodeSet per binary property and a code point map per enumerated/int-value property. - Full conformance with UAX 14 Line Breaking (required BreakIterator feature work). ICU4C Specific Changes - C++ Locale class - Additional functions forLanguageTag()/toLanguageTag(), and functions that are easier and safer to use by using StringPiece and ByteSink rather than raw buffers. - Move semantics. - ICU4C: Various Out-Of-Memory (OOM) issues have been fixed. (ticket query) - The icu-config tool has been deprecated. You can use the --disable-icu-config option to disable icu-config from being installed. Alternately, you can use --enable-icu-config to enable icu-config. In the future, icu-config will be disabled by default
Rather than playing "reserved define whack-a-mole" on NetBSD by trying to define _ISOC99_SOURCE to mitigate the damage of defining _XOPEN_SOURCE, just don't define _XOPEN_SOURCE - nothing in the source defines it and it breaks at least gcc 6.4 on NetBSD-8.0 No PKGREVISION bump as should not affect any building systems
Update to 62.1 Changelog: Common Changes Unicode 11: 684 new characters, including 7 new scripts, Mtavruli Georgian capital letters, 5 new Han characters, and 66 new emoji characters. CLDR 33.1: Unicode 11 script metadata, collation, Chinese transliteration. Chinese collation stroke order updated from Unicode 7 to Unicode 11 after tooling bug fixes. NumberFormatter A NumberFormatter can now be constructed from a locale-neutral skeleton string (like a DateFormat) (#8610). This is particularly useful in translated messages where placeholder details should not be translated. MessageFormat recognizes the style field as a number skeleton if it is prefixed with "::", as in "Number of files: {num, number, :: round-integer group-min2}." (#13742) New "conversion" functions for getting a NumberFormatter from a DecimalFormat, and a Format from a NumberFormatter. New C API (unicode/unumberformatter.h [permanent API docs link TBD]). (#13597) Currently it supports formatting settings only via a skeleton string. Several still-draft NumberFormatter methods and helper classes have been modified or renamed; the previous versions remain temporarily (as deprecated) for one release, to help with the transition. Break Iterator Rules: "Safe" rules are no longer required for correct break iterator operation. For back compatibility, existing rule sets containing safe rules will continue to work, with the safe rules they contain being ignored. The Break Iterator binary data format has been updated to reflect this change. Line Break: The boundary rules have been updated to reflect the Unicode 11 version of UAX #14. Specifically, the handling of Emoji ZWJ sequences has been improved. ICU4C Specific Changes Under-the-hood overhaul of number parsing. See the design doc for a summary of changes; behavior is mostly compatible with previous versions, but there are some known differences. DecimalFormat now wraps the new NumberFormatter code.
icu: fix build on armeb by adding it to the long list of archs that exist and don't have 80bit float types. PR pkg/53408
textproc/icu: updated to 61.1 61.1: Common Changes * CLDR 33: - Two additional locales (Odia, Assamese) were brought up to Modern coverage level. - 4 new transforms: fa-fa_FONIPA, ha-ha_NE, nv-nv_FONIPA, vec-vec_FONIPA. - New currency code MRU for Mauritania. - Arabic native vs. ASCII digits. - Data additions & bug fixes. * Many small API additions, improvements, and bug fixes. ICU4C Specific Changes * Added Google double-conversion library for formatting doubles. This is the library used in V8 and a number of other projects for converting doubles to decimals. To avoid name collisions, the library is linked internally under the ICU namespace as icu::double_conversion. Our copy of double_conversion is not intended for public usage. * Re-wrote U8_NEXT macros to eliminate all library function calls.
icu: Revert previous clang patch, clang was changed instead.
icu: Don't perform SunOS _STDC_C99 workaround with clang.
icu: updated to 60.2 60.2: New API for direct-UTF-8 normalization. - It also optionally records changes, for source-to-result index mapping and tracking of text metadata. More convenient case mapping API (StringPiece→ByteSink). ICU now handles ill-formed UTF-8 byte sequences as specified in the W3C Encoding Standard. Bug fixes: CLDR 32.0.1 - Change of some German AM/PM to English strings reverted; will be revisited. - BGN transliterations restored. The Script_Extensions property value for 5 CJK characters is wrong. ICU4J DecimalFormat - getGroupingSize() returns -1 instead of 0 in ICU60 if grouping is disabled - setPositivePrefix also changes negative prefix - unsets maxFrac when minFrac is set on a currency instance DateFormat - Urdu Islamic calendar eras - Narrow format of noon time is used for abbreviated day period pattern letter 'b' and 'bb' Conversion buffer overflow Calendar buffer overrun Windows C++: The header file "stringoptions.h" is not included in the the pre-built binary .zip file download. Fix various typos and spelling mistakes.
icu: updated to 60.1 Changes 60.1: * Unicode 10.0: 8,518 new characters, including four new scripts, 7,494 new Han characters, and 56 new emoji characters. - Properties newly supported in ICU: Emoji_Component, Regional_Indicator, Prepended_Concatenation_Mark * CLDR 32: - Data for several (mostly Asian) new languages, date formatting patterns using colloquial day period formats ("h:mm B" → “1:30 in the afternoon”), and many other data improvements. - See the CLDR download page for other CLDR features and migration issues in CLDR 32. * NumberFormatter, a new number formatting API: A long-overdue refresh of number formatting in ICU with a focus on usability, robustness, and performance. The 30+ settings in DecimalFormat are reduced to 8 in NumberFormatter; all NumberFormatter objects are thread-safe and immutable; and the code is efficient in both the client-side (constant locale) and server-side (variable locale) use cases. - New users are encouraged to use the new API for number formatting. However, preexisting code can continue using the old API, which has been partially made into a wrapper over the new API. - Documentation: in Java, see com.ibm.icu.number.NumberFormatter, and in C++, see i18n/unicode/numberformatter.h. * New options for titlecasing: - Sentence titlecasing and whole-string titlecasing without custom BreakIterator instances. - The default index adjustment has been changed from "find first cased character" to "find first letter, number, or symbol"; a new option is available for selecting the previous adjustment behavior. * Smaller data files for BreakIterator. - Reverse rules no longer used: Easier updates, easier to conform to Unicode Standard. - Old source rule files continue to work, reverse rules are ignored. - Rule-based data files: 1.2MB→0.8MB. ICU4C Specific Changes * New API for direct-UTF-8 normalization. - It also optionally records changes, for source-to-result index mapping and tracking of text metadata. * More convenient case mapping API (StringPiece→ByteSink). * ICU now handles ill-formed UTF-8 byte sequences as specified in the W3C Encoding Standard.
Pullup ticket #5651 - requested by he textproc/icu: security fix Revisions pulled up: - textproc/icu/Makefile 1.111-1.112 - textproc/icu/distinfo 1.66,1.70 - textproc/icu/patches/patch-config_mh-solaris-gcc 1.4 - textproc/icu/patches/patch-i18n_zonemeta.cpp 1.1 --- Module Name: pkgsrc Committed By: jperkin Date: Wed Oct 4 10:52:40 UTC 2017 Modified Files: pkgsrc/textproc/icu: Makefile distinfo pkgsrc/textproc/icu/patches: patch-config_mh-solaris-gcc Log Message: icu: Remove -nodefaultlibs -nostdlib from SunOS linker args. This prevented GCC libraries from being used and thus disabled SSP and other features. Bump PKGREVISION. --- Module Name: pkgsrc Committed By: he Date: Thu Nov 16 09:58:26 UTC 2017 Modified Files: pkgsrc/textproc/icu: Makefile distinfo Added Files: pkgsrc/textproc/icu/patches: patch-i18n_zonemeta.cpp Log Message: Apply a fix for CVE-2017-14952 from http://bugs.icu-project.org/trac/changeset/40324/trunk/icu4c/source/i18n/zonemeta.cpp Bump PKGREVISION.
Apply a fix for CVE-2017-14952 from http://bugs.icu-project.org/trac/changeset/40324/trunk/icu4c/source/i18n/zonemeta.cpp Bump PKGREVISION.
icu: include xlocale on all non-netbsd non-linux systems. netbsd&linux do not have it (glibc had it, but removed in 2.26, and was satisfied by locale.h always, if their release notes is to be believed) this should cover BSDs other than netbsd, etc.
Fix building on Darwin
icu: never include xlocale.h, always use locale.h This was a glibc header, whereas locale.h is a POSIX one. glibc went ahead and removed it in the new version. change suggested by Thomas Orgis on tech-pkg but probably not applied exactly.
icu: Remove -nodefaultlibs -nostdlib from SunOS linker args. This prevented GCC libraries from being used and thus disabled SSP and other features. Bump PKGREVISION.
Pullup ticket #5357 - requested by maya textproc/icu: security fix (backported) ICU had a vulnerability (CVE-2017-786[78]) Unfortunately they fixed it by doing a major release and have previously broken other packages at runtime with such updates. I've made backports of all the changesets that were mentioned in any of the links, specifically the oss-fuzz report was somewhat broad and mentioned 39673 which backported several 'crash' changesets: http://bugs.icu-project.org/trac/changeset/39663 http://bugs.icu-project.org/trac/changeset/39669 http://bugs.icu-project.org/trac/changeset/39671 The advisory only references code changes relevant to 39671, we could limit the backport to that. https://www.debian.org/security/2017/dsa-3830 I've run make replace and smoke-tested with midori they have a rather extensive testsuite. I've run it with 'make test' and it didn't show any issues. These are manual backports by myself as the patches did not apply cleanly.
icu: remove part of configure script stripping -std=c++11 on Solaris Blind build fix attempt for SmartOS.
Changes 59.1: * Emoji 5.0 data * Includes bidi data files from Unicode 10 beta. * Includes segmentation data files and rules from Unicode 10 beta and CLDR 31.0.1. * Does not yet include the Emoji_Component property. * Otherwise ICU 59 continues to use Unicode 9 data. CLDR 31.0.1 * Including updates for emoji 5.0, for example local names for England, Scotland, and Wales. * GMT and UTC are no longer unified, and CLDR provides distinct UTC display names, avoiding confusion with standard (winter) time in Britain. * See the CLDR download page for other CLDR features and migration issues in CLDR v31. New case mapping API (C++ & Java classes CaseMap) supports styled text.
Updates in ICU 58.2 Common Changes * CLDR 30.0.3 * Time zone database version 2016j * ICU SVN repository structure change. See the note on the Source Code Access page for more information. ICU4C Fixes * 12815 uspoof_getSkeleton sets backwards-incompatible illegal argument exception * 12822 digitlist.cpp won't compile on msvc under Node.js * 12825 uspoof_check goes into an "infinite loop" when U+30FB is in an input string * 12832 GreekUpper::toUpper skips the final character on a non-terminated UTF-8 string * 12849 u_strToTitle returns incorrect length if destination is NULL * 12868 uprv_convertToPosix() Windows bug
Update to 58.1 * Fix regression with upstream patch, https://ssl.icu-project.org/trac/ticket/12827 Changelog: Common Changes CLDR 30.0.2: For details of the many changes in CLDR, see CLDR 30. Some things to note: For some combinations of numbering system (arab, arabext, latn) and/or locale (ar, fa, he), there were changes to the bidirectional control characters used with certain symbols (percent, minus, plus), and changes to number patterns (currency and/or percent, including addition of bidirectional control characters in some cases). New in this release, the bidirectional controls used for such purposes include U+061C ARABIC LETTER MARK (ALM), which requires use of the bidirectional algorithm from Unicode 6.3 or later. The time separator for Norwegian locales (nb, nn) was changed to be ':' throughout. Unicode 9.0: Version 9.0 adds exactly 7,500 characters, for a total of 128,172 characters. These additions include six new scripts, 19 symbols for the new 4K TV standard, and 72 new emoji characters. Draft Emoji 4.0 data Emoji updates for word & line breaking. (#12664 & Unicode 9 update #12526) UBiDiTransform/BidiTransform API for convenient transformation of text between different Bidi layouts. (#11679) MeasureFormat API for measurement unit display names. (#12029) Most COUNT and LIMIT enum constants have been deprecated. (#12420) SpoofChecker: Handling of "whole script confusables" has been removed from ICU, in accordance with its removal from UTS #39 Version 9.0.0 and the removal of the corresponding Unicode data file. (#12549) Greek uppercasing ("el" locale ID) removes most diacritics. (#5456) More robust locale data loading across ICU implementation code. Reduced heap memory usage in DateTimePatternGenerator. (#11782) ICU4C Specific Changes The layout engine code has been removed; the ParagraphLayout is not deprecated and remains (and must now be built on top of HarfBuzz). See http://userguide.icu-project.org/layoutengine (#12708) Windows: Supports & requires Visual Studio 2015.
Update to 57.1 Changelog: Common Changes CLDR 29: For details of the many changes in CLDR, see CLDR 29. Grapheme/word/line breaking for emoji sequences, based on Unicode 9 proposed rules. See the Unicode emoji break proposal and the Unicode Emoji Technical Report Proposed Update describing the new emoji sequences. (#12081). Four new Unicode emoji properties (#11802). DateFormat day period formatting of "noon", "at night", etc. via new pattern characters b & B, and DateTimePatternGenerator support of C for selecting the customary form (#11872). Except: Formatting of "0:00 midnight" has been disabled because it is confusing except for at the end of an interval. RelativeDateTimeFormatter: Simpler formatting API (#12072). More robust CLDR data loading for MeasureFormat (#11986, #12030), RelativeDateTimeFormatter (#12018), and DateIntervalFormat/DateIntervalInfo (#12013). New simple & fast SimpleFormatter class for a trivial subset of MessageFormat as used in CLDR data, e.g., "{0} {1}" (#10896). ICU4C Specific Changes C API support for RelativeDateTimeFormatter (#12072). Clang annotations for intended switch case fallthroughs, can now compile with -Wimplicit-fallthrough (#12166). Internal header files can be compiled by themselves, for simpler alternative build scripts (#12141).
Apply patch from upstream to fix compilation on CentOS 7. From Thomas Orgis via mail.
Add SHA512 digests for distfiles for textproc category Problems found locating distfiles: Package cabocha: missing distfile cabocha-0.68.tar.bz2 Package convertlit: missing distfile clit18src.zip Package php-enchant: missing distfile php-enchant/enchant-1.1.0.tgz Otherwise, existing SHA1 digests verified and found to be the same on the machine holding the existing distfiles (morden). All existing SHA1 digests retained for now as an audit trail.
Fix preprocessor logic bug causing __STRICT_ANSI__ to be undefined on all platforms, breaking SunOS/clang's use of GCC headers around __float128.
Use the GCC build file when using SunOS/clang, and patch it to pass the -h linker argument correctly.
Update to 56.1 Changelog: Release Overview The features for this release include support of CLDR 28 and Unicode 8.0. For more details, including migration issues, see below. Common Changes CLDR 28: For details of the many changes in CLDR, see CLDR 28. Unicode data updated to Unicode 8.0: 41 new emoji characters, 5,771 new ideographs for Chinese/Japanese/Korean, 6 new scripts, improved character properties data, etc. ICU data size reduced by about 7.2% (1.8MB) via sharing string values across resource bundles. [#11537] DateIntervalFormat now handles intervals with seconds, and sets FieldPosition more consistently. [#11706, #11726] DateFormat::createInstanceForSkeleton() caches DateFormat patterns rather than DateTimePatternGenerator instances, for better performance (for cache hits) and lower heap memory consumption. [#11780] StringSearch (based on collation) defaults to matches on normalization boundaries rather than grapheme cluster boundaries, which yields more matches on Indic text. [#11750] RuleBasedNumberFormat (spelled-out numbers) now handles rounding (Java only), infinity, NaN. [#11653, #11760, #8223] Most of the old Normalizer/unorm.h had been replaced by (and reimplemented via) Normalizer2, and is now deprecated. [#7303] COLON has been withdrawn as a date pattern character corresponding to the date field [UDAT_]TIME_SEPARATOR_FIELD; there is currently no pattern character corresponding to that field. [#11773] Support for locale key "cf" to specify currency format style, and interaction with NumberFormat values for UNumberFormatStyle: [#11787] For NumberFormat style UNUM_CURRENCY / CURRENCYSTYLE, the default is "standard" currency style (typically using minus sign for negative numbers), but the new locale key "cf" may be used with values "standard" or "account" to specify currency format style ("account" indicates accounting style, often using parentheses for negative numbers). For other NumberFormat styles, the locale key "cf" is ignored (they override the locale preference): UNUM_CURRENCY_ISO / ISOCURRENCYSTYLE UNUM_CURRENCY_PLURAL / PLURALCURRENCYSTYLE UNUM_CURRENCY_ACCOUNTING / ACCOUNTINGCURRENCYSTYLE UNUM_CASH_CURRENCY / CASHCURRENCYSTYLE A new NumberFormat style is availble to explicitly specify standard style, ignoring the the locale key "cf" UNUM_CURRENCY_STANDARD / STANDARDCURRENCYSTYLE ICU4C Specific Changes C API support for CompactDecimalFormat via UNumberFormatStyle additions: UNUM_DECIMAL_COMPACT_SHORT, UNUM_DECIMAL_COMPACT_LONG [#11693] Larger UnicodeString object stores more characters inside the object without heap allocation; the UnicodeString object size is now build-time-configurable. [#11551] On 64-bit machines, increase from object size 40 bytes with 15 internal UChars to a new default of 64 bytes with 27 UChars. Some C++ classes now have swap() and moveFrom() methods, and support C++11 move semantics on compilers that support them. [#10086] UnicodeString, LocalPointer, LocalArray DecimalFormat code refactored to fix bugs, improve maintainability, and improve performance. [#10458] New FilteredBreakIterator suppresses certain segment boundaries. For example, it can suppress the sentence boundary in the middle of "Mr. Smith". [#11248] The internal, shared cache has been changed from unbounded to bounded. [#11767] For [U]BreakIterator with type UBRK_SENTENCE, the locale key "ss" can now be used with value "standard" to specify that standard sentence break suppression data should be used, or with value "none" to indicate that no break suppression data should be used (the default). [#11770] Collator: first-time startup time improved 20% due to precalculated unsafe-backward table [#11886] A number of memory leaks and buffer overruns have been fixed based on static code analysis, mostly in data build tools
Pullup ticket #4826 - requested by tnn textproc/icu: security fix Revisions pulled up: - textproc/icu/Makefile 1.100 - textproc/icu/distinfo 1.55 - textproc/icu/patches/patch-common_ucnv__io.cpp 1.1 --- Module Name: pkgsrc Committed By: tnn Date: Tue Sep 29 02:15:54 UTC 2015 Modified Files: pkgsrc/textproc/icu: Makefile distinfo Added Files: pkgsrc/textproc/icu/patches: patch-common_ucnv__io.cpp Log Message: Patch CVE-2015-1270. Via Debian.
Patch CVE-2015-1270. Via Debian.
Unbreak on Bitrig by adding necessary parts to autoconf related files Add Bitrig to runConfigure script http://bugs.icu-project.org/trac/ticket/11881 http://bugs.icu-project.org/trac/ticket/11882
Changes 55.1: The features for this release include support of CLDR 27 (with a major cleanup of region locales, among many other improvements), formatting for scientific notation ("1.2 × 10³"), an update to Unicode 7.0 data for spoof-checking, narrow AM/PM markers ("7:45p"), and various performance enhancements. For C/C++, there are new methods for flexible dates ("Nov 10", or "Sept 2015"), named capture groups for regular expressions, formatting of compound units ("3.5 meters per second"), new C wrappers, and independent timezone resource loading. ICU4J has been improved and tested for using ICU4C data and for running on Android.
Pullup ticket #4636 - requested by spz textproc/icu: security patch Revisions pulled up: - textproc/icu/Makefile 1.96 - textproc/icu/distinfo 1.52 - textproc/icu/patches/patch-CVE-2014-7923+7926 1.1 --- Module Name: pkgsrc Committed By: spz Date: Fri Mar 6 14:43:15 UTC 2015 Modified Files: pkgsrc/textproc/icu: Makefile distinfo Added Files: pkgsrc/textproc/icu/patches: patch-CVE-2014-7923+7926 Log Message: add patch for CVE-2014-7923 and CVE-2014-7926 found at https://chromium.googlesource.com/chromium/deps/icu52/+/6242e2fbb36f486f2c0addd1c3cef67fc4ed33fb
add patch for CVE-2014-7923 and CVE-2014-7926 found at https://chromium.googlesource.com/chromium/deps/icu52/+/6242e2fbb36f486f2c0addd1c3cef67fc4ed33fb
Fix compilation on Mac OS 10.4. From Sevan Janiyan in PR pkg/49077.
ICU 54 is a major release of ICU, with new features, new APIs and many bug fixes in data and code. It supports the latest versions of the Unicode locale data (CLDR 26, September 2014) and Unicode Standard (Unicode 7.0, June 2014). The improvements include 72 new measurement units, Unihan radical-stroke collation moved into root, new RBNF PluralFormat syntax, dictionary-based word and line break for Burmese, support for short locale display names, compatibility support for IANA time zone data abbreviations, a tech preview of FilteredBreakIterator using ULI break data, ICU4C thread safety fixes, and the ability to build ICU4C Paragraph Layout with HarfBuzz.
Fix SCO OpenSrver 5.0.7/3.2 build. Add configuration for SCOOSR5.
Remove hard-coded RPATH flags from patch and use PKGCONFIG_OVERRIDE instead. Fixes unwanted linker flags for platforms missing rpath support.
Use Cygwin package way instead of tons patches affect to other platforms.
Fix OpenBSD 5.5 build * OpenBSD 5.5 has /usr/include/sys/atomic.h, but it is different from NetBSD's one
Changes 53.1: Data from the CLDR 25 release: Many bug fixes Time zone data: 2014b, including post CLDR 25 time zone data update to CLDR. U+20BD Ruble Sign added (from Unicode 7.0, otherwise ICU 53 still uses Unicode 6.3) MeasureFormat API for new units in CLDR 24 Hoisted setContext/getContext from SimpleDateFormat to DateFormat, implement context-sensitive capitalization of relative dates Added setContext/getContext methods to NumberFormat (and unum_setContext/unum_getContext for UNumberFormat), implement context-sensitive number formatting (for RBNF spellout) Improved lenient date parsing consistency between ICU4C and ICU4J, add finer-grained control of date parsing leniency Fixed numeric rounding in TimeUnitFormat Fixes to Unicode 6.3 bidirectional algorithm implementations to behave exactly like reference implementations Improved UTF-16 charset detection Collation code re-implemented Many bugs fixed, some enhancements implemented (link for ticket query) Passes full UCA conformance tests now Updated to UCA 6.3/CLDR 24 root collation Performance: C++ UTF-8 and Java string comparisons significantly faster (very small reduction for C++ UTF-16) Collation data size (uncompressed) reduced from 4.48MB (ICU 52) to 2.62MB New data format, removed empty files, fixed genrb bug More APIs function when collation rule strings have been omitted from the data files (e.g., getTailoredSet()) Java Collator.compare(Object, Object) now works with CharSequence, not just String Java Collator base class (does not apply to RuleBasedCollator instances): getters for strength, decomposition mode, and locales return hardcoded default values; their setters do nothing Rule syntax and semantics tightened and improved, matching LDML 25 Collation Rule Syntax In particular, rule chains now must start with a reset. Setting of variableTop deprecated, and not supported in rule syntax any more Replaced by the new maxVariable setting; see LDML 25 Collation Settings Accounting format supported in NumberFormat RelativeDateTimeFormatter class for formatting relative times such as "3 weeks ago" or "next Tuesday." Updated Spoof Checker for Unicode Security Standard version 6.3.
Add NetBSD MI atomic_ops support. Based on PR pkg/48608 by Izumi Tsutsui.
Pullup ticket #4267 - requested by taca textproc/icu: security patch Revisions pulled up: - textproc/icu/Makefile patch - textproc/icu/distinfo patch - textproc/icu/patches/patch-i18n_csrucode.cpp patch --- Apply patch to fix the security vulnerability reported in CVE-2013-2924.
Fix solaris build for icu, namely problems in general with CFLAGS/CXXFLAGS from typos in configure and acinclude.m4 to needing to add the flags to properly generate dependency files with gcc.
Fix MirBSD build by adding a <sys/time.h> include.
Fix build on ARM platform.
Changes 52.1: Unicode 6.3: New bidi control codes, new Bidi_Class property values, two new bidi "bracket" properties; for other property value changes see the UAX 44 summary. The bidi algorithm implementation has also been updated to support the new properties and to match the updated algorithm in the Unicode 6.3 version of UAX 9. Note: ICU 52 still uses collation root data based on Unicode Collation Algorithm 6.2 (UCA 6.2). (However, ICU 52 does use CLDR 24 collation tailoring data.) CLDR 24: Improved coverage for top 70+ languages, fractional plural rules and forms, many new measurement units, major simplification of collation rule syntax, preliminary version of European Ordering Rules, new relative fields such as “last Sunday” and “now”, and much more. Time zone data: 2013g. Support new variants of Islamic calendar: "islamic-umalqura": Umm al-Qura. "islamic-tbla": Tabular (fixed intercalary years), with astronomical epoch. Made Calendar getDayOfWeekType behave as documented. New API for converting between Windows time zone ID and IANA tz database ID. Technology Preview: New API for more granular control of DateFormat parse leniency. DateTimePatternGenerator: Support recently-added time zone pattern characters O, X, x and updated support for V, Z. Support newly-defined skeleton character ‘J’ to generate preferred hour cycle without any day period indicator (such as AM/PM for h). Implement support for plurals that depend on displayed fractional values. MessageFormat and currency formatting etc. select appropriate plural forms for values with decimal digits (after the decimal point). Segmentation: Add dictionary-based word & line break for Lao.
Fix build on OpenBSD.
Fix remaining build problem under Mac OS X: As "DSO_LIBDIR" is now always set (and must be always set because of all the changes that refer to it) we cannot use it to check for the Cygwin case anymore. Instead check whether "OPSYS" is (not) equal to "Cygwin".
Make sure that the target directory for shared libraries is always defined. This fixes one of the build problems under NetBSD and Mac OS X introduced by the Cygwin patches.
Correctly install DLLs for Cygwin. The dynamic data library build is still a bit fishy, but this gets it to a mostly usable state at least.
Changes 51.2: Bug fixes: * fix for enumset.h not being installed on Windows * zOS pkgdata fix * Test fixes * Region enumeration fix * make stable sort faster * host failures for DateFormatTest * LayoutEngine security patches (see above) * ubrk fix for word_POSIX infinite loop * fix memory leak/crash in LayoutEngine * fix header guard typo in layout/TibetanReordering.h
Changes 51.1: Common Changes ============== CLDR 23: Collation tailorings put native script first; non-Gregorian calendar formats are more consistent; much improved data for Armenian (hy), Georgian (ka), Mongolian (mn), and Welsh (cy); … Time zone data: 2013b Date format/parse now supports CLDR short weekday names ("EEEEEE", "cccccc"). Support DisplayContext for date formatting, locale display names. DateTimePatternGenerator behavior is now much more consistent between C and J. Support new timezone pattern characters in LDML spec: X+, x+, O, OOOO, V, VV, VVV. Updated SpoofChecker for v5 of UTS39. AlphabeticIndex enhancements: New thread-safe ImmutableIndex sub-API Build an index for a custom Collator. Make data-driven for Chinese collations. New API for CLDR script metadata. ICU4C Specific Changes ====================== Support for “dangi” Korean luni-solar calendar (already in ICU4J). Add CompactDecimalFormat (already in ICU4J). Add TerritoryContainment APIs (already in ICU4J). UnicodeString default constructor and destructor now inline. Layout engine now supports 'morx' tables. Fixed some ICU 50 regressions: Affixes set with e.g. DecimalFormat::setPositivePrefix were ignored for parse. UNUM_PARSE_INT_ONLY no longer handled grouping separator. Add ucal_getTimeZoneID. The C++ AlphabeticIndex implementation is now on par with Java, including full support for all Chinese collation tailorings. U8_NEXT() and similar low-level macros now support NUL-terminated UTF-8 strings. New macros like U8_NEXT_OR_FFFD() return U+FFFD for an ill-formed sequence. Conversion: New "good one-way" mapping type, for example for Variation Selector sequences.
Changes 50.1.2: This is a maintenance release affecting only the Layout Engine ABI. It only incorporates bug 9826 which fixes a regression in ICU4C 50.1.1.
Changes 50.1.1: * 9306 Layout Engine changes for harfbuzz integration * 9677 Affixes set with e.g. DecimalFormat::setPositivePrefix now ignored for parse * 9714 OS/400 test failures * 9728 Fail building icu4c with mingw-w64 * 9737 Locale::GetDefault() in locid.cpp is not thread-safe * 9771 Updated Currency from/to data (CLDR 5470) * 9748 Visual Studio 2010/2012 issues * 9780 UNUM_PARSE_INT_ONLY no longer handles grouping sep * 9783 New Turkish Lira symbol * 9789 Date format parsing problem with new CLDR data * 9793 Currency data integration issue with CLDR 5470 changes * 9801 UCONFIG_NO_CONVERSION test failure * 9802 No data test failure
Changes 50.1: * Unicode 6.2: Turkish Lira Sign, improved word & line segmentation (BreakIterator) for symbols * CLDR 22.1: Data coverage & quality improved across all major languages; new short width type for weekday names; new zhuyin (Bopomofo) collation for Chinese; improved data for CompactDecimalFormat & RBNF * Time zone data: 2012h * Ordinal-number support in MessageFormat & PluralRules * Deprecate setLocale(locale) in PluralFormat * Dictionary-based break iterators (word segmentation): * Support Chinese & Japanese, use more compact dictionary format, port all but Khmer support to Java * Update Khmer dictionary * Change Java util.ListFormat to text.ListFormatter and other updates, use CLDR data, port to C++ * Add updated IBM-eucJP and IBM-5233 converter * Improve number formatting performance * C++ GenderInfo: Effective combined gender of a list of people's genders (ported from Java) * Thread safety support cannot be removed (see the Readme) * Default compilers: Clang is now used if available (see the Readme) * C++ Collator API cleanup, subclassing-API-breaking changes (see the Readme) * Add option to genrb tool for writing java resource bundle files * Time zone format APIs
Add MirBSD support and unbreak build on MirOS. No-op on other platforms so no PKGREVISION bump.
Functions with empty args should be declared (void), not (). Fixes build of lang/parrot. PKGREVISION -> 1.
Changes 49.1.2: * 9242 ICU4C fails to parse pattern containing EEE properly whilst ICU4J parses it successfully * 9258 Number format performance * 9283 uregex_open fails for look-behind assertion + case-insensitive * 9284 Date format roundtrip test failure * 9295 HPPA endianness detection * 9313 Problem building ICU4C with Cygwin/MSVC * 9332 Linux s390 endianness detection * 9336 Problem building ICU4C 49.1.1 on zOS
Avoid C99/XOPEN_SOURCE confusion. Fixes build on SunOS.
On BSD use <sys/endian.h> to derive endianess, instead of defaulting to little endian always.
Changes 49.1.1: * Unicode 6.1: New scripts & blocks; changes to grapheme break & line break property values; some characters change from symbol to Po or No; etc. * CLDR 21.0.1: Changes in segmentation data to match Unicode 6.1; new structures for support of Chinese calendar, for context-dependent capitalization, for gender of lists of people, for ordinal categories, and for multiple number systems per locale; deprecation of "commonlyUsed" element in timezone names; removal of "whole-locale" aliases; major cleanups of timezone names, delimiter data, abbreviated number data. * Normalizer2 API additions * Easier-to-use getInstance() variants; e.g., getNFDInstance() * Getter for the combining-class value for a code point * Getter for the raw Decomposition_Mapping * Pairwise composition * TimeZone class: (C++) Getter for unknown time zone, (Java) fields for GMT & unknown zone * Support for deprecation of the "commonlyUsed" element for CLDR metazones * DateTimePatternGenerator can now use separate patterns for skeletons that differ only in MMM vs MMMM or EEE vs EEEE, etc. * Support for custom DecimalFormatSymbols in RuleBasedNumberFormat * Format and parse Chinese calendar dates including support for intercalary months * Context Transforms for context-dependent capitalization behavior * APIs for TimeZoneNames and TimeZoneFormat * Support for new date format pattern "ZZZZZ" for ISO 8601 zone format * Options for ambiguous local time resolution in Calendar * Support for ISO 4217 numeric currency code
Fix pkg/43187 * Add ECHO_C to end of version string. This reduce unnecessary newline. Thank you, obache@.
Fix pkg-config files to include library paths. Bump PKGREVISION. Generate PKGNAME from DISTNAME while here.
add patch from upstream Ticket #8984 to fix possible out-of-bounds array access, bump PKGREV
Changes 4.8.1: This is a maintenance release of ICU 4.8. No new APIs were added.
Changes 4.8: * CLDR 2.0: The CLDR 2.0 release contains numerous improvements and bug fixes approved by the CLDR committee, including much additional data for many languages. * Explicit parent locale support in data imported from CLDR. * MessageFormat and related classes (choice/plural/select) have been reimplemented, with several improvements and some incompatible changes. * Extended PluralFormat pattern syntax supports explicit-value forms and offsets. * Utility APIs in PluralRules (get some/all/unique keyword values) * Time zone API to return a list of available canonical system time zone IDs. * Time zone API to return a region. * Collation: Full implementation & public API for script reordering * Dictionary-type trie * GB18030-2005 update
Changes 4.6.1: * Common Locale Data Repository (CLDR) 1.9.1 * Update timezone data support to Olson 2011c * 8271 UCOL_RUNTIME_VERSION should be updated for 4.6 * 8277 Collation Reordering Use Of USCRIPT_UNKNOWN * 8290 Can't find Hangul with search coll (usearch doesn't handle CE iter behavior) * 8303 ULocale#toLanguageTag() should not supply "und" as language when the locale has only private use * 8341 USpoof uses NFKD, should be NFD
Changes 4.6: CLDR 1.9, Unicode 6.0, UTS #46 support, collation enhancements, alternate number symbols
update to 4.2.1 major changes: Locale Data: ICU uses and supports data from Common Locale Data Repository (CLDR) 1.7 , which includes data for 146 languages, 159 territories, 468 locales- 21% more locale data than the previous release. Number system support and the number keyword. Number system override in DateFormat Numerics used by Hebrew Calendar date in Hebrew locale BCP47 (language tag) / Locale transformation BCP47 mapping of LDML keywords Encoding selector: Return a list of charsets that can handle the input text Simple duration: Implementation of CLDR duration format Available/Preferred keywords for a locale (Calendar, Collation, and Currency) StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI, RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace and RFC4518 LDAPprep Miscellaneous Arabic shaping enhancements UTF-8 friendly internal data structure for Unicode data lookup API to get CLDR version used by ICU ISCII charset converter updates (added Gurumukhi, other updates) Performance improvements in Time Zone Name format/parse, and in DateIntervalFormat construction
Update from version 3.6nb2 to 4.0.1. Pkgsrc changes: o New MASTER_SITE o Adjust PLIST o Remove no-longer-needed patches, since corresponding changes have been adopted upstream o BUILDLINK_ABI_DEPENDS bumped to >=4.0, since a new shared library version is installed o Fixes security vulnerability, ref. below. Dependent pkgsrc packages will have their revisions bumped shortly due to the (possibly/probably) changed ABI. Upstream changes: 4.0.1: ICU4C 4.0.1 is a maintenance release of ICU4J 4.0. The primary changes of this release were: * Updated time zone data to 2008i * Technical preview of string search implementation using Boyer-Moore algorithm (#6286). For detail information, please see the tech note here. * #5691 Conversion: consistent illegal sequences * #6435 Bad @stable ICU4.0 tags * #6597 TestDisplayNamesMeta failure * #6670 Test failure in format/TimeZoneTest/TestShortZoneIDs 4.0: Major changes in ICU 4.0 include the following: * Common Changes o Unicode 5.1 (#5696) o Locale Data: ICU uses and supports data from Common Locale Data Repository (CLDR) 1.6 , which includes many improvements in quality and quantity of data. o add/removeLikelySubtags (#6124) o Charset converter file size improvement (#5987) o Date Interval Formatting (#6157) Note: Calendar type supported by this feature is Gregorian only in this release. o Improved Plural support * ICU4C Specific Changes Additional Calendars + Chinese (#4081) + Coptic/Ethiopic (#4571) * ICU4J Specific Changes o Charset + Graduated from Technology Preview status + ICU2022 Converter (#5791) + HZ Converter (#6128) + SCSU/BOCU-1 Converter (#2147) + Charset Converter Callback (#6144) o Thai Dictionary break iterator (#5385) o JDK TimeZone support (#5975) o Locale Service Provider (#5976) o More convenient formatting of year+month, day+month, and other combinations (#6304) o Simple Duration Formatting (#6303) * ICU4C Security Fixes ICU4C 4.0 resolves the vulnerabilities CVE-2007-4770 and CVE-2007-4771 which were found in earlier versions of ICU. The standard ICU tests verify that these have been corrected, however, the updated versions of the previous tests may be run by applying the following patch to ICU 4.0: r24324. As well, ICU4C and ICU4J 4.0 resolve the issue underlying CVE-2008-1036.
fix RE vulnerabilities (CVE-2007-(4770|4771)), patch from redhat via Gentoo bug #208001, bump PKGREVISION
update to ICU 3.6 Major changes in ICU 3.6 include the following: - Unicode: ICU uses and supports Unicode 5.0, which is the latest major release of Unicode. Unicode 5.0 will be used in many operating systems and applications, and this version of ICU is important maintain interoperability with these new operating systems and applications. More information about Unicode 5.0 can be found in the Unicode press release. - Locale Data: ICU uses and supports data from Common Locale Data Repository (CLDR) 1.4, which includes many improvements in quality and quantity of data. There is 25% more CLDR locale data in 245 locales in ICU. - ICU4C Specific Changes - Charset Detection: A charset detection framework was added, which provides heuristics for detecting the charset for unlabeled sequences of bytes. - Layout: The font layout engine has support added for Tibetan, Sinhala and Old Hangul. - BiDi: The BiDi algorithm was enhanced to be more flexible and efficient - ICU Data Management: The new icupkg tool provides an easier way to manage ICU's data library. This tool allows you to add, update or remove data from ICU's data archive. - Time Zones The time zone data is modularized to allow easier building and updating of the data. - Word Boundaries: The Thai word break iteration was improved to be more accurate. Also dictionary based detection of Thai word boundaries is now active for all locales. - UText - The BreakIterator uses UText for abstract text processing. - 64-bit indexing is now used to allow access to larger chunks of text. - API for read-only locking for security and robustness was added. - Performance - The u_sprintf/u_sscanf performance from the icuio library has been improved for number formatting/parsing. - Constructing a DateFormat is significantly faster for many locales. - Opening and closing a charset converter is significantly faster. - The UTF-8 transformation functions and macros are faster. - The UText API was improved for performance. - The collation open and close functions have a small performance improvement.
unlimit datasize, to make it build on amd64/3.0 being here, update to 3.4.1 changes: -Updated timezone data -Improved portability -Improved default codepage and default locale detection. -A number of collation bug fixes.
Update to 3.4: New Features: Major changes in ICU 3.4 include the following: Updates to conform to Unicode 4.1, including new characters properties and values, text segmentation, plus collation updated for Unicode Technical Standard #10 (UCA) and regex updated for Unicode Technical Standard #18. * Updates to conform to the Common Locale Data Repository (CLDR), Version 1.3 for the latest locale data. This includes: * New data to support localization of timezones, United Nations M.49 regions (including continents and regions), mappings from language to script and territory. * Consolidation of inherited data and improved resource aliasing for smaller data footprint * Additional locales, and many other fixes and additions of locale data. * POSIX migration support: direct API support for all POSIX character classes, implemented according to Unicode recommendations
Add DragonFly support.
ICU 3.2 includes the latest bug fixes, locale/charset updates, and performance/build/porting enhancements. The following list summarizes the main new features in this release.sion. CLDR 1.2. This is the main new feature in the release. ICU locale data is now completely built from the CLDR 1.2 data, which contains data for 232 locales, covering 72 languages and 108 territories. Many translated names for languages, territories, and scripts have been added, as well as for time zones, calendars, and other named items such as collation. For more information, see http://www.unicode.org/press/pr-cldr1.2.html. Miscellaneous Universal Timescale conversions. ICU now provides mechanisms for quickly and reliably converting between the different binary representations of date/time used on different platforms. Accept-Language. ICU provides a mechanism for matching Accept-Language against a list of locales. DateFormat and Calendar Performance. Object construction performance has been significantly improved. Footprint. The size of executables that statically link to ICU has been reduced. Stdin. The icuio library can now read from stdin. UnicodeSet C API. More uset_* C API were added. i5/OS (os/400). Building ICU has been simplified to allow more configure options to work. POSIX. Default codepage determination has been fixed.
Add RMD160 digests to the SHA1 ones.
update to icu-3.0 major changes: ICU 3.0 includes the latest bug fixes, locale/charset updates, and performance/build/porting enhancements. - Collation Collation data is in a separate data tree, allowing for easier modularization and maintenance. getFunctionalEquivalent API allows for better caching and UI support. - Unicode 4.0.1 ICU is updated to the latest version of Unicode standard, which had significant property changes. - CLDR 1.1 Updates to CLDR 1.1, with many updates to locale data, and special emphasis on collation data. - Formatting As an aid to migration of traditional C (stdio) and C++ (iostream) formatting, the POSIX-like input/output library, icuio, is officially supported. Significant digits now supported in DecimalFormat, for general use and %g support. - RFC822 time zone format support in DateFormat for compatibility. - Currency formatting/parsing improvements Allows parsing multiple currencies with one formatter, without knowing the currency in advance. Much cleaner design allowing extensibility to other measurement units in the future. - Regular expressions (C) The regular expressions framework now features a C API, instead of just C++. - Locales Locale canonicalization spec defined and implemented. Provides interoperability with POSIX and .NET locale IDs, more RFC 3066 support. - Layout engine Layout engine now supports using different canonically-equivalent Unicode forms of the same text: e.g. a + ´ or á. This is especially important for non-Latin scripts. - Build Environment ICU can now build its data library much faster on most platforms. For a complete list see: http://oss.software.ibm.com/cvs/icu/~checkout~/icu/readme.html?tag=release-3-0
update to 2.8 Lot's of changes and fixes. For example: # Number Formatting ICU4C adds support for formatting and parsing of 64-bit integers. # Text Analysis (Break Iterators) Full conformance with Unicode Consortium UAX 29 and UAX 14 definitions for text boundary positions. Significantly improved performance for reverse direction iteration and isBoundary tests of arbitrary string positions. # StringPrep ICU 2.8 adds APIs and a tool for generic support of StringPrep profiles such as those used in NFS 4. For a complete list see: http://oss.software.ibm.com/icu/download/2.8/index.html
update to 2.6.1 Lots of changes/fixes, eg. Unicode 4.0 support. See http://oss.software.ibm.com/cvs/icu/~checkout~/icu/readme.html?rev=1.141.2.1#News for details. ok'ed by wiz@
Update textproc/icu to 2.6. This is a major reference release with new features and new and modified APIs from version 2.4: * Added support for Unicode 4.0 * Added support for Unicode regular expressions * Enhanced sorting * Added support for international domain names * Added service registration for pluggable ICU modules * Added layout engine API for language-specific glyphs * Separated currencies from locales * Added POSIX-like API for message catalogs * Added new charset converters
Make this compile by a non-root user. Fix confusion on NetBSD PECOFF environment
Update to version 2.4. Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me. - follow PKG_SYSCONFDIR List of major changes for this release: * Regular Expressions Phase 1 ICU 2.4 introduces a Regular Expression C++ API that is modeled after the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode level 1 regular expressions (see Unicode Regular Expression Guidelines) but not all pattern metacharacters and features are supported yet. Regular expressions leverage all of the UnicodeSet support, including all Unicode 3.2 property names and property value names. Future ICU releases will complete the pattern support, add support for higher Unicode regex levels, and improve performance. For more details see the API References and the User Guide. * Modularized ICU library building ICU 2.4 provides build-time switches to prune parts of the library code, for smaller custom distributions. For details see the readme file. * Character set alias management support Additional APIs map alias+standard to a unique charset name (e.g., "Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset names in the alias table, not just the installed ones. See convrtrs.txt and ucnv.h. These APIs allow programmers to avoid data corruption problems when different platforms use the same names for different character conversion mappings. * EBCDIC-z/OS converter option The EBCDIC converter now handles swapped LF/NL mappings algorithmically instead of with modified .ucm/.cnv conversion table files. This makes this behavior available for all supported EBCDIC conversions without adding to the data package size. See "swaplfnl" in convrtrs.txt. * Additional converter A new converter implementation has been added for the encoding of IMAP mailbox names. See RFC 2060/5.1.3. Mailbox International Naming Convention and "IMAP-mailbox-name" in convrtrs.txt. * Customizable break iteration ICU 2.4 allows registration of a BreakIterator with a locale ID. This allows applications to provide more sophisticated word/sentence break engines and use them seamlessly with the ICU APIs. In future releases, this registration mechanism will be extended to all relevant ICU services. If you are interested in ICU customization, please try out this feature. * Collation performance ICU 2.4 collation was improved in several areas, with an emphasis on performance: * Latin-1: Improved performance of u_strcoll(). * Russian/Cyrillic: Improved performance by tailoring collation for cyrillic-script languages, removing UCA contractions that are not used for modern Russian (this uses the [suppressContractions] tailoring option). * Korean: Improved performance by resolving collation elements for modern Hangul syllables at build time (this uses the [optimize] tailoring option). * Japanese: The default strength for Japanese was reduced from quaternary to tertiary as in all other locales. * UnicodeSet performance UnicodeSet performance is significantly improved, especially for add(codePoint) and contains(codePoint). * Unicode property aliases ICU 2.4 introduces APIs for mapping between all appropriate Unicode property aliases and property value aliases and ICU property enumeration constants. See u_getPropertyName() etc. in uchar.h. * Unicode string functions * There are new C functions for searching for last occurrences of characters and partial strings. See u_strrstr(), u_strrchr32() etc. * New C/C++/Java functions for efficient checking if a string contains more than a certain number of code points. See hasMoreChar32Than(). * Copying UnicodeStrings via the standard assignment operator and copy constructor does not preserve readonly aliasing any more because this can sometimes have unexpected and dangerous effects. A new fastCopyFrom() member function provides the old copy semantics. See Jitterbug 1794 for more details. * UTF macros simplified The low-level C macros for handling code points in 8-bit and 16-bit Unicode strings have been replaced by a simpler, more consistent set with more concise names. For details see utf_old.h and utf.h. Similarly, ICU 2.4 defines the UChar32 consistently (now always as int32_t) and adds a U_SENTINEL non-code point value for new APIs. * Performance tests ICU 2.4 has a new performance test framework and additional performance tests using this framework. This is not currently documented, but it is available as part of the source distribution at source/test/perf/.
Move to sha1 digests, and add distfile sizes.
+ move the distfile digest/checksum value from files/md5 to distinfo + move the patch digest/checksum values from files/patch-sum to distinfo