The NetBSD Project

CVS log for src/lib/libedit/chartype.h

[BACK] Up to [cvs.NetBSD.org] / src / lib / libedit

Request diff between arbitrary revisions


Default branch: MAIN


Revision 1.34 / (download) - annotate - [select for diffs], Mon May 9 21:46:56 2016 UTC (6 months, 3 weeks ago) by christos
Branch: MAIN
CVS Tags: pgoyette-localcount-base, pgoyette-localcount-20161104, pgoyette-localcount-20160806, pgoyette-localcount-20160726, pgoyette-localcount, localcount-20160914, HEAD
Changes since 1.33: +8 -8 lines
Diff to previous 1.33 (colored)

s/protected/libedit_private/g

Revision 1.33 / (download) - annotate - [select for diffs], Mon May 2 16:48:34 2016 UTC (7 months ago) by christos
Branch: MAIN
Changes since 1.32: +2 -2 lines
Diff to previous 1.32 (colored)

eliminate static buffer with custom resizing code.

Revision 1.32 / (download) - annotate - [select for diffs], Mon May 2 16:35:17 2016 UTC (7 months ago) by christos
Branch: MAIN
Changes since 1.31: +3 -3 lines
Diff to previous 1.31 (colored)

fix typos from Pedro Giffuni @FreeBSD

Revision 1.31 / (download) - annotate - [select for diffs], Mon Apr 11 18:56:31 2016 UTC (7 months, 3 weeks ago) by christos
Branch: MAIN
Changes since 1.30: +3 -3 lines
Diff to previous 1.30 (colored)

Get rid of private/public; keep protected (Ingo Schwarze)

Revision 1.30 / (download) - annotate - [select for diffs], Mon Apr 11 16:06:52 2016 UTC (7 months, 3 weeks ago) by christos
Branch: MAIN
Changes since 1.29: +3 -18 lines
Diff to previous 1.29 (colored)

chartype cleanups from Ingo Schwarze:

 - The file tokenizer.c no longer uses chartype.h,
   so don't include the header.

 - The dummy definitions of ct_{de,en}code_string() for the
   NARROWCHAR case are only used in history.c, so move them there.

 - Now the whole content of chartype.h is for the wide character
   case only.  So remove the NARROWCHAR ifdef and include the
   header only in the wide character case.

 - In chartype.h, move ct_encode_char() below the comment explaining it.

 - No more need for underscores before ct_{de,en}code_string().

 - Make the conversion buffer resize functions private.
   They are only called from the decoding and encoding functions
   inside chartype.c, and no need can possibly arise to call them
   from anywhere else.

Revision 1.29 / (download) - annotate - [select for diffs], Mon Apr 11 00:50:13 2016 UTC (7 months, 3 weeks ago) by christos
Branch: MAIN
Changes since 1.28: +13 -20 lines
Diff to previous 1.28 (colored)

Char -> wchar_t from Ingo Schwarze.

Revision 1.28 / (download) - annotate - [select for diffs], Mon Apr 11 00:22:48 2016 UTC (7 months, 3 weeks ago) by christos
Branch: MAIN
Changes since 1.27: +1 -28 lines
Diff to previous 1.27 (colored)

more macro WIDECHAR undoing from Ingo Schwarze.

Revision 1.27 / (download) - annotate - [select for diffs], Sat Apr 9 18:43:17 2016 UTC (7 months, 3 weeks ago) by christos
Branch: MAIN
Changes since 1.26: +1 -43 lines
Diff to previous 1.26 (colored)

More WIDECHAR elimination (Ingo Schwarze)

Revision 1.26 / (download) - annotate - [select for diffs], Wed Mar 23 22:27:48 2016 UTC (8 months, 1 week ago) by christos
Branch: MAIN
Changes since 1.25: +3 -42 lines
Diff to previous 1.25 (colored)

Start removing the WIDECHAR ifdefs; building without it has stopped working
anyway. (Ingo Schwarze)

Revision 1.25 / (download) - annotate - [select for diffs], Mon Mar 7 00:05:20 2016 UTC (8 months, 4 weeks ago) by christos
Branch: MAIN
Changes since 1.24: +1 -8 lines
Diff to previous 1.24 (colored)

Remove advertising clause.

Revision 1.24 / (download) - annotate - [select for diffs], Wed Mar 2 19:24:20 2016 UTC (9 months ago) by christos
Branch: MAIN
Changes since 1.23: +3 -1 lines
Diff to previous 1.23 (colored)

PR/50880: David Binderman: Remove redundant code.
While here, fix all debugging formats.

Revision 1.23 / (download) - annotate - [select for diffs], Wed Feb 24 17:20:01 2016 UTC (9 months, 1 week ago) by christos
Branch: MAIN
Changes since 1.22: +3 -4 lines
Diff to previous 1.22 (colored)

Tuck in mbstate_t to the wide char version only to avoid exposing the zeroing
hack and doing it in the narrow case.

Revision 1.22 / (download) - annotate - [select for diffs], Wed Feb 24 17:13:22 2016 UTC (9 months, 1 week ago) by christos
Branch: MAIN
Changes since 1.21: +2 -2 lines
Diff to previous 1.21 (colored)

Make the read_char function always take a wchar_t * argument (Ingo Schwarze)

Revision 1.21 / (download) - annotate - [select for diffs], Wed Feb 17 19:47:49 2016 UTC (9 months, 2 weeks ago) by christos
Branch: MAIN
Changes since 1.20: +6 -6 lines
Diff to previous 1.20 (colored)

whitespace and header sorting changes (Ingo Schwarze). No functional changes.

Revision 1.20 / (download) - annotate - [select for diffs], Sun Feb 14 17:06:24 2016 UTC (9 months, 2 weeks ago) by christos
Branch: MAIN
Changes since 1.19: +3 -1 lines
Diff to previous 1.19 (colored)

From Ingo Schwarze:

el_getc() for the WIDECHAR case, that is, the version in eln.c.
For a UTF-8 locale, it is broken in four ways:

 1. If the character read is outside the ASCII range, the function
    does an undefined cast from wchar_t to char.  Even if wchar_t
    is internally represented as UCS-4, that is wrong and dangerous
    because characters beyond codepoint U+0255 get their high bits
    truncated, meaning that perfectly valid printable Unicode
    characters get mapped to arbitrary bytes, even the ASCII escape
    character for some Unicode characters.  But wchar_t need not
    be implemented in terms of UCS-4, so the outcome of this function
    is undefined for any and all input.

 2. If insufficient space is available for the result, the function
    fails to detect failure and returns garbage rather than -1 as
    specified in the documentation.

 3. The documentation says that errno will be set on failure, but
    that doesn't happen either in the above case.

 4. Even for ASCII characters, the results may be wrong if wchar_t
    is not using UCS-4.

Revision 1.19 / (download) - annotate - [select for diffs], Sun Feb 14 14:49:34 2016 UTC (9 months, 2 weeks ago) by christos
Branch: MAIN
Changes since 1.18: +1 -5 lines
Diff to previous 1.18 (colored)

From Ingo Schwarze:

As we have seen before, "histedit.h" can never get rid of including
the <wchar.h> header because using the data types defined there is
deeply ingrained in the public interfaces of libedit.

Now POSIX unconditionally requires that <wchar.h> defines the type
wint_t.  Consequently, it can be used unconditionally, no matter
whether WIDECHAR is active or not.  Consequently, the #define Int
is pointless.

Note that removing it is not gratuitious churn.  Auditing for
integer signedness problems is already hard when only fundamental
types like "int" and "unsigned" are involved.  It gets very hard
when types come into the picture that have platform-dependent
signedness, like "char" and "wint_t".  Adding yet another layer
on top, changing both the signedness and the width in a platform-
dependent way, makes auditing yet harder, which IMHO is really
dangerous.  Note that while removing the #define, i already found
one bug caused by this excessive complication - in the function
re_putc() in refresh.c.  If WIDECHAR was defined, it printed an
Int = wint_t value with %c.  Fortunately, that bug only affects
debugging, not production.  The fix is contained in the patch.

With WIDECHAR, this doesn't change anything.  For the case without
WIDECHAR, i checked that none of the places wants to store values
that might not fit in wint_t.

This only changes internal interfaces; public ones remain unchanged.

Revision 1.18 / (download) - annotate - [select for diffs], Sun Feb 14 14:47:48 2016 UTC (9 months, 2 weeks ago) by christos
Branch: MAIN
Changes since 1.17: +2 -4 lines
Diff to previous 1.17 (colored)

From Ingo Schwartze:

Next step:  Remove #ifdef'ing in read_char(), in the same style
as we did for setlocale(3) in el.c.

A few remarks are required to explain the choices made.

 * On first sight, handling mbrtowc(3) seems a bit less trivial
   than handling setlocale(3) because its prototype uses the data
   type mbstate_t from <wchar.h>.  However, it turns out that
   "histedit.h" already includes <wchar.h> unconditionally (i don't
   like headers including other headers, but that ship has sailed,
   people are by now certainly used to the fact that including
   "histedit.h" doesn't require including <wchar.h> before), and
   "histedit.h" is of course included all over the place.  So from
   that perspective, there is no problem with using mbrtowc(3)
   unconditionally ever for !WIDECHAR.

 * However, <wchar.h> also defines the mbrtowc(3) prototype,
   so we cannot just #define mbrtowc away, or including the header
   will break.  It would also be a bad idea to porovide a local
   implementation of mbrtowc() and hope that it overrides the one
   in libc.  Besides, the required prototype is subtly different:
   While mbrtowc(3) takes "wchar_t *" as its first argument, we
   need a function that takes "Char *".  So unfortunately, we have
   to keep a ct_mbrtowc #define, at least until we can maybe get
   rid of "Char *" in the more remote future.

 * After getting rid of the #else clause in read_char(), we can
   pull "return 1;" into the default: clause.  After that, we can
   get rid of the ugly "goto again_lastbyte;" and just "break;".
   As a bonus, that also gets rid of the ugly CONSTCOND.

 * While here, delete the unused ct_mbtowc() from chartype.h.

Revision 1.17 / (download) - annotate - [select for diffs], Thu Feb 11 19:10:18 2016 UTC (9 months, 3 weeks ago) by christos
Branch: MAIN
Changes since 1.16: +1 -3 lines
Diff to previous 1.16 (colored)

remove unused wrapper (Ingo Schwarze)

Revision 1.16 / (download) - annotate - [select for diffs], Mon Feb 8 17:18:43 2016 UTC (9 months, 3 weeks ago) by christos
Branch: MAIN
Changes since 1.15: +3 -1 lines
Diff to previous 1.15 (colored)

UTF-8 fixes from Ingo Schwarze:

 1. Assume that errno is non-zero when entering read_char()
    and that read(2) returns 0 (indicating end of file).
    Then, the code will clear errno before returning.
    (Obviously, the statement "errno = 0" is almost always
     a bug unless there is save_errno = errno right before it
     and the previous value is properly restored later,
     in all reachable code paths.)

 2. When encountering an invalid byte sequence, the code discards
    all following bytes until MB_LEN_MAX overflows; consider, for
    example, 0xc2 immediately followed by a few valid ASCII bytes.
    Three of those ASCII bytes will be discarded.

 3. On a POSIX system, EILSEQ will always be set after reading a
    valid (yes, valid, not invalid!) UTF-8 character.  The reason
    is that mbtowc(3) will first be called with a length limit
    (third argument) of 1, which will fail, return -1, and - on
    a POSIX system - set errno to EILSEQ.
    This third bug is mitigated a bit because i couldn't find any
    system that actually conforms to POSIX in this respect:  None
    of OpenBSD, NetBSD, FreeBSD, Solaris 11, and glibc set errno
    when an incomplete character is passed to mbtowc(3), even though
    that is required by POSIX.
    Anyway, that mbtowc(3) bug will be fixed at least in OpenBSD
    after release unlock, so it would be good to fix this bug in
    libedit before fixing the bug in mbtowc(3).

How can these three bugs be fixed?

 1. As far as i understand it, the intention of the bogus errno = 0
    is to undo the effects of failing system calls in el_wset(),
    sig_set(), and read__fixio() if the subsequent read(2) indicates
    end of file.  So, restoring errno has to be moved right after
    read__fixio().  Of course, neither 0 nor e is the right value
    to restore: 0 is wrong if errno happened to be set on entry, e
    would be wrong because if one read(2) fails but a second attempt
    succeeds after read__fixio(), errno should not be touched.  So,
    the errno to be restored in this case has to be saved before
    calling read(2) for the first time.

 2. Solving the second issue requires distinguishing invalid and
    incomplete characters, but that is impossible with the function
    mbtowc(3) because it returns -1 in both cases and sets errno
    to EILSEQ in both cases (once properly implemented).

    It is vital that each input character is processed right away.
    It is not acceptable to wait for the next input character before
    processing the previous one because this is an interactive
    library, not a batch system.  Consequently, the only situation
    where it is acceptable to wait for the next byte without first
    processing the previous one(s) is when the previous one(s) form
    an incomplete sequence that can be continued to form a valid
    character.

    Consequently, short of reimplementing a full UTF-8 state machine
    by hand, the only correct way forward is to use mbrtowc(3).
    Even then, care is needed to always have the state object
    properly initialized before using it, and to not discard a valid
    ASCII or UTF-8 lead byte if it happens to follow an invalid
    sequence.

 3. Fortunately, solution 2. also solves issue 3. as a side effect,
    by no longer using mbtowc(3) in the first place.

Revision 1.15 / (download) - annotate - [select for diffs], Sun May 17 13:14:41 2015 UTC (18 months, 2 weeks ago) by christos
Branch: MAIN
Changes since 1.14: +2 -2 lines
Diff to previous 1.14 (colored)

add FreeBSD

Revision 1.14 / (download) - annotate - [select for diffs], Thu May 14 10:44:15 2015 UTC (18 months, 3 weeks ago) by christos
Branch: MAIN
Changes since 1.13: +3 -1 lines
Diff to previous 1.13 (colored)

fix warnings on ubuntu 32 bit (Miki Rozloznik)

Revision 1.10.18.2 / (download) - annotate - [select for diffs], Wed May 13 13:33:55 2015 UTC (18 months, 3 weeks ago) by martin
Branch: netbsd-7
CVS Tags: netbsd-7-nhusb-base, netbsd-7-nhusb, netbsd-7-0-RELEASE, netbsd-7-0-RC3, netbsd-7-0-RC2, netbsd-7-0-RC1, netbsd-7-0-2-RELEASE, netbsd-7-0-1-RELEASE, netbsd-7-0
Changes since 1.10.18.1: +2 -2 lines
Diff to previous 1.10.18.1 (colored) to branchpoint 1.10 (colored) next main 1.11 (colored)

Sync lib/libedit with head, requested by christos in #753:

	lib/libedit/Makefile 1.53
	lib/libedit/chartype.h 1.13
	lib/libedit/editline.3 1.83-1.84
	lib/libedit/editrc.5 1.28-1.29
	lib/libedit/eln.c 1.18
	lib/libedit/filecomplete.c 1.33-1.34
	lib/libedit/readline.c 1.112-1.115

Man page improvements, fix overlapping strcpy, improve readline
compatibility, clang build fix.

Revision 1.10.18.1 / (download) - annotate - [select for diffs], Tue Apr 14 05:30:24 2015 UTC (19 months, 3 weeks ago) by snj
Branch: netbsd-7
Changes since 1.10: +5 -3 lines
Diff to previous 1.10 (colored)

Pull up following revision(s) (requested by christos in ticket #679):
	lib/libedit/chartype.c: revisions 1.11, 1.12
	lib/libedit/chartype.h: revisions 1.12, 1.13
PR/49683: Amir Plivatsky: Off-by-one comparison in ct_decode_string() leading
to out of bounds referrence.
--
split the allocation functions, their mixed usage was too confusing.

Revision 1.13 / (download) - annotate - [select for diffs], Sun Feb 22 02:16:19 2015 UTC (21 months, 1 week ago) by christos
Branch: MAIN
Changes since 1.12: +5 -3 lines
Diff to previous 1.12 (colored)

split the allocation functions, their mixed usage was too confusing.

Revision 1.12 / (download) - annotate - [select for diffs], Sun Feb 22 00:46:58 2015 UTC (21 months, 1 week ago) by christos
Branch: MAIN
Changes since 1.11: +2 -2 lines
Diff to previous 1.11 (colored)

PR/49683: Amir Plivatsky: Off-by-one comparison in ct_decode_string() leading
to out of bounds referrence.
XXX: pullup-7

Revision 1.11 / (download) - annotate - [select for diffs], Tue Feb 17 22:49:26 2015 UTC (21 months, 2 weeks ago) by christos
Branch: MAIN
Changes since 1.10: +2 -2 lines
Diff to previous 1.10 (colored)

OpenBSD is like us.

Revision 1.8.2.1 / (download) - annotate - [select for diffs], Tue Apr 17 00:05:27 2012 UTC (4 years, 7 months ago) by yamt
Branch: yamt-pagecache
CVS Tags: yamt-pagecache-tag8
Changes since 1.8: +7 -2 lines
Diff to previous 1.8 (colored) next main 1.9 (colored)

sync with head

Revision 1.10 / (download) - annotate - [select for diffs], Wed Nov 16 01:45:10 2011 UTC (5 years ago) by christos
Branch: MAIN
CVS Tags: yamt-pagecache-base9, yamt-pagecache-base8, yamt-pagecache-base7, yamt-pagecache-base6, yamt-pagecache-base5, yamt-pagecache-base4, tls-maxphys-base, tls-maxphys, tls-earlyentropy-base, tls-earlyentropy, riastradh-xf86-video-intel-2-7-1-pre-2-21-15, riastradh-drm2-base3, riastradh-drm2-base2, riastradh-drm2-base1, riastradh-drm2-base, riastradh-drm2, netbsd-7-base, netbsd-6-base, netbsd-6-1-RELEASE, netbsd-6-1-RC4, netbsd-6-1-RC3, netbsd-6-1-RC2, netbsd-6-1-RC1, netbsd-6-1-5-RELEASE, netbsd-6-1-4-RELEASE, netbsd-6-1-3-RELEASE, netbsd-6-1-2-RELEASE, netbsd-6-1-1-RELEASE, netbsd-6-1, netbsd-6-0-RELEASE, netbsd-6-0-RC2, netbsd-6-0-RC1, netbsd-6-0-6-RELEASE, netbsd-6-0-5-RELEASE, netbsd-6-0-4-RELEASE, netbsd-6-0-3-RELEASE, netbsd-6-0-2-RELEASE, netbsd-6-0-1-RELEASE, netbsd-6-0, netbsd-6, matt-nb6-plus-nbase, matt-nb6-plus-base, matt-nb6-plus, agc-symver-base, agc-symver
Branch point for: netbsd-7
Changes since 1.9: +3 -3 lines
Diff to previous 1.9 (colored)

easier with an int for now.

Revision 1.9 / (download) - annotate - [select for diffs], Tue Nov 15 23:54:14 2011 UTC (5 years ago) by christos
Branch: MAIN
Changes since 1.8: +7 -2 lines
Diff to previous 1.8 (colored)

Since Width() is used only for display purposes we don't want to pass -1 for
unprintable characters.

Revision 1.8 / (download) - annotate - [select for diffs], Fri Jul 29 23:44:44 2011 UTC (5 years, 4 months ago) by christos
Branch: MAIN
CVS Tags: yamt-pagecache-base3, yamt-pagecache-base2, yamt-pagecache-base
Branch point for: yamt-pagecache
Changes since 1.7: +3 -3 lines
Diff to previous 1.7 (colored)

pass -Wconversion

Revision 1.7 / (download) - annotate - [select for diffs], Thu Dec 16 17:42:28 2010 UTC (5 years, 11 months ago) by wiz
Branch: MAIN
CVS Tags: matt-mips64-premerge-20101231, cherry-xenmp-base, cherry-xenmp, bouyer-quota2-nbase, bouyer-quota2-base, bouyer-quota2
Changes since 1.6: +3 -3 lines
Diff to previous 1.6 (colored)

Observe the following spelling:
- wide character (noun)
- wide-character (adjective)

Inspired by jmc@OpenBSD.

Revision 1.6 / (download) - annotate - [select for diffs], Tue Apr 20 02:01:13 2010 UTC (6 years, 7 months ago) by christos
Branch: MAIN
Changes since 1.5: +2 -2 lines
Diff to previous 1.5 (colored)

Use the same hack for Solaris and MacOS/X. This is not right, we only really
support UTF-8, but it will get us going until this is fixed properly.
From Jess Thrysoee

Revision 1.5 / (download) - annotate - [select for diffs], Thu Apr 15 00:55:57 2010 UTC (6 years, 7 months ago) by christos
Branch: MAIN
Changes since 1.4: +2 -1 lines
Diff to previous 1.4 (colored)

From Jess Thrysoee
	expose ct_enc_width()

Revision 1.4 / (download) - annotate - [select for diffs], Sun Jan 3 18:27:10 2010 UTC (6 years, 11 months ago) by christos
Branch: MAIN
Changes since 1.3: +2 -2 lines
Diff to previous 1.3 (colored)

rename historyw -> history_w for consistency.
add wide tst code and make it the default.

Revision 1.3 / (download) - annotate - [select for diffs], Thu Dec 31 18:32:37 2009 UTC (6 years, 11 months ago) by christos
Branch: MAIN
Changes since 1.2: +5 -3 lines
Diff to previous 1.2 (colored)

expose the encode and decode string functions for the benefit of history
and readline.

Revision 1.2 / (download) - annotate - [select for diffs], Wed Dec 30 23:54:52 2009 UTC (6 years, 11 months ago) by christos
Branch: MAIN
Changes since 1.1: +6 -1 lines
Diff to previous 1.1 (colored)

Fix wide build, test it, but don't turn it on yet.

Revision 1.1 / (download) - annotate - [select for diffs], Wed Dec 30 22:37:40 2009 UTC (6 years, 11 months ago) by christos
Branch: MAIN

Wide character support (UTF-8) from Johny Mattsson; currently disabled.

This form allows you to request diff's between any two revisions of a file. You may select a symbolic revision name using the selection box or you may type in a numeric name using the type-in text box.




CVSweb <webmaster@jp.NetBSD.org>