Up to [cvs.NetBSD.org] / src / bin / sh
Request diff between arbitrary revisions
Keyword substitution: kv
Default branch: MAIN
Fix processing of unknown variable expansion types. Our shell is (was) one of the last not to do this correctly. Expansions are supposed to happen only when the command in which they occur is being executed, not while it is being parsed. If the expansion only happens them, errors should only be detected then. Make it work like that (I saw after I fixed this that FreeBSD had done it, long ago, almost the same way - it is kind of an obvious thing to do). This will allow code like if test it is shell X then commands using shell X specific expansion ops else if it is shell Y then commands using shell Y specific expansion ops else ... fi Previously expansion errors were detected while parsing, so if we're not shell X, and don't implement something that it does (some extension to the standard) that would have generated a parser syntax error, and the script could not be executed (despite the line with the error never being executed). Note that this change does not handle all such possible extensions, just this one. Others are much harder. One side effect of this change is that sh will now continue reading a variable expansion until it locates the terminating '}' (in ${var} forms) regardless of how broken it obviously is (to our shell) whereas previously it would have bailed out as soon as an oddity was spotted.
bin: fix lint warning "effectively discards 'const'" For example: src/bin/ed/io.c(339): warning: call to 'strchr' effectively discards 'const' from argument [346] No binary change.
Pull up following revision(s) (requested by kre in ticket #1787): bin/sh/eval.c: revision 1.191 bin/sh/expand.c: revision 1.144 PR bin/57773 Fix a bug reported by Jarle Fredrik Greipsland in PR bin/57773, where a substring expansion where the substring to be removed from a variable expansion is itself a var expansion where the value contains one (or more) of sh's CTLxxx chars - the pattern had CTLESC inserted, the string to be matched against did not. Fail. We fix that by always inserting CTLESC in var assign expansions. See the PR for all the gory details. Thanks for the PR. PR bin/57773 Fix another bug reported by Jarle Fredrik Greipsland and added to PR bin/57773, which relates to calculating the length of a positional parameter which contains CTL chars -- yes, this one really is that specific, though it would also affect the special param $0 if it were to contain CTL chars, and its length was requested - that is fixed with the same change. And note: $0 is not affected because it looks like a positional param (it isn't, ${00} would be, but is always unset, ${0} isn't) all special parame would be affected the same way, but the only one that can ever contain a CTL char is $0 I believe. ($@ and $* were affected, but just because they're expanding the positional params ... ${#@} and ${#*} are both technically unspecified expansions - and different shells produce different results. See the PR for the details of this one (and the previous). Thanks for the PR.
Pull up following revision(s) (requested by kre in ticket #535): bin/sh/eval.c: revision 1.191 bin/sh/expand.c: revision 1.144 PR bin/57773 Fix a bug reported by Jarle Fredrik Greipsland in PR bin/57773, where a substring expansion where the substring to be removed from a variable expansion is itself a var expansion where the value contains one (or more) of sh's CTLxxx chars - the pattern had CTLESC inserted, the string to be matched against did not. Fail. We fix that by always inserting CTLESC in var assign expansions. See the PR for all the gory details. Thanks for the PR. PR bin/57773 Fix another bug reported by Jarle Fredrik Greipsland and added to PR bin/57773, which relates to calculating the length of a positional parameter which contains CTL chars -- yes, this one really is that specific, though it would also affect the special param $0 if it were to contain CTL chars, and its length was requested - that is fixed with the same change. And note: $0 is not affected because it looks like a positional param (it isn't, ${00} would be, but is always unset, ${0} isn't) all special parame would be affected the same way, but the only one that can ever contain a CTL char is $0 I believe. ($@ and $* were affected, but just because they're expanding the positional params ... ${#@} and ${#*} are both technically unspecified expansions - and different shells produce different results. See the PR for the details of this one (and the previous). Thanks for the PR.
PR bin/57773 Fix another bug reported by Jarle Fredrik Greipsland and added to PR bin/57773, which relates to calculating the length of a positional parameter which contains CTL chars -- yes, this one really is that specific, though it would also affect the special param $0 if it were to contain CTL chars, and its length was requested - that is fixed with the same change. And note: $0 is not affected because it looks like a positional param (it isn't, ${00} would be, but is always unset, ${0} isn't) all special parame would be affected the same way, but the only one that can ever contain a CTL char is $0 I believe. ($@ and $* were affected, but just because they're expanding the positional params ... ${#@} and ${#*} are both technically unspecified expansions - and different shells produce different results. See the PR for the details of this one (and the previous). Thanks for the PR. XXX pullup to everything.
Correct a bizarre piece of source formatting that crept in by accident several years ago (change a space into newline tab). NFC
Adjust tilde expansion as will be documented in the forthcoming version of the POSIX standard (Issue 8). I believe we were already compliant with what is to be required, but POSIX is now encouraging (and will likely require in a later version) that if a tilde expansion produces a string which ends in a '/' and the '~' that was expanded is immediately followed by a '/' in the input word, that one of those two slashes be omitted. The worst (current) example of this is when HOME=/ and we expand ~/foo - previously producing //foo which is (in POSIX) a path with implementation defined semantics, and so not what we should be generating by accident. Change that, so now if the ~ prefix expansion ends in a '/' and there is a '/' following immediately after, the resulting word contains only one of those chars (in the example just given, we will now produce /foo instead). POSIX is also making it clear that the expansion that results from the tilde expansion is treated as quoted (not subject to pathname expansion, or field splitting, or any var/arith/command substitutions) and that if HOME="" the expansion of ~ must generate "" (not nothing). Our implementation did all of that already (though older versions used to treat an empty expansion of HOME the same as if HOME was unset - that was fixed some time ago). The actual modification made here is probably smaller than this log entry, and without added comments, certainly is!
PR bin/53550 Here we go again... One more time to redo how here docs are processed (it has been a few years since the last time!) This is actually a relatively minor change, mostly to timimg (to just when things happen). Now here docs are expanded at the same time the "filename" word in a redirect is expanded, rather than later when the heredoc was being sent to its process. This actually makes things more consistent - but does break one of the ATF tests which was testing that we were (effectively) internally inconsistent in this area. Not all shells agree on the context in which redirection expansions should happen, some make any side effects visible to the parent shell (the majority do) others do the redirection expansions in a subshell so any side effcts are lost. We used to have a foot in each camp, with the majority for everything but here docs, and the minority for here docs. Now we're all the way with LBJ ... (or something like that).
DEBUG mode changes only. NFC (NC) for any normally compiled shell. Mostly adding DEBUG mode tracing (when appropriate verbose tracing is enabled generally) whenever a shell (including sushell) process exits, so shells that the tracing should indicate why ehslls that vanish did that. Note for future investigators: if the relevant tracing is enabled, and a (sub-)shell still simply seems to have vanished without trace, the likely cause is that it was killed by a signal - and of those, the most common that occurs is SIGPIPE.
bin: remove unnecessary lint comment CONSTCOND Since 2021-01-31, lint no longer warns about 'do ... while (0)'. No functional change.
Remove a redundant set of parentheses that were added (along with a extra && or || or something ... forgotten now) as part a failed attempt to fix an earlier bug (later fixed a better way) - when the extra test (never committed) was removed, the now-redundant parentheses got forgotten... NFC.
Pull up following revision(s) (requested by kre in ticket #940): bin/sh/expand.h: revision 1.25 bin/sh/expand.c: revision 1.134 bin/sh/expand.c: revision 1.135 bin/sh/expand.c: revision 1.136 bin/sh/expand.c: revision 1.137 Remove a (completely harmless) duplicate assignment introduced in a code merge from FreeBSD in 2017. NFC. Pointed out by Roland Illig. prevent sign extension from making expression always false. remove masking and cast (requested by kre@) When expanding a here-doc (NXHERE - the type with an unquoted end delim) the output will not be further processed (at all) so there is no need to escape magic chars in the output, and doing so leaves stray CTLESC chars in the here doc text. Not good. So don't do that... To save a strlen() of the result, to determine the size of the here doc, make rmescapes() return the length of the resulting string (this isn't needed for other uses, so didn't happen previously). Reported on current-users@ (2020-02-06) by Jun Ebihara XXX pullup -9
Ooops, restore accidently removed files from merge mishap
Sync with HEAD
Mostly merge changes from HEAD upto 20200411
Merge changes from current as of 20200406
When expanding a here-doc (NXHERE - the type with an unquoted end delim) the output will not be further processed (at all) so there is no need to escape magic chars in the output, and doing so leaves stray CTLESC chars in the here doc text. Not good. So don't do that... To save a strlen() of the result, to determine the size of the here doc, make rmescapes() return the length of the resulting string (this isn't needed for other uses, so didn't happen previously). Reported on current-users@ (2020-02-06) by Jun Ebihara XXX pullup -9
Pull up following revision(s) (requested by kre in ticket #467): bin/sh/expand.c: revision 1.133 Open code the validity test & copy of the character class name in a bracket expression in a pattern (ie: [[:THISNAME:]]). Previously the code used strspn() to look for invalid chars in the name, and then memcpy(), now we do the test and copy a character at a time. This might, or might not, be faster, but it now correctly handles \ quoted characters in the name (' and " quoting were already dealt with, \ was too in an earlier version, but when the \ handling changes were made, this piece of code broke). Not exactly a vital bug fix (who writes [[:\alpha:]] or similar?) but it should work correctly regardless of how obscure the usage is. Problem noted by Harald van Dijk XXX pullup -9
remove masking and cast (requested by kre@)
prevent sign extension from making expression always false.
Remove a (completely harmless) duplicate assignment introduced in a code merge from FreeBSD in 2017. NFC. Pointed out by Roland Illig.
Open code the validity test & copy of the character class name in a bracket expression in a pattern (ie: [[:THISNAME:]]). Previously the code used strspn() to look for invalid chars in the name, and then memcpy(), now we do the test and copy a character at a time. This might, or might not, be faster, but it now correctly handles \ quoted characters in the name (' and " quoting were already dealt with, \ was too in an earlier version, but when the \ handling changes were made, this piece of code broke). Not exactly a vital bug fix (who writes [[:\alpha:]] or similar?) but it should work correctly regardless of how obscure the usage is. Problem noted by Harald van Dijk XXX pullup -9
Sync with HEAD
PR bin/54112 Fix handling of "$@" (that is, double quoted dollar at), when it appears in a string which will be subject to field splitting. Eg: ${0+"$@" } More common usages, like the simple "$@" or ${0+"$@"} end up being entirely quoted, so no field splitting happens, and the problem was avoided. See the PR for more details. This ends up making a bunch of old hack code (and some that was relatively new) vanish - for now it is just #if 0'd or commented out. Cleanups of that stuff will happen later. That some of the worst $@ hacks are now gone does not mean that processing of "$@" does not retain a very special place in every hackers heart. RIP extreme ugliness - long live the merely ordinary ugly. Added a new bin/sh ATF test case to verify that all this remains fixed.
Finish the fixes from Feb 4 for handling of random data that matches the internal CTL* chars. The earlier fixes handled CTL* char values in var expansions, but not in various other places they can occur (positional parameters, $@ $* -- even potentially $0 and ~ expansions, as well as byte strings generated from a \u in a $'' string). These should all be correctly handled now. There is a new ISCTL() macro to make the test, rather than using the old BASESYNTAX[c]==CCTL form (which us still a viable alternative) as the new way allows compiler optimisations, and less mem references, so it should be smaller and faster. Also, be sure in all cases to remove any CTLESC (or other) CTL* chars from all strings before they are made available for any external use (there was one case missed - which didn't matter when we weren't bothering to escape the CTL* chars at all.) XXX pullup-8 (will need to be via a patch) along with the Feb 4 fixes.
Fix an old bug (very old) that was made worse in 1.128 (the "${1+$@}" fixes) where a variable containing a CTL char (the only possibility used to be CTLESC (0x81)) would lose that character if the variable was expanded when "set -f" (noglob) was in effect. 1.128 made this worse by adding more 0x8z values (a couple more) which would see the same behaviour, and one of those was noticed by Martijn Dekker. The reasoning was that when noglob is on, when a var is expanded, there are no magic chars, so (apparently) no need to escape anything. Hence nothing was escaped .. including any CTL chars that happened to be present. When we later rmescapes() the CTL chars that we expect might occur are summarily removed - even if they weren't really CTL chars, but just data masquerading. We must *always* escape any CTL char clones that are in the var value, no matter what other conditions apply, and what we expect to happen next. While here, fix rmescapes() (and its $(()) clone, rmescapes_nl()) to be more robust, less likely to forget to delete anything (which was not the issue here, just the reverse) and in a DEBUG shell, have the shell abort() if it encounters something in rmescapes() it is not anticipating, so the code can be made to handle it, or if it should not happen, we can find out why it did. XXX pullup -8 (but will need to be via patch, code is quite different).
Sync with HEAD, resolve a few conflicts
Yet another foray into the mysterious world of $@ -- this time to fix the (unusual) idiom "${1+$@}" (the quotes are part of it). This seems to have broken about 5 or 6 years ago (somewhere between -6 and -7), I believe. Note this is not the same as "$@" and also not the same as ${1+"$@"} (much more common idioms) which both worked. Also attempt to deal with "" more correctly, especially when it appears adjacent to "$@" (or one of the similar constructs.) This stuff is still all as ugly and hackish (and fragile) as is possible to imagine, but in an effort to allow some of the weirdness to eventually go away, the parser output has been made more regular and all quoted (parts of) words always now start with CTLQUOTEMARK and end with CTLQUOTEEND regardless of where the quotes appear. This allows us to tell the difference between """$@" and "$@" which was impossible before - yet they are required to generate different output when there are no args (when "$@" simply vanishes). Needless to say that change had ramifications all over the place. To simplify any similar change in the future, there are some new macros that can generally be used to detect the "noise" data when processing words, rather than open coding that every time (which meant that there would *always* be one which missed getting updated...) Several other bugs (of my making, and older ones) are also fixed. The aim is that (aside from anything that is detecting the cases that were broken before - which were all unlikely uses of sh syntax) these changes should have no external visible impact. Sure...
Sync with HEAD, resolve a couple of conflicts
Rationalise (slightly) the way that expansions are processed to hide meta-characters in the result when the expansion was in (double) quotes, and so should not be further processed. Most of this has been OK for a long while, but \ needs hiding as well, which complicates things, as \ cannot simply be hidden in the syntax tables as one of the group of random special characters. This was fixed earlier for simple variable expansions, but every variety has its own code path ($var uses different code than $n which is different than $(...), which is different again from ~ expansions, and also from what $'...' produces). This could be fixed by moving them all to a common code path, but that's harder than it seems. The form in which the data is made available differs, so one common routine would need a whole bunch of different "get the next char or indicate end" methods - probably via passing in an accessor function. That's all a lot of churn, and would probably slow the shell. Instead, just make macros for doing the standard tests, and use those instead of open coding (differently) each time. This way some of the code paths don't end up forgetting to handle '\' (which is different than all the others). This removes one optimisation ... when no escaping is needed (like just $var (unquoted) where magic chars (think '*') in the value are intended to remain magic), the code avoided doing two tests for each char ("do we need escapes" and "is this char one that needs escaping") by choosing two different syntax tables (choice made outside the loop) - one of which never returns the magic "needs escaping" result, and the other does when appropriate, and then just avoiding the "do we need escapes" test for each character processed. Then when '\' was fixed, there needed to be another test for it, as it cannot (for other reasons) be the same as all the others for which "this char need escaping" is true. So that added a 2nd test for each char... Not all the code paths were updated. Hence the bugs... nb: this is all rarely seen in the wild, so it is no big surprised that no-one ever noticed. Now the "use two different syntax tables" is gone (the two returned the same for '\' which is why '\' needed special processing) - and in order to avoid two tests for each char (plus the \ test) we duplicate the loops, one of which tests each char to see if it needs an escape, the 2nd just copies them. This should be faster in the "no escapes" code path (though that is not the point) and perhaps also in the "escapes needed" path (no indirect reference to the syntax table - though that would probably be in a register) but makes the code slightly bigger. For /bin/sh the text segment (on amd64) has grown by 48 bytes. But it still uses the same number of 512 byte pages (and hence also any bigger page size). The resulting file size (/bin/sh) is identical before and after. So is /rescue/sh (or /rescue/anything-else).
Pull up following revision(s) via patch (requested by kre in ticket #1015): bin/sh/expand.c: revision 1.124 bin/sh/expand.c: revision 1.127 bin/sh/parser.c: revision 1.148 bin/sh/parser.c: revision 1.149 bin/sh/syntax.c: revision 1.6 bin/sh/syntax.h: revision 1.9 (partial) First pass at fixing some of the more arcane pattern matching possibilities that we do not currently handle all that well. This mostly means (for now) making sure that quoted pattern magic characters (as well as quoted sh syntax magic chars) are properly marked, so they remain known as being quoted, and do not turn into pattern magic. Also, make sure that an unquoted \ in a pattern always quotes whatever comes next (which, unlike in regular expressions, includes inside [] matches), - Part 2 of pattern matching (glob etc) fixes. Attempt to correctly deal with \ (both when it is a literal, in appropriate cases, and when it appears as CTLESC when it was detected as a quoting character during parsing). In a pattern, in sh, no quoted character can ever be anything other than a literal character. This is quite different than regular expressions, and even different than other uses of glob matching, where shell quoting is not an issue. In something like ls ?\*.c the ? is a meta-character, the * is a literal (it was quoted). This is nothing new, sh has handled that properly for ever. But the same happens with VAR='?\*.c' and ls $VAR which has not always been handled correctly. Of course, in ls "$VAR" nothing in VAR is a meta-character (the entire expansion is quoted) so even the '\' must match literally (or more accurately, no matching happens - VAR simply contains an "unusual" filename). But if it had been ls *"$VAR" then we would be looking for filenames that end with the literal 5 characters that make up $VAR. The same kinds of things are requires of matching patterns in case statements, and sub-strings with the % and # operators in variable expansions. While here, the final remnant of the ancient !! pattern matching hack has been removed (the code that actually implemented it was long gone, but one small piece remained, not doing any real harm, but potentially wasting time - if someone gave a pattern which would once have invoked that hack.)
Sync with HEAD
Part 2 of pattern matching (glob etc) fixes. Attempt to correctly deal with \ (both when it is a literal, in appropriate cases, and when it appears as CTLESC when it was detected as a quoting character during parsing). In a pattern, in sh, no quoted character can ever be anything other than a literal character. This is quite different than regular expressions, and even different than other uses of glob matching, where shell quoting is not an issue. In something like ls ?\*.c the ? is a meta-character, the * is a literal (it was quoted). This is nothing new, sh has handled that properly for ever. But the same happens with VAR='?\*.c' and ls $VAR which has not always been handled correctly. Of course, in ls "$VAR" nothing in VAR is a meta-character (the entire expansion is quoted) so even the '\' must match literally (or more accurately, no matching happens - VAR simply contains an "unusual" filename). But if it had been ls *"$VAR" then we would be looking for filenames that end with the literal 5 characters that make up $VAR. The same kinds of things are requires of matching patterns in case statements, and sub-strings with the % and # operators in variable expansions. While here, the final remnant of the ancient !! pattern matching hack has been removed (the code that actually implemented it was long gone, but one small piece remained, not doing any real harm, but potentially wasting time - if someone gave a pattern which would once have invoked that hack.)
NFC: Whitespace cleanups
DEBUG mode only change (ie: no effect to any normal shell). Add tracing of pattern matching (aid in debugging various issues.)
First pass at fixing some of the more arcane pattern matching possibilities that we do not currently handle all that well. This mostly means (for now) making sure that quoted pattern magic characters (as well as quoted sh syntax magic chars) are properly marked, so they remain known as being quoted, and do not turn into pattern magic. Also, make sure that an unquoted \ in a pattern always quotes whatever comes next (which, unlike in regular expressions, includes inside [] matches),
Pull up following revision(s) (requested by kre in ticket #907): bin/sh/expand.c: revision 1.122 When matching a char class ([[:name:]]) in a pattern (for filename expansion, case patterrns, etc) do not force '[' to be a member of every class. Before this fix, try: case [ in [[:alpha:]]) echo Huh\?;; esac XXX pullup-8 (Perhaps -7 as well, though that shell version has much more relevant bugs than this one.) This bug is not in -6 as that has no charclass support.
Sync with HEAD
When processing character classes ([:xxx:] inside []), treat a class name that is longer than we can handle the same way we treat an unknown class name (as a valid char class which contains nothing, so never matches). Previously a "too long" class name invalidated the class, so [:very-long-name:] would match any of '[' ':' 'v' ... (note: "very-long-name" is not long enough to trigger this, but you get the idea!) However, the name itself has a restricted syntax ([[:***:]] is not a character class, it is a match for one of a '[' ':' or '*', followed by a ']') which we did not implement - check the syntax of the name before treating it as a character class (but we do add '_' to alphanumerics as legal class name characters).
When matching a char class ([[:name:]]) in a pattern (for filename expansion, case patterrns, etc) do not force '[' to be a member of every class. Before this fix, try: case [ in [[:alpha:]]) echo Huh\?;; esac XXX pullup-8 (Perhaps -7 as well, though that shell version has much more relevant bugs than this one.) This bug is not in -6 as that has no charclass support.
Pull up following revision(s) (requested by kre in ticket #310): bin/sh/expand.c: revision 1.121 bin/sh/sh.1: revision 1.167 via patch Three fixes and a change to ~ expansions 1. A serious bug introduced 3 1/2 months ago (approx) (rev 1.116) which broke all but the simple cases of ~ expansions is fixed (amazingly, given the magnitude of this problem, no-one noticed!) 2. An ancient bug (probably from when ~ expansion was first addedin 1994, and certainly is in NetBSD-6 vintage shells) where ${UnSeT:-~} (and similar) does not expand the ~ is fixed (note that ${UnSeT:-~/} does expand, this should give a clue to the cause of the problem. 3. A fix/change to make the effects of ~ expansions on ${UnSeT:=whatever} identical to those in UnSeT=whatever In particular, with HOME=/foo ${UnSeT:=~:~} now assigns, and expands to, /foo:/foo rather than ~:~ just as VAR=~:~ assigns /foo:/foo to VAR. Note this is even after the previous fix (ie: appending a '/' would not change the results here.) It is hard to call this one a bug fix for certain (though I believe it is) as many other shells also produce different results for the ${V:=...} expansions than they do for V=... (though not all the same as we did). POSIX is not clear about this, expanding ~ after : in VAR=whatever assignments is clear, whether ${U:=whatever} assignments should be treated the same way is not stated, one way or the other. 4. Change to make ':' terminate the user name in a ~ expansion in all cases, not only in assignments. This makes sense, as ':' is one character that cannot occur in user names, no matter how otherwise weird they become. bash (incl in posix mode) ksh93 and bosh all act this way, whereas most other shells (and POSIX) do not. Because this is clearly an extension to POSIX, do this one only when not in posix mode (not set -o posix).
Three fixes and a change to ~ expansions 1. A serious bug introduced 3 1/2 months ago (approx) (rev 1.116) which broke all but the simple cases of ~ expansions is fixed (amazingly, given the magnitude of this problem, no-one noticed!) 2. An ancient bug (probably from when ~ expansion was first addedin 1994, and certainly is in NetBSD-6 vintage shells) where ${UnSeT:-~} (and similar) does not expand the ~ is fixed (note that ${UnSeT:-~/} does expand, this should give a clue to the cause of the problem. 3. A fix/change to make the effects of ~ expansions on ${UnSeT:=whatever} identical to those in UnSeT=whatever In particular, with HOME=/foo ${UnSeT:=~:~} now assigns, and expands to, /foo:/foo rather than ~:~ just as VAR=~:~ assigns /foo:/foo to VAR. Note this is even after the previous fix (ie: appending a '/' would not change the results here.) It is hard to call this one a bug fix for certain (though I believe it is) as many other shells also produce different results for the ${V:=...} expansions than they do for V=... (though not all the same as we did). POSIX is not clear about this, expanding ~ after : in VAR=whatever assignments is clear, whether ${U:=whatever} assignments should be treated the same way is not stated, one way or the other. 4. Change to make ':' terminate the user name in a ~ expansion in all cases, not only in assignments. This makes sense, as ':' is one character that cannot occur in user names, no matter how otherwise weird they become. bash (incl in posix mode) ksh93 and bosh all act this way, whereas most other shells (and POSIX) do not. Because this is clearly an extension to POSIX, do this one only when not in posix mode (not set -o posix).
Add support for $'...' quoting (based upon C "..." strings, with \ expansions.) Implementation largely obtained from FreeBSD, with adaptations to meet the needs and style of this sh, some updates to agree with the current POSIX spec, and a few other minor changes. The POSIX spec for this ( http://austingroupbugs.net/view.php?id=249 ) [see note 2809 for the current proposed text] is yet to be approved, so might change. It currently leaves several aspects as unspecified, this implementation handles those as: Where more than 2 hex digits follow \x this implementation processes the first two as hex, the following characters are processed as if the \x sequence was not present. The value obtained from a \nnn octal sequence is truncated to the low 8 bits (if a bigger value is written, eg: \456.) Invalid escape sequences are errors. Invalid \u (or \U) code points are errors if known to be invalid, otherwise can generate a '?' character. Where any escape sequence generates nul ('\0') that char, and the rest of the $'...' string is discarded, but anything remaining in the word is processed, ie: aaa$'bbb\0ccc'ddd produces the same as aaa'bbb'ddd. Differences from FreeBSD: FreeBSD allows only exactly 4 or 8 hex digits for \u and \U (as does C, but the current sh proposal differs.) reeBSD also continues consuming as many hex digits as exist after \x (permitted by the spec, but insane), and reject \u0000 as invalid). Some of this is possibly because that their implementation is based upon an earlier proposal, perhaps note 590 - though that has been updated several times. Differences from the current POSIX proposal: We currently always generate UTF-8 for the \u & \U escapes. We should generate the equivalent character from the current locale's character set (and UTF8 only if that is what the current locale uses.) If anyone would like to correct that, go ahead. We (and FreeBSD) generate (X & 0x1F) for \cX escapes where we should generate the appropriate control character (SOH for \cA for example) with whatever value that has in the current character set. Apart from EBCDIC, which we do not support, I've never seen a case where they differ, so ...
Pull up following revision(s) (requested by kre in ticket #103): bin/kill/kill.c: 1.28 bin/sh/Makefile: 1.111-1.113 bin/sh/arith_token.c: 1.5 bin/sh/arith_tokens.h: 1.2 bin/sh/arithmetic.c: 1.3 bin/sh/arithmetic.h: 1.2 bin/sh/bltin/bltin.h: 1.15 bin/sh/cd.c: 1.49-1.50 bin/sh/error.c: 1.40 bin/sh/eval.c: 1.142-1.151 bin/sh/exec.c: 1.49-1.51 bin/sh/exec.h: 1.26 bin/sh/expand.c: 1.113-1.119 bin/sh/expand.h: 1.23 bin/sh/histedit.c: 1.49-1.52 bin/sh/input.c: 1.57-1.60 bin/sh/input.h: 1.19-1.20 bin/sh/jobs.c: 1.86-1.87 bin/sh/main.c: 1.71-1.72 bin/sh/memalloc.c: 1.30 bin/sh/memalloc.h: 1.17 bin/sh/mknodenames.sh: 1.4 bin/sh/mkoptions.sh: 1.3-1.4 bin/sh/myhistedit.h: 1.12-1.13 bin/sh/nodetypes: 1.16-1.18 bin/sh/option.list: 1.3-1.5 bin/sh/parser.c: 1.133-1.141 bin/sh/parser.h: 1.22-1.23 bin/sh/redir.c: 1.58 bin/sh/redir.h: 1.24 bin/sh/sh.1: 1.149-1.159 bin/sh/shell.h: 1.24 bin/sh/show.c: 1.43-1.47 bin/sh/show.h: 1.11 bin/sh/syntax.c: 1.4 bin/sh/syntax.h: 1.8 bin/sh/trap.c: 1.41 bin/sh/var.c: 1.56-1.65 bin/sh/var.h: 1.29-1.35 An initial attempt at implementing LINENO to meet the specs. Aside from one problem (not too hard to fix if it was ever needed) this version does about as well as most other shell implementations when expanding $((LINENO)) and better for ${LINENO} as it retains the "LINENO hack" for the latter, and that is very accurate. Unfortunately that means that ${LINENO} and $((LINENO)) do not always produce the same value when used on the same line (a defect that other shells do not share - aside from the FreeBSD sh as it is today, where only the LINENO hack exists and so (like for us before this commit) $((LINENO)) is always either 0, or at least whatever value was last set, perhaps by LINENO=${LINENO} which does actually work ... for that one line...) This could be corrected by simply removing the LINENO hack (look for the string LINENO in parser.c) in which case ${LINENO} and $((LINENO)) would give the same (not perfectly accurate) values, as do most other shells. POSIX requires that LINENO be set before each command, and this implementation does that fairly literally - except that we only bother before the commands which actually expand words (for, case and simple commands). Unfortunately this forgot that expansions also occur in redirects, and the other compound commands can also have redirects, so if a redirect on one of the other compound commands wants to use the value of $((LINENO)) as a part of a generated file name, then it will get an incorrect value. This is the "one problem" above. (Because the LINENO hack is still enabled, using ${LINENO} works.) This could be fixed, but as this version of the LINENO implementation is just for reference purposes (it will be superseded within minutes by a better one) I won't bother. However should anyone else decide that this is a better choice (it is probably a smaller implementation, in terms of code & data space then the replacement, but also I would expect, slower, and definitely less accurate) this defect is something to bear in mind, and fix. This version retains the *BSD historical practice that line numbers in functions (all functions) count from 1 from the start of the function, and elsewhere, start from 1 from where the shell started reading the input file/stream in question. In an "eval" expression the line number starts at the line of the "eval" (and then increases if the input is a multi-line string). Note: this version is not documented (beyond as much as LINENO was before) hence this slightly longer than usual commit message. A better LINENO implementation. This version deletes (well, #if 0's out) the LINENO hack, and uses the LINENO var for both ${LINENO} and $((LINENO)). (Code to invert the LINENO hack when required, like when de-compiling the execution tree to provide the "jobs" command strings, is still included, that can be deleted when the LINENO hack is completely removed - look for refs to VSLINENO throughout the code. The var funclinno in parser.c can also be removed, it is used only for the LINENO hack.) This version produces accurate results: $((LINENO)) was made as accurate as the LINENO hack made ${LINENO} which is very good. That's why the LINENO hack is not yet completely removed, so it can be easily re-enabled. If you can tell the difference when it is in use, or not in use, then something has broken (or I managed to miss a case somewhere.) The way that LINENO works is documented in its own (new) section in the man page, so nothing more about that, or the new options, etc, here. This version introduces the possibility of having a "reference" function associated with a variable, which gets called whenever the value of the variable is required (that's what implements LINENO). There is just one function pointer however, so any particular variable gets at most one of the set function (as used for PATH, etc) or the reference function. The VFUNCREF bit in the var flags indicates which func the variable in question uses (if any - the func ptr, as before, can be NULL). I would not call the results of this perfect yet, but it is close. Unbreak (at least) i386 build .... I have no idea why this built for me on amd64 (problem was missing prototype for snprintf witout <stdio.h>) While here, add some (DEBUG mode only) tracing that proved useful in solving another problem. Set the line number before expanding args, not after. As the line_number would have usually been set earlier, this change is mostly an effective no-op, but it is better this way (just in case) - not observed to have caused any problems. Undo some over agressive fixes for a (pre-commit) bug that did not need these changes to be fixed - and these cause problems in another absurd use case. Either of these issues is unlikely to be seen by anyone who isn't an idiot masochist... PR bin/52280 removescapes_nl in expari() even when not quoted, CRTNONL's appear regardless of quoting (unlike CTLESC). New sentence, new line. Whitespace. Improve the (new) LINENO section, markup changes (with thanks to wiz@ for assistace) and some better wording in a few placed. I am an idiot... revert the previous unintended commit. Remove some left over baggage from the LINENO v1 implementation that didn't get removed with v2, and should have. This would have had (I think, without having tested it) one very minor effect on the way LINENO worked in the v2 implementation, but my guess is it would have taken a long time before anyone noticed... Correct spelling in comments of DEBUG only code... (Perhaps) temporary fix to pkgtools (cwrappers) build (configure). Expanding `` containing \ \n sequences looks to have been giving problems. I don't think this is the correct fix, but it will do no worse harm than (perhaps) incorrectly calculating LINENO in this kind of (rare) circumstance. I'll look and see if there should be a better fix later. s/volatile/const/ -- wonderful how opposites attract like this. NFC (normal use) - DEBUG only change, when showing empty arg list don't omit terminating \n. Free stack memory in a couple of obscure cases where it wasn't being done (one in probably dead code that is never compiled, the other in a very rare error case.) Since it is stack memory it wasn't lost in any case, just held longer than needed. Many internal memory management type fixes. PR bin/52302 (core dump with interactive shell, here doc and error on same line) is fixed. (An old bug.) echo "$( echo x; for a in $( seq 1000 ); do printf '%s\n'; done; echo y )" consistently prints 1002 lines (x, 1000 empty ones, then y) as it should (And you don't want to know what it did before, or why.) (Another old one.) (Recently added) Problems with ~ expansion fixed (mem management related). Proper fix for the cwrappers configure problem (which includes the quick fix that was done earlier, but extends upon that to be correct). (This was another newly added problem.) And the really devious (and rare) old bug - if STACKSTRNUL() needs to allocate a new buffer in which to store the \0, calculate the size of the string space remaining correctly, unlike when SPUTC() grows the buffer, there is no actual data being stored in the STACKSTRNUL() case - the string space remaining was calculated as one byte too few. That would be harmless, unless the next buffer also filled, in which case it was assumed that it was really full, not one byte less, meaning one junk char (a nul, or anything) was being copied into the next (even bigger buffer) corrupting the data. Consistent use of stalloc() to allocate a new block of (stack) memory, and grabstackstr() to claim a block of (stack) memory that had already been occupied but not claimed as in use. Since grabstackstr is implemented as just a call to stalloc() this is a no-op change in practice, but makes it much easier to comprehend what is really happening. Previous code sometimes used stalloc() when the use case was really for grabstackstr(). Change grabstackstr() to actually use the arg passed to it, instead of (not much better than) guessing how much space to claim, More care when using unstalloc()/ungrabstackstr() to return space, and in particular when the stack must be returned to its previous state, rather than just returning no-longer needed space, neither of those work. They also don't work properly if there have been (really, even might have been) any stack mem allocations since the last stalloc()/grabstackstr(). (If we know there cannot have been then the alloc/release sequence is kind of pointless.) To work correctly in general we must use setstackmark()/popstackmark() so do that when needed. Have those also save/restore the top of stack string space remaining. [Aside: for those reading this, the "stack" mentioned is not in any way related to the thing used for maintaining the C function call state, ie: the "stack segment" of the program, but the shell's internal memory management strategy.] More comments to better explain what is happening in some cases. Also cleaned up some hopelessly broken DEBUG mode data that were recently added (no effect on anyone but the poor semi-human attempting to make sense of it...). User visible changes: Proper counting of line numbers when a here document is delimited by a multi-line end-delimiter, as in cat << 'REALLY END' here doc line 1 here doc line 2 REALLY END (which is an obscure case, but nothing says should not work.) The \n in the end-delimiter of the here doc (the last one) was not incrementing the line number, which from that point on in the script would be 1 too low (or more, for end-delimiters with more than one \n in them.) With tilde expansion: unset HOME; echo ~ changed to return getpwuid(getuid())->pw_home instead of failing (returning ~) POSIX says this is unspecified, which makes it difficult for a script to compensate for being run without HOME set (as in env -i sh script), so while not able to be used portably, this seems like a useful extension (and is implemented the same way by some other shells). Further, with HOME=; printf %s ~ we now write nothing (which is required by POSIX - which requires ~ to expand to the value of $HOME if it is set) previously if $HOME (in this case) or a user's directory in the passwd file (for ~user) were a null STRING, We failed the ~ expansion and left behind '~' or '~user'. Changed the long name for the -L option from lineno_fn_relative to local_lineno as the latter seemed to be marginally more popular, and perhaps more importantly, is the same length as the peviously existing quietprofile option, which means the man page indentation for the list of options can return to (about) what it was before... (That is, less indented, which means more data/line, which means less lines of man page - a good thing!) Cosmetic changes to variable flags - make their values more suited to my delicate sensibilities... (NFC). Arrange not to barf (ever) if some turkey makes _ readonly. Do this by adding a VNOERROR flag that causes errors in var setting to be ignored (intended use is only for internal shell var setting, like of "_"). (nb: invalid var name errors ignore this flag, but those should never occur on a var set by the shell itself.) From FreeBSD: don't simply discard memory if a variable is not set for any reason (including because it is readonly) if the var's value had been malloc'd. Free it instead... NFC - DEBUG changes, update this to new TRACE method. KNF - white space and comment formatting. NFC - DEBUG mode only change - convert this to the new TRACE() format. NFC - DEBUG mode only change - complete a change made earlier (marking the line number when included in the trace line tag to show whether it comes from the parser, or the elsewhere as they tend to be quite different). Initially only one case was changed, while I pondered whether I liked it or not. Now it is all done... Also when there is a line tag at all, always include the root/sub-shell indicator character, not only when the pid is included. NFC: DEBUG related comment change - catch up with reality. NFC: DEBUG mode only change. Fix botched cleanup of one TRACE(). "b" more forgiving when sorting options to allow reasonable (and intended) flexibility in option.list format. Changes nothing for current option.list. Now that excessive use of STACKSTRNUL has served its purpose (well, accidental purpose) in exposing the bug in its implementation, go back to not using it when not needed for DEBUG TRACE purposes. This change should have no practical effect on either a DEBUG shell (where the STACKSTRNUL() calls remain) or a non DEBUG shell where they are not needed. Correct the initial line number used for processing -c arg strings. (It was inheriting the value from end of profile file processing) - I didn't notice before as I usually test with empty or no profile files to avoid complications. Trivial change which should have very limited impact. Fix from FreeBSD (applied there in July 2008...) Don't dump core with input like sh -c 'x=; echo >&$x' - that is where the word after a >& or <& redirect expands to nothing at all. Another fix from FreeBSD (this one from April 2009). When processing a string (as in eval, trap, or sh -c) don't allow trailing \n's to destroy the exit status of the last command executed. That is: sh -c 'false ' echo $? should produce 1, not 0. It is amazing what nonsense appears to work sometimes... (all my nonsense too!) Two bugs here, one benign because of the way the script is used. The other hidden by NetBSD's sort being stable, and the data not really requiring sorting at all... So as it happens these fixes change nothing, but they are needed anyway. (The contents of the generated file are only used in DEBUG shells, so this is really even less important than it seems.) Another ancient (highly improbable) bug bites the dust. This one caused by incorrect macro usage (ie: using the wrong one) which has been in the sources since version 1.1 (ie: forever). Like the previous (STACKSTRNUL) bug, the probability of this one actually occurring has been infinitesimal but the LINENO code increases that to infinitesimal and a smidgen... (or a few, depending upon usage). Still, apparently that was enough, Kamil Rytarowski discovered that the zsh configure script (damn competition!) managed to trigger this problem. source .editrc after we initialize so that commands persist! Make arg parsing in kill POSIX compatible with POSIX (XBD 2.12) by parsing the way getopt(3) would, if only it could handle the (required) -signumber and -signame options. This adds two "features" to kill, -ssigname and -lstatus now work (ie: one word with all of the '-', the option letter, and its value) and "--" also now works (kill -- -pid1 pid2 will not attempt to send the pid1 signal to pid2, but rather SIGTERM to the pid1 process group and pid2). It is still the case that (apart from --) at most 1 option is permitted (-l, -s, -signame, or -signumber.) Note that we now have an ambiguity, -sname might mean "-s name" or send the signal "sname" - if one of those turns out to be valid, that will be accepted, otherwise the error message will indicate that "sname" is not a valid signal name, not that "name" is not. Keeping the "-s" and signal name as separate words avoids this issue. Also caution: should someone be weird enough to define a new signal name (as in the part after SIG) which is almost the same name as an existing name that starts with 'S' by adding an extra 'S' prepended (eg: adding a SIGSSYS) then the ambiguity problem becomes much worse. In that case "kill -ssys" will be resolved in favour of the "-s" flag being used (the more modern syntax) and would send a SIGSYS, rather that a SIGSSYS. So don't do that. While here, switch to using signalname(3) (bye bye NSIG, et. al.), add some constipation, and show a little pride in formatting the signal names for "kill -l" (and in the usage when appropriate -- same routine.) Respect COLUMNS (POSIX XBD 8.3) as primary specification of the width (terminal width, not number of columns to print) for kill -l, a very small value for COLUMNS will cause kill -l output to list signals one per line, a very large value will cause them all to be listed on one line.) (eg: "COLUMNS=1 kill -l") TODO: the signal printing for "trap -l" and that for "kill -l" should be switched to use a common routine (for the sh builtin versions.) All changes of relevance here are to bin/kill - the (minor) changes to bin/sh are only to properly expose the builtin version of getenv(3) so the builtin version of kill can use it (ie: make its prototype available.) Properly support EDITRC - use it as (naming) the file when setting up libedit, and re-do the config whenever EDITRC is set. Get rid of workarounds for ancient groff html backend. Simplify macro usage. Make one example more like a real world possibility (it still isn't, but is closer) - though the actual content is irrelevant to the point being made. Add literal prompt support this allows one to do: CA="$(printf '\1')" PS1="${CA}$(tput bold)${CA}\$${CA}$(tput sgr0)${CA} " Now libedit supports embedded mode switch sequence, improve sh support for them (adds PSlit variable to set the magic character). NFC: DEBUG only change - provide an externally visible (to the DEBUG sh internals) interface to one of the internal (private to trace code) functions Include redirections in trace output from "set -x" Implement PS1, PS2 and PS4 expansions (variable expansions, arithmetic expansions, and if enabled by the promptcmds option, command substitutions.) Implement a bunch of new shell environment variables. many mostly useful in prompts when expanded at prompt time, but all available for general use. Many of the new ones are not available in SMALL shells (they work as normal if assigned, but the shell does not set or use them - and there is no magic in a SMALL shell (usually for install media.)) Omnibus manual update for prompt expansions and new variables. Throw in some random cleanups as a bonus. Correct a markup typo (why did I not see this before the prev commit??) Sort options (our default is 0..9AaBbZz). Fix markup problems and a typo. Make $- list flags in the same order they appear in sh(1) Do a better job of detecting the error in pkgsrc/devel/libbson-1.6.3's configure script, ie: $(( which is intended to be a sub-shell in a command substitution, but is an arith subst instead, it needs to be written $( ( to do as intended. Instead of just blindly carrying on to find the missing )) somewhere, anywhere, give up as soon as we have seen an unbalanced ')' that isn't immediately followed by another ')' which in a valid arith subst it always would be. While here, there has been a comment in the code for quite a while noting a difference in the standard between the text descr & grammar when it comes to the syntax of case statements. Add more comments to explain why parsing it as we do is in fact definitely the correct way (ie: the grammar wins arguments like this...). DEBUG and white space changes only. Convert TRACE() calls for DEBUg mode to the new style. NFC (when not debugging sh). Mostly DEBUG and white space changes. Convert DEEBUG TRACE() calls to the new format. Also #if 0 a function definition that is used nowhere. While here, change the function of pushfile() slightly - it now sets the buf pointer in the top (new) input descriptor to NULL, instead of simply leaving it - code that needs a buffer always (before and after) must malloc() one and assign it after the call. But code which does not (which will be reading from a string or similar) now does not have to explicitly set it to NULL (cleaner interface.) NFC intended (or observed.) DEBUG changes: convert DEBUG TRACE() calls to new format. ALso, cause exec failures to always cause the shell to exit with status 126 or 127, whatever the cause. 127 is intended for lookup failures (and is used that way), 126 is used for anything else that goes wrong (as in several other shells.) We no longer use 2 (more easily confused with an exit status of the command exec'd) for shell exec failures. DEBUG only changes. Convert the TRACE() calls in the remaining files that still used it to the new format. NFC. Fix a reference after free (and consequent nonsense diagnostic for attempts to set readonly variables) I added in 1.60 by incompletely copying the FreeBSD fix for the lost memory issue.
Implement PS1, PS2 and PS4 expansions (variable expansions, arithmetic expansions, and if enabled by the promptcmds option, command substitutions.)
Now that excessive use of STACKSTRNUL has served its purpose (well, accidental purpose) in exposing the bug in its implementation, go back to not using it when not needed for DEBUG TRACE purposes. This change should have no practical effect on either a DEBUG shell (where the STACKSTRNUL() calls remain) or a non DEBUG shell where they are not needed.
NFC: DEBUG mode only change. Fix botched cleanup of one TRACE().
Many internal memory management type fixes. PR bin/52302 (core dump with interactive shell, here doc and error on same line) is fixed. (An old bug.) echo "$( echo x; for a in $( seq 1000 ); do printf '%s\n'; done; echo y )" consistently prints 1002 lines (x, 1000 empty ones, then y) as it should (And you don't want to know what it did before, or why.) (Another old one.) (Recently added) Problems with ~ expansion fixed (mem management related). Proper fix for the cwrappers configure problem (which includes the quick fix that was done earlier, but extends upon that to be correct). (This was another newly added problem.) And the really devious (and rare) old bug - if STACKSTRNUL() needs to allocate a new buffer in which to store the \0, calculate the size of the string space remaining correctly, unlike when SPUTC() grows the buffer, there is no actual data being stored in the STACKSTRNUL() case - the string space remaining was calculated as one byte too few. That would be harmless, unless the next buffer also filled, in which case it was assumed that it was really full, not one byte less, meaning one junk char (a nul, or anything) was being copied into the next (even bigger buffer) corrupting the data. Consistent use of stalloc() to allocate a new block of (stack) memory, and grabstackstr() to claim a block of (stack) memory that had already been occupied but not claimed as in use. Since grabstackstr is implemented as just a call to stalloc() this is a no-op change in practice, but makes it much easier to comprehend what is really happening. Previous code sometimes used stalloc() when the use case was really for grabstackstr(). Change grabstackstr() to actually use the arg passed to it, instead of (not much better than) guessing how much space to claim, More care when using unstalloc()/ungrabstackstr() to return space, and in particular when the stack must be returned to its previous state, rather than just returning no-longer needed space, neither of those work. They also don't work properly if there have been (really, even might have been) any stack mem allocations since the last stalloc()/grabstackstr(). (If we know there cannot have been then the alloc/release sequence is kind of pointless.) To work correctly in general we must use setstackmark()/popstackmark() so do that when needed. Have those also save/restore the top of stack string space remaining. [Aside: for those reading this, the "stack" mentioned is not in any way related to the thing used for maintaining the C function call state, ie: the "stack segment" of the program, but the shell's internal memory management strategy.] More comments to better explain what is happening in some cases. Also cleaned up some hopelessly broken DEBUG mode data that were recently added (no effect on anyone but the poor semi-human attempting to make sense of it...). User visible changes: Proper counting of line numbers when a here document is delimited by a multi-line end-delimiter, as in cat << 'REALLY END' here doc line 1 here doc line 2 REALLY END (which is an obscure case, but nothing says should not work.) The \n in the end-delimiter of the here doc (the last one) was not incrementing the line number, which from that point on in the script would be 1 too low (or more, for end-delimiters with more than one \n in them.) With tilde expansion: unset HOME; echo ~ changed to return getpwuid(getuid())->pw_home instead of failing (returning ~) POSIX says this is unspecified, which makes it difficult for a script to compensate for being run without HOME set (as in env -i sh script), so while not able to be used portably, this seems like a useful extension (and is implemented the same way by some other shells). Further, with HOME=; printf %s ~ we now write nothing (which is required by POSIX - which requires ~ to expand to the value of $HOME if it is set) previously if $HOME (in this case) or a user's directory in the passwd file (for ~user) were a null STRING, We failed the ~ expansion and left behind '~' or '~user'.
PR bin/52280 removescapes_nl in expari() even when not quoted, CRTNONL's appear regardless of quoting (unlike CTLESC).
Set the line number before expanding args, not after. As the line_number would have usually been set earlier, this change is mostly an effective no-op, but it is better this way (just in case) - not observed to have caused any problems.
A better LINENO implementation. This version deletes (well, #if 0's out) the LINENO hack, and uses the LINENO var for both ${LINENO} and $((LINENO)). (Code to invert the LINENO hack when required, like when de-compiling the execution tree to provide the "jobs" command strings, is still included, that can be deleted when the LINENO hack is completely removed - look for refs to VSLINENO throughout the code. The var funclinno in parser.c can also be removed, it is used only for the LINENO hack.) This version produces accurate results: $((LINENO)) was made as accurate as the LINENO hack made ${LINENO} which is very good. That's why the LINENO hack is not yet completely removed, so it can be easily re-enabled. If you can tell the difference when it is in use, or not in use, then something has broken (or I managed to miss a case somewhere.) The way that LINENO works is documented in its own (new) section in the man page, so nothing more about that, or the new options, etc, here. This version introduces the possibility of having a "reference" function associated with a variable, which gets called whenever the value of the variable is required (that's what implements LINENO). There is just one function pointer however, so any particular variable gets at most one of the set function (as used for PATH, etc) or the reference function. The VFUNCREF bit in the var flags indicates which func the variable in question uses (if any - the func ptr, as before, can be NULL). I would not call the results of this perfect yet, but it is close.
Pull up following revision(s) (requested by kre in ticket #7): bin/sh/expand.c: revisions 1.111, 1.112 PR bin/52272 - fix an off-by one that broke ~ expansions. -- Another arithmetic expansion recordregion() fix, this time calculate the lenght (used to calculate the end) based upon the correct starting point. Thanks to John Klos for finding and reporting this one.
Another arithmetic expansion recordregion() fix, this time calculate the lenght (used to calculate the end) based upon the correct starting point. Thanks to John Klos for finding and reporting this one.
PR bin/52272 - fix an off-by one that broke ~ expansions.
DEBUG mode only change. Convert old trace style to new, and add some more. NFC for any non-DEBUG shell.
NFC: Code style only. Rather than being perverse and adding the negative of a negative number, just add a positive number instead... (the previous version came about purely as an accident of the way the relevant piece of code was added and debugged.... that's my story anyway!)
The correct usage of recordregion() is (begin, end) not (begin, length). Fixing this fixes a regression introduced earlier today (UTC) where arithmetic expressions would be split correctly when the arithmetic started at the beginning of a word: echo $(( expression )) where "begin" is 0, and so (begin, length) is the same as (begin, begin+length) (aka: (begin,end) - and yes, "end" means 1 after last to consider). but did not work correctly when the usage was echo XXX$( expression )) (begin !+ 0) and would only split (some part of) the result of the expression. This regression was also foung by the new t_fsplit:split_arith test case added earlier to the ATF tests for sh.
Fixes to shell expand (that is, $ stuff) from FreeBSD (implemented differently...) In particular ${01} is now $1 not $0 (for ${0any-digits}) ${4294967297} is most probably now "" (unless you have a very large number of params) it is no longer an alias for $1 (4294967297 & 0xFFFFFFFF) == 1 $(( expr $(( more )) stuff )) is no longer the same as $(( expr (( more )) stuff )) which was sometimes OK, as in: $(( 3 + $(( 2 - 1 )) * 3 )) but not always as in: $(( 1$((1 + 1))1 )) which should be 121, but was an arith syntax error as 1((1 + 1))1 is meaningless. Probably some more. This also sprinkles a little const, splits a big func that had 2 (kind of unrelated) purposes into two simpler ones, and avoids some (semi-dubious) modifications (and restores) in the input string to insert \0's when they were needed.
Arrange for set -o and $- output to be sorted, rather than more or less random (and becoming worse as more options are added.) Since the data is known at compile time, sort at compile time, rather than at run time.
Sync with HEAD - tag prg-localcount2-base1
Convert the pattern matcher from recursive to backtracking (from FreeBSD).
Sync with HEAD
Sync with HEAD
Pull up following revision(s) (requested by kre in ticket #1388): bin/sh/expand.c: revision 1.102 Fix for the "${unset-var#$(cmd1)}$(cmd2)" runs the wrong command bug. ... From FreeBSD
PR bin/52090 - fix expansion of unquoted $*
Finish support for all required $(( )) (shell arithmetic) operators, closing PR bin/50958 That meant adding the assignment operators ('=', and all of the +=, *= ...) Currently, ++, --, and ',' are not implemented (none of those are required by posix) but support for them (most likely ',' first) might be added later. To do this, I removed the yacc/lex arithmetic parser completely, and replaced it with a hand written recursive descent parser, that I obtained from FreeBSD, who earlier had obtained it from dash (Herbert Xu). While doing the import, I cleaned up the sources (changed some file names to avoid requiring a clean build, or signifigant surgery to the obj directories if "build.sh -u" was to be used - "build.sh -u" should work fine as it is now) removed some dashisms, applied some KNF, ...
Sync with HEAD
Fix for the "${unset-var#$(cmd1)}$(cmd2)" runs the wrong command bug. ... From FreeBSD
Implement the NETBSD_SHELL readonly unexportable unimportable variable (with its current value set at 20160401) as discussed on current-users and tech-userlevel. This also includes the necessary support to implement it properly (particularly the unexportable part) and adds options to the export command to support unexportable variables. Also implement the "posix" option (no single letter equivalent) which gets its default value from whether or not POSIXLY_CORRECT is set in the environment when the shell starts (but can be changed just like any other option using -o and +o on the command line, or the set builtin command.) While there, fix all uses of options so it is possible to have options that have a short (one char) name, and no long name, just as it has been possible to have options with a long name and no short name, though there are currently none (with no long name). For now, the only use of the posix option is to control whether ${ENV} is read at startup by a non-interactive shell, so changing it with set is not usful - that might change in the future. (from kre@)
After discussions with Jilles Tjoelker (FreeBSD shell) and following a suggestion from him, the way the fix to PR bin/50993 was implemented has changed a little. There are three steps involved in processing a here document, reading it, parsing it, and then evaluating it before applying it to the correct file descriptor for the command to use. The third of those is not related to this problem, and has not changed. The bug was caused by combining the first two steps into one (and not doing it correctly - which would be hard that way.) The fix is to split the first two stages into separate events. The original fix moved the 2nd stage (parsing) to just immediately before the 3rd stage (evaluation.) Jilles pointed out some unwanted side effects from doing it that way, and suggested moving the 2nd stage to immediately after the first. This commit makes that change. The effect is to revert the changes to expand.c and parser.h (which are no longer needed) and simplify slightly the change to parser.c. (from kre@)
PR bin/50993 - this is a significant rewrite of the way that here documents are processed. Now, when first detected, they are simply read (the only change made to the text is to join lines ended with a \ to the subsequent line, otherwise end marker detection does not work correctly (for here docs with an unquoted endmarker only of course.) This patch also moves the "internal subroutine" for looking for the end marker out of readtoken1() (which had to happen as readtoken1 is no longer reading the here doc when it is needed) - that uses code mostly taken from FreeBSD's sh (thanks!) and along the way results in some restrictions on what the end marker can be being removed. We still do not allow all we should. (from kre@)
General KNF and source code cleanups, avoid scattering the magic string " \t\n" all over the place, slightly improved syntax error messages, restructured some of the code for clarity, don't allow IFS to be imported through the environment, and remove the (never) conditionally compiled ATTY option. Apart from one or two syntax error messages, and ignoring IFS if present in the environment, this is intended to have no user visible changes. (from kre@)
PR/19832, PR/35423: Fix handling 0x81 and 0x82 characters in expansions ($VAR etc) that are used to generate filenames for redirections. (from kre)
PR bin/50834o: fix expansions of (unquoted) ${unset_var-} and ""$@ (from kre)
remove useless casts
PR bin/43469 - correctly handle quoting of the pattern part of ${var%pat} type expansions. (from kre)
PR/50179: Timo Buhrmester: sh(1) variable expansion bug
Use an explicit body for a "until not EINTR" loop.
Rebase to HEAD as of a few days ago.
sync with head. for a reference, the tree before this commit was tagged as yamt-pagecache-tag8. this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
Add wctype(3) support to Shell Patterns. Obtained from FreeBSD.
Fix PR bin/48202 [non-critical/low]: sh +nounset and `for X; do` iteration fails if parameter set empty by applying and testing FreeBSD's patch of Oct 24 2009 for this; see http://svnweb.freebsd.org/base/head/bin/sh/expand.c?r1=198453&r2=198454 Also created an ATF test in tests/bin/sh/t_expand.sh for this error and corrected a space->tabs problem there as well.
add crude $LINENO support for FreeBSD
Pull up following revision(s) (requested by gdt in ticket #1851): bin/sh/expand.c: revision 1.88 bin/sh/expand.h: revision 1.19 Fix the expansion of "$(foo-$bar}" so that IFS isn't applied when expanding $bar. Noted by Greg Troxel on tech-userlevel running some 'git' tests. Should fix PR bin/47361
resync with head
Pull up the following revisions(s) (requested by dsl in ticket #773): bin/sh/expand.c: revision 1.88 bin/sh/expand.h: revision 1.19 Fix the expansion of "$(foo-$bar}" so that IFS isn't applied when expanding $bar. Should fix PR bin/47361
sync with head
Fix the expansion of "$(foo-$bar}" so that IFS isn't applied when expanding $bar. Noted by Greg Troxel on tech-userlevel running some 'git' tests. Should fix PR bin/47361
sync with head
include <limits.h> for CHAR_MIN/CHAR_MAX
Pull up following revision(s) (requested by christos in ticket #1665): bin/sh/expand.c: revision 1.85 PR/45269: Andreas Gustafsson: Instead of falling off the edge when eating trailing newlines if the block has moved, arrange so that trailing newlines are never placed in the string in the first place, by accumulating them and adding them only after we've encountered a non-newline character. This allows also for more efficient appending since we know how much we need beforehand. From FreeBSD.
NULL does not need a cast
PR/45269: Andreas Gustafsson: Instead of falling off the edge when eating trailing newlines if the block has moved, arrange so that trailing newlines are never placed in the string in the first place, by accumulating them and adding them only after we've encountered a non-newline character. This allows also for more efficient appending since we know how much we need beforehand. From FreeBSD.
Catchup with rmind-uvmplock merge.
PR/45069: Henning Petersen: Use prototypes from builtins.h .
Use %zu in printf format for size_t value.
fix -Wsign-compare issues
use EXP_CASE only when trimming and unquoted.
PR/36954: Roland Illig: don't eat backlash escapes in variable patterns. Makes ${line%%\**} work.
The field width passed for a %.*s printf format is supposed to be int, not ptrdiff_t; on 64-bit platforms the latter will be too wide. Adjust accordingly.
Pull up following revision(s) (requested by apb in ticket #570): bin/sh/expand.c: revision 1.78 bin/sh/arith.y: revision 1.18 bin/sh/expand.h: revision 1.17 regress/bin/sh/expand.sh: revision 1.4 bin/sh/sh.1: revision 1.86 bin/sh/arith_lex.l: revision 1.14 Make /bin/sh use intmax_t (instead of int) for arithmetic in $((...)).
Make /bin/sh use intmax_t (instead of int) for arithmetic in $((...)).
s/apparant/apparent/, from Zafer.
Pull up following revision(s) (requested by dsl in ticket #1488): bin/sh/expand.c: revision 1.76 Set the 'not a parameter' flag when we skip initial whitespace. Otherwise: ./sh -c 'x=" "; for a in $x; do echo a${a}a; done' is processed as a single empty parameter (instead of no parameters). Should fix the breakage I introdiced in rev 1.75 and PR/34256 and PR/34254
Pull up following revision(s) (requested by dsl in ticket #1487): bin/sh/expand.c: revision 1.75 Rework the code changes from revisions 1.69, 1.70 and 1.74 so that the code behaves correctly. As far as I can tell, "x$@y" now expands correctly, as does IFS=:; set -$IFS. Fixes PR/33472 (again) and PR/33956
Pull up following revision(s) (requested by dsl in ticket #84): bin/sh/expand.c: revision 1.76 Set the 'not a parameter' flag when we skip initial whitespace. Otherwise: ./sh -c 'x=" "; for a in $x; do echo a${a}a; done' is processed as a single empty parameter (instead of no parameters). Should fix the breakage I introdiced in rev 1.75 and PR/34256 and PR/34254
Pull up following revision(s) (requested by dsl in ticket #83): bin/sh/expand.c: revision 1.75 Rework the code changes from revisions 1.69, 1.70 and 1.74 so that the code behaves correctly. As far as I can tell, "x$@y" now expands correctly, as does IFS=:; set -$IFS. Fixes PR/33472 (again) and PR/33956
Set the 'not a parameter' flag when we skip initial whitespace. Otherwise: ./sh -c 'x=" "; for a in $x; do echo a${a}a; done' is processed as a single empty parameter (instead of no parameters). Should fix the breakage I introdiced in rev 1.75 and PR/34256 and PR/34254
Rework the code changes from revisions 1.69, 1.70 and 1.74 so that the code behaves correctly. As far as I can tell, "x$@y" now expands correctly, as does IFS=:; set -$IFS. Fixes PR/33472 (again) and PR/33956
Pull up following revision(s) (requested by dsl in ticket #1336): bin/sh/expand.c: revision 1.74 When expanding "$@" add a \0 byte after the last argument (as well as all the earlier ones) so that a separator is added before it when it is empty. This wasn't needed before a recent change that chenged the behaviour of trailing whitespace IFS characters. Fixed PR/33472
Pull up following revision(s) (requested by dsl in ticket #1336): bin/sh/expand.c: revision 1.74 When expanding "$@" add a \0 byte after the last argument (as well as all the earlier ones) so that a separator is added before it when it is empty. This wasn't needed before a recent change that chenged the behaviour of trailing whitespace IFS characters. Fixed PR/33472
When expanding "$@" add a \0 byte after the last argument (as well as all the earlier ones) so that a separator is added before it when it is empty. This wasn't needed before a recent change that chenged the behaviour of trailing whitespace IFS characters. Fixed PR/33472
Coverity CID 620: Remove dead code.
TOG require that 'set +o' output the options in a form suitable for restoring them - make it so.
Pull up following revision(s) (requested by martin in ticket #1418): bin/sh/expand.c: revision 1.68 expbackq() was incorrectly backing up a temporary buffer when removing \n from the end of output of commands inside $(...) substitutions. If the program output is n*128+1 bytes long (ending in a \n) then the code checks buf[-1] for another \n - looking an uninitialised stack. On a big-endian system an integer of value 10 will satisfy this (unlikely on little endian) and can happen depending on the last code path to use a lot of stack! This caused the problem with newvers.sh on sparc64 after ', 2005' was added to the date list. Fixed PR/28852
appease gcc -Wuninitialized
Pull up revision 1.68 (requested by martin in ticket #1418): expbackq() was incorrectly backing up a temporary buffer when removing \n from the end of output of commands inside $(...) substitutions. If the program output is n*128+1 bytes long (ending in a \n) then the code checks buf[-1] for another \n - looking an uninitialised stack. On a big-endian system an integer of value 10 will satisfy this (unlikely on little endian) and can happen depending on the last code path to use a lot of stack! This caused the problem with newvers.sh on sparc64 after ', 2005' was added to the date list. Fixed PR/28852
Pull up revision 1.70 (requested by dsl in ticket #119): Check quoting before merging ifs regions. sh -c 'set -- a; x="b c"; set -- "$@"$x' now correctly gives $1=ab, $2=c
Pull up revision 1.69 (requested by dsl in ticket #118): Don't merge ifs regions with different quoting requirements
Check quoting before merging ifs regions. sh -c 'set -- a; x="b c"; set -- "$@"$x' now correctly gives $1=ab, $2=c
Don't merge ifs regions with different quoting requirements
expbackq() was incorrectly backing up a temporary buffer when removing \n from the end of output of commands inside $(...) substitutions. If the program output is n*128+1 bytes long (ending in a \n) then the code checks buf[-1] for another \n - looking an uninitialised stack. On a big-endian system an integer of value 10 will satisfy this (unlikely on little endian) and can happen depending on the last code path to use a lot of stack! This caused the problem with newvers.sh on sparc64 after ', 2005' was added to the date list. Fixed PR/28852
Add new builtin `wordexp' to support wordexp(3). From FreeBSD. Provided in PR lib/26123. Approved by kleink@.
Correctly apply IFS to unquoted text in ${x-text}. Fixes PR/26058 and the 'for i in ${x-a b c}; do ...' and ${x-'a b' c}. I can't find a PR for the latter problem. Regression test goind in shortly.
Remove a broken optimistion that crept in earlier today.
Kill a diagnostic I accidentally left in.
No functional changes (intended). Rename some variables, add some comments, and restructure a little. In preparation for fixing "set ${x-a b c}" and friends.
Undo previous fix, breaks: #!/bin/sh echo ${1+"$@"} ./sh.new foo.sh a b c a b c b c I'll revisit this when I have some more time.
"for i in ${x-a b c}; do echo $i; done" should print "a\nb\nc\n" not "a b c\n" like other shells do. mark the expansion for ifs splitting. XXX: linux has a very complicated fix for this. I wonder why.
minor optimization in evalvar() change sent in bin/23813 by VaX#n8
Fix 'set "*" b; case "* b" in "$@") ...' and 'set "*"; case 1 in "${#1}") ...' Which got broken by the previous fix.
PR/22640: Paul Jarc: sh mishandles positional parameters in case. Fixed from FreeBSD PR 56147.
Move UCB-licensed code from 4-clause to 3-clause licence. Patches provided by Joel Baker in PR 22249, verified by myself.
Fixes from David Laight: - ansification - format of output of jobs command (etc) - job identiers %+, %- etc - $? and $(...) - correct quoting of output of set, export -p and readonly -p - differentiation between nornal and 'posix special' builtins - correct behaviour (posix) for errors on builtins and special builtins - builtin printf and kill - set -o debug (if compiled with DEBUG) - cd src obj (as ksh - too useful to do without) - unset -e name, remove non-readonly variable from export list. (so I could unset -e PS1 before running the test shell...)
Revert previous change. No need to save rootshell. It is only affecting the non-vfork case. Having said that, it would be nice if pipelines of simple commands were vforked too. Right now they are not. Explain that setpgid() might fail because we are doing it both in the parent and the child case, because we don't know which one will come first. Suspending a pipeline prints %1 Suspended n times where n is the number of processes, but that was there before. It is easy to fix, but I'll leave the code alone for now.
Deal with rootshell not being maintained correctly in the vfork() case. Propagate isroot, throughout the eval process and maintain it properly. Fixes sleep 10 | cat^C not exiting because sleep and cat ended up in their own process groups, because wasroot was always true in the children.
Implement unset variable error messages from Ben Harris.
Pull up revision 1.52 (requested by itojun): Do not truncate expr > 10 digits. Fixes PR#13943.
Pull up revision 1.52 (requested by itojun): Do not truncate expr > 10 digits. Fixes PR#13943.
make sure we do not truncate arith expresssion > 10 digits. freebsd bin/sh/expand.c revision 1.15. NetBSD PR 13943.
Globbing should match broken symlinks. stat()->lstat() to fix this.
remove redundant declarations and nexted externs.
Fix doubled 'the' in comment.
compile with WARNS = 2
Fix for bin/7502, from Tor Egge / FreeBSD. Their commit message: > During variable expansion, the internal representation of the expression > might be relocated. Handle this case.
Pull up 1.45-1.46. Corrects what's obviously a typo.
Correct a rather obvious typo (once Tor Egge pointed it out to me) in the last change.
PR/7231: Havard Eidnes: Shell quoting/trimming problem
Fix off-by-one error in the starting point to search for an arithmetic expression.
PR/5577: Craig M. Chase: sh does not build with PARALLEL set. - Added YHEADER in Makefile, removed arith.h and adjusted the sources.
Patches from Tor Egge (via Havard Eidnes) to fix various bugs in field splitting and combining. (Note: Some of this are not strictly bugs, but differences between traditional Bourne shell and POSIX.)
Be more retentive about use of NOTREACHED and noreturn.
const poisoning.
Sync with trunk, per request of christos.
- change "extern" variables into int's - remove extern'd variables not actually referenced - don't use char as an array index
Fix the VSTRIMRIGHT* bugs... The problem was not the string length computation, but lack of '\0' termination. Factor this segment out as common code too, while I am there.
off by one error in ${%%}
Previous fix broke $var quoting. Try again differently :-)
Fix bug introduced by EXP_RECORD, where in case there was a variable expansion involved in the `for' list, the list was recorded twice, leading to incorrect argument expansion. Introduce ifsfree() function that free's the IFS region list, GC'ing duplicated code.
PR/4851: Benjamin Lorenz: In the "for <var> in <args>" construct <args> was not marked as a region to be handled by ifsbreakup. Add EXP_RECORD to indicate that the argument string needs to be recorded.
Unfortunately (as I expected) the previous change broke: sleep cmd='set `type "sleep"`; eval echo \$$#' which=`eval $cmd` echo $which because the region did not get recorded at all, and it was interpreted as a single word. I modified the code to keep track when the result of a backquote expansion has been recorded to avoid recording it twice. I still feel that this is not the right fix... More to come.
PR/4547: Joern Clausen: Incorrect argument expansion in backquote variable assignment. E.g. echo ${foo:=`echo 1 2 3 4`} prints: 1 2 3 1 2 3 4 because when the arquments are not quoted, the backquote result gets recorded twice. The fix right now is to comment out the record_region() call in expbackq(). I hope that it does not break anything else.
Make code agree with man page in processing expansion of "$*". Fix from PR 2647.
Fix compiler warnings.
PR/3352: From Hiroyuki Ito: ${#1} was not being expanded properly if there was a need to allocated another stack block.
Pull up latest sh(1). Fixes yet more bugs.
varisset fixes: - treat $0 specially since it is not in shellparams - check the number of parameters instead of walking the parameters array to avoid checking against the null terminated element.
Pull up off-by-one fix.
PR/3269: Off by one in varisset(), caused variable substitution not to count the last positional parameter as set.
Update /bin/sh from trunk per request of Christos Zoulas. Fixes many bugs.
- varisset(): In positional arguments, take into account VSNUL so that: set -- ""; echo ${1:-wwww} works. - when expanding arithmetic, discard previous ifs recorded regions, since we are doing our own scanning. x=ab; echo $((${#x}+1)) now works. - in ${var#word} fix two bugs: * if there was an exact match, there was an off-by-one bug in the comparison of the words. x=abcd; echo ${x#abcd} * if there was no match, the stack region was not adjusted and the rest of the word was getting written in the wrong place. x=123; echo ${x#abc}X
kill 'register'
A correction to the previous patch from Todd Miller.
echo ${1:-empty} did not do the substitution; from Todd Miller (OpenBSD)
PR/2808: Fix parsing of $n where n > 9 (from FreeBSD)
Don't infinite loop with: unset var echo ${var:=}
Fix PR/2070: Ksh style variable modifiers were broken in /bin/sh, from enami tsugutomo
Fixed new bug the previous fix introduced: false foo=bar echo $? would print 1 Also fixed the long standing bug: false echo `echo $?` would print 0 The exitstatus needs rethinking and rewriting. The trial and error method is not very efficient
Merge in my changes from vangogh, and fix the x=`false`; echo $? == 0 bug.
convert to new RCS id conventions.
Oops... typo in the IFS previous fix.
Changed so that backquote expansion eats all trailing newlines, not just the last one. Reported by guido@gvr.win.tue.nl (Guido van Rooij). Repeat By: echo "`cat file-with-many-newlines`"
Changed IFS string-splitting so that it breaks spaces even when IFS does not begin with a space, but contains one. Fixes PR bin/809. #!/bin/sh list="a b c " echo "With ordinary IFS" for i in $list;do echo $i done IFS=":${IFS}" echo "With changed IFS" for i in $list;do echo $i done Note that before the fix ":${IFS}" behaved differently than "${IFS}:".
I added the documented in the manual but not implemented variable expansions: ${#WORD} ${WORD%PAT} ${WORD%%PAT} ${WORD#PAT} ${WORD##PAT}
from James Jegers <jimj@miller.cs.uwm.edu>: quiet -Wall, and squelch some of the worst style errors.
update from trunk
Fix problem with character classes matching a terminating NUL, from Henry Spencer.
Add RCS ids.
Include appropriate header files to bring function prototypes into scope.
sync with 4.4lite
44lite code
Last patch was wrong; just save argbackq around the argstr() call.
evalvar(): If subtype is VSASSIGN (or VSQUESTION), argstr() already rolled forward the backquote queue. If VSQUESTION it doesn't matter because we already exited with an error.
Add RCS identifiers.
Jim "wilson@moria.cygnus.com" Wilson's patches to make C News (and other things) work.
changed "Id" to "Header" for rcsids
added rcs ids to all files
initial import of 386bsd-0.1 sources
Initial revision