Up to [cvs.NetBSD.org] / src / sys / ufs / lfs
Request diff between arbitrary revisions
Keyword substitution: kv
Default branch: MAIN
Round of uvm.h cleanup. The poorly named uvm.h is generally supposed to be for uvm-internal users only. - Narrow it to files that actually need it -- mostly files that need to query whether curlwp is the pagedaemon, which should maybe be exposed by an external header. - Use uvm_extern.h where feasible and uvm_*.h for things not exposed by it. We should split up uvm_extern.h but this will serve for now to reduce the uvm.h dependencies. - Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use UVMHIST(ubchist), since ubchist is declared in uvm.h but the reference evaporates if UVMHIST is not defined, so we reduce header file dependencies. - Make uvm_device.h and uvm_swap.h independently includable while here. ok chs@
Merge changes from current as of 20200406
Sync with head.
UVM locking changes, proposed on tech-kern: - Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock. - Break v_interlock and vmobjlock apart. v_interlock remains a mutex. - Do partial PV list locking in the x86 pmap. Others to follow later.
Sync with head.
Merge from yamt-pagecache (after much testing): - Reduce unnecessary page scan in putpages esp. when an object has a ton of pages cached but only a few of them are dirty. - Reduce the number of pmap operations by tracking page dirtiness more precisely in uvm layer.
- Add and use wrapper functions that take and acquire page interlocks, and pairs of page interlocks. Require that the page interlock be held over calls to uvm_pageactivate(), uvm_pagewire() and similar. - Solve the concurrency problem with page replacement state. Rather than updating the global state synchronously, set an intended state on individual pages (active, inactive, enqueued, dequeued) while holding the page interlock. After the interlock is released put the pages on a 128 entry per-CPU queue for their state changes to be made real in batch. This results in in a ~400 fold decrease in contention on my test system. Proposed on tech-kern but modified to use the page interlock rather than atomics to synchronise as it's much easier to maintain that way, and cheaper.
Break the global uvm_pageqlock into a per-page identity lock and a private lock for use of the pagedaemon policy code. Discussed on tech-kern. PR kern/54209: NetBSD 8 large memory performance extremely low PR kern/54210: NetBSD-8 processes presumably not exiting PR kern/54727: writing a large file causes unreasonable system behaviour
update from HEAD
Pull up following revision(s) (requested by pgoyette in ticket #335): share/man/man9/kernhist.9: 1.5-1.8 sys/arch/acorn26/acorn26/pmap.c: 1.39 sys/arch/arm/arm32/fault.c: 1.105 via patch sys/arch/arm/arm32/pmap.c: 1.350, 1.359 sys/arch/arm/broadcom/bcm2835_bsc.c: 1.7 sys/arch/arm/omap/if_cpsw.c: 1.20 sys/arch/arm/omap/tiotg.c: 1.7 sys/arch/evbarm/conf/RPI2_INSTALL: 1.3 sys/dev/ic/sl811hs.c: 1.98 sys/dev/usb/ehci.c: 1.256 sys/dev/usb/if_axe.c: 1.83 sys/dev/usb/motg.c: 1.18 sys/dev/usb/ohci.c: 1.274 sys/dev/usb/ucom.c: 1.119 sys/dev/usb/uhci.c: 1.277 sys/dev/usb/uhub.c: 1.137 sys/dev/usb/umass.c: 1.160-1.162 sys/dev/usb/umass_quirks.c: 1.100 sys/dev/usb/umass_scsipi.c: 1.55 sys/dev/usb/usb.c: 1.168 sys/dev/usb/usb_mem.c: 1.70 sys/dev/usb/usb_subr.c: 1.221 sys/dev/usb/usbdi.c: 1.175 sys/dev/usb/usbdi_util.c: 1.67-1.70 sys/dev/usb/usbroothub.c: 1.3 sys/dev/usb/xhci.c: 1.75 sys/external/bsd/drm2/dist/drm/i915/i915_gem.c: 1.34 sys/kern/kern_history.c: 1.15 sys/kern/kern_xxx.c: 1.74 sys/kern/vfs_bio.c: 1.275-1.276 sys/miscfs/genfs/genfs_io.c: 1.71 sys/sys/kernhist.h: 1.21 sys/ufs/ffs/ffs_balloc.c: 1.63 sys/ufs/lfs/lfs_vfsops.c: 1.361 sys/ufs/lfs/ulfs_inode.c: 1.21 sys/ufs/lfs/ulfs_vnops.c: 1.52 sys/ufs/ufs/ufs_inode.c: 1.102 sys/ufs/ufs/ufs_vnops.c: 1.239 sys/uvm/pmap/pmap.c: 1.37-1.39 sys/uvm/pmap/pmap_tlb.c: 1.22 sys/uvm/uvm_amap.c: 1.108 sys/uvm/uvm_anon.c: 1.64 sys/uvm/uvm_aobj.c: 1.126 sys/uvm/uvm_bio.c: 1.91 sys/uvm/uvm_device.c: 1.66 sys/uvm/uvm_fault.c: 1.201 sys/uvm/uvm_km.c: 1.144 sys/uvm/uvm_loan.c: 1.85 sys/uvm/uvm_map.c: 1.353 sys/uvm/uvm_page.c: 1.194 sys/uvm/uvm_pager.c: 1.111 sys/uvm/uvm_pdaemon.c: 1.109 sys/uvm/uvm_swap.c: 1.175 sys/uvm/uvm_vnode.c: 1.103 usr.bin/vmstat/vmstat.c: 1.219 Reorder to test for null before null deref in debug code -- Reorder to test for null before null deref in debug code -- KNF -- No need for '\n' in UVMHIST_LOG -- normalise a BIOHIST log message -- Update the kernhist(9) kernel history code to address issues identified in PR kern/52639, as well as some general cleaning-up... (As proposed on tech-kern@ with additional changes and enhancements.) Details of changes: * All history arguments are now stored as uintmax_t values[1], both in the kernel and in the structures used for exporting the history data to userland via sysctl(9). This avoids problems on some architectures where passing a 64-bit (or larger) value to printf(3) can cause it to process the value as multiple arguments. (This can be particularly problematic when printf()'s format string is not a literal, since in that case the compiler cannot know how large each argument should be.) * Update the data structures used for exporting kernel history data to include a version number as well as the length of history arguments. * All [2] existing users of kernhist(9) have had their format strings updated. Each format specifier now includes an explicit length modifier 'j' to refer to numeric values of the size of uintmax_t. * All [2] existing users of kernhist(9) have had their format strings updated to replace uses of "%p" with "%#jx", and the pointer arguments are now cast to (uintptr_t) before being subsequently cast to (uintmax_t). This is needed to avoid compiler warnings about casting "pointer to integer of a different size." * All [2] existing users of kernhist(9) have had instances of "%s" or "%c" format strings replaced with numeric formats; several instances of mis-match between format string and argument list have been fixed. * vmstat(1) has been modified to handle the new size of arguments in the history data as exported by sysctl(9). * vmstat(1) now provides a warning message if the history requested with the -u option does not exist (previously, this condition was silently ignored, with only a single blank line being printed). * vmstat(1) now checks the version and argument length included in the data exported via sysctl(9) and exits if they do not match the values with which vmstat was built. * The kernhist(9) man-page has been updated to note the additional requirements imposed on the format strings, along with several other minor changes and enhancements. [1] It would have been possible to use an explicit length (for example, uint64_t) for the history arguments. But that would require another "rototill" of all the users in the future when we add support for an architecture that supports a larger size. Also, the printf(3) format specifiers for explicitly-sized values, such as "%"PRIu64, are much more verbose (and less aesthetically appealing, IMHO) than simply using "%ju". [2] I've tried very hard to find "all [the] existing users of kernhist(9)" but it is possible that I've missed some of them. I would be glad to update any stragglers that anyone identifies. -- For some reason this single kernel seems to have outgrown its declared size as a result of the kernhist(9) changes. Bump the size. XXX The amount of increase may be excessive - anyone with more detailed XXX knowledge please feel free to further adjust the value appropriately. -- Misssed one cast of pointer --> uintptr_t in previous kernhist(9) commit -- And yet another one. :( -- Use correct mark-up for NetBSD version. -- More improvements in grammar and readability. -- Remove a stray '"' (obvious typo) and add a couple of casts that are probably needed. -- And replace an instance of "%p" conversion with "%#jx" -- Whitespace fix. Give Bl tag table a width. Fix Xr.
Pull up following revision(s) (requested by maya in ticket #330): sbin/fsck_lfs/inode.c: 1.69 sbin/fsck_lfs/lfs.c: 1.73 sbin/fsck_lfs/pass6.c: 1.50 sbin/fsck_lfs/segwrite.c: 1.46 sys/ufs/lfs/lfs.h: 1.202-1.203 sys/ufs/lfs/lfs_accessors.h: 1.48 sys/ufs/lfs/lfs_alloc.c: 1.136-1.137 sys/ufs/lfs/lfs_balloc.c: 1.94 sys/ufs/lfs/lfs_bio.c: 1.141 sys/ufs/lfs/lfs_extern.h: 1.113 sys/ufs/lfs/lfs_inode.c: 1.156-1.157 sys/ufs/lfs/lfs_inode.h: 1.20, 1.21, 1.23 sys/ufs/lfs/lfs_itimes.c: 1.20 sys/ufs/lfs/lfs_pages.c: 1.13-1.15 sys/ufs/lfs/lfs_rename.c: 1.22 sys/ufs/lfs/lfs_segment.c: 1.270-1.275 sys/ufs/lfs/lfs_subr.c: 1.94-1.97 sys/ufs/lfs/lfs_syscalls.c: 1.175 sys/ufs/lfs/lfs_vfsops.c: 1.360 sys/ufs/lfs/lfs_vnops.c: 1.316-1.321 sys/ufs/lfs/ulfs_inode.c: 1.20 sys/ufs/lfs/ulfs_inode.h: 1.24 sys/ufs/lfs/ulfs_lookup.c: 1.41 sys/ufs/lfs/ulfs_quota2.c: 1.31 sys/ufs/lfs/ulfs_readwrite.c: 1.24 sys/ufs/lfs/ulfs_vnops.c: 1.49-1.50 Update inode member i_flag --> i_state to keep up with kernel changes Move definition of IN_ALLMOD near the flag it's a mask for. Now we can see that it doesn't match all the flags, but changing that will require more careful thought. Correct confusion between i_flag and i_flags These will have to be renamed. Spotted by Riastradh, thanks! Add an XXX about the missing flags so it's not buried in a commit message. now the XXX count for LFS is 260 Rename i_flag to i_state. The similarity to i_flags has previously caused errors. Use continue to denote the no-op loop to match netbsd style newline for extra clarity. It isn't safe to drain dirops with seglock held, it'll deadlock if there are any dirops. drain before grabbing seglock. lfs_dirops == 0 is always true (as we already drained dirops), so omit that part of the comparison. Fixes a lot of LFS deadlocks. PR kern/52301 Many thanks to dholland for help analyzing coredumps Ifdef out KDASSERT which fires on my machine. Deduplicate sanity check that seglock is held on segunlock Revert r1.272 fix to PR kern/52301, the performance hit is making things unusable. change lfs_nextsegsleep and lfs_allclean_wakeup to use condvar XXX had to use lfs_lock in lfs_segwait, removed kernel_lock, is this appropriate? fix buffer overflow/KASSERT when cookies are supplied lfs no longer uses the ffs-style struct direct, use the correct minimum size from dholland XXX more wrong Consistently use {,UN}MARK_VNODE macros rather than function calls. Not much point doing anything after a panic call Ask some question about the code in a XXX comment XXX question our double-flushing of dirops Fix typo in comment
Update the kernhist(9) kernel history code to address issues identified in PR kern/52639, as well as some general cleaning-up... (As proposed on tech-kern@ with additional changes and enhancements.) Details of changes: * All history arguments are now stored as uintmax_t values[1], both in the kernel and in the structures used for exporting the history data to userland via sysctl(9). This avoids problems on some architectures where passing a 64-bit (or larger) value to printf(3) can cause it to process the value as multiple arguments. (This can be particularly problematic when printf()'s format string is not a literal, since in that case the compiler cannot know how large each argument should be.) * Update the data structures used for exporting kernel history data to include a version number as well as the length of history arguments. * All [2] existing users of kernhist(9) have had their format strings updated. Each format specifier now includes an explicit length modifier 'j' to refer to numeric values of the size of uintmax_t. * All [2] existing users of kernhist(9) have had their format strings updated to replace uses of "%p" with "%#jx", and the pointer arguments are now cast to (uintptr_t) before being subsequently cast to (uintmax_t). This is needed to avoid compiler warnings about casting "pointer to integer of a different size." * All [2] existing users of kernhist(9) have had instances of "%s" or "%c" format strings replaced with numeric formats; several instances of mis-match between format string and argument list have been fixed. * vmstat(1) has been modified to handle the new size of arguments in the history data as exported by sysctl(9). * vmstat(1) now provides a warning message if the history requested with the -u option does not exist (previously, this condition was silently ignored, with only a single blank line being printed). * vmstat(1) now checks the version and argument length included in the data exported via sysctl(9) and exits if they do not match the values with which vmstat was built. * The kernhist(9) man-page has been updated to note the additional requirements imposed on the format strings, along with several other minor changes and enhancements. [1] It would have been possible to use an explicit length (for example, uint64_t) for the history arguments. But that would require another "rototill" of all the users in the future when we add support for an architecture that supports a larger size. Also, the printf(3) format specifiers for explicitly-sized values, such as "%"PRIu64, are much more verbose (and less aesthetically appealing, IMHO) than simply using "%ju". [2] I've tried very hard to find "all [the] existing users of kernhist(9)" but it is possible that I've missed some of them. I would be glad to update any stragglers that anyone identifies.
Sync with HEAD
Rename i_flag to i_state. The similarity to i_flags has previously caused errors.
Eliminate crusty debugging sludge. We have a mostly sane vnode lifecycle now. If this needs debugging, it should be done once at the call site of VOP_RECLAIM.
Sync with HEAD
Sync with HEAD
Make VOP_INACTIVE preserve vnode lock on return. Discussed on tech-kern: https://mail-index.netbsd.org/tech-kern/2017/04/01/msg021751.html Ride 7.99.68, a bumpy bus of incremental vfs improvements!
Remove now redundant calls to fstrans_start()/fstrans_done(). Add fstrans_start()/fstrans_done() to lfs_putpages().
Sync with HEAD
Remove now obsolete operation vcache_remove(). Welcome to 7.99.36
Pull up following revision(s) (requested by dholland in ticket #1188): sys/ufs/lfs/ulfs_inode.c: revision 1.14 Merge ufs_inode.c 1.93: missing unlock on error path.
Sync with HEAD
One more batch of already-synced ufs changes: ufs_extern.h 1.79 is equivalent to ulfs_extern.h 1.14 ufsmount.h 1.43 is (roughly) equivalent to lfs_extern.h 1.102 ufs_inode.c 1.94 does not apply to lfs ufs_inode.c 1.95 does not apply to lfs either ufs_readwrite.c 1.108 is equivalent to ulfs_readwrite.c 1.8 ufs_readwrite.c 1.109 is equivalent to ulfs_readwrite.c 1.9 ufs_readwrite.c 1.110 is equivalent to ulfs_readwrite.c 1.10 ufs_readwrite.c 1.111 does not apply to lfs ufs_readwrite.c 1.112 is equivalent to ulfs_readwrite.c 1.11 ufs_readwrite.c 1.113 is equivalent to ulfs_readwrite.c 1.13 ufs_readwrite.c 1.114 is equivalent to ulfs_readwrite.c 1.14 ufs_readwrite.c 1.115 is equivalent to ulfs_readwrite.c 1.15 ufs_readwrite.c 1.116-1.118 does not apply to lfs ufs_readwrite.c 1.119-1.120 are equivalent to ulfs_readwrite.c 1.16 ufs_rename.c 1.12 is equivalent to lfs_rename.c 1.8 ufs_vnops.c 1.226 is equivalent to ulfs_vnops.c 1.22 and lfs_vnops.c 1.270 ufs_vnops.c 1.227 is equivalent to ulfs_vnops.c 1.23 ufs_vnops.c 1.228-1.229 are equivalent to ulfs_vnops.c 1.24 ufs_vnops.c 1.230 is equivalent to ulfs_vnops.c 1.25 and lfs_vnops.c 1.271 ufs_vnops.c 1.231 originated in lfs ufs_vnops.c 1.232 does not apply to lfs
Merge ufs_inode.c 1.93: missing unlock on error path.
Note more already-merged versions: inode.h 1.68 is subsumed by ulfs_inode.h 1.19 inode.h 1.69-1.72 do not apply to lfs ufs_extern.h 1.74 was covered when lfs was moved to the new vnode cache ufs_extern.h 1.75 is equivalent to ulfs_extern.h 1.13 ufs_extern.h 1.76-1.77 do not apply to lfs ufsmount.h 1.42 does not apply to lfs ufs_inode.c 1.90 is subsumed by ulfs_inode.c 1.10 ufs_inode.c 1.91-1.92 do not apply to lfs ufs_lookup.c 1.130 is subsumed by ulfs_lookup.c 1.24 ufs_lookup.c 1.131 is equivalent to ulfs_lookup.c 1.20 ufs_lookup.c 1.132 is equivalent to ulfs_lookup.c 1.21 ufs_lookup.c 1.133 is equivalent to ulfs_lookup.c 1.22 ufs_lookup.c 1.134 is equivalent to ulfs_lookup.c 1.23 ufs_lookup.c 1.135 is equivalent to ulfs_lookup.c 1.25 ufs_quota2.c 1.38 is equivalent to ulfs_quota2.c 1.17 ufs_quota2.c 1.39 is equivalent to ulfs_quota2.c 1.16 ufs_quota2.c 1.40 is equivalent to ulfs_quota2.c 1.18 ufs_vfsops.c 1.53 is subsumed by lfs_vfsops.c 1.324 ufs_vfsops.c 1.54 is subsumed by lfs_vfsops.c 1.324 ufs_vnops.c 1.223-1.224 do not apply to lfs
Sync with HEAD (as of 26th Dec)
Remove historic references to wapbl.
Sync with HEAD
Add new accessors for the d_type and d_namlen fields of struct lfs_direct. Napalm the old byteswap access logic for these.
Sync with HEAD
Change lfs from hash table to vcache. - Change lfs_valloc() to return an inode number and version instead of a vnode and move lfs_ialloc() and lfs_vcreate() to new lfs_init_vnode(). - Add lfs_valloc_fixed() to allocate a known inode, used by kernel roll forward. - Remove lfs_*ref(), these functions cannot coexist with vcache and their commented behaviour is far away from their implementation. - Add the cleaner lwp and blockinfo to struct ulfsmount so lfs_loadvnode() may use hints from the cleaner. - Remove vnode locks from ulfs_lookup() like we did with ufs_lookup().
Rebase to HEAD as of a few days ago.
sync with head. for a reference, the tree before this commit was tagged as yamt-pagecache-tag8. this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
sync with head
file ulfs_inode.c was added on branch yamt-pagecache on 2014-05-22 11:41:19 +0000
Remove the now-pointless ulfs ops macros.
Get rid of the ulfs_ops table as we only have one fs in here now.
resync from head
file ulfs_inode.c was added on branch tls-maxphys on 2013-06-23 06:18:39 +0000
There is no WAPBL in LFS.
mp->mnt_wapbl and mp->mnt_wapbl_replay are always NULL in here.
Add lfs_ or ulfs_ in front of extern symbols lacking them, mostly quota-related (and particularly quota2-related) stuff.
Split lfs from ufs step 4: Massedit all ufs symbols to be "ulfs" instead, to make sure there are no conflicts with ufs. Confirmed with grep. (This required changing a few comments that maybe should have been left alone to say "ulfs", but we'll survive that.)
Split lfs from ufs step 3: rearrange config stuff. Add new options: LFS_EI LFS_DIRHASH LFS_EXTATTR LFS_EXTATTR_AUTOSTART LFS_QUOTA LFS_QUOTA2 and update code referring to the corresponding FFS and UFS config symbols to use the LFS versions. Disable the one extant reference to APPLE_UFS in the ulfs files. Use opt_lfs.h only, not opt_ffs.h.
Split lfs from ufs, part 2: Change all <ufs/ufs/foo.h> includes to <ufs/lfs/ulfs_foo.h>.
Split lfs from ufs, part 1: cut and paste 15000 lines of ufs as "ulfs". These are verbatim copies except that I've preserved the ufs rcsids for reference. Also, ufs/quota.h -> ulfs_quotacommon.h ufs/ufs_quota.h -> ulfs_quota.h Splitting lfs from ufs was ok'd by core some years ago. This is not from my original tree, which became unmergeable after the several sets of quota changes; I've done the work over again over the last couple days.