History log of /freebsd-head/sys/cddl/compat/opensolaris/kern/opensolaris_kmem.c
Revision Date Author Comments
5451b35f06da65eff6282c76a7a1061fb222f9ae 01-Sep-2019 markj <markj@FreeBSD.org> Extend uma_reclaim() to permit different reclamation targets.

The page daemon periodically invokes uma_reclaim() to reclaim cached
items from each zone when the system is under memory pressure. This
is important since the size of these caches is unbounded by default.
However it also results in bursts of high latency when allocating from
heavily used zones as threads miss in the per-CPU caches and must
access the keg in order to allocate new items.

With r340405 we maintain an estimate of each zone's usage of its
(per-NUMA domain) cache of full buckets. Start making use of this
estimate to avoid reclaiming the entire cache when under memory
pressure. In particular, introduce TRIM, DRAIN and DRAIN_CPU
verbs for uma_reclaim() and uma_zone_reclaim(). When trimming, only
items in excess of the estimate are reclaimed. Draining a zone
reclaims all of the cached full buckets (the previous behaviour of
uma_reclaim()), and may further drain the per-CPU caches in extreme

Now, when under memory pressure, the page daemon will trim zones
rather than draining them. As a result, heavily used zones do not incur
bursts of bucket cache misses following reclamation, but large, unused
caches will be reclaimed as before.

Reviewed by: jeff
Tested by: pho (an earlier version)
MFC after: 2 months
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D16667
5683a21d5850eeee8f514313275f285608155d68 21-Feb-2018 mav <mav@FreeBSD.org> 9018 Replace kmem_cache_reap_now() with kmem_cache_reap_soon()


To prevent kmem_cache reaping from blocking other system resources, turn
kmem_cache_reap_now() (which blocks) into kmem_cache_reap_soon(). Callers
to kmem_cache_reap_soon() should use kmem_cache_reap_active(), which
exploits #9017's new taskq_empty().

Reviewed by: Bryan Cantrill <bryan@joyent.com>
Reviewed by: Dan McDonald <danmcd@joyent.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Yuri Pankov <yuripv@yuripv.net>
Author: Tim Kordas <tim.kordas@joyent.com>

FreeBSD does not use taskqueue for kmem caches reaping, so this change
is less dramatic then it is on Illumos, just limiting reaping to 1 time
per second. It may possibly be improved later, if needed.
57634430237527708cb45463c5630be1cb51c9e0 17-Apr-2017 glebius <glebius@FreeBSD.org> All these files need sys/vmmeter.h, but now they got it implicitly
included via sys/pcpu.h.
00d578928eca75be320b36d37543a7e2a4f9fbdb 27-May-2016 grehan <grehan@FreeBSD.org> Create branch for bhyve graphics import.
227e0641478d6ec14ae84c216f886f45300c123d 10-Oct-2014 smh <smh@FreeBSD.org> MFC r270759:
Refactor ZFS ARC reclaim logic to be more VM cooperative

MFC r270861:
Ensure that ZFS ARC free memory checks include cached pages

MFC r272483:
Refactor ZFS ARC reclaim checks and limits

Sponsored by: Multiplay
f2543cb01cf389f602fb09b5ecdead5b9354a916 03-Oct-2014 smh <smh@FreeBSD.org> Refactor ZFS ARC reclaim checks and limits

Remove previously added kmem methods in favour of defines which
allow diff minimisation between upstream code base.

Rebalance ARC free target to be vm_pageout_wakeup_thresh by default
which eliminates issue where ARC gets minimised instead of balancing
with VM pageout. The restores the target point prior to r270759.

Bring in missing upstream only changes which move unused code to
further eliminate code differences.

Add additional DTRACE probe to aid monitoring of ARC behaviour.

Enable upstream i386 code paths on platforms which don't define

Fix mixture of byte an page values in arc_memory_throttle i386 code
path value assignment of available_memory.

PR: 187594
Review: D702
Reviewed by: avg
MFC after: 1 week
X-MFC-With: r270759 & r270861
Sponsored by: Multiplay
8d9d31d78637e445e67f416145121f559e461120 30-Aug-2014 smh <smh@FreeBSD.org> Ensure that ZFS ARC free memory checks include cached pages

Also restore kmem_used() check for i386 as it has KVA limits that the raw
page counts above don't consider

PR: 187594
Reviewed by: peter
X-MFC-With: r270759
Review: D700
Sponsored by: Multiplay
502601a54088ae2edc419343befadcf273f55be7 28-Aug-2014 smh <smh@FreeBSD.org> Refactor ZFS ARC reclaim logic to be more VM cooperative

Prior to this change we triggered ARC reclaim when kmem usage passed 3/4
of the total available, as indicated by vmem_size(kmem_arena, VMEM_ALLOC).

This could lead large amounts of unused RAM e.g. on a 192GB machine with
ARC the only major RAM consumer, 40GB of RAM would remain unused.

The old method has also been seen to result in extreme RAM usage under
certain loads, causing poor performance and stalls.

We now trigger ARC reclaim when the number of free pages drops below the
value defined by the new sysctl vfs.zfs.arc_free_target, which defaults
to the value of vm.v_free_target.

Credit to Karl Denninger for the original patch on which this update was

PR: 191510 and 187594
Tested by: dteske
MFC after: 1 week
Relnotes: yes
Sponsored by: Multiplay
6fcf6199a4a9aefe9f2e59d947f0e0df171367b5 22-Mar-2014 bdrewery <bdrewery@FreeBSD.org> Rename global cnt to vm_cnt to avoid shadowing.

To reduce the diff struct pcu.cnt field was not renamed, so
PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in
kvm(3) and vmstat(8). The goal was to not affect externally used KPI.

Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the
the global cnt variable.

Exp-run revealed no ports using it directly.

No objection from: arch@
Sponsored by: EMC / Isilon Storage Division
eb1a5f8de9f7ea602c373a710f531abbf81141c4 21-Feb-2014 gjb <gjb@FreeBSD.org> Move ^/user/gjb/hacking/release-embedded up one directory, and remove
^/user/gjb/hacking since this is likely to be merged to head/ soon.

Sponsored by: The FreeBSD Foundation
6b01bbf146ab195243a8e7d43bb11f8835c76af8 27-Dec-2013 gjb <gjb@FreeBSD.org> Copy head@r259933 -> user/gjb/hacking/release-embedded for initial
inclusion of (at least) arm builds with the release.

Sponsored by: The FreeBSD Foundation
de4ecca21340ce4d0bf9182cac133c14e031218e 07-Aug-2013 jeff <jeff@FreeBSD.org> Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

- Normalize functions that allocate memory to use kmem_*
- Those that allocate address space are named kva_*
- Those that operate on maps are named kmap_*
- Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by: alc
Tested by: pho
Sponsored by: EMC / Isilon Storage Division
1b03c5bf41222b723415638f03e00ed12cac076a 27-Feb-2011 pjd <pjd@FreeBSD.org> Finally... Import the latest open-source ZFS version - (SPA) 28.

Few new things available from now on:

- Data deduplication.
- Triple parity RAIDZ (RAIDZ3).
- zfs diff.
- zpool split.
- Snapshot holds.
- zpool import -F. Allows to rewind corrupted pool to earlier
transaction group.
- Possibility to import pool in read-only mode.

MFC after: 1 month
09f9c897d33c41618ada06fbbcf1a9b3812dee53 19-Oct-2010 jamie <jamie@FreeBSD.org> A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.
86ef76003513bdab705bb75c5b1eed24a4cb1a76 07-Oct-2010 avg <avg@FreeBSD.org> opensolaris_kmem kmem_size(): report lesser of vm_kmem_size and available
physical memory

This is needed to correctly autotune ZFS ARC size when vm_kmem_size is
set to value larger than available physical memory.

MFC after: 2 weeks
f1216d1f0ade038907195fc114b7e630623b402c 19-Mar-2010 delphij <delphij@FreeBSD.org> Create a custom branch where I will be able to do the merge.
bbe899b96e388a8b82439f81ed3707e0d9c6070d 17-Nov-2008 pjd <pjd@FreeBSD.org> Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes.

This bring huge amount of changes, I'll enumerate only user-visible changes:

- Delegated Administration

Allows regular users to perform ZFS operations, like file system
creation, snapshot creation, etc.


Level 2 cache for ZFS - allows to use additional disks for cache.
Huge performance improvements mostly for random read of mostly
static content.

- slog

Allow to use additional disks for ZFS Intent Log to speed up
operations like fsync(2).

- vfs.zfs.super_owner

Allows regular users to perform privileged operations on files stored
on ZFS file systems owned by him. Very careful with this one.

- chflags(2)

Not all the flags are supported. This still needs work.

- ZFSBoot

Support to boot off of ZFS pool. Not finished, AFAIK.

Submitted by: dfr

- Snapshot properties

- New failure modes

Before if write requested failed, system paniced. Now one
can select from one of three failure modes:
- panic - panic on write error
- wait - wait for disk to reappear
- continue - serve read requests if possible, block write requests

- Refquota, refreservation properties

Just quota and reservation properties, but don't count space consumed
by children file systems, clones and snapshots.

- Sparse volumes

ZVOLs that don't reserve space in the pool.

- External attributes

Compatible with extattr(2).

- NFSv4-ACLs

Not sure about the status, might not be complete yet.

Submitted by: trasz

- Creation-time properties

- Regression tests for zpool(8) command.

Obtained from: OpenSolaris
63117b74b14bcc55f35b13d1ad2a0d48521e8b74 05-Nov-2008 rodrigc <rodrigc@FreeBSD.org> Remove definition of KMEM_DEBUG accidentally brought in by latest DTrace

Noticed by: thompsa
8cd2060f99cc7bcf056f242b0b5056d3057ae325 05-Nov-2008 rodrigc <rodrigc@FreeBSD.org> Merge latest DTrace changes from Perforce.
cf5320822f93810742e3d4a1ac8202db8482e633 19-Oct-2008 lulf <lulf@FreeBSD.org> - Import the HEAD csup code which is the basis for the cvsmode work.
c472c0eeeee1fa13d31f15c91688990d4d353753 01-Sep-2008 jb <jb@FreeBSD.org> Disable debug mode.

This is the likely cause of the performance degradation noticed by ZFS
users after the DTrace merge.
165851adb536053ee0b5bb3ccd3e4492ce4a48a3 27-Aug-2008 jb <jb@FreeBSD.org> MFC

DTrace support.

Note that this defaults the 'make buildkernel' to build with CTF data so
that the release kernel and modules are DTrace-able.
439c12d3f064ea4f607b085318836105cce1b889 24-May-2008 bz <bz@FreeBSD.org> Remove redundant redeclaration of 'zone_drain'.
d129981bf536936dfdec28f2b0c70ee1a24b9550 17-Apr-2008 jb <jb@FreeBSD.org> MFC. The great CDDL file move.

These files were repocopied for HEAD. The repo copy process renames
tags, adding a prefix of 'old_', so the history for these files is
in old_RELENG_7 etc.
6ebfa61b436e196e3c0b88601973cbcfb6d6cf1d 01-Apr-2008 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'RELENG_7'.
435a09e625aa62f4e1a89ab7741d374e4c4d0954 24-Jan-2008 pjd <pjd@FreeBSD.org> Change type of kmem_used() and kmem_size() functions to uint64_t, so it
doesn't overflow in arc.c in this check:

if (kmem_used() > (kmem_size() * 4) / 5)
return (1);

With this bug ZFS almost doesn't cache.

Only 32bit machines are affected that have vm.kmem_size set to values >=1GB.

Reported by: David Taylor <davidt@yadt.co.uk>
e7776e4739489b0048ace6aebd4e4f385cc17b80 10-Oct-2007 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'RELENG_7'.
648f58f532a7d0f21a8b274e2caec17c98a880ff 10-Apr-2007 pjd <pjd@FreeBSD.org> Try to stabilize ZFS with regard to memory consumption:
- Allow to shrink ARC down to 16MB (instead of 64MB).
- Set arc_max to 1/2 of kmem_map by default.
- Start freeing things earlier when low memory situation is detected.
- Serialize execution of arc_lowmem().

I decided to setup minimum ZFS memory requirements to 512MB of RAM and 256MB of
kmem_map size. If there is less RAM or kmem_map, a warning will be printed.
World is cruel, be no better. In other words: modern file system requires
modern hardware:)

From ZFS administration guide:

"Currently the minimum amount of memory recommended to install a Solaris
system is 512 Mbytes. However, for good ZFS performance, at least one
Gbyte or more of memory is recommended."
3b005d330261f33318ca1ee3fef1940237fd788b 06-Apr-2007 pjd <pjd@FreeBSD.org> Please welcome ZFS - The last word in file systems.

ZFS file system was ported from OpenSolaris operating system. The code in under
CDDL license.

I'd like to thank all SUN developers that created this great piece of software.

Supported by: Wheel LTD (http://www.wheel.pl/)
Supported by: The FreeBSD Foundation (http://www.freebsdfoundation.org/)
Supported by: Sentex (http://www.sentex.net/)