History log of /freebsd-head/sys/compat/
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
9404710737f949ae8df728be34065daefc6d08e9 07-Oct-2020 bz <bz@FreeBSD.org> LinuxKPI: add a bitfield.h implementation.

This code was iteratively implemented during the work on various WiFi
drivers -- from individual functions to a macro-created implementations
for the various bit sized needed (and then extended to more for
comepleteness). Some of the bit combinations do not seem to make sense
so are left out.

The __bf_shf(x) was obtained from D26681 [1].

Requested by: manu [1]
Reviewed by: hselasky, manu
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26708
inuxkpi/common/include/linux/bitfield.h
448f16239ea1ecdfa2b549802831431d0523d2ef 06-Oct-2020 manu <manu@FreeBSD.org> linuxkpi: Add pagemap.h

Add release_pages needed by drm which simply calls put_page for
all the pages provided

Reviewed by: bz
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26680
inuxkpi/common/include/linux/pagemap.h
229b923b22f11c0cba10a7c9c3c01dd7b894ffa8 06-Oct-2020 manu <manu@FreeBSD.org> linuxkpi: Add power_supply.h

Add power_supply_is_system_supplied which is needed by drm.

Reviewed by: bz
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26679
inuxkpi/common/include/linux/power_supply.h
69b75883275a6eb647d19336bee0d3280e92a189 06-Oct-2020 manu <manu@FreeBSD.org> linuxkpi: Add prefetch.h

Only add prefetchw as it is the only function used by drm.
Simply use the __builtin_prefetch which is available in all
compiler for a long time.

Reviewed by: bz
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26677
inuxkpi/common/include/linux/prefetch.h
b924a56ecdfcdee392b6331043173184e98dc3e8 06-Oct-2020 manu <manu@FreeBSD.org> linuxkpi: Add numa.h

Only contain NUMA_NO_NODE needed by drm

Reviewed by: bz
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26676
inuxkpi/common/include/linux/numa.h
24103a673e85520329d4b86d4838a33153095509 06-Oct-2020 manu <manu@FreeBSD.org> linuxkpi: Add gcd function

This compute the common greater divider
Taken from OpenBSD

Reviewed by: bz, imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26674
inuxkpi/common/include/linux/gcd.h
da71c38944fc6d3f172feca61e59ecfeb6c164d0 04-Oct-2020 hselasky <hselasky@FreeBSD.org> Populate the acquire context field of a ww_mutex in the LinuxKPI.
Bump the FreeBSD version to force recompilation of external kernel modules.

MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D26657
Submitted by: greg_unrelenting.technology (Greg V)
Sponsored by: Mellanox Technologies // NVIDIA Networking
inuxkpi/common/include/linux/ww_mutex.h
inuxkpi/common/src/linux_lock.c
af0253266939778b4ad41531e6bb3198729d70ff 02-Oct-2020 manu <manu@FreeBSD.org> linuxkpi: Add dmi_* function

dmi function are used to get smbios values.
The DRM subsystem and drivers use it to enabled (or not) quirks.

Reviewed by: hselasky
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26046
inuxkpi/common/include/linux/dmi.h
inuxkpi/common/include/linux/fs.h
inuxkpi/common/include/linux/mod_devicetable.h
inuxkpi/common/src/linux_dmi.c
ecd88b99aeaa22f234e01dcd7b4c2c70e0319906 02-Oct-2020 manu <manu@FreeBSD.org> linuxkpi: Add backlight support

Add backlight function to linuxkpi.
Graphics drivers expose the backlight of the panel directly so allow them to use the backlight subsystem so
user can use backlight(8) to configure them.

Reviewed by: hselasky
Relnotes: yes
Differential Revision: The FreeBSD Foundation
inuxkpi/common/include/linux/backlight.h
inuxkpi/common/include/linux/device.h
inuxkpi/common/src/linux_kmod.c
inuxkpi/common/src/linux_pci.c
96407d0cdcc3fa4007d5aea02e623aea54b8bf68 25-Sep-2020 trasz <trasz@FreeBSD.org> Regen after r366145.

Sponsored by: DARPA
loudabi32/cloudabi32_proto.h
loudabi32/cloudabi32_sysent.c
loudabi64/cloudabi64_proto.h
loudabi64/cloudabi64_sysent.c
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_sysent.c
d246d508c5578a77dcbf2d5c0e4d425a2ff33484 23-Sep-2020 kib <kib@FreeBSD.org> Do not leak oldvmspace if image activation failed

and current address space is already destroyed, so kern_execve()
terminates the process.

While there, clean up some internals of post_execve() inlined in init_main.

Reported by: Peter <pmc@citylink.dinoex.sub.org>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26525
loudabi/cloudabi_proc.c
reebsd32/freebsd32_misc.c
inux/linux_emul.c
7fbac817ea4993423702bc4bc6a669e000cf7b1c 17-Sep-2020 trasz <trasz@FreeBSD.org> Reduce code duplication by introducing linux_copyout_sockaddr()
helper function. No functional changes.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25804
inux/linux_socket.c
4054bb11ea2f6875400f8fdc5bd7c3c3b91bd492 17-Sep-2020 trasz <trasz@FreeBSD.org> Add support for SOUND_MIXER_WRITE_MONITOR ioctl. Fixes alsamixer(1)
on my x220.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25806
inux/linux_ioctl.c
inux/linux_ioctl.h
371726d1752f5ce493011c64efd99ea737a379b2 17-Sep-2020 trasz <trasz@FreeBSD.org> Get rid of sv_errtbl and SV_ABI_ERRNO().

Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D26388
a32/ia32_sysvec.c
inux/linux.h
inux/linux_errno.c
inux/linux_socket.c
4f67284a103d25c93dde61db7056cc0509cfca8c 04-Sep-2020 markj <markj@FreeBSD.org> Add emulation support for the Linux kcov(4) ioctl API.

This makes it possible to run an unmodified Linux syzkaller executor
against the Linuxulator, and have it gather code coverage information.

Sponsored by: The FreeBSD Foundation
indebugfs/lindebugfs.c
insysfs/linsysfs.c
inux/linux_ioctl.c
inux/linux_ioctl.h
6265a56fda2455ca5973c47b3d0d702ed6d409ec 01-Sep-2020 mjg <mjg@FreeBSD.org> compat: clean up empty lines in .c and .h files
loudabi32/cloudabi32_proto.h
loudabi64/cloudabi64_proto.h
reebsd32/freebsd32_ipc.h
reebsd32/freebsd32_misc.c
reebsd32/freebsd32_proto.h
a32/ia32_signal.h
indebugfs/lindebugfs.c
inprocfs/linprocfs.c
inux/linux_common.c
inux/linux_event.c
inux/linux_file.c
inux/linux_fork.c
inux/linux_futex.c
inux/linux_ioctl.c
inux/linux_ipc.c
inux/linux_misc.c
inux/linux_misc.h
inux/linux_mmap.c
inux/linux_signal.c
inux/linux_socket.c
inux/linux_stats.c
inux/linux_time.c
inux/linux_timer.c
inux/linux_vdso.c
inuxkpi/common/include/asm/atomic-long.h
inuxkpi/common/include/asm/atomic.h
inuxkpi/common/include/asm/atomic64.h
inuxkpi/common/include/linux/compat.h
inuxkpi/common/include/linux/dma-attrs.h
inuxkpi/common/include/linux/dma-mapping.h
inuxkpi/common/include/linux/dmapool.h
inuxkpi/common/include/linux/fs.h
inuxkpi/common/include/linux/io.h
inuxkpi/common/include/linux/jhash.h
inuxkpi/common/include/linux/kmod.h
inuxkpi/common/include/linux/kref.h
inuxkpi/common/include/linux/list.h
inuxkpi/common/include/linux/pci.h
inuxkpi/common/include/linux/scatterlist.h
inuxkpi/common/include/linux/sysfs.h
inuxkpi/common/include/net/ipv6.h
inuxkpi/common/src/linux_compat.c
inuxkpi/common/src/linux_hrtimer.c
inuxkpi/common/src/linux_idr.c
inuxkpi/common/src/linux_kmod.c
inuxkpi/common/src/linux_pci.c
inuxkpi/common/src/linux_radix.c
inuxkpi/common/src/linux_rcu.c
inuxkpi/common/src/linux_seq_file.c
inuxkpi/common/src/linux_usb.c
dis/kern_ndis.c
dis/kern_windrv.c
dis/ndis_var.h
dis/ntoskrnl_var.h
dis/pe_var.h
dis/subr_hal.c
dis/subr_ndis.c
dis/subr_ntoskrnl.c
dis/subr_usbd.c
1ba6953720e56ecab6fb4047b9b0f827203bbadd 29-Aug-2020 wulf <wulf@FreeBSD.org> LinuxKPI: Implement ksize() function.

In Linux, ksize() gets the actual amount of memory allocated for a given
object. This commit adds malloc_usable_size() to FreeBSD KPI which does
the same. It also maps LinuxKPI ksize() to newly created function.

ksize() function is used by drm-kmod.

Reviewed by: hselasky, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D26215
inuxkpi/common/include/linux/slab.h
43c4791db68bf98ca89824b74030959ca1433cd2 27-Aug-2020 hselasky <hselasky@FreeBSD.org> Implement extensible arrays API using the existing radix tree implementation
in the LinuxKPI.

Differential Revision: https://reviews.freebsd.org/D25101
Reviewed by: kib @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/xarray.h
inuxkpi/common/src/linux_xarray.c
a0d6ab1e696dce76058e99a44a553fecb1adee95 24-Aug-2020 mjg <mjg@FreeBSD.org> cache: drop the always curthread argument from reverse lookup routines

Note VOP_VPTOCNP keeps getting it as temporary compatibility for zfs.

Tested by: pho
inprocfs/linprocfs.c
inux/linux_getcwd.c
ecce8eaad7858a86d46abb23eceb8810ced96775 19-Aug-2020 mjg <mjg@FreeBSD.org> vfs: drop the error parameter from vn_isdisk, introduce vn_isdisk_error

Most consumers pass NULL.
inux/linux_stats.c
3501867eb29102709ef9ea10cffa124f61b33247 18-Aug-2020 mjg <mjg@FreeBSD.org> linux: add sysctl compat.linux.use_emul_path

This is a step towards facilitating jails with only Linux binaries.
Supporting emul_path adds path lookups which are completely spurious
if the binary at hand runs in a Linux-based root directory.

It defaults to on (== current behavior).

make -C /root/linux-5.3-rc8 -s -j 1 bzImage:

use_emul_path=1: 101.65s user 68.68s system 100% cpu 2:49.62 total
use_emul_path=0: 101.41s user 64.32s system 100% cpu 2:45.02 total
inux/linux_file.c
inux/linux_mib.c
inux/linux_misc.c
inux/linux_stats.c
inux/linux_uid16.c
inux/linux_util.h
adaa7ce8e90388e6481d918fd12ef528eda58f4b 18-Aug-2020 markj <markj@FreeBSD.org> Fix handling of ancillary data on non-AF_UNIX Linux sockets.

After r340674, the "continue" would restart the loop without having
updated clen, resulting in an infinite loop. Restore the old behaviour
of simply ignoring all control messages on such sockets, since we
currently only implement handling for AF_UNIX-specific messages.

Reported by: syzkaller
Reviewed by: tijl
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26093
inux/linux_socket.c
9fc2a101135102074391b6ff28b825b082410ffa 17-Aug-2020 markj <markj@FreeBSD.org> Remove "emulation" of clone(CLONE_PARENT | CLONE_THREAD).

On Linux this is supposed to result in EINVAL.

Reported by: syzkaller
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
inux/linux_fork.c
a9a7211e27bfe2bec89b08899b977a4677728d97 17-Aug-2020 markj <markj@FreeBSD.org> Fix a lock leak when emulating futex(FUTEX_WAIT_BITSET).

Reported by: syzkaller
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
inux/linux_futex.c
6d9aafc6a0a9ce25aa9fc269d5914a932c11f670 17-Aug-2020 markj <markj@FreeBSD.org> Skip Linux madvise(MADV_DONTNEED) on unmanaged objects.

vm_object_madvise() is a no-op for unmanaged objects, but we should also
limit the scope of mappings on which pmap_remove() is called. In
particular, with the WIP largepage shm objects patch the kernel must
remove mappings of such objects along superpage boundaries, and without
this check Linux madvise(MADV_DONTNEED) could violate that requirement.

Reviewed by: alc, kib
MFC with: r362631
Sponsored by: Juniper Networks, Klara Inc.
Differential Revision: https://reviews.freebsd.org/D26084
inux/linux_mmap.c
0c7391ed923d3907ca1502e633f0fdbc05e5fcdf 16-Aug-2020 mjg <mjg@FreeBSD.org> vfs: remove the thread argument from vget

It was already asserted to be curthread.

Semantic patch:

@@

expression arg1, arg2, arg3;

@@

- vget(arg1, arg2, arg3)
+ vget(arg1, arg2)
inuxkpi/common/include/linux/fs.h
5000b42e448eb645c66c1bab5d0aff953da06319 14-Aug-2020 manu <manu@FreeBSD.org> linuxkpi: Add a few wait_bit functions

The linux function does a lot more than that as multiple waitqueue could be fetch
from a static table based on the hash of the argument but since in DRM it's only used
in one place just add a single variable.
We will probably need to change that in the futur but it's ok with DRM even with current
linux.

Reviewed by: hselasky
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26054
inuxkpi/common/include/linux/wait.h
inuxkpi/common/include/linux/wait_bit.h
inuxkpi/common/src/linux_compat.c
5210997c9cae5d6c222eab77dd599143dd232a87 12-Aug-2020 markj <markj@FreeBSD.org> linprocfs: Fix some inaccuracies in meminfo.

- Fill out MemFree correctly. Delete an ancient comment suggesting that
we don't want to advertise the true quantity of free memory.
- Populate the Buffers field by reading vfs.bufspace.
- The page cache consists of all pages in page queues, not just the
inactive queue.

PR: 248463
Reported and tested by: danfe
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
inprocfs/linprocfs.c
4986d5a3ba546d083a447e5cb11975ad3ccd131d 11-Aug-2020 markj <markj@FreeBSD.org> Remove sys/compat/netbsd.

It contained only a header used by ncv(4), which was mainly used on pc98
systems. Both ncv(4) and pc98 support have long been removed.
etbsd/dvcfg.h
4c1b6ff62178ca67a86b97cae09abe00ebe1122b 11-Aug-2020 hselasky <hselasky@FreeBSD.org> Use atomic_clear_rel_long() to implement clear_bit_unlock() in the LinuxKPI
after r363842.

Suggested by: alc@
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/bitops.h
e42d18e18380a1560536f954ed8bf1bc0dada1af 11-Aug-2020 hselasky <hselasky@FreeBSD.org> Need to clone the task struct fields related to RCU aswell in the
LinuxKPI after r359727. This fixes a minor regression issue. Else the
priority tracking won't work properly when both sleepable and
non-sleepable RCU is in use on the same thread.

Bump the __FreeBSD_version to force recompilation of external kernel
modules.

PR: 242272
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
inuxkpi/common/src/linux_rcu.c
96c29bfcbb56e21e7862b31ca0ae4e0360f88eea 07-Aug-2020 mjg <mjg@FreeBSD.org> vfs: add VOP_STAT

The current scheme of calling VOP_GETATTR adds avoidable overhead.

An example with tmpfs doing fstat (ops/s):
before: 7488958
after: 7913833

Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D25910
inuxkpi/common/src/linux_compat.c
e7d1fc81c06e8c3fd1658d481f5f534f5fb19d04 07-Aug-2020 hselasky <hselasky@FreeBSD.org> Implement radix_tree_store() in the LinuxKPI for use with the coming
extensible arrays implementation.

While at it add some more comments explaining the current
radix_tree_insert() function and make sure to clean the root node when
the radix tree reaches the maximum height. This can happen if the
index passed is too big when the tree is empty.

The radix_tree_store() function is basically a copy of the
radix_tree_insert() function with some added functionality.

The radix_tree_store() function is local to FreeBSD and does not yet
exist in Linux.

Reviewed by: kib
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/radix-tree.h
inuxkpi/common/src/linux_radix.c
3846ff841491a990212681f423ac8e26c39032b9 05-Aug-2020 markj <markj@FreeBSD.org> Fix a TOCTOU vulnerability in freebsd32_copyin_control().

PR: 248257
Reported by: m00nbsd working with Trend Micro Zero Day Initiative
Reviewed by: kib
Security: SA-20:23.sendmsg
Security: CVE-2020-7460
Security: ZDI-CAN-11543
reebsd32/freebsd32_misc.c
b500756c345d333c9b7506e503ec300cffc11798 04-Aug-2020 manu <manu@FreeBSD.org> linuxkpi: Add time_after32 and time_before32

This compare two 32 bits times

Sponsored by: The FreeBSD Foundation
Reviewed by: kib, hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25700
inuxkpi/common/include/linux/jiffies.h
512c4325e3057aa2c57488a89abe060052110241 04-Aug-2020 manu <manu@FreeBSD.org> linuxkpi: Add clear_bit_unlock

This calls clear_bit and adds a memory barrier.

Sponsored by: The FreeBSD Foundation

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25943
inuxkpi/common/include/linux/bitops.h
6c5625f47a2660a86da4a6effae2275fc666f5a3 04-Aug-2020 manu <manu@FreeBSD.org> Re-apply r363564.

We now have linux/sizes.h in the tree.
inuxkpi/common/include/linux/dma-mapping.h
19a710eb94737f481ec8db6b727b37847c2fc71e 04-Aug-2020 manu <manu@FreeBSD.org> linuxkpi: Add nested variant of mutex_lock_interruptible

We don't do anything with the _nesteds variant so just call mutex_lock_interruptible

Sponsoredby: The FreeBSD Foundation
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25944
inuxkpi/common/include/linux/mutex.h
86104a890bf4be19467fcbeeb1f7aa7b5c31aa30 04-Aug-2020 manu <manu@FreeBSD.org> linuxkpi: Add kref_put_lock

Same as kref_put but in addition to calling the rel function it will
acquire the lock first.

Sponsored by: The FreeBSD Foundation
Reviewed by: hselasky, emaste
Differential Revision: https://reviews.freebsd.org/D25942
inuxkpi/common/include/linux/kref.h
12b89d5c9552dd402aa72ee76cbbbeaa611ac71b 04-Aug-2020 manu <manu@FreeBSD.org> linuxkpi: Add linux/sizes.h

This file contain some defines for common sizes.

Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky, emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25941
inuxkpi/common/include/linux/sizes.h
529e1e37819cc132fcd624bc4f5fc2bcbf33ad3b 26-Jul-2020 manu <manu@FreeBSD.org> Fix r363565

lockdep.h needs sys/lock.h for LOCK_CLASS
inuxkpi/common/include/linux/lockdep.h
2986ce4dee0b0743ec2131b3b975eab7f147cb2c 26-Jul-2020 manu <manu@FreeBSD.org> Revert r363564

linux/sizes.h doesn't exists in base ... sorry.
inuxkpi/common/include/linux/dma-mapping.h
06a0f46796746dfd151754d1d6721b23159104b5 26-Jul-2020 manu <manu@FreeBSD.org> linuxkpi: Add taint* defines

This isn't used for us but allow us to port drivers more easily.

Reviewed by: hselasky
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25703
inuxkpi/common/include/linux/kernel.h
ccf1cfa59d454cbab20e0d773effaca85ff96e40 26-Jul-2020 manu <manu@FreeBSD.org> linuxkpi: Include hardirq.h in preempt.h and lockdep.h in hardirq.h

Linux does the same, this avoids ifdef or extra includes in ported drivers.

Reviewed by: emaste, hselasky
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25702
inuxkpi/common/include/linux/hardirq.h
inuxkpi/common/include/linux/preempt.h
93097499a76026f2b996b583265713d1dc3ee562 26-Jul-2020 manu <manu@FreeBSD.org> linuxkpi: Include linux/sizes.h in dma-mapping.h

Linux does the same, this avoids ifdef or extra includes in ported drivers.

Reviewed by: emaste, hselasky
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25701
inuxkpi/common/include/linux/dma-mapping.h
59b94fa393b4f71988054746e3b05647838c619c 22-Jul-2020 markj <markj@FreeBSD.org> usb(4): Stop checking for failures from malloc(M_WAITOK).

Handle the fact that parts of usb(4) can be compiled into the boot
loader, where M_WAITOK does not guarantee a successful allocation.

PR: 240545
Submitted by: Andrew Reiter <arr@watson.org> (original version)
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25706
inuxkpi/common/src/linux_usb.c
3e67e9ff3640c06fc90d0a112105eea78db2ea4f 19-Jul-2020 trasz <trasz@FreeBSD.org> Make linux(4) support the BLKPBSZGET ioctl. Oracle uses it.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25694
inux/linux_ioctl.c
inux/linux_ioctl.h
21f89ef3a197c5683cee6a94cd4eee72f7975ae1 18-Jul-2020 trasz <trasz@FreeBSD.org> Make linux fallocate(2) return EOPNOTSUPP, not ENOSYS, on unsupported mode,
as documented in the man page.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inux/linux_file.c
e85fdba1aa1bdb8230da4438b7c1a50588880407 18-Jul-2020 trasz <trasz@FreeBSD.org> Bump the default linux version from 3.2.0 to 3.10.0, which corresponds
to RHEL 7. Required for DB2.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25656
inux/linux_mib.h
3e5218c3622c7c0955f8374027e9d0e0f92df2e4 18-Jul-2020 trasz <trasz@FreeBSD.org> Add a trivial linux(4) splice(2) implementation, which simply
returns EINVAL. Fixes grep (grep-3.1-2build1).

PR: kern/218699
Reported by: avos
Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25636
inux/linux_file.c
7c522535a62aad9903b8a9fed58065dbf2241eeb 18-Jul-2020 trasz <trasz@FreeBSD.org> Add missing SysV IPC stats to linprocfs(4). Fixes 'ipcs -l',
and also helps Oracle.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25669
inprocfs/linprocfs.c
90fb98b2929503309c945f7f51a5e63877b3f5e1 18-Jul-2020 trasz <trasz@FreeBSD.org> Fix bogomips calculation. Previously it was off by half. This was
verified under VMWare Fusion, comparing to what's reported under CentOS,
and by comparing numbers reported by linuxulator on T420 with a googled
up Linux cpuinfo (https://lkml.org/lkml/2011/11/29/116).

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20693
inprocfs/linprocfs.c
a83cb2c31a434c2339ea1e37a38e1490a6a5c210 18-Jul-2020 trasz <trasz@FreeBSD.org> Fix two typos in flag names in /proc/cpuinfo.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25695
inprocfs/linprocfs.c
513365d5848b50beafac005d73ad671c721ca15a 14-Jul-2020 wulf <wulf@FreeBSD.org> linuxkpi: Ignore NULL pointers passed to string parameter of kstr(n)dup

That follows Linux and fixes related drm-kmod-5.3 panic.

Reviewed by: imp, hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25657
inuxkpi/common/include/linux/string.h
698f5c2b8cbfbe4a27a8c62341e8e6a3a9389a00 12-Jul-2020 netchild <netchild@FreeBSD.org> Fix r363125 (Implement CLOCK_MONOTONIC_RAW (linux >= 2.6.28)),
by realy using the MONOTONIC version and not the REALTIME version.

Noticed by: myfreeweb at github
inux/linux_time.c
936a891b0779e5bb0733d129e570e10e3924c95c 12-Jul-2020 netchild <netchild@FreeBSD.org> Implement CLOCK_MONOTONIC_RAW (linux >= 2.6.28).

It is documented as a raw hardware-based clock not subject to NTP or
incremental adjustments. With this "not as precise as CLOCK_MONOTONIC"
description in mind, map it to our CLOCK_MONOTNIC_FAST (the same
mapping as for the linux CLOCK_MONOTONIC_COARSE).

This is needed for the webcomponent of steam (chromium) and some
other steam component or game.

The linux-steam-utils port contains a LD_PRELOAD based fix for this.
There this is mapped to CLOCK_MONOTONIC.
As an untrained ear/eye (= the majority of people) is normaly not
noticing a difference of jitter in the 10-20 ms range, specially
if you don't pay attention like for example in a browser session
while watching a video stream, the mapping to CLOCK_MONOTONIC_FAST
seems more appropriate than to CLOCK_MONOTONIC.
inux/linux_time.c
3bdb1095f1456ff379e89d5ced1c1862acfedaab 11-Jul-2020 trasz <trasz@FreeBSD.org> Make linprocfs(5) report correct tty number in /proc/<PID>/stat.
Fixes sudo (sudo-1.8.21p2-3ubuntu1.2); previously would fail
with "sudo: no tty present and no askpass program specified".

Reviewed by: kib, emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25588
inprocfs/linprocfs.c
edfea716900a1d92a9fbfd51ed501b4bfd465663 11-Jul-2020 trasz <trasz@FreeBSD.org> Make linux stat(2) return the same st_dev for every devfs instance.
The reason for this is to work around an idiosyncrasy of glibc
getttynam(3) implementation: it checks whether st_dev returned for
fd 0 is the same as st_dev returned for the target of /proc/self/fd/0
symlink, and with linux chroots having their own devfs instance,
the check will fail if you chrooted into it.

PR: kern/240767
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25559
inux/linux_stats.c
8f75ec4950bdb53418a2c5e71cb1fbf8d9d501a0 10-Jul-2020 trasz <trasz@FreeBSD.org> Don't emit warnings on MADV_HUGEPAGE; Firefox uses it a lot.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inux/linux_mmap.c
7ecaecc7c22f657e6130b271f2308cac3be0fa25 10-Jul-2020 hselasky <hselasky@FreeBSD.org> Implement the bitmap_subset() function in the LinuxKPI. This function
checks if the bitmap pointed to by the first argument is a subset of
the bitmap pointed to by the second argument. The function returns one
on success and zero on failure.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/bitmap.h
dee503b4a7185545e7e570c08c911d7ea34524eb 10-Jul-2020 hselasky <hselasky@FreeBSD.org> Implement the array_size() function in the LinuxKPI. This function
basically multiplies its two arguments and returns SIZE_MAX if the
result overflows the size_t type. Else the product of the two
arguments is returned.

Bump the FreeBSD_version to mitigate issues with existing
implementation of array_size() in drm-devel-kmod.

Discussed with: manu@
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/overflow.h
54d3d7e40e31fc52af17f8f4f4dac58c4c769ba1 10-Jul-2020 kevans <kevans@FreeBSD.org> memfd_create: turn on SHM_GROW_ON_WRITE

memfd_create fds will no longer require an ftruncate(2) to set the size;
they'll grow (to the extent that it's possible) upon write(2)-like syscalls.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D25502
inux/linux_file.c
96097a6b56a2d3af99e73263e7bbf14a1ae7a215 06-Jul-2020 markj <markj@FreeBSD.org> Regenerate.

Sponsored by: The FreeBSD Foundation
reebsd32/freebsd32_sysent.c
bf3101e83a23e004f9be9058d37a6e6b8f1a865f 05-Jul-2020 hselasky <hselasky@FreeBSD.org> Fix include file order in io.h in the LinuxKPI.
Make sure sys/types.h is included before machine/vm.h.

PR: 247775
Submitted by: pkubaj@
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/io.h
3b2643a2c3b9475bec202de255ac5110937aa937 05-Jul-2020 trasz <trasz@FreeBSD.org> Fix Linux recvmsg(2) when msg_namelen returned is 0. Previously
it would fail with EINVAL, breaking some of the Python regression
tests.

While here, cap the user-controlled message length.

Note that the code doesn't seem to be copying out the new length
in either (success or failure) case. This will be addressed separately.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25392
inux/linux_socket.c
ce3cf0a9a7e1405490d05552c1f260c8d7134231 04-Jul-2020 trasz <trasz@FreeBSD.org> Add /proc/sys/kernel/tainted to linprocfs(5). Helps LTP.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25556
inprocfs/linprocfs.c
4e5e35021fb910eab0c282b0ef59ff920fcefcf5 04-Jul-2020 trasz <trasz@FreeBSD.org> Make linprocfs(5) create /proc/bus/pci/devices/, and linsysfs(5)
create /sys/class/power_supply/. This silences some warnings
from biology/linux-foldingathome.

Reported by: 0mp
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25557
inprocfs/linprocfs.c
insysfs/linsysfs.c
0c55a7efbc3483a699a2c56512195e1d6b91f223 04-Jul-2020 mjg <mjg@FreeBSD.org> linux: fix ioctl performance for termios

TCGETS et al are frequently issued by Linux binaries while the previous code
avoidably ping-pongs a global sx lock and serializes on Giant.

Note that even with the fix the common case will serialize on a per-tty lock.
inux/linux_ioctl.c
e5fef1d9618514e2154f5971f10c6e330a7e5b31 02-Jul-2020 kib <kib@FreeBSD.org> linuxkpi: improvements for linux_pid_task() and linux_get_pid_task().

Unify functions bodies.
Do not call tdfind() if pid is passed, and do not call pfind() if tid
is supplied.

Reviewed by: hselasky
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25534
inuxkpi/common/src/linux_current.c
967382a845cf780b5c0ef0535a40e4193e8b66bb 01-Jul-2020 trasz <trasz@FreeBSD.org> Rework linux accept(2). This makes the code flow easier to follow,
and fixes a bug where calling accept(2) could result in closing fd 0.

Note that the code still contains a number of problems: it makes
assumptions about l_sockaddr_in being the same as sockaddr_in,
the EFAULT-related code looks like it doesn't work at all, and the
socket type check is racy. Those will be addressed later on;
I'm trying to work in small steps to avoid breaking one thing while
fixing another.

It fixes Redis, among other things.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25461
inux/linux_socket.c
e478a9ec1d778e93d2d98ecafe17c709640c65a5 01-Jul-2020 hselasky <hselasky@FreeBSD.org> The "pid" field in the LinuxKPI task struct is typically set to the thread ID
and not the process ID. Make sure the linux_task_exiting() function uses tdfind()
to lookup the BSD procedure structure pointer by the "pid" field, and only
fallback to pfind() when no match is found! This makes linux_task_exiting()
in line with the rest of the code.

Differential Revision: https://reviews.freebsd.org/D25509
Submitted by: Greg V <greg@unrelenting.technology>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_current.c
fd0a5790f3472758a4dbf2b3bfac68761cda921b 30-Jun-2020 trasz <trasz@FreeBSD.org> Make linprocfs(5) create the /proc/<PID>/task/ directores.
This is to silence down some Chromium assertions.

PR: kern/240991
Analyzed by: Alex S <iwtcex@gmail.com>
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25256
inprocfs/linprocfs.c
c47e3b89d9934496ec133a04d39a643b3d74bf4b 30-Jun-2020 trasz <trasz@FreeBSD.org> Make linux(4) ignore SA_INTERRUPT. The zsh(1) binary from Bionic uses it.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25499
inux/linux_signal.c
8eb8b905bc462109f4693c876535ca2cca36f6f0 30-Jun-2020 hselasky <hselasky@FreeBSD.org> Document the is_signed(), type_max() and type_min() function macros in the
LinuxKPI. Try to make the function argument more readable.

Suggested by: several
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
283c1fe7cf3e8deb86ec76e0ff582b33e2bce398 29-Jun-2020 kevans <kevans@FreeBSD.org> linux: reposition the comment for bsd_to_linux_bits/linux_to_bsd_bits

rpokala notes that splitting the definitions like this is kind of silly,
since the comment applies to both. Move the comment up (or the definition
down, depending on your perspective on life) accordingly.

Reported by: rpokala
inux/linux.h
8ddfaf22d9b01a66ffcbda5a623c563302fffa1b 29-Jun-2020 hselasky <hselasky@FreeBSD.org> Implement is_signed(), type_max() and type_min() function macros in the
LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
145ddba8bee2bb9e4affa5e3b9acdf21c6c7283d 29-Jun-2020 kevans <kevans@FreeBSD.org> linuxolator: implement memfd_create syscall

This effectively mirrors our libc implementation, but with minor fudging --
name needs to be copied in from userspace, so we just copy it straight into
stack-allocated memfd_name into the correct position rather than allocating
memory that needs to be cleaned up.

The sealing-related fcntl(2) commands, F_GET_SEALS and F_ADD_SEALS, have
also been implemented now that we support them.

Note that this implementation is still not quite at feature parity w.r.t.
the actual Linux version; some caveats, from my foggy memory:

- Need to implement SHM_GROW_ON_WRITE, default for memfd (in progress)
- LTP wants the memfd name exposed to fdescfs
- Linux allows open() of an fdescfs fd with O_TRUNC to truncate after dup.
(?)

Interested parties can install and run LTP from ports (devel/linux-ltp) to
confirm any fixes.

PR: 240874
Reviewed by: kib, trasz
Differential Revision: https://reviews.freebsd.org/D21845
inux/linux.c
inux/linux.h
inux/linux_file.c
inux/linux_file.h
42b6300a9718b3cf32404882dec5630f26a467fb 28-Jun-2020 markj <markj@FreeBSD.org> Remove some redundant assignments and computations.

Reported by: alc
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25400
inuxkpi/common/src/linux_page.c
4ee3a6b8384dfbb5ffa727cc174f88475e119bec 28-Jun-2020 trasz <trasz@FreeBSD.org> Make linux(4) support SO_PROTOCOL. Running Python test suite
with python3.8 from Focal triggers those.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25491
inux/linux_socket.c
inux/linux_socket.h
6fa24b7c79c9a90093a3b10e2322b2687c501319 27-Jun-2020 trasz <trasz@FreeBSD.org> Make linux(4) warn about unsupported SA_ flags.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25453
inux/linux_signal.c
b6713bd00a7a74f6ec3adf202671a7d1ab53988a 25-Jun-2020 markj <markj@FreeBSD.org> Implement an approximation of Linux MADV_DONTNEED semantics.

Linux MADV_DONTNEED is not advisory: it has side effects for anonymous
memory, and some system software depends on that. In particular,
MADV_DONTNEED causes anonymous pages to be discarded. If the mapping is
a private mapping of a named object then subsequent faults are to
repopulate the range from that object, otherwise pages will be
zero-filled. For mappings of non-anonymous objects, Linux MADV_DONTNEED
can be implemented in the same way as our MADV_DONTNEED.

This implementation differs from Linux semantics in its handling of
private mappings, inherited through fork(), of non-anonymous objects.
After applying MADV_DONTNEED, subsequent faults will repopulate the
mapping from the parent object rather than the root of the shadow chain.

PR: 230160
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25330
inux/linux_mmap.c
0392561a1671a34d38104454162b6c119000031b 23-Jun-2020 dougm <dougm@FreeBSD.org> In r362552, RB_SET_PARENT is defined, and use in parens in
RB_CLEAR_NODE. But it is not an expression, and ought not to be
enclosed in parens. Remove them.

Approved by: markj
Differential Revision: https://reviews.freebsd.org/D25421
inuxkpi/common/include/linux/rbtree.h
8dd8f8bf0c4a7cc85ee380358abad88868004f3c 23-Jun-2020 dougm <dougm@FreeBSD.org> Define RB_SET_PARENT to do all assignments to rb parent
pointers. Define RB_SWAP_CHILD to replace the child of a parent with
its twin, and use it in 4 places. Use RB_SET in rb_link_node to remove
the only linuxkpi reference to color, and then drop color- and
parent-related definitions that are defined and used only in rbtree.h.

This is intended to be entirely cosmetic, with no impact on program
behavior, and leave RB_PARENT and RB_SET_PARENT as the only ways to
read and write rb parent pointers.

Reviewed by: markj, kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D25264
inuxkpi/common/include/linux/rbtree.h
61e7df6764f6a627f15b05a275b6206389d6cf7e 21-Jun-2020 tmunro <tmunro@FreeBSD.org> vfs: track sequential reads and writes separately

For software like PostgreSQL and SQLite that sometimes reads sequentially
while also writing sequentially some distance behind with interleaved
syscalls on the same fd, performance is better on UFS if we do
sequential access heuristics separately for reads and writes.

Patch originally by Andrew Gierth in 2008, updated and proposed by me with
his permission.

Reviewed by: mjg, kib, tmunro
Approved by: mjg (mentor)
Obtained from: Andrew Gierth <andrew@tao11.riddles.org.uk>
Differential Revision: https://reviews.freebsd.org/D25024
loudabi/cloudabi_file.c
d8f6e5f667a1da216d9997a246b8fd66af464d26 20-Jun-2020 trasz <trasz@FreeBSD.org> Add linux_madvise(2) instead of having Linux apps call the native
FreeBSD madvise(2) directly. While some of the flag values match,
most don't.

PR: kern/230160
Reported by: markj
Reviewed by: markj
Discussed with: brooks, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25272
inux/linux_mmap.c
inux/linux_mmap.h
112ec374678ceb774ecef610111f2128a93ee99f 19-Jun-2020 trasz <trasz@FreeBSD.org> Add warnings for unsupported Linux clockids.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25322
inux/linux_time.c
492642d553223098378f60ba95bdcb21b2416477 19-Jun-2020 markj <markj@FreeBSD.org> Add a helper function for validating VA ranges.

Functions which take untrusted user ranges must validate against the
bounds of the map, and also check for wraparound. Instead of having the
same logic duplicated in a number of places, add a function to check.

Reviewed by: dougm, kib
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25328
inuxkpi/common/src/linux_page.c
00d3dff5885a318c5b95781aac0bcf09dfd0ac41 18-Jun-2020 kib <kib@FreeBSD.org> Fix execution of linux binary from multithreaded non-Linux process.

If multithreaded non-Linux process execs Linux binary, then non-Linux
threads different from the one that execing are cleared by
single-threading at boundary, and then terminating them in
post_execve(). Since at that time the process is already switched to
linux ABI, linuxolator is involved in the thread clearing on boundary,
but cannot find the emul data.

Handle it by pre-creating emuldata for all threads in the execing process.

Also remove a code in linux_proc_exec() handler that cleared emul data
for other threads when execing from multithreaded Linux process. It is
excessive.

PR: 247020
Reported by: Martin FIlla <freebsd@sysctl.cz>
Reported by: Henrique L. Amorim, Independent Security Researcher
Reported by: Rodrigo Rubira Branco (BSDaemon), Amazon Web Services
Reviewed by: markj
Tested by: trasz
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25293
inux/linux_emul.c
6b8f9f044da8bd328562ea261af0eef82cb7178d 15-Jun-2020 trasz <trasz@FreeBSD.org> Make Linux uname(2) return x86_64 to 32-bit apps. This helps Steam.

PR: kern/240432
Analyzed by by: Alex S <iwtcex@gmail.com>
Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25248
inux/linux_misc.c
405e9d0fd06c8491a135a21c8832fef041faa5fa 14-Jun-2020 trasz <trasz@FreeBSD.org> Make linux(4) warn about unsupported CMSG level/type.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25255
inux/linux_socket.c
fc9d80af5fbac8b9d2ad219c01a37f0ba99d9e33 13-Jun-2020 dougm <dougm@FreeBSD.org> Linuxkpi uses the rb-tree structures without using their interfaces,
making them break when the representation changes. Revert changes that
eliminated the color field from rb-trees, leaving everything as it was
before.

Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D25250
inuxkpi/common/include/linux/rbtree.h
3befce834a140d7ab58556c72ffd3585b9b480d2 12-Jun-2020 dougm <dougm@FreeBSD.org> Revert r362108, as it breaks compilation.
inuxkpi/common/include/linux/rbtree.h
2651dedf7070baf6f9960f81478da997c56d73fe 12-Jun-2020 dougm <dougm@FreeBSD.org> The linuxkpi code accesses left/right rb tree pointers without using
RB_LEFT or RB_RIGHT, so they aren't stripping off the color bit
encoded there. Strip off that bit for linuxkpi.

Reported by: dch
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D25245
inuxkpi/common/include/linux/rbtree.h
513932b64a4df9f296d46672a7f741638b62157c 12-Jun-2020 trasz <trasz@FreeBSD.org> Add compat.linux.debug sysctl, to make it possible to silence down
the debug messages. While here, clean up some variable naming.

Reviewed by: bcr (manpages), emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25230
inux/linux_mib.c
inux/linux_mib.h
inux/linux_util.c
1e3f9796b747b306c622ca516b781d65f464e907 12-Jun-2020 trasz <trasz@FreeBSD.org> Fix naming clash.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inux/linux_socket.c
bfb3a8aa90aa14053bacaf8d9fbdb2ac260835a1 12-Jun-2020 trasz <trasz@FreeBSD.org> Make linux(4) warn about unsupported fcntls.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25231
inux/linux_file.c
1d5d0f32a0d1cda88e33d8121d1b5b2de77c104d 12-Jun-2020 trasz <trasz@FreeBSD.org> Minor code cleanup; no functional changes.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25232
inux/linux_socket.c
inux/linux_socket.h
70a7cb5859eb5e44797a6e11caae28fee04e689b 11-Jun-2020 trasz <trasz@FreeBSD.org> Don't use newlines with linux_msg(). No functional changes.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inux/linux.c
inux/linux_event.c
inux/linux_futex.c
6c476b95480dd4bee088528b2b947e8e3c2ca406 11-Jun-2020 trasz <trasz@FreeBSD.org> Replace LINUX_FASYNC with LINUX_O_ASYNC; no functional changes.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25218
inux/linux_file.c
inux/linux_file.h
993de18d60c10b494364f8c72f2c5a9cd9c1ba6e 11-Jun-2020 trasz <trasz@FreeBSD.org> Improve the warnings.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inux/linux_futex.c
d9119d071e4c677da4ab156bfbe265ae2fe7feb2 11-Jun-2020 trasz <trasz@FreeBSD.org> Make linux(4) handle SO_REUSEPORT.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25216
inux/linux_socket.c
inux/linux_socket.h
ddaa4980ddf24903909aae9b79ac6ae7618984c8 10-Jun-2020 markj <markj@FreeBSD.org> Fix a couple of nits in Linux sysinfo(2) emulation.

- Use the same definition of free memory as Linux.
- Rename the totalbig and freebig fields to match the corresponding
names on Linux.

Discussed with: alc
MFC after: 1 week
inux/linux_misc.c
08d6602d0a718cb289f459b0bc0796095828a5fa 10-Jun-2020 markj <markj@FreeBSD.org> Add a comment reflecting the commit log for r361945.

Suggested by: alc
Reviewed by: alc
MFC with: r361945
inux/linux_misc.c
4289b15fd7c94d2797acacbe78d3df2c1f797d3c 10-Jun-2020 trasz <trasz@FreeBSD.org> Make linux(4) set the openfiles soft resource limit to 1024 for Linux
applications, which often depend on this being the case. There's a new
sysctl, compat.linux.default_openfiles, to control this behaviour.

Reviewed by: kevans, emaste, bcr (manpages)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25177
inux/linux_emul.c
inux/linux_mib.c
inux/linux_mib.h
d9496f6f43e4aab10dd14b2b121108e1c75a7c43 10-Jun-2020 trasz <trasz@FreeBSD.org> Support SO_SNDBUFFORCE/SO_RCVBUFFORCE by aliasing them to the
standard SO_SNDBUF/SO_RCVBUF. Mostly cosmetics, to get rid
of the warning during 'apt upgrade'.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25173
inux/linux_socket.c
inux/linux_socket.h
16c1ffb4ffeabf05974c53135f5358ac46739db2 09-Jun-2020 dougm <dougm@FreeBSD.org> To reduce the size of an rb_node, drop the color field. Set the least
significant bit in the pointer to the node from its parent to indicate
that the node is red. Have the tree rotation macros leave the
old-parent/new-child node red and the new-parent/old-child node black.

This change makes RB_LEFT and RB_RIGHT no longer assignable, and
RB_COLOR no longer defined. Any code that modifies the tree or
examines a node color would have to be modified after this change.

Reviewed by: markj
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D25105
inuxkpi/common/include/linux/rbtree.h
3688da8f6e8275a190c490934fab93e5194b5b56 09-Jun-2020 jhb <jhb@FreeBSD.org> Refactor ptrace() ABI compatibility.

Add a freebsd32_ptrace() and move as many freebsd32 shims as possible
to freebsd32_ptrace(). Aside from register sets, freebsd32 passes
pointers to native structures to kern_ptrace() and converts to/from
native/32-bit structure formats in freebsd32_ptrace() outside of
kern_ptrace().

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25195
reebsd32/freebsd32_misc.c
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
reebsd32/syscalls.master
b217ce65e8d9b56e34b817db7d0ba03f22b06371 08-Jun-2020 markj <markj@FreeBSD.org> Stop computing a "sharedram" value when emulating Linux sysinfo(2).

The previous code was computing an incorrect value in a very expensive
manner. "sharedram" is supposed to be the amount of memory used by
named swap objects, which on FreeBSD basically corresponds to memory
usage by shared memory objects (including, for example, GEM objects) and
tmpfs. We currently have no cheap way to count such pages. The
previous code tried to determine the number of copy-on-write pages
shared between processes.

Just replace the computed value with 0. illumos reportedly does the
same thing. Linux itself did not populate this field until a 2014
commit, "mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces".

Reported by: mjg
MFC after: 1 week
inux/linux_misc.c
a0df95e7f1f5ea4b236ca240245cf7765453b3a5 05-Jun-2020 hselasky <hselasky@FreeBSD.org> Ensure pci_channel_offline() actually queries the PCI register space,
and not only the software cache of that register. Else
pci_channel_offline() won't detect that the PCI device is gone when
using the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
f27f87cbc2441536e608dbbd346d2b86d08ea267 02-Jun-2020 hselasky <hselasky@FreeBSD.org> Implement __is_constexpr() function macro in the LinuxKPI.
Bump the FreeBSD version.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
8c32f0d8ad0f08d817f38545bf588f07be793d11 02-Jun-2020 hselasky <hselasky@FreeBSD.org> Implement struct_size() function macro in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
7f08027678877f2270fef378fc52c477d011d9aa 02-Jun-2020 hselasky <hselasky@FreeBSD.org> Implement BUILD_BUG_ON_ZERO() in the LinuxKPI.
Tested using gcc and clang.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
f3a2c56f6e30521a2d0a363a89f7312cf21e1b8c 28-May-2020 rmacklem <rmacklem@FreeBSD.org> Update the files created from the new syscalls.master from r361599.

Reviewed by: brooks
Differential Revision: https://reviews.freebsd.org/D24949
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
4dbd471a29a6bfe63ac152bee9d7fe869e89a7c7 28-May-2020 rmacklem <rmacklem@FreeBSD.org> Add a syscall for the nfs-over-tls daemons to use.

The nfs-over-tls daemons need a system call to perform operations such as
associate a file descriptor with a krpc socket.
The daemons will not be in head for some time, but it will make it
easier for testers of nfs-over-tls to do testing if the system call
is in head (basically the stub for libc which will be commited soon).

Reviewed by: brooks
Differential Revision: https://reviews.freebsd.org/D24949
reebsd32/syscalls.master
7254c6c4256d8680584e9316b6af1a241f493911 27-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add kstrtou16

This function convert a char * to a u16.
Simply use strtoul and cast to compare for ERANGE

Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24996
inuxkpi/common/include/linux/kernel.h
ac39e7b5dc44018a3745c102b123c6905ce4b1d2 27-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add rcu_swap_protected

This macros swap an rcu pointer with a normal pointer.
The condition only seems to be used for debug/warning under linux, ignore
for now.

Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24954
inuxkpi/common/include/linux/rcupdate.h
8a31db9e3b5bd6af291201bb7938e6c151b0c844 27-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add overflow.h

Only add check_add_overflow and check_mul_overflow as those are the only
two needed function by DRM v5.3.
Both gcc and clang have builtin to do this check so use them directly
but throw an error if the compiler/code checker doesn't support this builtin.

Sponsored-by: The FreeBSD Foundation
Reviewed by: hselsasky
Differential Revision: https://reviews.freebsd.org/D25015
inuxkpi/common/include/linux/overflow.h
3fc1420eac76eb8ddf28d6b0715b2f2fe933f805 25-May-2020 manu <manu@FreeBSD.org> linuxkpi: Fix mod_timer and del_timer_sync

mod_timer is supposed to return 1 if the modified timer was pending, which
is exactly what callout_reset does so return the value after checking
that it's a correct one in case the api change.
del_timer_sync returns int so add a function and handle that.

Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24983
inuxkpi/common/include/linux/timer.h
inuxkpi/common/src/linux_compat.c
1c2e377244323f7f5d948156debded63dc826588 25-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add refcount.h

Implement some refcount functions needed by drm.
Just use the atomic_t struct and functions from linuxkpi for simplicity.

Sponsored-by: The FreeBSD Foundation

Reviewed by: hselsasky
Differential Revision: https://reviews.freebsd.org/D24985
inuxkpi/common/include/linux/refcount.h
c8a4308a6ee7fab2293c8a63dbb18cd527e2cc00 25-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add __same_type and __must_be_array macros

The same_type macro simply wraps around builtin_types_compatible_p which
exist for both GCC and CLANG, which returns 1 if both types are the same.
The __must_be_array macros returns 1 if the argument is an array.

This is needed for DRM v5.3

Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24953
inuxkpi/common/include/linux/compiler.h
f2cb13b0f9f3ebbebcd46ccf4d4ed827ee0408b0 23-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add prandom_u32_max

This is just a wrapper around arc4random_uniform
Needed by DRM v5.3

Sponsored-by: The FreeBSD Foundation
Reviewed by: cem, hselasky
Differential Revision: https://reviews.freebsd.org/D24961
inuxkpi/common/include/linux/random.h
f264053afd4e5748e78ee31bd6a3467f34dbd079 21-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add rcu_work functions

The rcu_work function helps to queue some work after waiting for a grace
period.
This is needed by DRM drivers.

Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24942
inuxkpi/common/include/linux/workqueue.h
inuxkpi/common/src/linux_work.c
682ad0da3508416180cc7b25bdcfb3278e9f6cfe 19-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add irq_work.h

Since handlers are call in a thread context we can simply use a workqueue
to emulate those functions.
The DRM code was patched to do that already, having it in linuxkpi allows us
to not patch the upstream code.

Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24859
inuxkpi/common/include/linux/irq_work.h
9de6703dcab2797ea0ccbd33ea6c9979dc652b0a 19-May-2020 manu <manu@FreeBSD.org> linuxkpi: add pci_dev_present

pci_dev_present shows if a set of pci ids are present in the system.
It just wraps pci_find_device.
Needed by DRMv5.2

Submitted by: Austing Shafer (ashafer@badland.io)
Differential Revision: https://reviews.freebsd.org/D24796
inuxkpi/common/include/linux/pci.h
d9b85b320f349850951874c14b74cc349a08decd 19-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add __init_waitqueue_head

The only difference with init_waitqueue_head is that the name and the
lock class key are provided but we don't use those so use init_waitqueue_head
directly.

Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24861
inuxkpi/common/include/linux/wait.h
3375c2e5710eb6b87c91de865bcf43a41cf59139 17-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add offsetofend macro

This calculate the offset of the end of the member in the given struct.
Needed by DRM in Linux v5.3

Sponsored-by: The FreeBSD Foudation
Differential Revision: https://reviews.freebsd.org/D24849
inuxkpi/common/include/linux/kernel.h
badd1fc826038ea4a9a6262bb042fd3ad80d0d53 17-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add __mutex_init

Same as mutex_init, the lock_class_key argument seems to be only used for
debug in Linux, simply ignore it for now.
Needed by DRM in Linux v5.3

Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24848
inuxkpi/common/include/linux/mutex.h
09a0740f402c8c31efa99e4c77f4c363054d2259 17-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add atomic_dec_and_mutex_lock

This function decrement the counter and if the result is 0 it acquires
the mutex and returns 1, if not it simply returns 0.
Needed by DRM from Linux v5.3

Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24847
inuxkpi/common/include/linux/mutex.h
0b66cf5b978e6bf9d2d1cbbd0ddae4dc0ebee817 16-May-2020 hselasky <hselasky@FreeBSD.org> Implement synchronize_srcu_expedited() in the LinuxKPI.

Differential Revision: https://reviews.freebsd.org/D24798
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/srcu.h
e11456daa04e6949d231cc2599d95c0a63048874 13-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add EBADRQC to errno.h

This is used in the amdgpu driver from Linux 5.2

Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24807
inuxkpi/common/include/linux/errno.h
b2b3f1ded8cf7d8a251dedca841f90ee5cd27b8b 13-May-2020 avg <avg@FreeBSD.org> linuxkpi: print stack trace in WARN_ON macros

Reviewed by: hselasky, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24779
inuxkpi/common/include/linux/kernel.h
31ef7223f864a1e3c2e97561200045ad0abb3b44 10-May-2020 manu <manu@FreeBSD.org> linuxkpi: Really add bitmap_alloc and bitmap_zalloc

This was missing in r360870

Sponsored-by: The FreeBSD Foundation
inuxkpi/common/include/linux/bitmap.h
f7dc2f857c5837cd29926e8e920d6d27a4637c62 10-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add bitmap_alloc and bitmap_free

This is a simple call to kmallock_array/kfree, therefore include linux/slab.h as
this is where the kmalloc_array/kfree definition is.

Sponsored-by: The FreeBSD Foundation
Reviewed by: hselsasky
Differential Revision: https://reviews.freebsd.org/D24794
inuxkpi/common/include/linux/bitmap.h
9130c9b4e2f1373db0075b2c06c2bd7901a9fbc6 09-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add bitmap_copy and bitmap_andnot

bitmap_copy simply copy the bitmaps, no idea why it exists.
bitmap_andnot is similar to bitmap_and but uses !src2.

Sponsored-by: The FreeBSD Foundation
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D24782
inuxkpi/common/include/linux/bitmap.h
f1dfccbb20cf22802fc3817a4b3a091a2c149b92 07-May-2020 manu <manu@FreeBSD.org> linuxkpi: Add pci_iomap and pci_iounmap

Those function are use to map/unmap io region of a pci device.
Different resource can be mapped depending on the bar so use a
tailq to store them all.

Sponsored-by: The FreeBSD Foundation

Reviewed by: emaste, hselasky
Differential Revision: https://reviews.freebsd.org/D24696
inuxkpi/common/include/linux/pci.h
inuxkpi/common/src/linux_pci.c
cc53a082c73bcad2d0be721d645b21e1b22b3534 04-May-2020 hselasky <hselasky@FreeBSD.org> Optimise use of sg_page_count() in __sg_page_iter_next() in the LinuxKPI.
No need to compute value twice.

No functional change intended.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/scatterlist.h
bae439410db8ac66455699e36bf259cf3ab4f561 04-May-2020 hselasky <hselasky@FreeBSD.org> Implement more scatter and gather functions in the LinuxKPI.

Differential Revision: https://reviews.freebsd.org/D24611
Submitted by: ashafer_badland.io (Austin Shafer)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/scatterlist.h
ecf2e61cc2185185640d2cad280b710ff61f9674 04-May-2020 hselasky <hselasky@FreeBSD.org> Fix warning about sleeping with non-sleepable lock when allocating
"current" from linux_cdev_pager_populate() in the LinuxKPI:

Backtrace:
witness_debugger()
witness_warn()
uma_zalloc_arg()
malloc()
linux_alloc_current()
linux_cdev_pager_populate()
vm_fault()
vm_fault_trap()
trap_pfault()
trap()
calltrap()

Suggested by: avg@
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
e02ffa084b188db18fe5f729c59bb9ef79c731a3 01-May-2020 hselasky <hselasky@FreeBSD.org> Implement more PCI-express bandwidth functions in the LinuxKPI.

Submitted by: ashafer_badland.io (Austin Shafer)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
648e93eab7b7247cfe0cd9603b3bc6a730bb89c1 01-May-2020 hselasky <hselasky@FreeBSD.org> Implement mutex_lock_killable() in the LinuxKPI.

Submitted by: ashafer_badland.io (Austin Shafer)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mutex.h
c71ffa7939fbbe076e37c5cf936d77c4cd666035 01-May-2020 hselasky <hselasky@FreeBSD.org> Implement DIV64_U64_ROUND_UP() in the LinuxKPI.

Submitted by: ashafer_badland.io (Austin Shafer)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/math64.h
0ed5725388a38cdf86ad4eb0aeb030b08a2b5cfe 01-May-2020 hselasky <hselasky@FreeBSD.org> Implement more lockdep macros in the LinuxKPI.

Submitted by: ashafer_badland.io (Austin Shafer)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/lockdep.h
d12c7ea104c61b58389d4699b6fab53d2199c0ba 01-May-2020 hselasky <hselasky@FreeBSD.org> Implement kstrtou64() in the LinuxKPI.

Submitted by: ashafer_badland.io (Austin Shafer)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
ef59c9d4aaa1acd5b9a83ab6cc249a37c674842c 24-Apr-2020 kevans <kevans@FreeBSD.org> sysent: re-roll after 360236 (AUE_CLOSERANGE used)
reebsd32/freebsd32_sysent.c
d854168dac1c0dcdb005964cbf5d69b26c406503 24-Apr-2020 kevans <kevans@FreeBSD.org> close_range(2): use newly assigned AUE_CLOSERANGE
reebsd32/syscalls.master
1cb4ecfafa66e5fbc6943ce712762ce3515d0a6f 22-Apr-2020 hselasky <hselasky@FreeBSD.org> Factor code in LinuxKPI to allow attach and detach using any BSD device.
This allows non-LinuxKPI based infiniband device drivers to attach
correctly to ibcore.

No functional change intended.

Reviewed by: np @
Differential Revision: https://reviews.freebsd.org/D24514
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
inuxkpi/common/src/linux_pci.c
e04a8bb8e5be9f3c5e4f70a8bcad029632d9cf9d 20-Apr-2020 hselasky <hselasky@FreeBSD.org> Implement the atomic fetch add unless functions for the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic-long.h
inuxkpi/common/include/asm/atomic.h
inuxkpi/common/include/asm/atomic64.h
4b2e8f2d696a7a9d22f0348d33597d3c9b0e6bca 20-Apr-2020 hselasky <hselasky@FreeBSD.org> Implement aligned LinuxKPI types for u16, u32 and u64.
Makes a difference for 32-bit platforms mostly.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/types.h
ba6fb22df4f592ea3e593b3035e1e998778cb152 20-Apr-2020 hselasky <hselasky@FreeBSD.org> Allow test_bit() in the LinuxKPI to accept a const pointer.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/bitops.h
602f9bb9d872e1d49f19c56d5d7c9648dd009d01 20-Apr-2020 hselasky <hselasky@FreeBSD.org> Allow the ERR_CAST() function in the LinuxKPI to take a const void pointer.
No functional change.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/err.h
01036d2005cf36c2ad2592ea5117e6bceac6d16d 17-Apr-2020 markj <markj@FreeBSD.org> Remove a vestigal reference to kmem_object.

kmem_object has been an alias of kernel_object for a while.

MFC after: 1 week
inuxkpi/common/src/linux_page.c
92f82df12b2680225b1e7827584d1cd2628eaa44 16-Apr-2020 brooks <brooks@FreeBSD.org> Convert canary, execpathp, and pagesizes to pointers.

Use AUXARGS_ENTRY_PTR to export these pointers. This is a followup to
r359987 and r359988.

Reviewed by: jhb
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24446
reebsd32/freebsd32_misc.c
3bc86c9ae7f722f71566933fdf158eead7a68790 15-Apr-2020 brooks <brooks@FreeBSD.org> Export argc, argv, envc, envv, and ps_strings in auxargs.

This simplifies discovery of these values, potentially with reducing the
number of syscalls we need to make at runtime. Longer term, we wish to
convert the startup process to pass an auxargs pointer to _start() and
use that rather than walking off the end of envv. This is cleaner,
more C-friendly, and for systems with strong bounds (e.g. CHERI)
necessary.

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24407
reebsd32/freebsd32_misc.c
27c07b76c4be70d3244359ffc5ae93122c419757 15-Apr-2020 brooks <brooks@FreeBSD.org> Make ps_strings in struct image_params into a pointer.

This is a prepratory commit for D24407.

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA
reebsd32/freebsd32_misc.c
d11edfe286ab878db5612f2b8335f8f5bada62cc 14-Apr-2020 brooks <brooks@FreeBSD.org> Remove bogus use of useracc() in (clock_)nanosleep.

There's no point in pre-checking that we can access the user's rmtp
pointer before we do it in copyout().

While here, improve style(9) compliance.

Reviewed by: imp
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24409
reebsd32/freebsd32_misc.c
cfb2be0cff25af0150a209059991a791ac7c42fa 14-Apr-2020 brooks <brooks@FreeBSD.org> Centralize compatability translation macros.

Copy the CP, PTRIN, etc macros from freebsd32.h into a sys/abi_compat.h
and replace existing definitation with includes where required. This
eliminates duplicate code and allows Linux and FreeBSD compatability
headers to be included in the same files.

Input from: cem, jhb
Obtained from: CheriBSD
MFC after: 2 weeks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24275
reebsd32/freebsd32.h
inux/linux_ioctl.c
inux/linux_timer.h
ee46db7e3b54c37294cd68d10f3d54ee1ea68ad3 14-Apr-2020 kevans <kevans@FreeBSD.org> sysent: re-roll after r359930
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
79165c9642bbe7e5407803bbcb74a2d71e54f2a9 14-Apr-2020 kevans <kevans@FreeBSD.org> Mark closefrom(2) COMPAT12, reimplement in libc to wrap close_range

Include a temporarily compatibility shim as well for kernels predating
close_range, since closefrom is used in some critical areas.

Reviewed by: markj (previous version), kib
Differential Revision: https://reviews.freebsd.org/D24399
reebsd32/syscalls.master
a9a4eb77203e9a68117a75a0cf8d241c049de368 12-Apr-2020 kevans <kevans@FreeBSD.org> sysent: re-roll after introduction of close_range in r359836
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
6371039d47fcde7da8301b6243867c1bbfe0793e 12-Apr-2020 kevans <kevans@FreeBSD.org> Implement a close_range(2) syscall

close_range(min, max, flags) allows for a range of descriptors to be
closed. The Python folk have indicated that they would much prefer this
interface to closefrom(2), as the case may be that they/someone have special
fds dup'd to higher in the range and they can't necessarily closefrom(min)
because they don't want to hit the upper range, but relocating them to lower
isn't necessarily feasible.

sys_closefrom has been rewritten to use kern_close_range() using ~0U to
indicate closing to the end of the range. This was chosen rather than
requiring callers of kern_close_range() to hold FILEDESC_SLOCK across the
call to kern_close_range for simplicity.

The flags argument of close_range(2) is currently unused, so any flags set
is currently EINVAL. It was added to the interface in Linux so that future
flags could be added for, e.g., "halt on first error" and things of this
nature.

This patch is based on a syscall of the same design that is expected to be
merged into Linux.

Reviewed by: kib, markj, vangyzen (all slightly earlier revisions)
Differential Revision: https://reviews.freebsd.org/D21627
reebsd32/syscalls.master
0df2441972c98d9f8e55b76296a0634f2dffdd82 08-Apr-2020 hselasky <hselasky@FreeBSD.org> Clone the RCU interface into a sleepable and a non-sleepable part
in the LinuxKPI.

This allows synchronize RCU to be used inside a SRCU read section.
No functional change intended.

Bump the __FreeBSD_version to force recompilation of external kernel modules.

PR: 242272
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/rcupdate.h
inuxkpi/common/src/linux_rcu.c
d83216138fcd6fefd1d06eb36ac23742d7da685a 08-Apr-2020 hselasky <hselasky@FreeBSD.org> Some fixes for SRCU in the LinuxKPI.

- Make sure to use READ_ONCE() when deferring variables.
- Remove superfluous zero initializer.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/srcu.h
efd93357ab78846fb426a7d9dda6f986d9680281 01-Apr-2020 jhb <jhb@FreeBSD.org> Retire procfs-based process debugging.

Modern debuggers and process tracers use ptrace() rather than procfs
for debugging. ptrace() has a supserset of functionality available
via procfs and new debugging features are only added to ptrace().
While the two debugging services share some fields in struct proc,
they each use dedicated fields and separate code. This results in
extra complexity to support a feature that hasn't been enabled in the
default install for several years.

PR: 244939 (exp-run)
Reviewed by: kib, mjg (earlier version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D23837
a32/ia32_sysvec.c
cdba6f91c09527532a2c9078a05ed70c965da529 26-Mar-2020 markj <markj@FreeBSD.org> compat/linux/linux.h depends on queue.h since r353725.

Sponsored by: The FreeBSD Foundation
inux/linux.h
edebfddc2ea4e0832eaa64dc5095e3cc9facaf5d 20-Mar-2020 imp <imp@FreeBSD.org> Implement a workaround for kms-drm modules

pci_iov_if.h was added to pci.h, but none of the kms-drm branches have
that. Rather than play whack a mole with the branches, move its inclusion to
linux_pci.c which is the only part of the code that needs it now.

Longer term, other solutions will be needed, but this gives us time to get those
deployed on all the supported versions.
inuxkpi/common/include/linux/pci.h
inuxkpi/common/src/linux_pci.c
615850eb4b3ada021c8937e68d92210b4ddbb136 18-Mar-2020 kib <kib@FreeBSD.org> linuxkpi: Add infrastructure to pass FreeBSD IOV method calls into
pci_driver methods.

Reviewed by: hselasky
Sponsored by: Mellanox Technologies
MFC after: 2 weeks
inuxkpi/common/include/linux/pci.h
inuxkpi/common/src/linux_pci.c
f510b8c08005ebe91d3bb528667afad305820627 10-Mar-2020 hselasky <hselasky@FreeBSD.org> Add support for the device statistics IOCTL, needed by the coming
linux_libusb upgrade.

MFC after: 3 days
Sponsored by: Mellanox Technologies
inux/linux_ioctl.c
inux/linux_ioctl.h
ca63d0921691a8bea4366dd0594e43fc9525f9d5 05-Mar-2020 tijl <tijl@FreeBSD.org> Move compat.linux.map_sched_prio sysctl definition to linux_mib.c so it is
only defined by linux_common kernel module and not both linux and linux64
modules.

Reported by: Yuri Pankov <ypankov@fastmail.com>
inux/linux_mib.c
inux/linux_mib.h
inux/linux_misc.c
81196567353a55686c7209030536708aa34cba1f 04-Mar-2020 brooks <brooks@FreeBSD.org> Introduce kern_mmap_req().

This presents an extensible interface to the generic mmap(2)
implementation via a struct pointer intended to use a designated
initializer or compount literal. We take advantage of the mandatory
zeroing of fields not listed in the initializer.

Remove kern_mmap_fpcheck() and use kern_mmap_req().

The motivation for this change is a desire to keep the core
implementation from growing an ever-increasing number of arguments
that must be specified in the correct order for the lowest-level
implementations. In CheriBSD we have already added two more arguments.

Reviewed by: kib
Discussed with: kevans
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D23164
inux/linux_mmap.c
4fb65f540aeeb842908b8449c162398bf24c78de 03-Mar-2020 hselasky <hselasky@FreeBSD.org> When closing a LinuxKPI file always use the real release function to avoid
resource leakage when destroying a LinuxKPI character device.

Submitted by: Andrew Boyer <aboyer@pensando.io>
Reviewed by: kib@
PR: 244572
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
17e6de13da30c67328dd108a50e1df85dc18e8b2 01-Mar-2020 mjg <mjg@FreeBSD.org> fd: move vnodes out of filedesc into a dedicated structure

The new structure is copy-on-write. With the assumption that path lookups are
significantly more frequent than chdirs and chrooting this is a win.

This provides stable root and jail root vnodes without the need to reference
them on lookup, which in turn means less work on globally shared structures.
Note this also happens to fix a bug where jail vnode was never referenced,
meaning subsequent access on lookup could run into use-after-free.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D23884
inprocfs/linprocfs.c
ad1708e8876c9eb63c19a1c401d4c2654463e52d 01-Mar-2020 tijl <tijl@FreeBSD.org> linuxulator: Map scheduler priorities to Linux priorities.

On Linux the valid range of priorities for the SCHED_FIFO and SCHED_RR
scheduling policies is [1,99]. For SCHED_OTHER the single valid priority is
0. On FreeBSD it is [0,31] for all policies. Programs are supposed to
query the valid range using sched_get_priority_(min|max), but of course some
programs assume the Linux values are valid.

This commit adds a tunable compat.linux.map_sched_prio. When enabled
sched_get_priority_(min|max) return the Linux values and sched_setscheduler
and sched_(get|set)param translate between FreeBSD and Linux values.

Because there are more Linux levels than FreeBSD levels, multiple Linux
levels map to a single FreeBSD level, which means pre-emption might not
happen as it does on Linux, so the tunable allows to disable this behaviour.
It is enabled by default because I think it is unlikely that anyone runs
real-time software under Linux emulation on FreeBSD that critically relies
on correct pre-emption.

This fixes FMOD, a commercial sound library used by several games.

PR: 240043
Tested by: Alex S <iwtcex@gmail.com>
Reviewed by: dchagin
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D23790
inux/linux_misc.c
inux/linux_misc.h
74c388693eb07ef738f06dff754a8e1b816da414 27-Feb-2020 trasz <trasz@FreeBSD.org> Make linuxulator warn about unsupported getsockopt/setsockopt flags.

MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D23791
inux/linux_socket.c
76f8079137fb5ee3684a84e92796790207491c1d 27-Feb-2020 hselasky <hselasky@FreeBSD.org> Extend the range of the return value from nsecs_to_jiffies64() to support
Mesa's drm_syncobj usage, in the LinuxKPI.

While at it optimise the jiffies conversion functions to avoid repeated
and constant calculations.

Submitted by: Greg V <greg@unrelenting.technology>
Differential Revision: https://reviews.freebsd.org/D23846
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/jiffies.h
inuxkpi/common/src/linux_compat.c
ad355b0a9dbd6a8aabe7c081a731d24904a0f2c1 26-Feb-2020 kaktus <kaktus@FreeBSD.org> Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
a32/ia32_sysvec.c
inux/linux_mib.c
dis/subr_ntoskrnl.c
86bios/x86bios.c
939cf3d6d51350ce8137b81233f7018d9a6e99cd 21-Feb-2020 manu <manu@FreeBSD.org> linuxkpi: Move shmem related functions in it's own file

For drmkpi (D23085) we don't want the Linux struct file as we don't emulate
everything. Also the prototypes should be in shmem_fs.h to have 100%
compatibility with Linux.

Reviewed by: hselasky
MFC after: Maybe
Differential Revision: https://reviews.freebsd.org/D23764
inuxkpi/common/include/linux/fs.h
inuxkpi/common/include/linux/shmem_fs.h
inuxkpi/common/src/linux_page.c
inuxkpi/common/src/linux_shmemfs.c
354a0e9fa387fa11cdc65d587b4b1537503de4a1 20-Feb-2020 manu <manu@FreeBSD.org> linuxkpi: Add str_has_prefix

This function test if the string str begins with the string pointed
at by prefix.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23767
inuxkpi/common/include/linux/string.h
df7aea4cf7da31e2bb7988907f2053b61d134351 20-Feb-2020 manu <manu@FreeBSD.org> linuxkpi: Add list_is_first function

This function just test if the element is the first of the list.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23766
inuxkpi/common/include/linux/list.h
ddfc0f914b4734e60d01b54dbec314b3535d8d9a 20-Feb-2020 mjg <mjg@FreeBSD.org> make sysent for r358172 ("vfs: add realpathat syscall")
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
54c053b80c20e1e4fd353cec4e086a7b22cb9734 20-Feb-2020 mjg <mjg@FreeBSD.org> vfs: add realpathat syscall

realpath(3) is used a lot e.g., by clang and is a major source of getcwd
and fstatat calls. This can be done more efficiently in the kernel.

This works by performing a regular lookup while saving the name and found
parent directory. If the terminal vnode is a directory we can resolve it using
usual means. Otherwise we can use the name saved by lookup and resolve the
parent.

See the review for sample syscall counts.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D23574
reebsd32/syscalls.master
3c6c1e149e7803fd3cd64357975a7be43bd2319b 15-Feb-2020 kaktus <kaktus@FreeBSD.org> Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (2 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked). Use it in
preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Reviewed by: hselasky, kib, zeising
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D23631
inuxkpi/common/src/linux_compat.c
0fc62af53167cbfffc5e6dc4b20daaa74bdcf995 15-Feb-2020 mjg <mjg@FreeBSD.org> cloudabi: use new capsicum helpers
loudabi/cloudabi_file.c
c769587177ed3087233e9a5bf8a5119e0f53b430 12-Feb-2020 emaste <emaste@FreeBSD.org> regen sysent after r357831, r357838

Capability mode changes allowing fdatasync and getloginclass.

Sponsored by: The FreeBSD Foundation
reebsd32/freebsd32_sysent.c
1c619be1ef0d9bff14d975c26f78d901ed799461 10-Feb-2020 trasz <trasz@FreeBSD.org> Make linux(4) use kern_socketpair(9) instead of going through
sys_socketpair(). It's a cleanup; no functional changes.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22814
inux/linux_socket.c
625da238b06451dd5d9b4bf5e31ffdc30e4eca66 09-Feb-2020 kib <kib@FreeBSD.org> Regen.
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
c3e1a2bd2c4c568d00c4d31dbeb3fdbd2456198b 09-Feb-2020 kib <kib@FreeBSD.org> Add a way to manage thread signal mask using shared word, instead of syscall.

A new syscall sigfastblock(2) is added which registers a uint32_t
variable as containing the count of blocks for signal delivery. Its
content is read by kernel on each syscall entry and on AST processing,
non-zero count of blocks is interpreted same as the signal mask
blocking all signals.

The biggest downside of the feature that I see is that memory
corruption that affects the registered fast sigblock location, would
cause quite strange application misbehavior. For instance, the process
would be immune to ^C (but killable by SIGKILL).

With consumers (rtld and libthr added), benchmarks do not show a
slow-down of the syscalls in micro-measurements, and macro benchmarks
like buildworld do not demonstrate a difference. Part of the reason is
that buildworld time is dominated by compiler, and clang already links
to libthr. On the other hand, small utilities typically used by shell
scripts have the total number of syscalls cut by half.

The syscall is not exported from the stable libc version namespace on
purpose. It is intended to be used only by our C runtime
implementation internals.

Tested by: pho
Disscussed with: cem, emaste, jilles
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D12773
reebsd32/syscalls.master
2bed614f8f9100a2e468a4f8dadb270782be2f9c 07-Feb-2020 kib <kib@FreeBSD.org> linux futex_put(): do not touch futex after dropping our reference.

Reported and tested by: Steve Roome <me@stephenroome.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
inux/linux_futex.c
77a92d8644e7b1b97832b4a058653519c0b56830 05-Feb-2020 emaste <emaste@FreeBSD.org> linuxulator: implement sendfile

Submitted by: Bora Özarslan <borako.ozarslan@gmail.com>
Submitted by: Yang Wang <2333@outlook.jp>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19917
inux/linux_socket.c
inux/linux_socket.h
f56916862be3112585c04d186c1b8b662257adb1 04-Feb-2020 kib <kib@FreeBSD.org> Add sys/systm.h to several places that use vm headers.

It is needed (but not enough) to use e.g. KASSERT() in inline functions.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week
loudabi/cloudabi_vdso.c
965d75c8d2c09c07b8aaf22e4f8476761faab78a 04-Feb-2020 dchagin <dchagin@FreeBSD.org> Fix clock_gettime() and clock_getres() for cpu clocks:
- handle the CLOCK_{PROCESS,THREAD}_CPUTIME_ID specified directly;
- fix thread id calculation as in the Linuxulator we should
convert the user supplied thread id to struct thread * by linux_tdfind();
- fix CPUCLOCK_SCHED case by using kern_{process,thread}_cputime()
directly as native get_cputime() used by kern_clock_gettime() uses
native tdfind()/pfind() to find proccess/thread.

PR: 240990
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D23341
MFC after: 2 weeks
inux/linux_time.c
fd5e34749f462d234da9bdf64480b78175d9614b 04-Feb-2020 dchagin <dchagin@FreeBSD.org> linux_to_native_clockid() properly initializes nwhich variable (or return error),
so don't initialize nwhich in declaration and remove stale comment from r161304.

Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D23339
MFC after: 2 weeks
inux/linux_time.c
ecb991e675416271c1bf87cc6f33567fd19f536e 03-Feb-2020 mjg <mjg@FreeBSD.org> fd: remove the seq argument from fget_unlocked

It is almost always NULL.
inuxkpi/common/include/linux/file.h
609d31f8f4c4dbd0349ab3654f70d4204e2838df 01-Feb-2020 mjg <mjg@FreeBSD.org> cache: replace kern___getcwd with vn_getcwd

The previous routine was resulting in extra data copies most notably in
linux_getcwd.
inux/linux_getcwd.c
76f0f7625b8d9e62e5b5319f24973624ad63bd84 28-Jan-2020 trasz <trasz@FreeBSD.org> Add TCP_CORK support to linux(4). This fixes one of the things Nginx
trips over.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23171
inux/linux_socket.c
inux/linux_socket.h
75529ff6f1a46c5067a8bf2b91f4db777f1d0549 28-Jan-2020 trasz <trasz@FreeBSD.org> Add compat.linux.ignore_ip_recverr sysctl. This is a workaround
for missing IP_RECVERR setsockopt(2) support. Without it, DNS
resolution is broken for glibc >= 2.30 (glibc BZ #24047).

From the user point of view this fixes "yum update" on recent
CentOS 8.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23234
inux/linux_mib.c
inux/linux_mib.h
inux/linux_socket.c
inux/linux_socket.h
e1e9560efac282c57a82172c797eef365e8c0b00 28-Jan-2020 kib <kib@FreeBSD.org> Provide support for fdevname(3) on linuxkpi-backed devices.

Reported and tested by: manu
Reviewed by: hselasky, manu
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23386
inuxkpi/common/src/linux_compat.c
d22d1cb9474999d990548f144611b39ae0618a11 24-Jan-2020 hselasky <hselasky@FreeBSD.org> Implement mmget_not_zero() in the LinuxKPI.

Submitted by: Austin Shafer <ashafer@badland.io>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mm_types.h
c384fe093689b789cfaff7b5caedb9d1777ae4ce 24-Jan-2020 trasz <trasz@FreeBSD.org> Make linux(4) handle MAP_32BIT.

This unbreaks Mono (mono-devel-4.6.2.7+dfsg-1ubuntu1 from Ubuntu Bionic);
previously would crash on "amd64_is_imm32" assert.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23306
inux/linux_mmap.c
inux/linux_mmap.h
286137aef5e0eef6a6f551818032744b2f1a794e 24-Jan-2020 trasz <trasz@FreeBSD.org> Add kern_unmount() and use in Linuxulator. No functional changes.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22646
inux/linux_file.c
d25f72a6e48f5abd336ea894dd1423dc0509fbe9 22-Jan-2020 glebius <glebius@FreeBSD.org> Remove comment that no longer describe reality.
dis/ntoskrnl_var.h
377f319c60f3033b874d9f2f5d1abfa4925d87c5 21-Jan-2020 trasz <trasz@FreeBSD.org> Revert r356948; breaks build somehow.
inux/linux_mmap.c
inux/linux_mmap.h
59855ad8c2ea6c3c9b1f9d547da5ad0d7b4153e3 21-Jan-2020 trasz <trasz@FreeBSD.org> Make linux(4) handle MAP_32BIT.

This unbreaks Mono (mono-devel-4.6.2.7+dfsg-1ubuntu1 from Ubuntu Bionic);
previously would crash on "amd64_is_imm32" assert.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inux/linux_mmap.c
inux/linux_mmap.h
293becab6beff84a6495641cb82ed6d4e2fac509 21-Jan-2020 markj <markj@FreeBSD.org> Fix 64-bit syscall argument fetching in 32-bit Linux syscall handlers.

The Linux32 system call argument fetcher places each argument (passed in
registers in the Linux x86 system call convention) into an entry in the
generic system call args array. Each member of this array is 8 bytes
wide, so this approach is broken for system calls that take off_t
arguments.

Fix the problem by splitting l_loff_t arguments in the 32-bit system
call descriptions, the same as we do for FreeBSD32. Change entry points
to handle this using the PAIR32TO64 macro.

Move linux_ftruncate64() into compat/linux.

PR: 243155
Reported by: Alex S <iwtcex@gmail.com>
Reviewed by: kib (previous version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23210
inux/linux_file.c
d99963e18dc2ec41fd0968abf19da4549a30b954 20-Jan-2020 trasz <trasz@FreeBSD.org> Properly translate MNT_FORCE flag to Linux umount2(2). Previously
it worked by accident.

MFC after: 2 weeks
Sponsored by: DARPA
inux/linux_file.c
inux/linux_file.h
20614c0f40a14194eb6654b1fafd2f8fb311c71e 18-Jan-2020 kevans <kevans@FreeBSD.org> sysent targets: further cleanup and deduplication

r355473 vastly improved the readability and cleanliness of these Makefiles.
Every single one of them follows the same pattern and duplicates the exact
same logic.

Now that we have GENERATED/SRCS, split SRCS up into the two parameters we'll
use for ${MAKESYSCALLS} rather than assuming a specific ordering of SRCS and
include a common sysent.mk to handle the rest. This makes it less tedious to
make sweeping changes.

Some default values are provided for GENERATED/SYSENT_*; almost all of these
just use a 'syscalls.master' and 'syscalls.conf' in cwd, and they all use
effectively the same filenames with an arbitrary prefix. Most ABIs will be
able to get away with just setting GENERATED_PREFIX and including
^/sys/conf/sysent.mk, while others only need light additions. kern/Makefile
is the notable exception, as it doesn't take a SYSENT_CONF and the generated
files are spread out between ^/sys/kern and ^/sys/sys, but it otherwise fits
the pattern enough to use the common version.

Reviewed by: brooks, imp
Nice!: emaste
Differential Revision: https://reviews.freebsd.org/D23197
loudabi32/Makefile
loudabi64/Makefile
reebsd32/Makefile
46085d4187d07e17855920adaea1da443859bd33 15-Jan-2020 markj <markj@FreeBSD.org> Handle a NULL thread pointer in linux_close_file().

This can happen if a file is closed during unix socket GC. The same bug
was fixed for devfs descriptors in r228361.

PR: 242913
Reported and tested by: iz-rpi03@hs-karlsruhe.de
Reviewed by: hselasky, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23178
inuxkpi/common/src/linux_compat.c
a85eb0d89bc06626ad19b453635a881137a55e3f 14-Jan-2020 trasz <trasz@FreeBSD.org> Make linux(4) use kern_setsockopt(9) instead of going through
sys_setsockopt. Just a cleanup; no functional changes.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22812
inux/linux_socket.c
2e64d12585dd8d20fcdaec96fb2415fcb1528ee7 14-Jan-2020 trasz <trasz@FreeBSD.org> Make linux(4) use kern_getsockopt(9) instead of going through
sys_getsockopt(). It's a cleanup; no functional changes.

Reviewed by: kib (earlier version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22813
inux/linux_socket.c
76f76693cf6dcc24db2646514a5e7ff85fab75ce 14-Jan-2020 trasz <trasz@FreeBSD.org> Make linux getcpu(2) report the domain.

Submitted by: markj
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23144
inux/linux_misc.c
7171f03dc53a0b7f1d5219f765743e7f09b47d98 13-Jan-2020 kib <kib@FreeBSD.org> Code must not unlock a mutex while owning the thread lock.

Reviewed by: hselasky, markj
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23150
inuxkpi/common/src/linux_rcu.c
42dd32c02cf7ba7fe50b863c8a0dc99adf08f95b 12-Jan-2020 trasz <trasz@FreeBSD.org> dd kern_getpriority(), make Linuxulator use it.

Reviewed by: kib, emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22842
inux/linux_misc.c
1eb0b59f78c2e8b828a4210dca0d7eb35ce080a4 12-Jan-2020 trasz <trasz@FreeBSD.org> Add kern_setpriority(), use it in Linuxulator.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22841
inux/linux_misc.c
f813cffaf1153922051989adeefaeeaed66311ea 10-Jan-2020 kevans <kevans@FreeBSD.org> Set .ORDER for makesyscalls generated files

When either makesyscalls.lua or syscalls.master changes, all of the
${GENERATED} targets are now out-of-date. With make jobs > 1, this means we
will run the makesyscalls script in parallel for the same ABI, generating
the same set of output files.

Prior to r356603 , there is a large window for interlacing output for some
of the generated files that we were generating in-place rather than staging
in a temp dir. After that, we still should't need to run the script more
than once per-ABI as the first invocation should update all of them. Add
.ORDER to do so cleanly.

Reviewed by: brooks
Discussed with: sjg
Differential Revision: https://reviews.freebsd.org/D23099
loudabi32/Makefile
loudabi64/Makefile
reebsd32/Makefile
ff34767453e2b7e288f385b8c480695f25517047 08-Jan-2020 markj <markj@FreeBSD.org> linprocfs: Fix some bugs in the maps file implementation.

- Export the offset into the backing object, not the object size.
- Fix a bug where we would print the previous entry's "offset" when a
map_entry has no object.
- Try to identify shared mappings. Linux prints "s" when the mapping
"may be shared". This attempt is not perfect, for example, we print
"p" for anonymous memory that may be shared via
minherit(INHERIT_SHARE).

PR: 240992
Reviewed by: kib
MFC after: 1 week
MFC note: no OBJ_ANON in stable/12
Differential Revision: https://reviews.freebsd.org/D23062
inprocfs/linprocfs.c
1204b9c8821b43d6b09eabf97a1c61a4ecb14711 07-Jan-2020 mjg <mjg@FreeBSD.org> vfs: reimplement deferred inactive to use a dedicated flag (VI_DEFINACT)

The previous behavior of leaving VI_OWEINACT vnodes on the active list without
a hold count is eliminated. Hold count is kept and inactive processing gets
explicitly deferred by setting the VI_DEFINACT flag. The syncer is then
responsible for vdrop.

Reviewed by: kib (previous version)
Tested by: pho (in a larger patch, previous version)
Differential Revision: https://reviews.freebsd.org/D23036
inux/linux_stats.c
24ac2fb6ec0610e60ac18d2cf8334d2a608fe234 05-Jan-2020 kevans <kevans@FreeBSD.org> shm: correct KPI mistake introduced around memfd_create

When file sealing and shm_open2 were introduced, we should have grown a new
kern_shm_open2 helper that did the brunt of the work with the new interface
while kern_shm_open remains the same. Instead, more complexity was
introduced to kern_shm_open to handle the additional features and consumers
had to keep changing in somewhat awkward ways, and a kern_shm_open2 was
added to wrap kern_shm_open.

Backpedal on this and correct the situation- kern_shm_open returns to the
interface it had prior to file sealing being introduced, and neither
function needs an initial_seals argument anymore as it's handled in
kern_shm_open2 based on the shmflags.
loudabi/cloudabi_fd.c
a004277319953ab1dffe5c9473cc41d1e45edab1 04-Jan-2020 kevans <kevans@FreeBSD.org> kern_mmap: add a variant that allows caller to inspect fp

Linux mmap rejects mmap() on a write-only file with EACCES.
linux_mmap_common currently does a fun dance to grab the fp associated with
the passed in fd, validates it, then drops the reference and calls into
kern_mmap(). Doing so is perhaps both fragile and premature; there's still
plenty of chance for the request to get rejected with a more appropriate
error, and it's prone to a race where the file we ultimately mmap has
changed after it drops its referenced.

This change alleviates the need to do this by providing a kern_mmap variant
that allows the caller to inspect the fp just before calling into the fileop
layer. The callback takes flags, prot, and maxprot as one could imagine
scenarios where any of these, in conjunction with the file itself, may
influence a caller's decision.

The file type check in the linux compat layer has been removed; EINVAL is
seemingly not an appropriate response to the file not being a vnode or
device. The fileop layer will reject the operation with ENODEV if it's not
supported, which more closely matches the common linux description of
mmap(2) return values.

If we discover that we're allowing an mmap() on a file type that Linux
normally wouldn't, we should restrict those explicitly.

Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22977
inux/linux_mmap.c
f121d45000fd1c42611ca1e54872bd4545398933 03-Jan-2020 mjg <mjg@FreeBSD.org> vfs: drop the mostly unused flags argument from VOP_UNLOCK

Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D21427
loudabi/cloudabi_file.c
inux/linux_misc.c
inuxkpi/common/src/linux_compat.c
dis/subr_ndis.c
fc5f50a0912549b98a15afd7b0b94109aa007a8f 02-Jan-2020 markj <markj@FreeBSD.org> Remove set_page_dirty_lock().

Its use of the page lock is incorrect, and it is not used by the DRM
modules.

Reviewed by: hselasky
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D23002
inuxkpi/common/include/linux/mm.h
f870efbd57f42155d26b2bc7fb95846897449dca 31-Dec-2019 trasz <trasz@FreeBSD.org> Add basic getcpu(2) support to linuxulator. The purpose of this
syscall is to query the CPU number and the NUMA domain the calling
thread is currently running on. The third argument is ignored.
It doesn't do anything regarding scheduling - it's literally
just a way to query the current state, without any guarantees
you won't get rescheduled an opcode later.

This unbreaks Java from CentOS 8
(java-11-openjdk-11.0.5.10-0.el8_0.x86_64).

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22972
inux/linux_misc.c
121a4336abf995fd8768d9aa2594e5e07f380420 30-Dec-2019 kaktus <kaktus@FreeBSD.org> linux(4): implement copy_file_range(2)

copy_file_range(2) is implemented natively since r350315, make it available
for Linux binaries too.

Reviewed by: kib (mentor), trasz (previous version)
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D22959
inux/linux_file.c
4b3d6ef9bc03864312621102fed65161b024931d 29-Dec-2019 trasz <trasz@FreeBSD.org> Implement Linux syslog(2) syscall; just enough to make Linux dmesg(8)
utility work.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22465
inux/linux_misc.c
inux/linux_misc.h
e2c0d08b70cb40c87ad4dd8828306a95f92eb675 29-Dec-2019 trasz <trasz@FreeBSD.org> Make linprocfs(5) provide an empty /proc/modules. This should silence
some warnings.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inprocfs/linprocfs.c
15928a798d4c96a006cd67c71fbd5b4de9ee4a40 29-Dec-2019 trasz <trasz@FreeBSD.org> Make Linux stat(2) et al distinguish between block and character
devices. It's required for LTP, among other things. It's not
complete, but good enough for now.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22950
inux/linux_stats.c
95b9a558893ad6e491d6999d96d79af914090212 29-Dec-2019 trasz <trasz@FreeBSD.org> Implement Linux BLKGETSIZE64 ioctl.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inux/linux_ioctl.c
inux/linux_ioctl.h
e849a32c2f78afc9ffece1f54df954327464f8f2 28-Dec-2019 trasz <trasz@FreeBSD.org> Make linux mount(2) tolerate NULL 'from' argument, and fix flag
handling.

This should unbreak access04, acct01, chmod06, creat06,
and fchmod06 LTP tests.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inux/linux_file.c
a212d0b2e84d9c6aac527a5d831a60dac57faa14 26-Dec-2019 cem <cem@FreeBSD.org> random(9): Deprecate random(9), remove meaningless srandom(9)

srandom(9) is meaningless on SMP systems or any system with, say,
interrupts. One could never rely on random(9) to produce a reproducible
sequence of outputs on the basis of a specific srandom() seed because the
global state was shared by all kernel contexts. As such, removing it is
literally indistinguishable to random(9) consumers (as compared with
retaining it).

Mark random(9) as deprecated and slated for quick removal. This is not to
say we intend to remove all fast, non-cryptographic PRNG(s) in the kernel.
It/they just won't be random(9), as it exists today, in either name or
implementation.

Before random(9) is removed, a replacement will be provided and in-tree
consumers will be converted.

Note that despite the name, the random(9) interface does not bear any
resemblance to random(3). Instead, it is the same crummy 1988 Park-Miller
LCG used in libc rand(3).
dis/subr_ntoskrnl.c
bdb3ec248ab1cfb97dafd8f477fb1da0941987e9 22-Dec-2019 jeff <jeff@FreeBSD.org> Make page busy state deterministic on free. Pages must be xbusy when
removed from objects including calls to free. Pages must not be xbusy
when freed and not on an object. Strengthen assertions to match these
expectations. In practice very little code had to change busy handling
to meet these rules but we can now make stronger guarantees to busy
holders and avoid conditionally dropping busy in free.

Refine vm_page_remove() and vm_page_replace() semantics now that we have
stronger guarantees about busy state. This removes redundant and
potentially problematic code that has proliferated.

Discussed with: markj
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D22822
inuxkpi/common/src/linux_compat.c
e158d1984da30f42bdd1139c05d884840527c211 18-Dec-2019 hselasky <hselasky@FreeBSD.org> Restore important comment in RCU/EPOCH support in FreeBSD after r355784.

Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_rcu.c
802e5163ef7391f6390d3c59ef1333fd7acedf37 16-Dec-2019 trasz <trasz@FreeBSD.org> Add a hack to make ^T work for Linux binaries, enabled with
'compat.linux.preserve_vstatus=1' sysctl.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21967
inux/linux_ioctl.c
inux/linux_ioctl.h
inux/linux_mib.c
inux/linux_mib.h
e41f6b35b706a65fc5243073451db8cc3214dbc0 16-Dec-2019 trasz <trasz@FreeBSD.org> Add compat.linux.emul_path, so it can be set to something other
than "/compat/linux". Useful when you have several compat directories
with different Linux versions and you don't want to clash with files
installed by linux-c7 packages.

Reviewed by: bcr (manpages)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22574
inux/linux_util.c
inux/linux_util.h
a329a5f6804f6be8c30748323b16021ebc153215 16-Dec-2019 trasz <trasz@FreeBSD.org> Don't use K&R definitions. No functional changes.

Reported by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inux/linux_file.c
506c867c6e0e0ef3477c28e18b97f50207c7b230 15-Dec-2019 jeff <jeff@FreeBSD.org> schedlock 4/4

Don't hold the scheduler lock while doing context switches. Instead we
unlock after selecting the new thread and switch within a spinlock
section leaving interrupts and preemption disabled to prevent local
concurrency. This means that mi_switch() is entered with the thread
locked but returns without. This dramatically simplifies scheduler
locking because we will not hold the schedlock while spinning on
blocked lock in switch.

This change has not been made to 4BSD but in principle it would be
more straightforward.

Discussed with: markj
Reviewed by: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22778
inuxkpi/common/src/linux_rcu.c
bf925a1e495ba90a3064aa882619930b782ba9ab 15-Dec-2019 jeff <jeff@FreeBSD.org> schedlock 1/4

Eliminate recursion from most thread_lock consumers. Return from
sched_add() without the thread_lock held. This eliminates unnecessary
atomics and lock word loads as well as reducing the hold time for
scheduler locks. This will eventually allow for lockless remote adds.

Discussed with: kib
Reviewed by: jhb
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22626
inux/linux_fork.c
inuxkpi/common/src/linux_kthread.c
5b3e917aa9bd77a7b6e81eff6478393f82ee6034 15-Dec-2019 cem <cem@FreeBSD.org> Revert r355760, r355759

And remove the inline/deprecated attribute use entirely in stdlib.h, from
r355747. The intent was to provide a buildable API transitionary period, but
clearly that was counter-productive.

Reported by: delphij, imp, others
inuxkpi/common/include/linux/compiler.h
039992743e2c116f2bb7f98f3d0e63de527fb714 14-Dec-2019 cem <cem@FreeBSD.org> linuxkpi: Drop incompatible __deprecated definition

Probably all of these linuxkpi stubs should be '#ifndef' guarded, but maybe
that would prevent people from noticing when they are defined.

Introduced in r355759. For some reason I only ran a buildworld and not a
kernel. Mea culpa.

Reported by: Mark Millard
X-MFC-with: r355759
inuxkpi/common/include/linux/compiler.h
820308e362842a8bf5db2d437e6ee913c7919889 14-Dec-2019 trasz <trasz@FreeBSD.org> Add sync_file_range(2) implementation to linux(4); it's a thin wrapper
over the usual fsync(2).

This silences some warnings when running "apt-get upgrade".

Reviewed by: brooks, emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22371
inux/linux_file.c
inux/linux_file.h
afefb77b29b866280269455e2c5702859fe652c9 13-Dec-2019 trasz <trasz@FreeBSD.org> Add kern_kill() and use it in Linuxulator. It's just a cleanup,
no functional changes.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22645
inux/linux_signal.c
787237cc3fa549221581aad8595fda9c2481c9e7 13-Dec-2019 trasz <trasz@FreeBSD.org> Add kern_getsid() and use it in Linuxulator; no functional changes.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22647
inux/linux_misc.c
4459aedbdc7ef2215f488ce43f8119620605025d 09-Dec-2019 jhb <jhb@FreeBSD.org> Copy out aux args after the argument and environment vectors.

Partially revert r354741 and r354754 and go back to allocating a
fixed-size chunk of stack space for the auxiliary vector. Keep
sv_copyout_auxargs but change it to accept the address at the end of
the environment vector as an input stack address and no longer
allocate room on the stack. It is now called at the end of
copyout_strings after the argv and environment vectors have been
copied out.

This should fix a regression in r354754 that broke the stack alignment
for newer Linux amd64 binaries (and probably broke Linux arm64 as
well).

Reviewed by: kib
Tested on: amd64 (native, linux64 (only linux-base-c7), and i386)
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22695
reebsd32/freebsd32_misc.c
dfa2e15cbee94059564bd5ac5d6225fe20a0fb01 06-Dec-2019 brooks <brooks@FreeBSD.org> sysent: Reduce duplication and improve readability.

Use the power of variable to avoid spelling out source and generated
files too many times. The previous Makefiles were hard to read, hard to
edit, and badly formatted.

Reviewed by: kevans, emaste
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22714
loudabi32/Makefile
loudabi64/Makefile
reebsd32/Makefile
0d8d23a6a3f467016a24a8eea6ca95087ba82581 03-Dec-2019 jhb <jhb@FreeBSD.org> Use uintptr_t instead of register_t * for the stack base.

- Use ustringp for the location of the argv and environment strings
and allow destp to travel further down the stack for the stackgap
and auxv regions.
- Update the Linux copyout_strings variants to move destp down the
stack as was done for the native ABIs in r263349.
- Stop allocating a space for a stack gap in the Linux ABIs. This
used to hold translated system call arguments, but hasn't been used
since r159992.

Reviewed by: kib
Tested on: md64 (amd64, i386, linux64), i386 (i386, linux)
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22501
loudabi32/cloudabi32_module.c
loudabi32/cloudabi32_util.h
loudabi64/cloudabi64_module.c
loudabi64/cloudabi64_util.h
reebsd32/freebsd32_misc.c
reebsd32/freebsd32_util.h
a32/ia32_signal.h
18bccfabd0737fe0a2e47705caf2b45789059e34 02-Dec-2019 jeff <jeff@FreeBSD.org> Fix the last few cases that grab without busy or valid. The grab functions must
return the page in some held state for consistency elsewhere.

Reviewed by: alc, kib, markj
Differential Revision: https://reviews.freebsd.org/D22610
loudabi/cloudabi_vdso.c
inux/linux_vdso.c
a5c27e7b2262e51030c526401d9b0471963c9c9b 24-Nov-2019 wulf <wulf@FreeBSD.org> Linux epoll: Allow passing of any negative timeout value to epoll_wait

Linux epoll allow passing of any negative timeout value to epoll_wait()
to cause unbound blocking

Reviewed by: emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22517
inux/linux_event.c
b52750f436314f5d8b0984eed74504b47c2360ab 24-Nov-2019 wulf <wulf@FreeBSD.org> Linux epoll: Register events with zero event mask

Such an events are legal and should be interpreted as EPOLLERR | EPOLLHUP.
Register a disabled kqueue event in that case as we do not support EPOLLHUP yet.

Required by Linux Steam client.

PR: 240590
Reported by: Alex S <iwtcex@gmail.com>
Reviewed by: emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22516
inux/linux_event.c
inux/linux_event.h
73dfdf541df287556bb21078a8926b58d4ecea24 24-Nov-2019 wulf <wulf@FreeBSD.org> Linux epoll: Check both read and write kqueue events existence in EPOLL_CTL_ADD

Linux epoll EPOLL_CTL_ADD op handler should always check registration
of both EVFILT_READ and EVFILT_WRITE kevents to deceide if supplied
file descriptor fd is already registered with epoll instance.

Reviewed by: emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22515
inux/linux_event.c
f94c7eb751d7171839e7a292bafdeb7ca7d912b9 24-Nov-2019 wulf <wulf@FreeBSD.org> Linux epoll: Don't deregister file descriptor after EPOLLONESHOT is fired

Linux epoll does not remove descriptor after one-shot event has been triggered.
Set EV_DISPATCH kqueue flag rather then EV_ONESHOT to get the same behavior.

Required by Linux Steam client.

PR: 240590
Reported by: Alex S <iwtcex@gmail.com>
Reviewed by: emaste, imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22513
inux/linux_event.c
788b844f427a42fdee70870d3f4f0857b524d95d 20-Nov-2019 mjg <mjg@FreeBSD.org> linux: avoid overhead of P_CONTROLT checks if possible

Sponsored by: The FreeBSD Foundation
inux/linux_file.c
b317a3c030bf4508c3a9948626073e8e2216609a 18-Nov-2019 kevans <kevans@FreeBSD.org> sysent: regenerate after r354835

The lua-based makesyscalls produces slightly different output than its
makesyscalls.sh predecessor, all whitespace differences more closely
matching the source syscalls.master.
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_systrace_args.c
60027726b9e566af064789b976c245b18a5b6f07 18-Nov-2019 kevans <kevans@FreeBSD.org> Convert in-tree sysent targets to use new makesyscalls.lua

flua is bootstrapped as part of the build for those on older
versions/revisions that don't yet have flua installed. Once upgraded past
r354833, "make sysent" will again naturally work as expected.

Reviewed by: brooks
Differential Revision: https://reviews.freebsd.org/D21894
loudabi32/Makefile
loudabi64/Makefile
reebsd32/Makefile
81f62ee15e1ce883c7d2ffc22b18422c08483012 18-Nov-2019 jhb <jhb@FreeBSD.org> Check for errors from copyout() and suword*() in sv_copyout_args/strings.

Reviewed by: brooks, kib
Tested on: amd64 (amd64, i386, linux64), i386 (i386, linux)
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22401
loudabi32/cloudabi32_module.c
loudabi32/cloudabi32_util.h
loudabi64/cloudabi64_module.c
loudabi64/cloudabi64_util.h
reebsd32/freebsd32_misc.c
reebsd32/freebsd32_util.h
4faee8fc9d06b64dbc8feb25b3a84b269b99723f 18-Nov-2019 dab <dab@FreeBSD.org> Jail and capability mode for shm_rename; add audit support for shm_rename

Co-mingling two things here:

* Addressing some feedback from Konstantin and Kyle re: jail,
capability mode, and a few other things
* Adding audit support as promised.

The audit support change includes a partial refresh of OpenBSM from
upstream, where the change to add shm_rename has already been
accepted. Matthew doesn't plan to work on refreshing anything else to
support audit for those new event types.

Submitted by: Matthew Bryan <matthew.bryan@isilon.com>
Reviewed by: kib
Relnotes: Yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22083
reebsd32/freebsd32_sysent.c
reebsd32/syscalls.master
e480233bc966bf7e338428723317dd8956305ba8 18-Nov-2019 trasz <trasz@FreeBSD.org> Make linux(4) open(2)/openat(2) return ELOOP instead of EMLINK,
when being passed O_NOFOLLOW. This fixes LTP testcase openat02:5.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22384
inux/linux_file.c
3f50cb74914f1f9f9cf2930d182207512a7033fc 15-Nov-2019 jhb <jhb@FreeBSD.org> Add a sv_copyout_auxargs() hook in sysentvec.

Change the FreeBSD ELF ABIs to use this new hook to copyout ELF auxv
instead of doing it in the sv_fixup hook. In particular, this new
hook allows the stack space to be allocated at the same time the auxv
values are copied out to userland. This allows us to avoid wasting
space for unused auxv entries as well as not having to recalculate
where the auxv vector is by walking back up over the argv and
environment vectors.

Reviewed by: brooks, emaste
Tested on: amd64 (amd64 and i386 binaries), i386, mips, mips64
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22355
reebsd32/freebsd32_misc.c
a32/ia32_sysvec.c
9044e67b6e8ebc05e3559c38ab3bb4ec70b72e61 15-Nov-2019 trasz <trasz@FreeBSD.org> Support O_CLOEXEC in linux(4) open(2) and openat(2).

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21966
inux/linux_file.c
7f81c60b0a14e68e7bb8a18edea81a1e6c545b94 14-Nov-2019 brooks <brooks@FreeBSD.org> Tidy syscall declerations.

Pointer arguments should be of the form "<type> *..." and not "<type>* ...".

No functional change.

Reviewed by: kevans
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22373
reebsd32/syscalls.master
474b97ac77ea7d0b657eb91b6347843e5e660279 11-Nov-2019 cognet <cognet@FreeBSD.org> linprocfs: Make sure to report -1 as tty when we have no controlling tty.

When reporting a process' stats, we can't just provide the tty as an
unsigned long, as if we have no controlling tty, the tty would be NODEV, or
-1. Instaed, just special-case NODEV.

Submitted by: Juraj Lutter <otis@sk.FreeBSD.org>
MFC after: 1 week
inprocfs/linprocfs.c
af3a431a84bbe0d7da58a8776466c94be50c34af 07-Nov-2019 emaste <emaste@FreeBSD.org> linux_renameat2: improve flag checks

In the cases where Linux returns an error (e.g. passing in an undefined
flag) there's no need for us to emit a message. (The target of this
message is a developer working on the linuxulatorm, not the author of
presumably broken Linux software).

Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21606
inux/linux_file.c
inux/linux_file.h
7b2d70eda730b3e45377e5d272168685e7ec07bd 06-Nov-2019 trasz <trasz@FreeBSD.org> Make linux(4) create /dev/shm. Linux applications often expect
a tmpfs to be mounted there, and because they like to verify it's
actually a mountpoint, a symlink won't do.

Reviewed by: dchagin (earlier version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20333
inux/linux.c
inux/linux.h
inux/linux_common.c
6698b2cae647bf633a91ec639c63c30619e4ec14 04-Nov-2019 hselasky <hselasky@FreeBSD.org> Enable device class group attributes in the LinuxKPI.

Bump the __FreeBSD_version to force recompilation of
external kernel modules due to structure change.

Differential Revision: https://reviews.freebsd.org/D21564
Submitted by: Greg V <greg@unrelenting.technology>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/sysfs.h
3878ab63de93acf0c095efb5a2896a346f76e7e5 23-Oct-2019 rstone <rstone@FreeBSD.org> Add missing M_NOWAIT flag

The LinuxKPI linux_dma code calls PCTRIE_INSERT with a
mutex held, but does not set M_NOWAIT when allocating
nodes, leading to a potential panic. All of this code
can handle an allocation failure here, so prefer an
allocation failure to sleeping on memory.

Also fix a related case where NOWAIT/WAITOK was not
specified. In this case it's not clear whether sleeping
is allowed so be conservative and assume not. There are
a lot of other paths in this code that can fail due to
a lack of memory anyway.

Differential Revision: https://reviews.freebsd.org/D22127
Reviewed by: imp
Sponsored by: Dell EMC Isilon
MFC After: 1 week
inuxkpi/common/src/linux_pci.c
ad16acfadb9196e6d3f626287de2461710b871bb 18-Oct-2019 yuripv <yuripv@FreeBSD.org> linux: futex_mtx should follow futex_list

Move futex_mtx to linux_common.ko for amd64 and aarch64 along
with respective list/mutex init/destroy.

PR: 240989
Reported by: Alex S <iwtcex@gmail.com>
inux/linux.c
inux/linux.h
inux/linux_common.c
inux/linux_futex.c
inux/linux_futex.h
fb90dea0cc91f07e9dcb8678bbfc47c7c9d98aa9 18-Oct-2019 yuripv <yuripv@FreeBSD.org> linux: provide just one instance of futex_list

Move futex_list definition to linux.c which is included once
in linux.ko (i386) and in linux_common.ko (amd64 and aarch64)
allowing 32/64 bit linux programs to access the same futexes
in the latter case.

PR: 240989
Reviewed by: dchagin
Differential Revision: https://reviews.freebsd.org/D22073
inux/linux.c
inux/linux.h
inux/linux_futex.c
inux/linux_futex.h
a229d895ec68cbfbaabee3f8f4e692ec3568f548 15-Oct-2019 hselasky <hselasky@FreeBSD.org> Fix missing epochification of the LinuxKPI after r353292.

Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/netdevice.h
e249e932a57cd3328696318135104b9216431a0a 15-Oct-2019 jeff <jeff@FreeBSD.org> (4/6) Protect page valid with the busy lock.

Atomics are used for page busy and valid state when the shared busy is
held. The details of the locking protocol and valid and dirty
synchronization are in the updated vm_page.h comments.

Reviewed by: kib, markj
Tested by: pho
Sponsored by: Netflix, Intel
Differential Revision: https://reviews.freebsd.org/D21594
inuxkpi/common/src/linux_compat.c
918670a5ed80338daea27cde4f58668164c8b676 08-Oct-2019 dougm <dougm@FreeBSD.org> Define macro VM_MAP_ENTRY_FOREACH for enumerating the entries in a vm_map.
In case the implementation ever changes from using a chain of next pointers,
then changing the macro definition will be necessary, but changing all the
files that iterate over vm_map entries will not.

Drop a counter in vm_object.c that would have an effect only if the
vm_map entry count was wrong.

Discussed with: alc
Reviewed by: markj
Tested by: pho (earlier version)
Differential Revision: https://reviews.freebsd.org/D21882
inprocfs/linprocfs.c
f73831afa6248c0178e8a3af627f5950f7caa97f 30-Sep-2019 brooks <brooks@FreeBSD.org> Regen after r347228 and r352693.

No functional change.
loudabi32/cloudabi32_proto.h
loudabi32/cloudabi32_syscall.h
loudabi32/cloudabi32_syscalls.c
loudabi32/cloudabi32_sysent.c
loudabi32/cloudabi32_systrace_args.c
loudabi64/cloudabi64_proto.h
loudabi64/cloudabi64_syscall.h
loudabi64/cloudabi64_syscalls.c
loudabi64/cloudabi64_sysent.c
loudabi64/cloudabi64_systrace_args.c
2f6a23158ab4fe77ad6b8f3b4e1196cfec669e12 30-Sep-2019 kaktus <kaktus@FreeBSD.org> linux_renameat2: don't add extra \n on error.

linux_msg() already adds \n at the end of all messages.

Reported by: emaste, kib (mentor), mjg (mentor)
Reviewed by: kib (mentor), mjg (mentor)
Differential Revision: https://reviews.freebsd.org/D21852
inux/linux_file.c
c7fb4709b22ec656abc34ebc8c4320c451f972dd 26-Sep-2019 dab <dab@FreeBSD.org> sysent: regenerate after r352747.

Sponsored by: Dell EMC Isilon
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
edad331b448bc301ded004217132a964a8da896e 26-Sep-2019 dab <dab@FreeBSD.org> Add an shm_rename syscall

Add an atomic shm rename operation, similar in spirit to a file
rename. Atomically unlink an shm from a source path and link it to a
destination path. If an existing shm is linked at the destination
path, unlink it as part of the same atomic operation. The caller needs
the same permissions as shm_unlink to the shm being renamed, and the
same permissions for the shm at the destination which is being
unlinked, if it exists. If those fail, EACCES is returned, as with the
other shm_* syscalls.

truss support is included; audit support will come later.

This commit includes only the implementation; the sysent-generated
bits will come in a follow-on commit.

Submitted by: Matthew Bryan <matthew.bryan@isilon.com>
Reviewed by: jilles (earlier revision)
Reviewed by: brueffer (manpages, earlier revision)
Relnotes: yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D21423
reebsd32/syscalls.master
28b3990ee569112a299f686aefe38304314a7fe6 25-Sep-2019 kevans <kevans@FreeBSD.org> compat/freebsd32: restore style after r352705 (no functional change)

The escaped newlines haven't been necessary since r339624, but this file has
not been reformatted. Restore the style.
reebsd32/syscalls.master
1d9983b2214d02ca823999ac0edbf34a06e6bb6b 25-Sep-2019 kevans <kevans@FreeBSD.org> sysent: regenerate after r352705

This also implements it, fixes kdump, and removes no longer needed bits from
lib/libc/sys/shm_open.c for the interim.
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
df8ec3c1558e8c7bbc43e9a02836831190cd9c2c 25-Sep-2019 kevans <kevans@FreeBSD.org> Mark shm_open(2) as COMPAT12, succeeded by shm_open2

Implementation and regenerated files will follow.
reebsd32/syscalls.master
dd20cd52c2f73e4a7fec2b8474ca6ab0fa4bbf73 25-Sep-2019 kevans <kevans@FreeBSD.org> sysent: regenerate after r352700
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
61785cc3d49cae6919ce530cd287da9bf157e634 25-Sep-2019 kevans <kevans@FreeBSD.org> Add a shm_open2 syscall to support upcoming memfd_create

shm_open2 allows a little more flexibility than the original shm_open.
shm_open2 doesn't enforce CLOEXEC on its callers, and it has a separate
shmflag argument that can be expanded later. Currently the only shmflag is
to allow file sealing on the returned fd.

shm_open and memfd_create will both be implemented in libc to use this new
syscall.

__FreeBSD_version is bumped to indicate the presence.

Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D21393
reebsd32/syscalls.master
48e2e866d5d3e1ce14f29caa1e7b27832a561dad 25-Sep-2019 kevans <kevans@FreeBSD.org> [2/3] Add an initial seal argument to kern_shm_open()

Now that flags may be set on posixshm, add an argument to kern_shm_open()
for the initial seals. To maintain past behavior where callers of
shm_open(2) are guaranteed to not have any seals applied to the fd they're
given, apply F_SEAL_SEAL for existing callers of kern_shm_open. A special
flag could be opened later for shm_open(2) to indicate that sealing should
be allowed.

We currently restrict initial seals to F_SEAL_SEAL. We cannot error out if
F_SEAL_SEAL is re-applied, as this would easily break shm_open() twice to a
shmfd that already existed. A note's been added about the assumptions we've
made here as a hint towards anyone wanting to allow other seals to be
applied at creation.

Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D21392
loudabi/cloudabi_fd.c
199eacdc019891902a20afc188b8cadf038d7882 25-Sep-2019 kevans <kevans@FreeBSD.org> sysent: regenerate after r352693
reebsd32/freebsd32_proto.h
12110b80859e4d9a92f0b515b9229367ed6121ff 25-Sep-2019 kevans <kevans@FreeBSD.org> Add COMPAT12 support to makesyscalls.sh

Reviewed by: kib, imp, brooks (all without syscalls.master edits)
Differential Revision: https://reviews.freebsd.org/D21366
reebsd32/syscalls.master
bee4305eb617c95a636245f57f2d5f7952b8c4f1 23-Sep-2019 tijl <tijl@FreeBSD.org> Create a "drm" subdirectory for drm devices in linsysfs. Recent versions of
linux libdrm check for the existence of this directory:

https://cgit.freedesktop.org/mesa/drm/commit/?id=f8392583418aef5e27bfed9989aeb601e20cc96d

MFC after: 2 weeks
insysfs/linsysfs.c
c1fe73ee39192b29843ecfb1e8c14d6b100f8fcb 11-Sep-2019 emaste <emaste@FreeBSD.org> linux: add trivial renameat2 implementation

Just return EINVAL if flags != 0. The Linux man page documents one
case of EINVAL as "The filesystem does not support one of the flags in
flags."

After r351723 userland binaries will try using new system calls.

Reported by: mjg
Reviewed by: mjg, trasz
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21590
inux/linux_file.c
39803421e203f1f25b6be7d7504eb0d4431b6df2 11-Sep-2019 hselasky <hselasky@FreeBSD.org> Use true and false when dealing with bool type in the LinuxKPI.
No functional change.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_work.c
f2bc2e1d05ef7cb79d31b541c6645bc0f708346f 11-Sep-2019 hselasky <hselasky@FreeBSD.org> Fix synchronous work drain issue in the LinuxKPI.

A work callback may restart itself. Loop in the drain function to see if the
work has been rescheduled and stop the subsequent reschedules, if any.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_work.c
9bf06d7d6ba02a59a9a1b1b42a0ba251dc53e32f 11-Sep-2019 hselasky <hselasky@FreeBSD.org> Fix broken DECLARE_TASKLET() macro after r347852.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/interrupt.h
90a6cf3d6dfb9bf96ab91a9de665f819c94ca325 10-Sep-2019 jeff <jeff@FreeBSD.org> Replace redundant code with a few new vm_page_grab facilities:
- VM_ALLOC_NOCREAT will grab without creating a page.
- vm_page_grab_valid() will grab and page in if necessary.
- vm_page_busy_acquire() automates some busy acquire loops.

Discussed with: alc, kib, markj
Tested by: pho (part of larger branch)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21546
inuxkpi/common/src/linux_page.c
ccbfa8304f694595b78758646a54dbeee04f4be7 09-Sep-2019 markj <markj@FreeBSD.org> Change synchonization rules for vm_page reference counting.

There are several mechanisms by which a vm_page reference is held,
preventing the page from being freed back to the page allocator. In
particular, holding the page's object lock is sufficient to prevent the
page from being freed; holding the busy lock or a wiring is sufficent as
well. These references are protected by the page lock, which must
therefore be acquired for many per-page operations. This results in
false sharing since the page locks are external to the vm_page
structures themselves and each lock protects multiple structures.

Transition to using an atomically updated per-page reference counter.
The object's reference is counted using a flag bit in the counter. A
second flag bit is used to atomically block new references via
pmap_extract_and_hold() while removing managed mappings of a page.
Thus, the reference count of a page is guaranteed not to increase if the
page is unbusied, unmapped, and the object's write lock is held. As
a consequence of this, the page lock no longer protects a page's
identity; operations which move pages between objects are now
synchronized solely by the objects' locks.

The vm_page_wire() and vm_page_unwire() KPIs are changed. The former
requires that either the object lock or the busy lock is held. The
latter no longer has a return value and may free the page if it releases
the last reference to that page. vm_page_unwire_noq() behaves the same
as before; the caller is responsible for checking its return value and
freeing or enqueuing the page as appropriate. vm_page_wire_mapped() is
introduced for use in pmap_extract_and_hold(). It fails if the page is
concurrently being unmapped, typically triggering a fallback to the
fault handler. vm_page_wire() no longer requires the page lock and
vm_page_unwire() now internally acquires the page lock when releasing
the last wiring of a page (since the page lock still protects a page's
queue state). In particular, synchronization details are no longer
leaked into the caller.

The change excises the page lock from several frequently executed code
paths. In particular, vm_object_terminate() no longer bounces between
page locks as it releases an object's pages, and direct I/O and
sendfile(SF_NOCACHE) completions no longer require the page lock. In
these latter cases we now get linear scalability in the common scenario
where different threads are operating on different files.

__FreeBSD_version is bumped. The DRM ports have been updated to
accomodate the KPI changes.

Reviewed by: jeff (earlier version)
Tested by: gallatin (earlier version), pho
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20486
inuxkpi/common/include/linux/mm.h
inuxkpi/common/src/linux_compat.c
inuxkpi/common/src/linux_page.c
29ada79727993b8cca1cf24d52832ea0d2d460c6 06-Sep-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Improve sysfs support.

- Add functions for creating and merging sysfs groups.
- Add sysfs_streq function to compare strings ignoring newline from the
sysctl userland call.
- Add a call to sysfs_create_groups in device_add.
- Remove duplicate header include.
- Bump __FreeBSD_version.

Reviewed by: hselasky
Approved by: imp (mentor), hselasky
MFC after: 4 days
Differential Revision: D21542
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/sysfs.h
798fb0eec91b8f9f5400eb42b2b1e6420936eaf7 04-Sep-2019 trasz <trasz@FreeBSD.org> Fix /proc/mounts for autofs(5) mounts.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inprocfs/linprocfs.c
a788c25f19002e83cc351db8ac25bd05d3144d2d 03-Sep-2019 kib <kib@FreeBSD.org> Add procctl(PROC_STACKGAP_CTL)

It allows a process to request that stack gap was not applied to its
stacks, retroactively. Also it is possible to control the gaps in the
process after exec.

PR: 239894
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21352
reebsd32/freebsd32_misc.c
a15e2ccf7a9de598a14390781644ed80e364241b 03-Sep-2019 trasz <trasz@FreeBSD.org> Make linprocfs(4) report Tgid, Linux ltrace(1) needs it.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
inprocfs/linprocfs.c
286ae5bd6b5d86148741a589f277ac299e3cff4e 03-Sep-2019 mjg <mjg@FreeBSD.org> Add sysctlbyname system call

Previously userspace would issue one syscall to resolve the sysctl and then
another one to actually use it. Do it all in one trip.

Fallback is provided in case newer libc happens to be running on an older
kernel.

Submitted by: Pawel Biernacki
Reported by: kib, brooks
Differential Revision: https://reviews.freebsd.org/D17282
reebsd32/freebsd32_misc.c
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
reebsd32/syscalls.master
bce030131257645d83be6f0144a2cf68dafbf813 02-Sep-2019 trasz <trasz@FreeBSD.org> Bump Linux version to 3.2.0. Without it, binaries linked against
glibc 2.24 and up (eg Ubuntu 19.04) fail with "FATAL: kernel too old".

This alone is not enough to make newer binaries actually work;
fix/hack/workaround is pending review at https://reviews.freebsd.org/D20687.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20757
inux/linux_mib.h
efac9cd1ecb130fea28d2212d8fc206c61ccada8 02-Sep-2019 trasz <trasz@FreeBSD.org> Relax compat.linux.osrelease checks. This way one can do eg
'compat.linux.osrelease=3.10.0-957.12.1.el7.x86_64', which
corresponds to CentOS 7.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20685
inux/linux_mib.c
dc4227bd608327508538504339cd2b1df0cd9459 02-Sep-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Add sysfs create/remove functions that handles multiple files in one call.

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
Differential Revision: D21475
inuxkpi/common/include/linux/sysfs.h
f2d1759c50fb28fa0e2e1bd65e7bd49ee7ed4693 02-Sep-2019 hselasky <hselasky@FreeBSD.org> Use DEVICE memory instead of UNCACHEABLE on aarch64 in ioremap() in the LinuxKPI.
This fixes system hangs on reading device registers on aarch64.

Tested with: Marvell MACCHIATObin (Armada8k) + mlx4en, amdgpu
Submitted by: Greg V <greg@unrelenting.technology>
Differential Revision: https://reviews.freebsd.org/D20789
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/io.h
90c17c9d31fa70bd20b9240e0833019c14cab26c 18-Aug-2019 kib <kib@FreeBSD.org> Change locking requirements for VOP_UNSET_TEXT().

Require the vnode to be locked for the VOP_UNSET_TEXT() call. This
will be used by the following bug fix for a tmpfs issue.

Tested by: sbruno, pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
inux/linux_misc.c
bf49531f4c5727c4b7d3d45e8b8049762a657e65 17-Aug-2019 cem <cem@FreeBSD.org> Linuxkpi: Prevent easy generated ctor name conflicts with prefix

Sponsored by: Dell EMC Isilon
inuxkpi/common/include/linux/workqueue.h
b07319c849a174a85472144014aa3462f90343e0 14-Aug-2019 hselasky <hselasky@FreeBSD.org> Implement pci_enable_msi() and pci_disable_msi() in the LinuxKPI.
This patch makes the DRM graphics driver in ports usable on aarch64.

Submitted by: Greg V <greg@unrelenting.technology>
Differential Revision: https://reviews.freebsd.org/D21008
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/interrupt.h
inuxkpi/common/include/linux/pci.h
d093f3cf9534e01dcb95aa1169b343ba98a0816d 13-Aug-2019 jhb <jhb@FreeBSD.org> Fix build with DRM and INVARIANTS enabled.

The DRM drivers use the lockdep assertion macros with spinlock_t locks
which are backed by mutexes, not sx locks. This causes compile
failures since you can't use sx_assert with a mutex. Instead, change
the lockdep macros to use lock_class methods. This works by assuming
that each LinuxKPI locking primitive embeds a FreeBSD lock as its
first structure and uses a cast to get to the underlying 'struct
lock_object'.

Reviewed by: hselasky
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20992
inuxkpi/common/include/linux/lockdep.h
b27c98fb1e944170d8d7ca8baa34dda103176df2 11-Aug-2019 kib <kib@FreeBSD.org> compat/linux: Remove obsoleted and somewhat confusing comments related to COMPAT_43.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21200
inux/linux_misc.c
inux/linux_uid16.c
6f1b496fece9a974843c18a031f3ed690a09300b 04-Aug-2019 jhibbits <jhibbits@FreeBSD.org> Fix 32-bit build again, post r350570.

Missed this part with my testing as well. Pass the right type to
BUS_TRANSLATE_RESOURCE().
inuxkpi/common/src/linux_pci.c
5a2f340a9e89b0fc6ff4d4c5acd26d7e8dac541b 04-Aug-2019 jhibbits <jhibbits@FreeBSD.org> Fix 32-bit build post-r350570

The error message prints a rman_res_t, which is an uintmax_t. Explicitly
cast, just for future-proofing, and use the correct format.
inuxkpi/common/src/linux_pci.c
1153a377f7638ed43b848fba7d5c25fe45aab378 04-Aug-2019 jhibbits <jhibbits@FreeBSD.org> Add necessary bits for Linux KPI to work correctly on powerpc

PowerPC, and possibly other architectures, use different address ranges for
PCI space vs physical address space, which is only mapped at resource
activation time, when the BAR gets written. The DRM kernel modules do not
activate the rman resources, soas not to waste KVA, instead only mapping
parts of the PCI memory at a time. This introduces a
BUS_TRANSLATE_RESOURCE() method, implemented in the Open Firmware/FDT PCI
driver, to perform this necessary translation without activating the
resource.

In addition to system KPI changes, LinuxKPI is updated to handle a
big-endian host, by adding proper endian swaps to the I/O functions.

Submitted by: mmacy
Reported by: hselasky
Differential Revision: https://reviews.freebsd.org/D21096
inuxkpi/common/include/linux/gfp.h
inuxkpi/common/include/linux/io.h
inuxkpi/common/include/linux/pci.h
inuxkpi/common/src/linux_pci.c
3df08381ed7e96b2d3bcac00c9bd296e4c8966ae 31-Jul-2019 kib <kib@FreeBSD.org> Make randomized stack gap between strings and pointers to argv/envs.

This effectively makes the stack base on the csu _start entry
randomized.

The gap is enabled if ASLR is for the ABI is enabled, and then
kern.elf{64,32}.aslr.stack_gap specify the max percentage of the
initial stack size that can be wasted for gap. Setting it to zero
disables the gap, and max is capped at 50%.

Only amd64 for now.

Reviewed by: cem, markj
Discussed with: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21081
reebsd32/freebsd32_misc.c
a32/ia32_sysvec.c
4168d73dc2403a1bbe6fccc5e9910fd0a91afd07 31-Jul-2019 kib <kib@FreeBSD.org> Regen.
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
3ebafdf353596f0654a6b70c5e0b8a335e9ce7ea 31-Jul-2019 kib <kib@FreeBSD.org> freebsd32 shims for copy_file_range(2).

Reviewed by: brooks, rmacklem (previous version)
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21092
reebsd32/syscalls.master
fb8c9ef8339d80750d77ecc6f6a5d84ea9d129b2 31-Jul-2019 kevans <kevans@FreeBSD.org> kern_shm_open: push O_CLOEXEC into caller control

The motivation for this change is to allow wrappers around shm to be written
that don't set CLOEXEC. kern_shm_open currently accepts O_CLOEXEC but sets
it unconditionally. kern_shm_open is used by the shm_open(2) syscall, which
is mandated by POSIX to set CLOEXEC, and CloudABI's sys_fd_create1().
Presumably O_CLOEXEC is intended in the latter caller, but it's unclear from
the context.

sys_shm_open() now unconditionally sets O_CLOEXEC to meet POSIX
requirements, and a comment has been dropped in to kern_fd_open() to explain
the situation and add a pointer to where O_CLOEXEC setting is maintained for
shm_open(2) correctness. CloudABI's sys_fd_create1() also unconditionally
sets O_CLOEXEC to match previous behavior.

This also has the side-effect of making flags correctly reflect the
O_CLOEXEC status on this fd for the rest of kern_shm_open(), but a
glance-over leads me to believe that it didn't really matter.

Reviewed by: kib, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21119
loudabi/cloudabi_fd.c
18afe5991f141b1a05b4318dc71579ffd5d1d1ac 29-Jul-2019 markj <markj@FreeBSD.org> Avoid relying on header pollution from sys/refcount.h.

MFC after: 3 days
Sponsored by: The FreeBSD Foundation
reebsd32/freebsd32_capability.c
9e1efe06c391682f56c405bc9676ec599bfc694c 10-Jul-2019 avg <avg@FreeBSD.org> linuxcommon: add module version

MFC after: 2 weeks
inux/linux_common.c
5c4a9b0e32c1f9c47d5b687d6036bb03c3cc071c 10-Jul-2019 tijl <tijl@FreeBSD.org> Let linuxulator mprotect mask unsupported bits before calling kern_mprotect.

After r349240 kern_mprotect returns EINVAL for unsupported bits in the prot
argument. Linux rtld uses PROT_GROWSDOWN and PROT_GROWS_UP when marking the
stack executable. Mask these bits like kern_mprotect used to do. For other
unsupported bits EINVAL is returned like Linux does.

Reviewed by: trasz, brooks
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20864
inux/linux_mmap.c
inux/linux_mmap.h
039f74039e57a5d79cda830161af71a1e9c4050e 08-Jul-2019 markj <markj@FreeBSD.org> Merge the vm_page hold and wire mechanisms.

The hold_count and wire_count fields of struct vm_page are separate
reference counters with similar semantics. The remaining essential
differences are that holds are not counted as a reference with respect
to LRU, and holds have an implicit free-on-last unhold semantic whereas
vm_page_unwire() callers must explicitly determine whether to free the
page once the last reference to the page is released.

This change removes the KPIs which directly manipulate hold_count.
Functions such as vm_fault_quick_hold_pages() now return wired pages
instead. Since r328977 the overhead of maintaining LRU for wired pages
is lower, and in many cases vm_fault_quick_hold_pages() callers would
swap holds for wirings on the returned pages anyway, so with this change
we remove a number of page lock acquisitions.

No functional change is intended. __FreeBSD_version is bumped.

Reviewed by: alc, kib
Discussed with: jeff
Discussed with: jhb, np (cxgbe)
Tested by: pho (previous version)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19247
inuxkpi/common/src/linux_page.c
f6ef3ada2d6cadd27d5dc8aa067345b7e07f5a2a 04-Jul-2019 emaste <emaste@FreeBSD.org> Update Linux compat version to 2.6.36

New system calls between 2.6.32 and 2.6.26 are already implemented.

This should be mostly NFC as far as contemporary Linux applications are
concerned though, as Linux kernel 3.2 is the oldest supported by a
number of popular distros today; work is in progress by others to enable
support for those applications.

Discussed with: trasz
MFC after: 1 month
inux/linux_mib.h
fded509cb939268bdee5894b68865239c5c419f3 04-Jul-2019 trasz <trasz@FreeBSD.org> Return ENOTSUP for Linux FS_IOC_FIEMAP ioctl.

Linux man(1) calls it for no good reason; this avoids the console spam
(eg '(man): ioctl fd=4, cmd=0x660b ('f',11) is not implemented').

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20690
inux/linux_ioctl.c
inux/linux_ioctl.h
f343e8b139f9bfd4888874147083fd447c97fe84 04-Jul-2019 trasz <trasz@FreeBSD.org> Fix linuxulator prlimit64(2) with pid == 0. This makes 'ulimit -a'
return something reasonable, and helps linux binaries which attempt
to close all the files, eg apt(8).

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20692
inux/linux_misc.c
e3f6cb45c48c127a2c51f889f8ec2862511c281b 03-Jul-2019 hselasky <hselasky@FreeBSD.org> Remove dead code added after r348743 in the LinuxKPI. The
LINUXKPI_VERSION macro is not defined for any compiled LinuxKPI code
which basically means __GFP_NOTWIRED is never checked when allocating
pages. This should work fine with the existing external DRM code as
long as the page wiring and unwiring is balanced.

MFC after: 3 days
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/gfp.h
inuxkpi/common/src/linux_page.c
6bcf6e3fcb7911683307dc5d033108ed0f66e97c 03-Jul-2019 markj <markj@FreeBSD.org> Remove the CDIOCREADSUBCHANNEL_SYSSPACE ioctl.

This was added for emulation of Linux's CDROMSUBCHNL, but allows
users with read access to a cd(4) device to overwrite kernel memory
provided that the driver detects some media present.

Reimplement CDROMSUBCHNL by bouncing the data from CDIOCREADSUBCHANNEL
through the linux_cdrom_subchnl structure passed from userspace.

admbugs: 768
Reported by: Alex Fortune
Security: CVE-2019-5602
Security: FreeBSD-SA-19:11.cd_ioctl
inux/linux_ioctl.c
5144f6086b02fd30a2fc7268dd0b960fdc215587 02-Jul-2019 kib <kib@FreeBSD.org> Control implicit PROT_MAX() using procctl(2) and the FreeBSD note
feature bit.

In particular, allocate the bit to opt-out the image from implicit
PROTMAX enablement. Provide procctl(2) verbs to set and query
implicit PROTMAX handling. The knobs mimic the same per-image flag
and per-process controls for ASLR.

Reviewed by: emaste, markj (previous version)
Discussed with: brooks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D20795
reebsd32/freebsd32_misc.c
f814ce2d88b2b64a56d0bd434cc986790e0892ee 21-Jun-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Additions to rcu list.

- Add rcu list functions.
- Make rcu hlist's foreach macro use rcu calls instead of the non-rcu macro.
- Bump FreeBSD version so we have a checkpoint for the vboxvideo drm driver.

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
Differential Revision: D20719
inuxkpi/common/include/linux/rculist.h
1eab0f965fe9447bf06de9309ac8c517577b3a44 21-Jun-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Add atomic_long_sub macro.

Reviewed by: imp (mentor), hps
Approved by: imp (mentor), hps
MFC after: 1 week
Differential Revision: D20718
inuxkpi/common/include/asm/atomic-long.h
6a9f104098c14ce29580a3e6f8462e98f31a6f68 07-Jun-2019 markj <markj@FreeBSD.org> Replace uses of vm_page_unwire(m, PQ_NONE) with vm_page_unwire_noq(m).

These calls are not the same in general: the former will dequeue the
page if it is enqueued, while the latter will just leave it alone. But,
all existing uses of the former apply to unmanaged pages, which are
never enqueued in the first place. No functional change intended.

Reviewed by: kib
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20470
inuxkpi/common/src/linux_page.c
8393cc2acf7b2dac6b3645f040fd105e99678901 06-Jun-2019 markj <markj@FreeBSD.org> Make the linuxkpi's alloc_pages() consistently return wired pages.

Previously it did this only on platforms without a direct map. This
also more closely matches Linux's semantics.

Since some DRM v5.0 code assumes the old behaviour, use a
LINUXKPI_VERSION guard to preserve that until the out-of-tree module
is updated.

Reviewed by: hselasky, kib (earlier versions), johalun
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20502
inuxkpi/common/include/linux/gfp.h
inuxkpi/common/src/linux_page.c
786a38578631be34dbc23e792c51aaff8fce6532 30-May-2019 brooks <brooks@FreeBSD.org> makesyscalls.sh: always use absolute path for syscalls.conf

syscalls.conf is included using "." which per the Open Group:

If file does not contain a <slash>, the shell shall use the search
path specified by PATH to find the directory containing file.

POSIX shells don't fall back to the current working directory.

Submitted by: Nathaniel Wesley Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: bdrewery
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D20476
loudabi32/Makefile
loudabi64/Makefile
reebsd32/Makefile
a25b408b04599c6fce2193dbdb311213402b0efa 30-May-2019 dchagin <dchagin@FreeBSD.org> Complete LOCAL_PEERCRED support. Cache pid of the remote process in the
struct xucred. Do not bump XUCRED_VERSION as struct layout is not changed.

PR: 215202
Reviewed by: tijl
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20415
inux/linux_socket.c
3a2fd1de63ae212f9419b87ba5a38e09da93f996 30-May-2019 dchagin <dchagin@FreeBSD.org> Linux does not support MSG_OOB for unix(4) or non-stream oriented socket,
return EOPNOTSUPP as a Linux does.

Reviewed by: tijl
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20409
inux/linux_socket.c
bc0cda5ebb04dab606bf2f24a53eacb1a88527aa 21-May-2019 dchagin <dchagin@FreeBSD.org> Do not leak sa in linux_recvmsg() call if kern_recvit() fails.

MFC after: 1 week
inux/linux_socket.c
de381845cd93ffd3bb9642989bbf7295c74e6a99 21-May-2019 dchagin <dchagin@FreeBSD.org> Do not use uninitialised sa.

Reported by: tijl@
MFC after: 1 week
inux/linux_socket.c
05ec103068d451f2e42090eda46b5f674dbabc18 21-May-2019 dchagin <dchagin@FreeBSD.org> Do not leak sa in linux_recvfrom() call if kern_recvit() fails.

MFC after: 1 week
inux/linux_socket.c
0e503ca5778b5d477811921ec8f87ad96abe8dfb 21-May-2019 cem <cem@FreeBSD.org> Include eventhandler.h in more compilation units

This was enumerated with exhaustive search for sys/eventhandler.h includes,
cross-referenced against EVENTHANDLER_* usage with the comm(1) utility. Manual
checking was performed to avoid redundant includes in some drivers where a
common os_bsd.h (for example) included sys/eventhandler.h indirectly, but it is
possible some of these are redundant with driver-specific headers in ways I
didn't notice.

(These CUs did not show up as missing eventhandler.h in tinderbox.)

X-MFC-With: r347984
inuxkpi/common/src/linux_compat.c
80233392cc6c811031aec38f2576c93b1605fe87 19-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Finalize move of lindebugfs from ports to base.

The source file was moved to base earlier and also improved upon,
but never compiled in. This patch will:
- Make a module in sys/modules
- Make lindebugfs depend on linuxkpi (for seq_file)
- Check if read/write functions are set before calling, DRM drivers
don't always set both of them.

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
indebugfs/lindebugfs.c
23bce359918b195450af3119dd3aa8e802b20727 19-May-2019 trasz <trasz@FreeBSD.org> Implement PTRACE_O_TRACESYSGOOD. This makes Linux strace(1) work.

Reviewed by: dchagin
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20200
inux/linux_emul.h
inux/linux_misc.c
inux/linux_misc.h
7217903de21b1d0323d452adf11df7307a4cd878 19-May-2019 dchagin <dchagin@FreeBSD.org> Linux send() call returns EAGAIN instead of ENOTCONN in case when the
socket is non-blocking and connect() is not finished yet.

Initial patch developed by Steven Hartland in 2008 and adopted by me.

PR: 129169
Reported by: smh@
MFC after: 2 weeks
inux/linux_socket.c
50552705f26b85b6b77ddb014b91407788484f5d 16-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Finalize import of seq_file.

seq_file.h and linux_seq_file.c was imported form ports earlier but
linux_seq_file.c was never compiled in with the module. With this
commit base seq_file will replace ports seq_file and it required a
few modifications to not break functionality and build.

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/seq_file.h
aa109efed9eec761009ceba4ce3d045107ab888f 16-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Add in_task macro.

This patch is part of D19565

Reviewed by: hps, bwidawsk
Approved by: imp (mentor), hps
Obtained from: bwidawsk
MFC after: 1 week
inuxkpi/common/include/linux/preempt.h
19259f7af949ed5cf8bc17660b3fdb4f5790a5f0 16-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Fix build on powerpc/sparc.

Use cmpset instead of testandset in tasklet lock code.

Reviewed by: hps
Approved by: imp (mentor), hps
Obtained from: hps
MFC after: 1 week
inuxkpi/common/src/linux_tasklet.c
43a8682d315aad176a96290abc98605a100309a2 16-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Updates to tasklets for Linux 5.0.

DRM drivers expect tasklets to have a counter for enable/disable calls.
Also, add a few more tasklet locking functions.

This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/interrupt.h
inuxkpi/common/src/linux_tasklet.c
0efe35a512928d3fd127274649f8b88df0d58729 16-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Add group_leader member to struct task_struct.

Assign self as group leader at creation to act as the only member of a
new process group.
This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/sched.h
inuxkpi/common/src/linux_current.c
3a9128f648daca0ee750871be69fde2ae6bdf617 16-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Update access_ok macro for v5.0.

Check LINUXKPI_VERSION macro for backwards compatibility.
It's recommended to update any drivers that depend on the older KPI
so we can deprecate < 5.0 code as we update to newer Linux version.
This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/uaccess.h
inuxkpi/common/src/linux_compat.c
b267afaa6b3bc38dd70df8cbcf0fdb6e28d4d085 16-May-2019 tychon <tychon@FreeBSD.org> Allow loading the same DMA address multiple times without any prior
unload for the LinuxKPI.

Reviewed by: kib, zeising
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20181
inuxkpi/common/src/linux_pci.c
edc77597c1be66d77328ad3d1d4884786dac9b47 15-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Add helper macros IS_ALIGNED and DIV_ROUND_DOWN_ULL.

This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/kernel.h
f301f6321ce2ab17d9e03b682bc21a6b6ad0b468 15-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Move {lower|upper}_32_bits macros from port to base.

This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/compiler.h
7c170b59dbd1113717548f313afc92223ded911f 15-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Include asm/atomic-long.h from atomic.h.

This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/atomic.h
dfdde018eb5788cc043a8ded41745746d3e179ec 15-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Add get_random_u32 function.

This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/random.h
e4aa9931ce5c77cf042bb12d43bda0847f37ddd2 15-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Update user_access_begin for Linux v5.0.

Check the new LINUXKPI_VERSION macro for backwards compatibility.
This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/asm/uaccess.h
5478f0858cf55ce83f465d7a68462c7d35079ad4 15-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Expand ktime functionality.

Also, make ktime_get_raw call getnanouptime instead of getnanotime
to match (the correct) ktime_get_raw_ns.
This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/ktime.h
9d5b96beeb2bb72ed10918f5dd2408057af8814f 14-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Add prepare to pm_ops and bump FreeBSD version.

This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/device.h
15e23b1a4ab585692bb2f7102af2e0ab2b34bbd7 14-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Add vm_fault_t type.

This patch is part of D19565

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/mm_types.h
3ec4fe1cd6efa61600286bd4a0725dbe9040fa67 14-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Add context member to ww_mutex and bump FreeBSD version.

This patch is part of https://reviews.freebsd.org/D19565.

Reviewed by: hps
Approved by: imp (mentor), hps
inuxkpi/common/include/linux/ww_mutex.h
ce5e6d4d794b7883864b9711fc3497684c345e8d 14-May-2019 johalun <johalun@FreeBSD.org> LinuxKPI: Let del_timer return a value to match Linux.

This patch is part of https://reviews.freebsd.org/D19565.

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/timer.h
inuxkpi/common/src/linux_compat.c
f7e99603a761273593b99380eb79e865cd274413 13-May-2019 dchagin <dchagin@FreeBSD.org> Linuxulator depends on a fundamental kernel settings such as SMP. Many
of them listed in opt_global.h which is not generated while building
modules outside of a kernel and such modules never match real cofigured
kernel.

So, we should prevent our users from building obviously defective modules.

Therefore, remove the root cause of the building of modules outside of a
kernel - the possibility of building modules with DEBUG or KTR flags.
And remove all of DEBUG printfs as it is incomplete and in threaded
programms not informative, also a half of system call does not have DEBUG
printf. For debuging Linux programms we have dtrace, ktr and ktrace ability.

PR: 222861
Reviewed by: trasz
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20178
inux/linux_file.c
inux/linux_fork.c
inux/linux_getcwd.c
inux/linux_ioctl.c
inux/linux_misc.c
inux/linux_misc.h
inux/linux_signal.c
inux/linux_stats.c
c8001997cef4aca8409da5342de76fde94dceb96 13-May-2019 dchagin <dchagin@FreeBSD.org> Linuxulator getpeername() returns EINVAL in case then namelen less then 0.

MFC after: 2 weeks
inux/linux_socket.c
57102bcf40779603c9e7548e77d75bc66b7514a6 13-May-2019 dchagin <dchagin@FreeBSD.org> Our bsd_to_linux_sockaddr() and linux_to_bsd_sockaddr() functions
alter the userspace sockaddr to convert the format between linux and BSD versions.
That's the minimum 3 of copyin/copyout operations for one syscall.

Also some syscall uses linux_sa_put() and linux_getsockaddr() when load
sockaddr to userspace or from userspace accordingly.

To avoid this chaos, especially converting sockaddr in the userspace,
rewrite these 4 functions to convert sockaddr only in kernel and leave
only 2 of this functions.

Also in order to reduce duplication between MD parts of the Linuxulator put
struct sockaddr conversion functions that are MI out into linux_common module.

PR: 232920
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20157
inux/linux.c
inux/linux.h
inux/linux_common.h
inux/linux_socket.c
inux/linux_socket.h
853098f8376d01476e29682d813d36628a131c81 10-May-2019 johalun <johalun@FreeBSD.org> Implement linux_pci_unregister_drm_driver in linuxkpi so that drm drivers
can be unloaded.

This patch is a part of D19565.

Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
inuxkpi/common/include/linux/pci.h
inuxkpi/common/src/linux_pci.c
8b55b827e2531072b51d9a1de1a6eac45464ad16 09-May-2019 hselasky <hselasky@FreeBSD.org> Fix memory leak of PCI BUS structure in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_pci.c
2b471c6a15529b215aecdbe4fb2fa2ea87f68500 09-May-2019 hselasky <hselasky@FreeBSD.org> Fix regression issue after r346645 in the LinuxKPI.
Make sure LinuxKPI PCI devices get a default BUSDMA tag.

Found by: Thomas Laus <lausts@acm.org>
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_pci.c
6cc096bffa360f9f99af05b558252c45bbe1d809 08-May-2019 hselasky <hselasky@FreeBSD.org> Add support for Dynamic Interrupt Moderation, DIM, in mlx5en(4).

Add support for DIM based on Linux,
with some minor adaptions specific to FreeBSD.

Linux commit
f97c3dc3c0e8d23a5c4357d182afeef4c67f5c33

MFC after: 3 days
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/net_dim.h
206ba42431462826952c1e457d0b3a3405931a13 07-May-2019 emaste <emaste@FreeBSD.org> make sysent after r347228

Regenerate to add @generated tag in generated files.
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
ff74a853f511b8d9f96e533b28fadbbc7dff313d 07-May-2019 dchagin <dchagin@FreeBSD.org> Remove wrong copyright line. Discussed with Carlos Neira.

Reported by: Rodney W. Grimes
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D13656
insysfs/linsysfs.c
2c9faf1048125eb00f2662bf826b10eea67f1c80 06-May-2019 dchagin <dchagin@FreeBSD.org> Adds sys/class/net devices to linsysfs.

Only two interfaces are created eth0 and lo and they expose
the following properties:
address, addr_len, flags, ifindex, mty, tx_queue_len and type.

Initial patch developed by Carlos Neira in 2017 and finished by me.

PR: 223722
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D13656
insysfs/linsysfs.c
b83adbe5912c68336dd5681ac6de5b308f366275 06-May-2019 dchagin <dchagin@FreeBSD.org> Rewrite linux_ifflags() in more readable Linuxulator style.

Reviewed by: emaste
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20146
inux/linux.c
inux/linux.h
df8d6fbda3f397170b8bfa52ea790a7ef3d390dd 06-May-2019 dchagin <dchagin@FreeBSD.org> Complete r347052 (https://reviews.freebsd.org/D20137) as it it was not
a final revision.

Fix style issues and change bool-like variables from int to bool.

Reviewed by: emaste
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20141
inux/linux.c
97403087df1a430914780eba43d50b137fe86c82 06-May-2019 hselasky <hselasky@FreeBSD.org> Use PCIV_INVALID in pci_channel_offline() in the LinuxKPI.

Build tested drm-current-kmod prior to commit.

MFC after: 1 week
Submitted by: slavash@
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
6e5bade211398c61685ebd1fb5ed806235d43e80 06-May-2019 hselasky <hselasky@FreeBSD.org> Disabling a PCI device should only disable busmaster in the LinuxKPI.

As Linux comment for this function point:
Signal to the system that the PCI device is not in use by the system
anymore. This only involves disabling PCI bus-mastering, if active.

Build tested drm-current-kmod prior to commit.

MFC after: 1 week
Submitted by: slavash@
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
ec094306821fb0c8c621e34bdbbebdbc36aa9ccb 06-May-2019 hselasky <hselasky@FreeBSD.org> Implement print_hex_dump_debug() function macro in the LinuxKPI.

Build tested drm-current-kmod prior to commit.

MFC after: 1 week
Submitted by: slavash@
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/printk.h
f4b58c1701bb309b65c9947a642d824934da27a6 06-May-2019 hselasky <hselasky@FreeBSD.org> Allow controlling pr_debug at runtime in the LinuxKPI.

Turning on pr_debug at compile time make it non-optional at runtime.
This often means that the amount of the debugging is unbearable.
Allow developer to turn on pr_debug output only when needed.

Build tested drm-current-kmod prior to commit.

MFC after: 1 week
Submitted by: kib@
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
inuxkpi/common/src/linux_compat.c
2dc0d9edaa7487c11806a0ea8cae77e3a4a79785 05-May-2019 kib <kib@FreeBSD.org> Switch to use shared vnode locks for text files during image activation.

kern_execve() locks text vnode exclusive to be able to set and clear
VV_TEXT flag. VV_TEXT is mutually exclusive with the v_writecount > 0
condition.

The change removes VV_TEXT, replacing it with the condition
v_writecount <= -1, and puts v_writecount under the vnode interlock.
Each text reference decrements v_writecount. To clear the text
reference when the segment is unmapped, it is recorded in the
vm_map_entry backed by the text file as MAP_ENTRY_VN_TEXT flag, and
v_writecount is incremented on the map entry removal

The operations like VOP_ADD_WRITECOUNT() and VOP_SET_TEXT() check that
v_writecount does not contradict the desired change. vn_writecheck()
is now racy and its use was eliminated everywhere except access.
Atomic check for writeability and increment of v_writecount is
performed by the VOP. vn_truncate() now increments v_writecount
around VOP_SETATTR() call, lack of which is arguably a bug on its own.

nullfs bypasses v_writecount to the lower vnode always, so nullfs
vnode has its own v_writecount correct, and lower vnode gets all
references, since object->handle is always lower vnode.

On the text vnode' vm object dealloc, the v_writecount value is reset
to zero, and deadfs vop_unset_text short-circuit the operation.
Reclamation of lowervp always reclaims all nullfs vnodes referencing
lowervp first, so no stray references are left.

Reviewed by: markj, trasz
Tested by: mjg, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D19923
inux/linux_misc.c
9d04d3b1a9db7fd550096049e831b9e1fdb4c65f 04-May-2019 hselasky <hselasky@FreeBSD.org> Fix regression issue after r346645 in the LinuxKPI.

The S/G list must be mapped AS-IS without any optimisations.
This also implies that sg_dma_len() must be equal to sg->length.
Many Linux drivers assume this and this fixes some DRM issues.

Put the BUS DMA map pointer into the scatter-gather list to
allow multiple mappings on the same physical memory address.

The FreeBSD version has been bumped to force recompilation of
external kernel modules.

Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/scatterlist.h
inuxkpi/common/src/linux_pci.c
11c67cb03f715c9a98b9fd8644ce774077ff2f71 04-May-2019 hselasky <hselasky@FreeBSD.org> Fix regression issue after r346645 in the LinuxKPI.
Properly handle error case when mapping DMA address fails.

Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_pci.c
90cdaa0665b339584f85be1ce3b5a4ae4db3a185 03-May-2019 dchagin <dchagin@FreeBSD.org> In order to reduce duplication between MD parts of the Linuxulator
move bits that are MI out into the headers in compat/linux.
For that remove bogus _packed attribute from struct l_sockaddr
and use MI types for struct members.

And continue to move into the linux_common module a code that is
intended for both Linuxulator modules (both instruction set - 32 & 64 bit)
or for external modules like linsysfs or linprocfs.

To avoid header pollution introduce new sys/compat/linux_common.h header.

Reviewed by: emaste
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20137
inux/linux.c
inux/linux.h
inux/linux_common.h
inux/linux_ioctl.c
88a2bb70029b4927fa2432bb4e833ebdf6a78112 03-May-2019 trasz <trasz@FreeBSD.org> Decode more CPU flags in cpuinfo.

Reviewed by: dchagin
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20145
inprocfs/linprocfs.c
4496859fd5f9007de44dd1229b2d8143dc4df070 02-May-2019 trasz <trasz@FreeBSD.org> Fix flags in cpuinfo.

Reviewed by: dchagin
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20139
inprocfs/linprocfs.c
39598707e2fd41a532a1c5379665778f78a649f9 02-May-2019 dchagin <dchagin@FreeBSD.org> Remove unneeded includes.

MFC after: 2 week
insysfs/linsysfs.c
c5cd7639a03a170b2f78a6383357a709704b2131 02-May-2019 trasz <trasz@FreeBSD.org> Add sys/devices/system/cpu/{possible,present} to linsysfs(5).
That makes Linux lscpu(1) work.

Reviewed by: dchagin
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20131
insysfs/linsysfs.c
219791b23e761720c839a7d786b32533fd3c8056 30-Apr-2019 dchagin <dchagin@FreeBSD.org> Follow the FreeBSD and implement PDEATH_SIG prctl ops in the Linuxulator.
It was first introduced in r163734 and missied by me in r283383.

MFC after: 1 week
inux/linux_emul.c
inux/linux_emul.h
inux/linux_misc.c
57d474cf20eb2a0a4234a4e28a0c8a766663ab46 30-Apr-2019 hselasky <hselasky@FreeBSD.org> Reduce the number of mutexes after r346645 in the LinuxKPI.
Make function macro wrappers for locking and unlocking to ease readability.

No functional change.

Discussed with: kib@, tychon@ and zeising@
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_pci.c
c00a464559a9c0f7435c992deadb5ef09940b5d9 30-Apr-2019 hselasky <hselasky@FreeBSD.org> Make the dma_pool structure private to the LinuxKPI similar to Linux.

No functional change.

Discussed with: kib @
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/dmapool.h
inuxkpi/common/src/linux_pci.c
97e3e4a704599f5282621726743528fc7a97dca2 30-Apr-2019 hselasky <hselasky@FreeBSD.org> Store a pointer to the device instead of the PCI device in the DMA pool
implementation in the LinuxKPI. This avoids use of container_of().

No functional change.

Discussed with: kib @
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/dmapool.h
inuxkpi/common/src/linux_pci.c
722f733b1d9a8413168fd3f37c6aa318d3b4f7ec 26-Apr-2019 emaste <emaste@FreeBSD.org> make sysent after r346273 (readlinkat arg correction)

PR: 197915
Reminded by: dchagin
reebsd32/freebsd32_systrace_args.c
2a1acc988569555ec02e73603fdf1129679df33c 25-Apr-2019 johalun <johalun@FreeBSD.org> Don't call cdev_init where cdev_alloc is called. cdev_alloc already
handles initialization.

Reported by: johalun
Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19565
inuxkpi/common/src/linux_compat.c
730162b79254d39e6c7bf22c39096fcfc4cacbd5 25-Apr-2019 tychon <tychon@FreeBSD.org> LinuxKPI buildfix for ppc64 after r346645.

Proposed by: hselasky
Sponsored by: Dell EMC Isilon
inuxkpi/common/src/linux_pci.c
070cf1ede1850d8c1824181e258b6ec1ac293255 25-Apr-2019 hselasky <hselasky@FreeBSD.org> LinuxKPI buildfix for 32-bit DMA architectures after r346645.

The <sys/pctrie.h> APIs expect a 64-bit DMA key.
This is fine as long as the DMA is less than or equal to 64 bits, which
is currently the case.

Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_pci.c
6dbde879b1f13d5397bbde3b8d6f75b898fd2bed 24-Apr-2019 tychon <tychon@FreeBSD.org> LinuxKPI should use bus_dma(9) to be compatible with an IOMMU

Reviewed by: hselasky, kib
Tested by: greg@unrelenting.technology
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D19845
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/dma-mapping.h
inuxkpi/common/include/linux/dmapool.h
inuxkpi/common/include/linux/pci.h
inuxkpi/common/include/linux/scatterlist.h
inuxkpi/common/src/linux_pci.c
3ea680f19f6fceabd76f61ec9dc5c22393f46982 20-Apr-2019 emaste <emaste@FreeBSD.org> Enable ioremap for aarch64 in the LinuxKPI

Required for Mellanox drivers (e.g. on Ampere eMAG at Packet.com).

PR: 237055
Submitted by: Greg V <greg@unrelenting.technology>
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D19987
inuxkpi/common/include/linux/io.h
inuxkpi/common/src/linux_compat.c
0ae08aa89a52eb75657a0d5b83607afe03df7a71 16-Apr-2019 emaste <emaste@FreeBSD.org> correct readlinkat(2) return type

r176215 corrected readlink(2)'s return type and the type of the last
argument. readlink(2) was introduced in r177788 after being developed
as part of Google Summer of Code 2007; it appears to have inherited the
wrong return type.

Man pages and header files were already ssize_t; update syscalls.master
to match.

PR: 197915
Submitted by: Henning Petersen <henning.petersen@t-online.de>
MFC after: 2 weeks
reebsd32/syscalls.master
def45a363e51d353f063c8876ad00ddac259c019 06-Apr-2019 oshogbo <oshogbo@FreeBSD.org> Regen after r345982.
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
20d273b44bfbde962a17360eb1be449211af58d6 06-Apr-2019 oshogbo <oshogbo@FreeBSD.org> Introduce funlinkat syscall that always us to check if we are removing
the file associated with the given file descriptor.

Reviewed by: kib, asomers
Reviewed by: cem, jilles, brooks (they reviewed previous version)
Discussed with: pjd, and many others
Differential Revision: https://reviews.freebsd.org/D14567
loudabi/cloudabi_file.c
reebsd32/syscalls.master
inux/linux_file.c
33133b3b41a3dfd6204155c22b7d0d14039152ed 04-Apr-2019 cem <cem@FreeBSD.org> Replace read_random(9) with more appropriate arc4rand(9) KPIs

Reviewed by: ae, delphij
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D19760
inuxkpi/common/include/linux/etherdevice.h
inuxkpi/common/include/linux/random.h
05de3ee1690369c9b643c2a992c6fb23511b976a 30-Mar-2019 jah <jah@FreeBSD.org> freebsd32: fix padding of computed control message length for recvmsg()

Each control message region must be aligned on a 4-byte boundary on 32-bit
architectures. The 32-bit compat shim for recvmsg() gets the actual layout
right, but doesn't pad the payload length when computing msg_controllen for
the output message header. If a control message contains an unaligned
payload, such as the 1-byte TTL field in the example attached to PR 236737,
this can produce control message payload boundaries that extend beyond
the boundary reported by msg_controllen.

PR: 236737
Reported by: Yuval Pavel Zholkover <paulzhol@gmail.com>
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19768
reebsd32/freebsd32_misc.c
7dbe184dfcec8a18e09d1cb79a18461017ea718f 24-Mar-2019 dchagin <dchagin@FreeBSD.org> Whitespace cleanup (annoying).

MFC after: 1 month
inux/linux_fork.c
82329819bc3ea56d144c869a9dca495101ccc2ad 24-Mar-2019 dchagin <dchagin@FreeBSD.org> Update syscall.master to 5.0.

For 32-bit Linuxulator, ipc() syscall was historically
the entry point for the IPC API. Starting in Linux 4.18, direct
syscalls are provided for the IPC. Enable it.

MFC after: 1 month
inux/linux_ipc.h
70fa6829e28c8766721c842e96baed99573de08b 24-Mar-2019 dchagin <dchagin@FreeBSD.org> Linux between 4.18 and 5.0 split IPC system calls.
In preparation for doing this in the Linuxulator modify our linux_shmat()
to match actual Linux shmat() system call.

MFC after: 1 month
inux/linux_ipc.c
inux/linux_ipc.h
71140c5be468621cc47b728e33e8b875b0c0b608 16-Mar-2019 kib <kib@FreeBSD.org> amd64 KPTI: add control from procctl(2).

Add the infrastructure to allow MD procctl(2) commands, and use it to
introduce amd64 PTI control and reporting. PTI mode cannot be
modified for existing pmap, the knob controls PTI of the new vmspace
created on exec.

Requested by: jhb
Reviewed by: jhb, markj (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D19514
reebsd32/freebsd32_misc.c
2aa7ec640fc5a900a9d55358dfeba825c52b8e10 14-Mar-2019 hselasky <hselasky@FreeBSD.org> Revert r345102 until the DRM next port issues are resolved.

Requested by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
b3ba8cfe40d20e300c81f4179bbed09a5fee9b75 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Resolve duplicate symbol name conflict after r345095, when building LINT.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/idr.h
inuxkpi/common/src/linux_idr.c
76039ff57cc7b342ddb0cecd888ec67ff58a4a82 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement sg_virt() function in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/scatterlist.h
aeef74ece71ea120d7ee9b251d43d3a82f0f2996 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Define SG_CHAIN and SG_END in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/scatterlist.h
f5fba2286861b9aaf71b843cb49264406b94b628 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement pr_info_ratelimited() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/printk.h
e4dfde8a8f6691b14a618d4226d51e387c7566a2 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Define some RCU debug macros in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/rcupdate.h
633e7b731a2604622ef93083d372948bfc947cb8 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Honor SYSCTL function return values when creating sysfs nodes in the LinuxKPI.
Return proper error code upon failure.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sysfs.h
b424ea3997a6b476d73e1c08524a3c758e698673 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement more malloc function macros in the LinuxKPI.
Fix arguments for currently unused kvmalloc().

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/slab.h
24fd7ba4a1be1dde551ca2b7a15cc857d7d3f08b 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement more PCI speed related functions and macros in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
e491bd45abf10d6a302c6630f48da459a2ce969f 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement IS_ALIGNED() and DIV_ROUND_DOWN_ULL() function macros in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
36d6e32f9056e81611f22090c0d6d9683a003f73 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement si_meminfo() in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mm.h
inuxkpi/common/src/linux_page.c
0f69aa3ba105c6c65690a42aae2a02785a6954e9 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement task_euid() and get_task_state() function macros in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
41178626f0a966e96f72c4fac2689a7d40bde2ab 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement get_task_comm() in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
2615457a0aca8bf7816a0ece58b1cac2251c78f5 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement current_exiting() in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
inuxkpi/common/src/linux_current.c
06133477d68ea0a7a17608711b6bc943c748aa15 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement list_for_each_entry_from_reverse() and
list_bulk_move_tail() in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/list.h
d7a752008139512d278da651136ecafa7efa44fe 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement dma_map_page_attrs() in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/dma-mapping.h
13e21f62ea85125d5adfb26cbe06963fe849f524 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement ida_free() and ida_alloc_max() in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/idr.h
inuxkpi/common/src/linux_idr.c
eb6b83e1b35134bf5dc22209b0a33a4f376c5b5d 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement DEFINE_STATIC_SRCU() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/srcu.h
635b9987ad4aeb30da957a4096c08028359b1dea 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement BITS_PER_TYPE() function macro in the LinuxKPI.
Fix some style while at it.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/bitops.h
24e1a96ba75182b5d04c07e06ca96a2afdfa6aba 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Properly define the DMA attribute values in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/dma-attrs.h
575d275104b8b51dccea034ad5189a7b252bb1c9 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement dev_err_once() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/device.h
59c1132d33582a60a5929c1e55bf2220f9488131 13-Mar-2019 hselasky <hselasky@FreeBSD.org> Implement dma_set_mask_and_coherent() in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/dma-mapping.h
74f072217c40efdb995453ab6c01bf669f7e2d93 12-Mar-2019 imp <imp@FreeBSD.org> Kill tz_minuteswest and tz_dsttime.

Research Unix, 7th Edition introduced TIMEZONE and DSTFLAG
compile-time constants in sys/param.h to communicate these values for
the machine. 4.2BSD moved from the compile-time to run-time and
introduced these variables and used for localtime() to return the
right offset from UTC (sometimes referred to as GMT, for this purpose
is the same). 4.4BSD migrated to using the tzdata code/database and
these variables were basically unused.

FreeBSD removed the real need for these with adjkerntz in
1995. However, some RTC clocks continued to use these variables,
though they were largely unused otherwise. Later, phk centeralized
most of the uses in utc_offset, but left it using both tz_minuteswest
and adjkerntz.

POSIX (IEEE Std 1003.1-2017) states in the gettimeofday specification
"If tzp is not a null pointer, the behavior is unspecified" so there's
no standards reason to retain it anymore. In fact, gettimeofday has
been marked as obsolecent, meaning it could be removed from a future
release of the standard. It is the only interface defined in POSIX
that references these two values. All other references come from the
tzdata database via tzset().

These were used to more faithfully implement early unix ABIs which
have been removed from FreeBSD. NetBSD has completely eliminated
these variables years ago. Linux has migrated to tzdata as well,
though these variables technically still exist for compatibility
with unspecified older programs.

So, there's no real reason to have them these days. They are a
historical vestige that's no longer used in any meaningful way.

Reviewed By: jhb@, brooks@
Differential Revision: https://reviews.freebsd.org/D19550
reebsd32/freebsd32_misc.c
1a536e4f2450047129536463dda63fcd52ff7319 01-Mar-2019 trasz <trasz@FreeBSD.org> Remove sv_pagesize, originally introduced with r100384.

In all of the architectures we have today, we always use PAGE_SIZE.
While in theory one could define different things, none of the
current architectures do, even the ones that have transitioned from
32-bit to 64-bit like i386 and arm. Some ancient mips binaries on
other systems used 8k instead of 4k, but we don't support running
those and likely never will due to their age and obscurity.

Reviewed by: imp (who also contributed the commit message)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D19280
a32/ia32_sysvec.c
f1e006906859eeffe75fbf490b6d3313f209fa07 01-Mar-2019 bz <bz@FreeBSD.org> Add ushort and ulong to linux/types.h.

When porting code once written for Linux we find not only uints but also ushort and ulong.
Provide central typedefs as part of the linuxkpi for those as well.

Reviewed by: hselasky, emaste
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19405
inuxkpi/common/include/linux/types.h
d37c02f8a905cf4425adb3c049974c81df479d6d 23-Feb-2019 mmacy <mmacy@FreeBSD.org> import linux debugfs support

Reviewed by: hps@
MFC after: 1 week
Sponsored by: iX Systems
Differential Revision: https://reviews.freebsd.org/D19258
indebugfs/lindebugfs.c
inuxkpi/common/include/linux/debugfs.h
inuxkpi/common/include/linux/seq_file.h
inuxkpi/common/src/linux_seq_file.c
d75b15dcb36d80d1e7670a341dd73fbd680c4714 23-Feb-2019 mmacy <mmacy@FreeBSD.org> linux/fs: simplify interop and correct definition of loff_t

- offsets can be negative, loff_t needs to be signed, it also simplifies
interop with the rest of the code base to use off_t than the actual linux
definition "long long"
- don't rely on the defining "file" to "linux_file" in interface definitions
as that causes heartache with includes

Reviewed by: hps@
MFC after: 1 week
Sponsored by: iX Systems
Differential Revision: https://reviews.freebsd.org/D19274
inuxkpi/common/include/linux/fs.h
inuxkpi/common/include/linux/types.h
cd0e324cdca133eaa3b5224f137e4d1995eb3625 22-Feb-2019 mmacy <mmacy@FreeBSD.org> lkpi: allow late binding of linux_alloc_current

Some consumers may be loosely coupled with the lkpi.
This allows them to call linux_alloc_current without
having a static dependency.

Reviewed by: hps@
MFC after: 1 week
Sponsored by: iX Systems
Differential Revision: https://reviews.freebsd.org/D19257
inuxkpi/common/include/linux/compat.h
inuxkpi/common/src/linux_current.c
68256995cdc49503948193b26e6f7a0d662b47c7 12-Feb-2019 marius <marius@FreeBSD.org> Make taskqgroup_attach{,_cpu}(9) work across architectures

So far, intr_{g,s}etaffinity(9) take a single int for identifying
a device interrupt. This approach doesn't work on all architectures
supported, as a single int isn't sufficient to globally specify a
device interrupt. In particular, with multiple interrupt controllers
in one system as found on e. g. arm and arm64 machines, an interrupt
number as returned by rman_get_start(9) may be only unique relative
to the bus and, thus, interrupt controller, a certain device hangs
off from.
In turn, this makes taskqgroup_attach{,_cpu}(9) and - internal to
the gtaskqueue implementation - taskqgroup_attach_deferred{,_cpu}()
not work across architectures. Yet in turn, iflib(4) as gtaskqueue
consumer so far doesn't fit architectures where interrupt numbers
aren't globally unique.
However, at least for intr_setaffinity(..., CPU_WHICH_IRQ, ...) as
employed by the gtaskqueue implementation to bind an interrupt to a
particular CPU, using bus_bind_intr(9) instead is equivalent from
a functional point of view, with bus_bind_intr(9) taking the device
and interrupt resource arguments required for uniquely specifying a
device interrupt.
Thus, change the gtaskqueue implementation to employ bus_bind_intr(9)
instead and intr_{g,s}etaffinity(9) to take the device and interrupt
resource arguments required respectively. This change also moves
struct grouptask from <sys/_task.h> to <sys/gtaskqueue.h> and wraps
struct gtask along with the gtask_fn_t typedef into #ifdef _KERNEL
as userland likes to include <sys/_task.h> or indirectly drags it
in - for better or worse also with _KERNEL defined -, which with
device_t and struct resource dependencies otherwise is no longer
as easily possible now.
The userland inclusion problem probably can be improved a bit by
introducing a _WANT_TASK (as well as a _WANT_MOUNT) akin to the
existing _WANT_PRISON etc., which is orthogonal to this change,
though, and likely needs an exp-run.

While at it:
- Change the gt_cpu member in the grouptask structure to be of type
int as used elswhere for specifying CPUs (an int16_t may be too
narrow sooner or later),
- move the gtaskqueue_enqueue_fn typedef from <sys/gtaskqueue.h> to
the gtaskqueue implementation as it's only used and needed there,
- change the GTASK_INIT macro to use "gtask" rather than "task" as
argument given that it actually operates on a struct gtask rather
than a struct task, and
- let subr_gtaskqueue.c consistently use __func__ to print functions
names.

Reported by: mmel
Reviewed by: mmel
Differential Revision: https://reviews.freebsd.org/D19139
inuxkpi/common/src/linux_tasklet.c
08849e56bae3a92bfbfee3bc1cbd6628dbd685f6 10-Feb-2019 kib <kib@FreeBSD.org> Implement Address Space Layout Randomization (ASLR)

With this change, randomization can be enabled for all non-fixed
mappings. It means that the base address for the mapping is selected
with a guaranteed amount of entropy (bits). If the mapping was
requested to be superpage aligned, the randomization honours the
superpage attributes.

Although the value of ASLR is diminshing over time as exploit authors
work out simple ASLR bypass techniques, it elimintates the trivial
exploitation of certain vulnerabilities, at least in theory. This
implementation is relatively small and happens at the correct
architectural level. Also, it is not expected to introduce
regressions in existing cases when turned off (default for now), or
cause any significant maintaince burden.

The randomization is done on a best-effort basis - that is, the
allocator falls back to a first fit strategy if fragmentation prevents
entropy injection. It is trivial to implement a strong mode where
failure to guarantee the requested amount of entropy results in
mapping request failure, but I do not consider that to be usable.

I have not fine-tuned the amount of entropy injected right now. It is
only a quantitive change that will not change the implementation. The
current amount is controlled by aslr_pages_rnd.

To not spoil coalescing optimizations, to reduce the page table
fragmentation inherent to ASLR, and to keep the transient superpage
promotion for the malloced memory, locality clustering is implemented
for anonymous private mappings, which are automatically grouped until
fragmentation kicks in. The initial location for the anon group range
is, of course, randomized. This is controlled by vm.cluster_anon,
enabled by default.

The default mode keeps the sbrk area unpopulated by other mappings,
but this can be turned off, which gives much more breathing bits on
architectures with small address space, such as i386. This is tied
with the question of following an application's hint about the mmap(2)
base address. Testing shows that ignoring the hint does not affect the
function of common applications, but I would expect more demanding
code could break. By default sbrk is preserved and mmap hints are
satisfied, which can be changed by using the
kern.elf{32,64}.aslr.honor_sbrk sysctl.

ASLR is enabled on per-ABI basis, and currently it is only allowed on
FreeBSD native i386 and amd64 (including compat 32bit) ABIs. Support
for additional architectures will be added after further testing.

Both per-process and per-image controls are implemented:
- procctl(2) adds PROC_ASLR_CTL/PROC_ASLR_STATUS;
- NT_FREEBSD_FCTL_ASLR_DISABLE feature control note bit makes it possible
to force ASLR off for the given binary. (A tool to edit the feature
control note is in development.)
Global controls are:
- kern.elf{32,64}.aslr.enable - for non-fixed mappings done by mmap(2);
- kern.elf{32,64}.aslr.pie_enable - for PIE image activation mappings;
- kern.elf{32,64}.aslr.honor_sbrk - allow to use sbrk area for mmap(2);
- vm.cluster_anon - enables anon mapping clustering.

PR: 208580 (exp runs)
Exp-runs done by: antoine
Reviewed by: markj (previous version)
Discussed with: emaste
Tested by: pho
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D5603
reebsd32/freebsd32_misc.c
a32/ia32_sysvec.c
32c9348f1fd290e782fabf2f9c388acf1ba03dc4 09-Feb-2019 kib <kib@FreeBSD.org> Normalize the declaration of i386_read_exec variable.

It is currently re-declared in sys/sysent.h which is a wrong place for
MD variable. Which causes redeclaration error with gcc when
sys/sysent.h and machine/md_var.h are included both.

Remove it from sys/sysent.h and instead include machine/md_var.h when
needed, under #ifdef for both i386 and amd64.

Reported and tested by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
reebsd32/freebsd32_misc.c
c52e7b8bebbbd47ab75eeb31d70399099d2901fa 30-Jan-2019 avos <avos@FreeBSD.org> Fix compilation with 'option NDISAPI + device ndis' and
without 'device pccard' in the kernel config file.

PR: 171532
Reported by: Robert Bonomi <bonomi@host128.r-bonomi.com>
MFC after: 1 week
dis/ndis_var.h
dec5165be206583d7a4fb285706e8831b5d7aaf2 25-Jan-2019 hselasky <hselasky@FreeBSD.org> Add full support for PCI_ANY_ID when matching PCI IDs in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
inuxkpi/common/src/linux_pci.c
f59c76260c1af80572faec13deb624cccb46a0af 22-Jan-2019 gonzo <gonzo@FreeBSD.org> [ndis] Fix unregistered use of FPU by NDIS in kernel on amd64

amd64 miniport drivers are allowed to use FPU which triggers "Unregistered use
of FPU in kernel" panic.

Wrap all variants of MSCALL with fpu_kern_enter/fpu_kern_leave. To reduce
amount of allocations/deallocations done via
fpu_kern_alloc_ctx/fpu_kern_free_ctx maintain cache of fpu_kern_ctx elements.

Based on the patch by Paul B Mahol

PR: 165622
Submitted by: Vlad Movchan <vladislav.movchan@gmail.com>
MFC after: 1 month
dis/kern_windrv.c
dis/pe_var.h
3bb289571c987005b1c9e3c7bbb0cb1a040ec873 21-Jan-2019 emaste <emaste@FreeBSD.org> linuxulator: fix stack memory disclosure in linux_sigaltstack

Most siginfo_to_lsiginfo callers already zeroed the l_siginfo_t before
callit it, but linux_waitid did not. Instead of zeroing in the called
function to address linux_waitid (as in commit 2e6ebe70), just do it in
linux_waitid.

admbugs: 765
Reported by: Vlad Tsyrklevich <vlad@tsyrklevich.net>
Reviewed by: Andrew
MFC after: 1 day
Security: Kernel stack memory disclosure
Sponsored by: The FreeBSD Foundation
inux/linux_misc.c
938cf74229bd8637bc490522bdf402a392b0d89f 21-Jan-2019 emaste <emaste@FreeBSD.org> linuxulator: fix stack memory disclosure in linux_ioctl_termio

admbugs: 765
Reported by: Vlad Tsyrklevich <vlad@tsyrklevich.net>
Reviewed by: andrew
MFC after: 1 day
Security: Kernel stack memory disclosure
Sponsored by: The FreeBSD Foundation
inux/linux_ioctl.c
294f2877ce2453446464e39509f419fbe1fd6122 21-Jan-2019 emaste <emaste@FreeBSD.org> linuxulator: fix stack memory disclosure in linux_ioctl_v4l

admbugs: 765
Reported by: Vlad Tsyrklevich <vlad@tsyrklevich.net>
Reviewed by: andrew
MFC after: 1 day
Security: Kernel stack memory disclosure
Sponsored by: The FreeBSD Foundation
inux/linux_ioctl.c
72a21ba0f62da5e86a1c0b462aeb3f5ff849a1b7 17-Jan-2019 mckusick <mckusick@FreeBSD.org> Create new EINTEGRITY error with message "Integrity check failed".

An integrity check such as a check-hash or a cross-correlation failed.
The integrity error falls between EINVAL that identifies errors in
parameters to a system call and EIO that identifies errors with the
underlying storage media. EINTEGRITY is typically raised by intermediate
kernel layers such as a filesystem or an in-kernel GEOM subsystem when
they detect inconsistencies. Uses include allowing the mount(8) command
to return a different exit value to automate the running of fsck(8)
during a system boot.

These changes make no use of the new error, they just add it. Later
commits will be made for the use of the new error number and it will
be added to additional manual pages as appropriate.

Reviewed by: gnn, dim, brueffer, imp
Discussed with: kib, cem, emaste, ed, jilles
Differential Revision: https://reviews.freebsd.org/D18765
loudabi/cloudabi_errno.c
inux/linux_errno.inc
37fd65a0e194bf556bf96f05bdf495e586f43a45 15-Jan-2019 glebius <glebius@FreeBSD.org> Fix compilation failures on different arches that have vm_machdep.c not
aware of counter_u64_t by including counter.h into uma_int.h. I'm not
happy about this inclusion, but it fixes compilation ASAP.
inuxkpi/common/src/linux_page.c
60d5d98bc3734af2d40d4dfccb5d4031a65f6ccc 15-Jan-2019 glebius <glebius@FreeBSD.org> Make uz_allocs, uz_frees and uz_fails counter(9). This removes some
atomic updates and reduces amount of data protected by zone lock.

During startup point these fields to EARLY_COUNTER. After startup
allocate them for all early zones.

Tested by: pho
inuxkpi/common/src/linux_page.c
170373a6332918aa78faf6aebfd2be28f98c01fc 13-Jan-2019 cognet <cognet@FreeBSD.org> Regenerate sysent files after having modified syscalls.master.
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
cf385242dd146dcf456e4c25e52b17fb22b86cc0 13-Jan-2019 cognet <cognet@FreeBSD.org> amd64 is the only arch that doesn't require padding for 32bits syscalls, so
instead of listing every arch thar requires it, just exclude amd64.
reebsd32/syscalls.master
6d8cc191f953b3680c5e5911afc66b7c1f8e6c4b 09-Jan-2019 glebius <glebius@FreeBSD.org> Mechanical cleanup of epoch(9) usage in network stack.

- Remove macros that covertly create epoch_tracker on thread stack. Such
macros a quite unsafe, e.g. will produce a buggy code if same macro is
used in embedded scopes. Explicitly declare epoch_tracker always.

- Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list
IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read
locking macros to what they actually are - the net_epoch.
Keeping them as is is very misleading. They all are named FOO_RLOCK(),
while they no longer have lock semantics. Now they allow recursion and
what's more important they now no longer guarantee protection against
their companion WLOCK macros.
Note: INP_HASH_RLOCK() has same problems, but not touched by this commit.

This is non functional mechanical change. The only functionally changed
functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter
epoch recursively.

Discussed with: jtl, gallatin
inuxkpi/common/include/linux/inetdevice.h
9e861e433f49c93ac3cea38dff4568d208690357 08-Jan-2019 markj <markj@FreeBSD.org> Specify the correct option level when emulating SO_PEERCRED.

Our equivalent to SO_PEERCRED, LOCAL_PEERCRED, is implemented at
socket option level 0, not SOL_SOCKET.

PR: 234722
Submitted by: Dániel Bakai <bakaidl@gmail.com>
MFC after: 2 weeks
inux/linux_socket.c
7263d0bea2c295671a9549a4690f1ffb208e29e0 01-Jan-2019 cem <cem@FreeBSD.org> linuxkpi: Remove extraneous NULL check on M_WAITOK allocation

The check was not introduced in r342628, but the subsequent unchecked access to
refs was added then, prompting a Coverity warning about "Null pointer
dereferences (FORWARD_NULL)." The warning is bogus due to M_WAITOK, but so is
the NULL check that hints it, so just remove it.

CID: 1398588
Reported by: Coverity
inuxkpi/common/include/linux/cdev.h
67253db60f21526a486cc0ee5e8313eda1376c32 30-Dec-2018 kib <kib@FreeBSD.org> Fix 32bit gcc builds after r342625.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
855498f6718967f982b6ea980db99b0f567da2ae 30-Dec-2018 kib <kib@FreeBSD.org> Fix linux_destroy_dev() behaviour when there are still files open from
the destroying cdev.

Currently linux_destroy_dev() waits for the reference count on the
linux cdev to drain, and each open file hold the reference.
Practically it means that linux_destroy_dev() is blocked until all
userspace processes that have the cdev open, exit. FreeBSD devfs does
not have such problem, because device refcount only prevents freeing
of the cdev memory, and separate 'active methods' counter blocks
destroy_dev() until all threads leave the cdevsw methods. After that,
attempts to enter cdevsw methods are refused with an error.

Implement somewhat similar mechanism for LinuxKPI cdevs. Demote cdev
refcount to only mean a hold on the linux cdev memory. Add sirefs
count to track both number of threads inside the cdev methods, and for
single-bit indicator that cdev is being destroyed. In the later case,
the call is redirected to the dummy cdev.

Reviewed by: markj
Discussed with: hselasky
Tested by: zeising
MFC after: 1 week
Sponsored by: Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D18606
inuxkpi/common/include/linux/cdev.h
inuxkpi/common/src/linux_compat.c
75010b9ec7383db3c352e3c129c9860d81ddf122 30-Dec-2018 kib <kib@FreeBSD.org> Implement zap_vma_ptes() for managed device objects.

Reviewed by: markj
Discussed with: hselasky
Tested by: zeising
MFC after: 1 week
Sponsored by: Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D18606
inuxkpi/common/include/linux/mm.h
inuxkpi/common/src/linux_compat.c
97e77a56ba39fb634fbac4c0a2ac9c6c0be94658 30-Dec-2018 kib <kib@FreeBSD.org> Use IDX_TO_OFF().

Reviewed by: markj
Discussed with: hselasky
Tested by: zeising
MFC after: 1 week
Sponsored by: Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D18606
inuxkpi/common/src/linux_compat.c
c39e5a04869dd7380aba830f3e95d7d999d87c5a 19-Dec-2018 mjg <mjg@FreeBSD.org> Remove iBCS2, part2: general kernel

Reviewed by: kib (previous version)
Sponsored by: The FreeBSD Foundation
a32/ia32_sysvec.c
a3c153e5af20f7fb650e0fc7aa819b8ca11c8590 18-Dec-2018 brooks <brooks@FreeBSD.org> const poison the `new` pointer of __sysctl.

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D18444
reebsd32/syscalls.master
7e31d1de7edc43eb4db8f4e6248b0441a44c1e5e 11-Dec-2018 mjg <mjg@FreeBSD.org> Remove unused argument to priv_check_cred.

Patch mostly generated with cocinnelle:

@@
expression E1,E2;
@@

- priv_check_cred(E1,E2,0)
+ priv_check_cred(E1,E2)

Sponsored by: The FreeBSD Foundation
inux/linux_misc.c
inux/linux_uid16.c
b2b1b7040b4d191f6aff6b2ef9a0671d29a6a4b5 10-Dec-2018 hselasky <hselasky@FreeBSD.org> Remove no longer needed ifdefs in the LinuxKPI, after r341787.

Differential Revision: https://reviews.freebsd.org/D18450
Reviewed by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic-long.h
inuxkpi/common/include/asm/atomic.h
0f30bb9a10aecf878378bafd36c7ac6e27cd2fc1 07-Dec-2018 kib <kib@FreeBSD.org> Regen.
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
48d91fd889f7286f7b0099adb2eaeb8056eed5b4 07-Dec-2018 kib <kib@FreeBSD.org> Add new file handle system calls.

Namely, getfhat(2), fhlink(2), fhlinkat(2), fhreadlink(2). The
syscalls are provided for a NFS userspace server (nfs-ganesha).

Submitted by: Jack Halford <jack@gandi.net>
Sponsored by: Gandi.net
Tested by: pho
Feedback from: brooks, markj
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D18359
reebsd32/syscalls.master
9de02645349d7a6bc189732f324882c38fcfe98d 05-Dec-2018 hselasky <hselasky@FreeBSD.org> Remove redundant declaration after r341517.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/idr.h
324b1064530fb57a799a25a5077ccc625e39c042 05-Dec-2018 hselasky <hselasky@FreeBSD.org> Fix some build of LinuxKPI on some platforms after r341518.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic-long.h
ba01a3ba32e67f638fd976c0b5bdfeeae305da37 05-Dec-2018 slavash <slavash@FreeBSD.org> mlx5: Fix driver version location

Driver description should be set by core and not by the Ethernet driver.

Approved by: hselasky (mentor)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_pci.c
5413daa1a0268094815c248f4da063012224f1e4 05-Dec-2018 slavash <slavash@FreeBSD.org> ibcore: ip6_dev_find() needs to know the scope ID.

Else the wrong network device can be returned for link-local addresses.

Submitted by: hselasky@
Approved by: hselasky (mentor)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/inetdevice.h
5a511ab0cd2f64e6ab03b194b8b34f17dc44e61b 05-Dec-2018 slavash <slavash@FreeBSD.org> linuxkpi: Really check if PCI is offline

Currently we always return false if for PCI offline query.
Try to read PCI config, if the return value if 0xffff probably the
PCI is offline.

Approved by: hselasky (mentor)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
b2f6cad4de352970a0c625bd3505f453513dbf31 05-Dec-2018 slavash <slavash@FreeBSD.org> linuxkpi: properly implement netif_carrier_ok().

Submitted by: kib@
Approved by: hselasky (mentor)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/netdevice.h
de304ffb7658522150f447a8a214c2d318f38ef9 05-Dec-2018 slavash <slavash@FreeBSD.org> linuxkpi: Fix for use-after-free when tearing down character devices.

Make sure we hold a reference on the character device for every opened file
to prevent the character device to be freed prematurely.

Submitted by: hselasky@
Approved by: hselasky (mentor)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/cdev.h
inuxkpi/common/include/linux/fs.h
inuxkpi/common/src/linux_compat.c
94db4c71a7221273bd9597fce818143e0d96af7c 05-Dec-2018 slavash <slavash@FreeBSD.org> linuxkpi: implement idr_is_empty() and ida_is_empty().

Submitted by: kib@
Approved by: hselasky (mentor)
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/idr.h
abd55a40797a2b6256970eed31ea1cbcee814a97 03-Dec-2018 kib <kib@FreeBSD.org> Improve procstat reporting for the linux cdev file descriptors.

If there is a vnode attached to the linux file, use it to fill
kinfo_file. Otherwise, report a new KF_TYPE_DEV file type, without
supplying any type-specific information.

KF_TYPE_DEV is supposed to be used by most devfs-specific file types.

Sponsored by: Mellanox Technologies
MFC after: 1 week
inuxkpi/common/src/linux_compat.c
54c8f3c8e75f259594c8e1f3ee85f3ec7de2a1d8 29-Nov-2018 brooks <brooks@FreeBSD.org> Add helper functions to copy strings into struct image_args.

Given a zeroed struct image_args with an allocated buf member,
exec_args_add_fname() must be called to install a file name (or NULL).
Then zero or more calls to exec_args_add_env() followed by zero or
more calls to exec_args_add_env(). exec_args_adjust_args() may be
called after args and/or env to allow an interpreter to be prepended to
the argument list.

To allow code reuse when adding arg and env variables, begin_envv
should be accessed with the accessor exec_args_get_begin_envv()
which handles the case when no environment entries have been added.

Use these functions to simplify exec_copyin_args() and
freebsd32_exec_copyin_args().

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15468
loudabi32/cloudabi32_module.c
loudabi64/cloudabi64_module.c
reebsd32/freebsd32_misc.c
c293a729d8e1847878c5ecd993cd8938e5e3bc3b 24-Nov-2018 markj <markj@FreeBSD.org> Pass malloc flags directly through kevent(2) subroutines.

Some kevent functions have a boolean "waitok" parameter for use when
calling malloc(9). Replace them with the corresponding malloc() flags:
the desired behaviour is known at compile-time, so this eliminates a
couple of conditional branches, and makes the code easier to read.

No functional change intended.

Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18318
inux/linux_event.c
11416ef4a8a41630a61b02e681c9e35d4ea07a7f 21-Nov-2018 bwidawsk <bwidawsk@FreeBSD.org> linuxkpi: Use pageproc instead of vmproc

According to markj@:
pageproc contains the page daemon and laundry threads, which are
responsible for managing the LRU page queues and writing back dirty
pages. vmproc's main task is to swap out kernel stacks when the system
is under memory pressure, and swap them back in when necessary. It's a
somewhat legacy component of the system and isn't required. You can
build a kernel without it by specifying "options NO_SWAPPING" (which is
a somewhat misleading name), in which vm_swapout_dummy.c is compiled
instead of vm_swapout.c.

Based on this, we want pageproc to emulate kswapd, not vmproc.

Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D18061
inuxkpi/common/include/linux/swap.h
0c21a36e53111f8800d6a5002f8710f9605817ab 20-Nov-2018 bwidawsk <bwidawsk@FreeBSD.org> linuxkpi: Remove duplicated text

Somehow this got botched while moving from git -> svn
inuxkpi/common/include/linux/swap.h
82426acbddc9a9f22233d0df60d47464af45fc22 20-Nov-2018 bwidawsk <bwidawsk@FreeBSD.org> linuxkpi: Add some basic swap functions

These are used by kms-drm to determine various heuristics relate
memory conditions.

The number of free swap pages is just a variable, and it can be
much cheaper by either adding a new getter, or simply extern'ing
swap_total. However, this patch opts to use the more expensive,
existing interface - since this isn't an operation in a high per
path.

This allows us to remove some more gpl linuxkpi and do the follo
kms-drm:
git rm linuxkpi/gplv2/include/linux/swap.h

Reviewed by: mmacy, Johannes Lundberg <johalun0@gmail.com>
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D18052
inuxkpi/common/include/linux/swap.h
679845ea2018bae1ea99ead6d34fbdec75863ee3 20-Nov-2018 tijl <tijl@FreeBSD.org> Fix another user address dereference in linux_sendmsg syscall.

This was hidden behind the LINUX_CMSG_NXTHDR macro which dereferences its
second argument. Stop using the macro as well as LINUX_CMSG_FIRSTHDR. Use
the size field of the kernel copy of the control message header to obtain
the next control message.

PR: 217901
MFC after: 2 days
X-MFC-With: r340631
inux/linux_socket.c
823217168b94e653cfc59ce6051b436c359e2344 19-Nov-2018 tijl <tijl@FreeBSD.org> Do proper copyin of control message data in the Linux sendmsg syscall.

Instead of calling m_append with a user address, allocate an mbuf cluster
and copy data into it using copyin. For the SCM_CREDS case, instead of
zeroing a stack variable and appending that to the mbuf, zero part of the
mbuf cluster directly. One mbuf cluster is also the size limit used by
the FreeBSD sendmsg syscall (uipc_syscalls.c:sockargs()).

PR: 217901
Reviewed by: kib
MFC after: 3 days
inux/linux_socket.c
4493b1d3a8f47ad6c92e200da5e09186519767e0 16-Nov-2018 mjg <mjg@FreeBSD.org> proc: always store parent pid in p_oppid

Doing so removes the dependency on proctree lock from sysctl process list
export which further reduces contention during poudriere -j 128 runs.

Reviewed by: kib (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17825
inux/linux_fork.c
c67a6ac9325ec9e1704d21cbdfbb909c4d8d7a6b 16-Nov-2018 hselasky <hselasky@FreeBSD.org> Define asm macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
77f03b40f3c6e75b21517da0018cc2c05c8271a2 16-Nov-2018 hselasky <hselasky@FreeBSD.org> Implement ktime_get_ts64() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/ktime.h
d696b58dd0563d9553cee8b9564bbc3140bfd116 14-Nov-2018 brooks <brooks@FreeBSD.org> Use the main capabilities.conf for freebsd32.

Allow the location of capabilities.conf to be configured.

Also allow a per-abi syscall prefix to be configured with the
abi_func_prefix syscalls.conf variable and check syscalls against
entries in capabilities.conf with and without the prefix amended.

Take advantage of these two features to allow use shared capabilities.conf
between the default syscall vector and the freebsd32 compatability
layer. We've been inconsistent about keeping the two in sync as
evidenced by the bugs fixed in r340294. This eliminates that problem
going forward.

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17932
reebsd32/Makefile
reebsd32/capabilities.conf
reebsd32/syscalls.conf
860e8821627f0f861a9ed714161eab77eb618115 09-Nov-2018 brooks <brooks@FreeBSD.org> Regen after r340302: Fix freebsd32 mknod(at).

Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17928
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
3f1281ac33901874986a1c751c678bed1459c052 09-Nov-2018 brooks <brooks@FreeBSD.org> Fix freebsd32 mknod(at).

As dev_t is now a 64-bit integer, it requires special handling as a
system call argument. 64-bit arguments are split between two 64-bit
integers due to the way arguments are promoted to allow reuse of most
system call implementations. They must be reassembled before use.
Further, 64-bit arguments at an odd offset (counting from zero) are
padded and slid to the next slot on powerpc and mips. Fix the
non-COMPAT11 system call by adding a freebsd32_mknodat() and
appropriately padded declerations.

The COMPAT11 system calls are fully compatible with the 64-bit
implementations so remove the freebsd32_ versions.

Use uint32_t consistently as the type of the old dev_t. This matches
the old definition.

Reviewed by: kib
MFC after: 3 days
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17928
reebsd32/freebsd32_misc.c
reebsd32/syscalls.master
76d653ddc281e7a56a48e0fb523c6a2ff3e88d0c 09-Nov-2018 brooks <brooks@FreeBSD.org> Regen after r340294: Fix a number of bugs in freebsd32's capabilities.conf.

Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17925
reebsd32/freebsd32_sysent.c
f0dc7b8bf0f34e6b134e4d43d053c30efa7ed5d3 09-Nov-2018 brooks <brooks@FreeBSD.org> Fix a number of bugs in freebsd32's capabilities.conf.

Bugs range from failure to update after changing syscall implementaion
names to using the wrong name. Somewhat confusingly, the name in
capabilities.conf is exactly the string that appears in syscalls.master,
not the name with a COMPAT* prefix which is the actual function name.

Found while making a change to use the default capabilities.conf.

Fixes: r335177, r336980, r340272, r340274, others
Reviewed by: kib, emaste
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17925
reebsd32/capabilities.conf
86a1796939b6290122eb377d198882b99fc5088f 09-Nov-2018 brooks <brooks@FreeBSD.org> Regen after r340274: Make freebsd32_utmx_op follow the freebsd32_foo
convention.
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
c1262215a1962f5ca94b2c1da50f0cf49215bae3 09-Nov-2018 brooks <brooks@FreeBSD.org> Make freebsd32_umtx_op follow the freebsd32_foo convention.

Sponsored by: DARPA, AFRL
reebsd32/syscalls.master
847735d1681b56700d0c24f92e6f41bd82ec786f 09-Nov-2018 brooks <brooks@FreeBSD.org> Regen after 340272: Make __sysctl follow the freebsd32_foo convention

Sponsored by: DARPA, AFRL
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
cdf64d808c86c7ec5cfdcd9dc73781088fdbd30d 09-Nov-2018 brooks <brooks@FreeBSD.org> Make __sysctl follow the freebsd32_foo convention.

Sponsored by: DARPA, AFRL
reebsd32/freebsd32_misc.c
reebsd32/syscalls.master
aece9729f02f718536d4f3a688f9f170109883c6 07-Nov-2018 brooks <brooks@FreeBSD.org> Regen after r340221: allow pointer return types.

Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17873
reebsd32/freebsd32_systrace_args.c
b47f3a58882f07c9974bf1237373367ef9d5b04e 07-Nov-2018 brooks <brooks@FreeBSD.org> makesyscalls.sh: allow pointer return types.

The previous code required that the return type be a single word. This
allows it to be a pointer without using a typedef.

Update the return types of break, mmap, and shmat to be void * as
declared. This only effects systrace output in-tree, but can aid in
generating system call wrappers from syscalls.master.

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17873
reebsd32/syscalls.master
c2135dd9fd45a9788c7605a2e2a4fc20240cd4b2 06-Nov-2018 brooks <brooks@FreeBSD.org> Regen after r340199: Use declared types for caddr_t arguments.

Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17852
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_systrace_args.c
a7587890604d97703f514d1afb4cd660fc4e4192 06-Nov-2018 brooks <brooks@FreeBSD.org> Use declared types for caddr_t arguments.

Leave ptrace(2) alone for the moment as it's defined to take a caddr_t.

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17852
reebsd32/syscalls.master
inux/linux_socket.c
2e8a4f13901604b26d9fce432137b129bf21cb6e 06-Nov-2018 oshogbo <oshogbo@FreeBSD.org> Remove ppoll. freebsd32 doesn't define a ppoll syscall.

Reported by: jhb
reebsd32/capabilities.conf
748e69ed71c783051df7d7eb132ba8972c51bdca 06-Nov-2018 oshogbo <oshogbo@FreeBSD.org> Regenerate after r340195.
reebsd32/freebsd32_sysent.c
66c6b4665c6c91e008a2f785b994a4ac251c187f 06-Nov-2018 oshogbo <oshogbo@FreeBSD.org> capsicum: Add ppoll and freebsd32_ppoll to compat32.

PR: 232495
Pointed out by: brooks
MFC after: 2 weeks
reebsd32/capabilities.conf
a184b5f4e929ff3c60f618b335cc54e0d2602496 06-Nov-2018 tijl <tijl@FreeBSD.org> On amd64 both Linux compat modules, linux.ko and linux64.ko, provide
linux_ioctl_(un)register_handler that allows other driver modules to
register ioctl handlers. The ioctl syscall implementation in each Linux
compat module iterates over the list of handlers and forwards the call to
the appropriate driver. Because the registration functions have the same
name in each module it is not possible for a driver to support both 32 and
64 bit linux compatibility.

Move the list of ioctl handlers to linux_common.ko so it is shared by
both Linux modules and all drivers receive both 32 and 64 bit ioctl calls
with one registration. These ioctl handlers normally forward the call
to the FreeBSD ioctl handler which can handle both 32 and 64 bit.

Keep the special COMPAT_LINUX32 ioctl handlers in linux.ko in a separate
list for now and let the ioctl syscall iterate over that list first.
Later, COMPAT_LINUX32 support can be added to the 64 bit ioctl handlers
via a runtime check for ILP32 like is done for COMPAT_FREEBSD32 and then
this separate list would disappear again. That is a much bigger effort
however and this commit is meant to be MFCable.

This enables linux64 support in x11/nvidia-driver*.

PR: 206711
Reviewed by: kib
MFC after: 3 days
inux/linux_common.c
inux/linux_ioctl.c
inux/linux_ioctl.h
36dddf6dc2ed6e70b85781bf5a2f129a52b018bc 02-Nov-2018 brooks <brooks@FreeBSD.org> Regen after r340080: Add const to input-only char * arguments.

Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17812
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_systrace_args.c
f101633ffdf59a145eec1b915fb7ed2eff27be9a 02-Nov-2018 brooks <brooks@FreeBSD.org> Add const to input-only char * arguments.

These arguments are mostly paths handled by NAMEI*() macros which already
take const char * arguments.

This change improves the match between syscalls.master and the public
declerations of system calls.

Reviewed by: kib (prior version)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17812
reebsd32/freebsd32_misc.c
reebsd32/freebsd32_util.h
reebsd32/syscalls.master
164b2a66dd843fc98b48f3c4d1a552506a8143cd 01-Nov-2018 brooks <brooks@FreeBSD.org> Regent after r340034: Use mode_t when the documented signature does.

Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17784
reebsd32/freebsd32_systrace_args.c
5ce65aa394482102251292bf34ca155c858f17d1 01-Nov-2018 brooks <brooks@FreeBSD.org> Use mode_t when the documented signature does.

This is more clear and produces better results when generating function
stubs from syscalls.master.

Reviewed by: kib, emaste
Obtained from: CheribSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17784
reebsd32/syscalls.master
bfe3a86c2fc58a4edadb432c242449e1bac91b7b 01-Nov-2018 bwidawsk <bwidawsk@FreeBSD.org> linuxkpi: Add GFP flags needed for ttm drivers

Submitted by: Johannes Lundberg <johalun0@gmail.com>
Requested by: bwidawsk
MFC after: 3 days
Approved by: emaste (mentor)
inuxkpi/common/include/linux/gfp.h
b485fc68f9d029ff455285b4af09d523c56c23e4 30-Oct-2018 hselasky <hselasky@FreeBSD.org> Implement the dump_stack() function in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 3 days
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
inuxkpi/common/src/linux_compat.c
d95a6e938d4ffcb53ad009a4bd931f6010e49d9c 30-Oct-2018 hselasky <hselasky@FreeBSD.org> Implement __KERNEL_DIV_ROUND_UP() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 3 days
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
7e348cdc514bb7473105380cb925ef77c4f5024f 29-Oct-2018 hselasky <hselasky@FreeBSD.org> Implement dma_pool_zalloc() in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 3 days
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/dmapool.h
9867e6d8fe1cac34121e5f2df9357363f5659617 26-Oct-2018 brooks <brooks@FreeBSD.org> Move 32-bit compat support for FIODGNAME to the right place.

ioctl(2) commands only have meaning in the context of a file descriptor
so translating them in the syscall layer is incorrect.

The new handler users an accessor to retrieve/construct a pointer from
the last member of the passed structure and relies on type punning to
access the other member which requires no translation.

Unlike r339174 this change supports both places FIODGNAME is handled.

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17475
reebsd32/freebsd32_ioctl.c
reebsd32/freebsd32_ioctl.h
de9d57cf384b5a6c821d06102c467a23ca8a0691 25-Oct-2018 kib <kib@FreeBSD.org> Implement O_BENEATH and AT_BENEATH.

Flags prevent open(2) and *at(2) vfs syscalls name lookup from
escaping the starting directory. Supposedly the interface is similar
to the same proposed Linux flags.

Reviewed by: jilles (code, previous version of manpages), 0mp (manpages)
Discussed with: allanjude, emaste, jonathan
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D17547
loudabi/cloudabi_file.c
inux/linux_file.c
94492cb893e8c9640c2c7debc8d4ece7f7b358ec 22-Oct-2018 brooks <brooks@FreeBSD.org> Remove __restrict qualifiers from syscalls.master.

The restruct qualifier is intended to aid code generation in the
compiler, but the only access to storage through these pointers is via
structs using copyin/copyout and the like which can not be written in C
or C++ and thus the compiler gains nothing from the qualifiers.

As such, the qualifiers add no value in current usage.

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17574
reebsd32/syscalls.master
babda82d29e6cab6d0c7de82d29bfc115343397c 22-Oct-2018 tijl <tijl@FreeBSD.org> Define linuxkpi readq for 64-bit architectures. It is used by drm-kmod.
Currently the compiler picks up the definition in machine/cpufunc.h.

Add compiler memory barriers to read* and write*. The Linux x86
implementation of these functions uses inline asm with "memory" clobber.
The Linux x86 implementation of read_relaxed* and write_relaxed* uses the
same inline asm without "memory" clobber.

Implement ioread* and iowrite* in terms of read* and write* so they also
have memory barriers.

Qualify the addr parameter in write* as volatile.

Like Linux, define macros with the same name as the inline functions.

Only define 64-bit versions on 64-bit architectures because generally
32-bit architectures can't do atomic 64-bit loads and stores.

Regroup the functions a bit and add brief comments explaining what they do:
- __raw_read*, __raw_write*: atomic, no barriers, no byte swapping
- read_relaxed*, write_relaxed*: atomic, no barriers, little-endian
- read*, write*: atomic, with barriers, little-endian

Add a comment that says our implementation of ioread* and iowrite*
only handles MMIO and does not support port IO.

Reviewed by: hselasky
MFC after: 3 days
inuxkpi/common/include/linux/io.h
cbbb57703a710565ca8b0fbececd730054ed9931 15-Oct-2018 kevans <kevans@FreeBSD.org> Correct COMPAT* macro names in syscalls.master

Both ^/sys/compat/freebsd32/syscalls.master and ^/sys/kern/syscalls.master
cited "COMPAT[n] #ifdef" instead of "COMPAT_FREEBSD[n] #ifdef" in places.

Approved by: re (glebius)
reebsd32/syscalls.master
6711c221836029e8edb828ba05fe96f1ca16a8d5 09-Oct-2018 brooks <brooks@FreeBSD.org> Regenerated assorted syscall related files after:
- r327895: Implement 'domainset'...
- r329876: Use linux types for linux-specific syscalls

Diff generated with:
find . -name syscalls.conf | xargs dirname | \
xargs -n1 -I DIR make -C DIR sysent

Approved by: re (kib)
Sponsored by: DARPA, AFRL
loudabi32/cloudabi32_proto.h
loudabi64/cloudabi64_proto.h
eb4c557ad16aee54a80da44af6f4ea2f7df9d768 04-Oct-2018 brooks <brooks@FreeBSD.org> Revert r339174: Move 32-bit compat support for FIODGNAME to the right place.

A case was missed in this commit which breaks sshing into a 32-bit sshd
on a 64-bit system.

Approved by: re (gjb)
reebsd32/freebsd32_ioctl.c
reebsd32/freebsd32_ioctl.h
e62dcc082e537b58e52d39ee9d0e2b2210cbfcac 03-Oct-2018 brooks <brooks@FreeBSD.org> Move 32-bit compat support for FIODGNAME to the right place.

ioctl(2) commands only have meaning in the context of a file descriptor
so translating them in the syscall layer is incorrect.

The new handler users an accessor to retrieve/construct a pointer from
the last member of the passed structure and relies on type punning to
access the other member which requires no translation.

Reviewed by: kib
Approved by: re (rgrimes, gjb)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Review: https://reviews.freebsd.org/D17388
reebsd32/freebsd32_ioctl.c
reebsd32/freebsd32_ioctl.h
3a94dca87f22c5a4cf559a126a0c19e16abf4c5c 02-Oct-2018 brooks <brooks@FreeBSD.org> Move 32-bit compat support for CDIOREADTOCENTRYS to the right place.

ioctl(2) commands only have meaning in the context of a file descriptor
so translating them in the syscall layer is incorrect.

The new handler users an accessor to retrieve/construct a pointer from
the last member of the passed structure and relies on type punning to
access the other members which require no translation.

Reviewed by: kib (prior version), jhb
Approved by: re (rgrimes)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Review: https://reviews.freebsd.org/D17378
reebsd32/freebsd32_ioctl.c
reebsd32/freebsd32_ioctl.h
8b293ebe2a355273f80a5f9944a4ef8e17a5a8c7 28-Sep-2018 jhb <jhb@FreeBSD.org> Regenerate after UNIMPL -> OBSOL changes in r339001.

Approved by: re (gjb)
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
a5ac7b0f85ba434f258d92d53e2cff8f280cd6bf 28-Sep-2018 jhb <jhb@FreeBSD.org> Mark various removed system calls as OBSOL instead of UNIMPL.

This is mostly a cosmetic change except that obsolete system calls are
assigned meaningful names in the names arrays which means that using
tools like kdump or truss against binaries invoking these system calls
will print out the name instead of the number. The script I use to
generate the XML list of syscalls for GDB also ignores UNIMPL but not
OBSOL entries. In general UNIMPL should only be used to reserve
placeholders for system calls that have never been implemented while
system calls that existed at one time in FreeBSD but were removed
should be marked OBSOL instead.

Reviewed by: brooks, kib, imp
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D17344
reebsd32/syscalls.master
ef4d131c1225339cf8797ba16c69bb860a1565fe 27-Sep-2018 brooks <brooks@FreeBSD.org> Centralize compat support for PCIOCGETCONF.

The pre-7.x compat for both native and 32-bit code was already in
pci_user.c. Use this infrastructure to add implement 32-bit support.
This is more correct as ioctl(2) commands only have meaning in the
context of a file descriptor.

Reviewed by: kib
Approved by: re (gjb)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential revision: https://reviews.freebsd.org/D17324
reebsd32/freebsd32_ioctl.c
reebsd32/freebsd32_ioctl.h
8eec7acbb922b3d99d0e63fc19f66d08d09db874 13-Sep-2018 royger <royger@FreeBSD.org> x86bios: use M_NOWAIT with mallocs

Or else it triggers the following bug:

APIC: CPU 6 has ACPI ID 6
APIC: CPU 7 has ACPI ID 7
panic: vm_wait in early boot
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff826ff8d0
vpanic() at vpanic+0x1a3/frame 0xffffffff826ff930
panic() at panic+0x43/frame 0xffffffff826ff990
vm_wait_domain() at vm_wait_domain+0xf9/frame 0xffffffff826ff9c0
kmem_alloc_contig_domain() at kmem_alloc_contig_domain+0x252/frame 0xffffffff826ffa50
kmem_alloc_contig() at kmem_alloc_contig+0x6c/frame 0xffffffff826ffad0
contigmalloc() at contigmalloc+0x2e/frame 0xffffffff826ffb00
x86bios_modevent() at x86bios_modevent+0x225/frame 0xffffffff826ffb20
module_register_init() at module_register_init+0xc0/frame 0xffffffff826ffb50
mi_startup() at mi_startup+0x118/frame 0xffffffff826ffb70
start_kernel() at start_kernel+0x10

While there also make x86bios_unmap_mem idempotent.

Reviewed by: kib
Approved by: re (gjb)
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D17000
86bios/x86bios.c
760ca794f4e8c761d9a3ce865c577299783daf87 28-Aug-2018 kib <kib@FreeBSD.org> Regen after r338357.

Approved by: re (gjb)
reebsd32/freebsd32_sysent.c
ee03ba29cf46a664e9d152e790f9039e9e8661d5 28-Aug-2018 kib <kib@FreeBSD.org> Fix compat32 ftruncate cap mode after ino64.

Reported by: asomers
PR: 230120
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
reebsd32/capabilities.conf
3799d78beb4cf81baac99c1256126e10696fc4e3 25-Aug-2018 alc <alc@FreeBSD.org> Eliminate the arena parameter to kmem_free(). Implicitly this corrects an
error in the function hypercall_memfree(), where the wrong arena was being
passed to kmem_free().

Introduce a per-page flag, VPO_KMEM_EXEC, to mark physical pages that are
mapped in kmem with execute permissions. Use this flag to determine which
arena the kmem virtual addresses are returned to.

Eliminate UMA_SLAB_KRWX. The introduction of VPO_KMEM_EXEC makes it
redundant.

Update the nearby comment for UMA_SLAB_KERNEL.

Reviewed by: kib, markj
Discussed with: jeff
Approved by: re (marius)
Differential Revision: https://reviews.freebsd.org/D16845
inuxkpi/common/include/linux/dma-mapping.h
inuxkpi/common/src/linux_page.c
4ce21fcbeaf66e2031419afa4737448e99284ce0 21-Aug-2018 alc <alc@FreeBSD.org> Eliminate kmem_malloc()'s unused arena parameter. (The arena parameter
became unused in FreeBSD 12.x as a side-effect of the NUMA-related
changes.)

Reviewed by: kib, markj
Discussed with: jeff, re@
Differential Revision: https://reviews.freebsd.org/D16825
inuxkpi/common/src/linux_page.c
71b5b012c470799caab3128c9b365585d08aa7e2 20-Aug-2018 alc <alc@FreeBSD.org> Eliminate kmem_alloc_contig()'s unused arena parameter.

Reviewed by: hselasky, kib, markj
Discussed with: jeff
Differential Revision: https://reviews.freebsd.org/D16799
inuxkpi/common/include/linux/dma-mapping.h
inuxkpi/common/src/linux_page.c
dis/subr_ntoskrnl.c
2acd1f2a253c451d21abf2cf5c1ff3fb5d8b6645 18-Aug-2018 delphij <delphij@FreeBSD.org> Regen after r337998.
reebsd32/freebsd32_sysent.c
4f62d03ca0c8df743c1995c9bf2c811f403f40e1 18-Aug-2018 delphij <delphij@FreeBSD.org> getrandom(2) should not be restricted in capability mode.
reebsd32/capabilities.conf
6b9aac38ce7615372458edbc16044dd43dac9d0c 16-Aug-2018 jamie <jamie@FreeBSD.org> Revert r337922, except for some documention-only bits. This needs to wait
until user is changed to stop using jail(2).

Differential Revision: D14791
reebsd32/freebsd32_misc.c
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
reebsd32/syscalls.master
94a36bb7c1dcba3dbdb1a5df1342fa4788322f29 16-Aug-2018 jamie <jamie@FreeBSD.org> Put jail(2) under COMPAT_FREEBSD11. It has been the "old" way of creating
jails since FreeBSD 7.

Along with the system call, put the various security.jail.allow_foo and
security.jail.foo_allowed sysctls partly under COMPAT_FREEBSD11 (or
BURN_BRIDGES). These sysctls had two disparate uses: on the system side,
they were global permissions for jails created via jail(2) which lacked
fine-grained permission controls; inside a jail, they're read-only
descriptions of what the current jail is allowed to do. The first use
is obsolete along with jail(2), but keep them for the second-read-only use.

Differential Revision: D14791
reebsd32/freebsd32_misc.c
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
reebsd32/syscalls.master
042e93c61a4eafb50b272d122fd1ebe2330be6cb 09-Aug-2018 hselasky <hselasky@FreeBSD.org> Use atomic_fcmpset_XXX() instead of atomic_cmpset_XXX() when possible
in the LinuxKPI.

Suggested by: mjg @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic-long.h
inuxkpi/common/include/asm/atomic.h
inuxkpi/common/include/asm/atomic64.h
inuxkpi/common/include/linux/bitops.h
b6bc7ec4e1552cc2665102b97158a2789ca07da3 07-Aug-2018 marius <marius@FreeBSD.org> Update the list of architectures having atomic_fcmpset_{8,16,64}(9) and
atomic_swap_{64,int}(9) respectively as of r337433.
inuxkpi/common/include/asm/atomic.h
inuxkpi/common/include/asm/atomic64.h
7a979485ab0753cfd7296338fb6a13f9360df5f8 07-Aug-2018 markj <markj@FreeBSD.org> Improve handling of control message truncation.

If a recvmsg(2) or recvmmsg(2) caller doesn't provide sufficient space
for all control messages, the kernel sets MSG_CTRUNC in the message
flags to indicate truncation of the control messages. In the case
of SCM_RIGHTS messages, however, we were failing to dispose of the
rights that had already been externalized into the recipient's file
descriptor table. Add a new function and mbuf type to handle this
cleanup task, and use it any time we fail to copy control messages
out to the recipient. To simplify cleanup, control message truncation
is now only performed at control message boundaries.

The change also fixes a few related bugs:
- Rights could be leaked to the recipient process if an error occurred
while copying out a message's contents.
- We failed to set MSG_CTRUNC if the truncation occurred on a control
message boundary, e.g., if the caller received two control messages
and provided only the exact amount of buffer space needed for the
first.

PR: 131876
Reviewed by: ed (previous version)
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D16561
loudabi/cloudabi_sock.c
reebsd32/freebsd32_misc.c
inux/linux_socket.c
4c75978b8f820bd5172ffc683201ba8b98583ac5 06-Aug-2018 hselasky <hselasky@FreeBSD.org> Implement current_work() function in the LinuxKPI.

Tested by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
inuxkpi/common/include/linux/workqueue.h
inuxkpi/common/src/linux_work.c
860046d78abacebe8a8bbb59150d5e5a67c6a847 06-Aug-2018 hselasky <hselasky@FreeBSD.org> Implement atomic_long_cmpxchg() function in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic-long.h
0c38e4ae83b37893fe55e4a830322f2e3f543bb9 06-Aug-2018 hselasky <hselasky@FreeBSD.org> Define __poll_t type in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/types.h
50ad879db7edfb5b67f6b2f5fe202ed2dea9980a 03-Aug-2018 hselasky <hselasky@FreeBSD.org> Implement ktime_add_ms() and ktime_before() in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/ktime.h
0a6bb4dddd38c5e79b65db096c7d24fa9e9b34c5 01-Aug-2018 hselasky <hselasky@FreeBSD.org> Don't refer to non-existing atomic functions, even though not compiled,
in the LinuxKPI.

Found by: rpolka @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic.h
cfe7aa28cb6cecb2eb924b4ad2950fb18a866ce1 01-Aug-2018 kib <kib@FreeBSD.org> Add ioctl to conveniently mmap a PCI device BAR into userspace.

Add the ioctl PCIOCBARMMAP on /dev/pci to conveniently create
userspace mapping of a PCI device BAR. This is enormously superior to
read the BAR value with PCIOCREAD and then try to mmap /dev/mem, and
should allow to automatically activate the mapped BARs when needed in
future.

Current implementation creates new sg pager for each user mmap
request. If the pointer (and reference) to a managed device pager is
stored in pci_map, we would be able to revoke all mappings on the BAR
deactivation or relocation. This is related to the unimplemented BAR
activation on mmap, and is postponed for the future.

Discussed with: imp, jhb
Sponsored by: The FreeBSD Foundation, Mellanox Technologies
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D15583
reebsd32/freebsd32_ioctl.c
reebsd32/freebsd32_ioctl.h
13ce146f42cf8b7956fabb5e68338c5acfea98c5 31-Jul-2018 kib <kib@FreeBSD.org> Regenerate after r336980.
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
756fea054a1f75281e3a30deec10f623cc0fae9e 31-Jul-2018 kib <kib@FreeBSD.org> Provide compat32 shims for sched_rr_get_interval(2).

The interface uses struct timespec, which needs a translation.

Reported and reviewed by: asomers
PR: 230175
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D16525
reebsd32/freebsd32_misc.c
reebsd32/syscalls.master
b3776cb8de3ed109d5b9625fc934a7face4f69a7 30-Jul-2018 asomers <asomers@FreeBSD.org> Make timespecadd(3) and friends public

The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.

Our kernel currently defines two-argument versions of timespecadd and
timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions. Solaris also defines a three-argument version, but
only in its kernel. This revision changes our definition to match the
common three-argument version.

Bump _FreeBSD_version due to the breaking KPI change.

Discussed with: cem, jilles, ian, bde
Differential Revision: https://reviews.freebsd.org/D14725
inux/linux_event.c
inux/linux_futex.c
inux/linux_misc.c
inux/linux_socket.c
inuxkpi/common/include/linux/time.h
0bdfcd97cbed88d74d43e7834716930fded6f38e 29-Jul-2018 asomers <asomers@FreeBSD.org> freebsd32_getrusage(2): skip freebsd32_rusage_out on error

PR: 230153
Reported by: kib
MFC after: 2 weeks
X-MFC-With: 336871
Differential Revision: https://reviews.freebsd.org/D16500
reebsd32/freebsd32_misc.c
f537dbc2c9ec0d0e8d290a854f4ef1a48c714126 29-Jul-2018 asomers <asomers@FreeBSD.org> getrusage(2): fix return value under 32-bit emulation

According to the man page, getrusage(2) should return EFAULT if the rusage
argument lies outside of the process's address space. But due to an
oversight in r100384, that's never been the case during 32-bit emulation.
Fix it.

PR: 230153
Reported by: tests(7)
Reviewed by: cem
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D16500
reebsd32/freebsd32_misc.c
3b45d6cd60275c53788a43d5c95301c79cab4c6a 10-Jul-2018 brooks <brooks@FreeBSD.org> Regen after r336171.
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
228301019d26d9a58fe72f692eed22b45d15df3e 10-Jul-2018 brooks <brooks@FreeBSD.org> Get rid of netbsd_lchown and netbsd_msync syscall entries.

No valid FreeBSD binary very called them (they would call lchown and
msync directly) and we haven't supported NetBSD binaries in ages.

This is a respin of r335983 with a workaround for the ancient BFD linker
in the libc stubs.

Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16193
reebsd32/syscalls.master
ef0811c8894556145e56b806d38a321a840c4548 07-Jul-2018 imp <imp@FreeBSD.org> Create PCI_MATCH and pci_match_device

Create a covenience function to match PCI device IDs. It's about 15
years overdue.

Differential Revision: https://reviews.freebsd.org/D15999
insysfs/linsysfs.c
ae591a440e9e9b2ea0896b3dab1f73b076770da7 05-Jul-2018 andrew <andrew@FreeBSD.org> Create a new macro for static DPCPU data.

On arm64 (and possible other architectures) we are unable to use static
DPCPU data in kernel modules. This is because the compiler will generate
PC-relative accesses, however the runtime-linker expects to be able to
relocate these.

In preparation to fix this create two macros depending on if the data is
global or static.

Reviewed by: bz, emaste, markj
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D16140
inuxkpi/common/src/linux_idr.c
inuxkpi/common/src/linux_rcu.c
inuxkpi/common/src/linux_tasklet.c
b11bbbe3ee153c26408ba4c2ae843e936baf25db 05-Jul-2018 brooks <brooks@FreeBSD.org> Revert r335983.

The bfd linker in tree doesn't support multiple names for the same
symbol (at least with current flags).
reebsd32/syscalls.master
3585a5abe3a4c7d2aca98b7715c7236af5f72bfd 05-Jul-2018 brooks <brooks@FreeBSD.org> Get rid of netbsd_lchown and netbsd_msync syscall entries.

No valid FreeBSD binary ever called them (they would call lchown and
msync directly) and we haven't supported NetBSD binaries in ages.

Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15814
reebsd32/syscalls.master
0f1e196e487605f0f9a73ce1289e5f3390677f14 03-Jul-2018 oshogbo <oshogbo@FreeBSD.org> Regen after 335900.

PR: 228671
reebsd32/freebsd32_sysent.c
c34b7cd0a2e3e41daf792388c2e6c98786795813 03-Jul-2018 oshogbo <oshogbo@FreeBSD.org> capsicum: add getdirentries to the freebsd32 compact

There is a getdirentries syscall in freebsd32 and it's
capability ready so allow calling it in the capability mode.

PR: 228671
reebsd32/capabilities.conf
ad717aa1587d722c8be48058c7e08f33ddbcb47a 27-Jun-2018 emaste <emaste@FreeBSD.org> Split kern_break from sys_break and use it in linuxulator

Previously the linuxulator's linux_brk invoked the FreeBSD sys_break
syscall implementation directly. Instead, move the bulk of the existing
implementation to kern_break, and call that from both sys_break and
linux_brk.

This also addresses a minor bug in linux_brk in that we now return the
actual (rounded up) break address, rather than the requested value.

Reviewed by: brooks (earlier version)
Sponsored by: Turing Robotic Industries
Differential Revision: https://reviews.freebsd.org/D16019
inux/linux_misc.c
1d6e4247415d264485ee94b59fdbc12e0c566fd0 25-Jun-2018 emaste <emaste@FreeBSD.org> Quiet unused fn warning for linuxulator w/o legacy syscalls

Sponsored by: Turing Robotic Industries
inux/linux_sysctl.c
4063facb4a4b8b60a45eccdf2a4f32b072cf6a53 22-Jun-2018 chuck <chuck@FreeBSD.org> Fix output of linprocfs stat entry

The Linux /proc/stat entry has grown over time

v2.5.41 <
user, nice, system, idle
v2.5.41
user, nice, system, idle, iowait, irq
v2.6.11
user, nice, system, idle, iowait, irq, softirq, steal
v2.6.24
user, nice, system, idle, iowait, irq, softirq, steal, guest
v2.6.32 >
user, nice, system, idle, iowait, irq, softirq, steal, guest, guest_nice

Some applications (e.g. nodejs) depend on the correct number of entries
and will abort otherwise.

Fix is to print the correct number of entries based on the value of
osrelease set either in sysctl or the jail settings. Change is similar
to approach used by illumos.

Reviewed by: emaste, imp (mentor)
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D15858
inprocfs/linprocfs.c
360d52c6285e5d81073160f85ebfb0154af6ae32 22-Jun-2018 chuck <chuck@FreeBSD.org> Fix the Linux kernel version number calculation

The Linux compatibility code was converting the version number (e.g.
2.6.32) in two different ways and then comparing the results.

The linux_map_osrel() function converted MAJOR.MINOR.PATCH similar to
what FreeBSD does natively. I.e. where major=v0, minor=v1, and patch=v2
v = v0 * 1000000 + v1 * 1000 + v2;

The LINUX_KERNVER() macro, on the other hand, converted the value with
bit shifts. I.e. where major=a, minor=b, and patch=c
v = (((a) << 16) + ((b) << 8) + (c))

The Linux kernel uses the later format via the KERNEL_VERSION() macro in
include/generated/uapi/linux/version.h

Fix is to use the LINUX_KERNVER() macro in linux_map_osrel() as well as
in the .trans_osrel functions.

PR: 229209
Reviewed by: emaste, cem, imp (mentor)
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D15952
inux/linux_mib.c
8880abe5ff7c958f1a7b20bae3f881fa0b961640 21-Jun-2018 kib <kib@FreeBSD.org> linux_clone_thread: mark new thread as TDB_BORN.

So that the ptrace code will catch it and report it to attached
debugger. Enables debugging of threaded Linux binaries with FreeBSD
debugger.

Submitted by: Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D15880
inux/linux_fork.c
c1c1eec65b08ac7fd58c35e855ebe22a91a6bdb8 19-Jun-2018 emaste <emaste@FreeBSD.org> linuxulator: handle V3 capget/capset

Linux 2.6.26 introduced 64-bit capability sets. Extend our stub
implementation to handle both 32- and 64-bit. (We still report no
capabilities in capget, and disallow any in capset.)

Reviewed by: chuck
Sponsored by: Turing Robotic Industries Inc.
Differential Revision: https://reviews.freebsd.org/D15887
inux/linux_misc.c
90ed98421e967aa5ee8f93c5554083cbb0537ecd 18-Jun-2018 emaste <emaste@FreeBSD.org> linuxulator: add debugging for invalid capget/capset version

Sponsored by: Turing Robotic Industries Inc.
inux/linux_misc.c
d3d3a0ddfe86ff2d4f860209cfe3e936777c6bd3 18-Jun-2018 emaste <emaste@FreeBSD.org> linsysfs: depend on linux_common module on arm64, as on amd64

Sponsored by: Turing Robotic Industries
insysfs/linsysfs.c
4c5756cf390d34a1ef6b1506a841f4427cc0c58a 17-Jun-2018 dim <dim@FreeBSD.org> Fix build of ndis with base gcc on i386

Casting from rman_res_t to a pointer results in "cast to pointer from
integer of different size" warnings with base gcc on i386, so use an
intermediate cast to uintptr_t to suppress it. In this case, the I/O
port range is effectively limited to the range of 0..65535.

Reviewed by: imp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D15746
dis/subr_ndis.c
37f8f0ac3b4c3cda0bc03717521fd5ce30d23c36 15-Jun-2018 chuck <chuck@FreeBSD.org> Add linprocfs support for min_free_kbytes

This adds linprocfs support for proc/sys/vm/min_free_kbytes which the
free program requires for correct operation. The approach mirrors the
approach used in illumos.

Reviewed by: imp (mentor), emaste
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D15563
inprocfs/linprocfs.c
bc07d95d9396788fb076a505131afe147ceeee7d 15-Jun-2018 emaste <emaste@FreeBSD.org> linuxulator: do not include legacy syscalls on arm64

Existing linuxulator platforms (i386, amd64) support legacy syscalls,
such as non-*at ones like open, but arm64 and other new platforms do
not.

Wrap these in #ifdef LINUX_LEGACY_SYSCALLS, #defined in the MD linux.h
files. We may need finer grained control in the future but this is
sufficient for now.

Reviewed by: andrew
Sponsored by: Turing Robotic Industries
Differential Revision: https://reviews.freebsd.org/D15237
inux/linux_event.c
inux/linux_file.c
inux/linux_fork.c
inux/linux_misc.c
inux/linux_stats.c
inux/linux_sysctl.c
68ef93383a7540be64cec6a1a969eea7e2ab9da1 15-Jun-2018 emaste <emaste@FreeBSD.org> Correct debug control for linuxulator faccessat

The Linuxulator provides per-syscall debug control via the
compat.linux.debug sysctl. There's generally a 1:1 mapping between
sysctl setting and syscall, but faccessat was controlled by the access
setting, perhaps due to copy-paste.

Sponsored by: Turing Robotic Industries
inux/linux_file.c
a5564ac823134f50402aff3f7d60b90bc3112996 15-Jun-2018 kib <kib@FreeBSD.org> linprocfs: add TracerPid to /proc/pid/status.
Also fix the value of parent pid if the process is traced.

Submitted by: Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after: 1 week
inprocfs/linprocfs.c
2cbaffbe4793e47151b8fc975468558d80d79405 15-Jun-2018 emaste <emaste@FreeBSD.org> Add stubbed arm64 linuxulator /proc/cpuinfo handler

Sponsored by: Turing Robotic Industries
inprocfs/linprocfs.c
8e419faaf8e64f49c901119c7111131de0c83be8 14-Jun-2018 brooks <brooks@FreeBSD.org> Regen after 335177 (rename sys_obreak to sys_break).
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
ad6bae500f06adafe710fb6a27e1a00ae142892d 14-Jun-2018 brooks <brooks@FreeBSD.org> Name the implementation of brk and sbrk sys_break().

The break() system call was renamed (several times) starting in v3
AT&T UNIX when C was invented and break was a language keyword. The
last vestage of a need for it to be called something else (eg obreak)
was removed in r225617 which consistantly prefixed all syscall
implementations.

Reviewed by: emaste, kib (older version)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15638
reebsd32/syscalls.master
inux/linux_misc.c
13753d45de65d4f13be748c85f7eaa84a82c0f16 13-Jun-2018 bde <bde@FreeBSD.org> Oops, r335053 had an old version of the comment about 16-bit linux dev_t
translation.
inux/linux_stats.c
8bff5fd9562a73b25b75a806911ab77253643ea1 13-Jun-2018 bde <bde@FreeBSD.org> Fix the encoding of major and minor numbers in 64-bit dev_t by restoring
the old encodings for the lower 16 and 32 bits and only using the
higher 32 bits for unusually large major and minor numbers. This
change breaks compatibility with the previous encoding (which was only
used in -current).

Fix truncation to (essentially) 16-bit dev_t in newnfs v3.

Any encoding of device numbers gives an ABI, so it can't be changed
without translations for compatibility. Extra bits give the much
larger complication that the translations need to compress into fewer
bits. Fortunately, more than 32 bits are rarely needed, so
compression is rarely needed except for 16-bit linux dev_t where it
was always needed but never done.

The previous encoding moved the major number into the top 32 bits.
Almost no translation code handled this, so the major number was blindly
truncated away in most 32-bit encodings. E.g., for ffs, mknod(8) with
major = 1 and minor = 2 gave dev_t = 0x10000002; ffs cannot represent
this and blindly truncated it to 2. But if this mknod was run on any
released version of FreeBSD, it gives dev_t = 0x102. ffs can represent
this, but in the previous encoding it was not decoded, giving major = 0,
minor = 0x102.

The presence of bugs was most obvious for exporting dev_t's from an
old system to -current, since bugs in newnfs augment them. I fixed
oldnfs to support 32-bit dev_t in 1996 (r16634), but this regressed
to 16-bit dev_t in newnfs, first to the old 16-bit encoding and then
further in -current. E.g., old ad0 with major = 234, minor = 0x10002
had the correct (major, minor) number on the wire, but newnfs truncated
this to (234, 2) and then the previous encoding shifted the major
number into oblivion as seen by ffs or old applications.

I first tried to fix this by translating on every ABI/API boundary, but
there are too many boundaries and too many sloppy translations by blind
truncation. So use the old encoding for the low 32 bits so that sloppy
translations work no worse than before provided the high 32 bits are
not set. Add some error checking for when bits are lost. Keep not
doing any error checking for translations for almost everything in
compat/linux.

compat/freebsd32/freebsd32_misc.c:
Optionally check for losing bits after possibly-truncating assignments as
before.

compat/linux/linux_stats.c:
Depend on the representation being compatible with Linux's (or just with
itself for local use) and spell some of the translations as assignments in
a macro that hides the details.

fs/nfsclient/nfs_clcomsubs.c:
Essentially the same fix as in 1996, except there is now no possible
truncation in makedev() itself. Also fix nearby style bugs.

kern/vfs_syscalls.c:
As for freebsd32. Also update the sysctl description to include file
numbers, and change it to describe device ids as device numbers.

sys/types.h:
Use inline functions (wrapped by macros) since the expressions are now
a bit too complicated for plain macros. Describe the encoding and
some of the reasons for it. 16-bit compatibility didn't leave many
reasonable choices for the 32-bit encoding, and 32-bit compatibility
doesn't leave many reasonable choices for the 64-bit encoding. My
choice is to put the 8 new minor bits in the low 8 bits of the top 32
bits. This minimizes discontiguities.

Reviewed by: kib (except for rewrite of the comment in linux_stats.c)
reebsd32/freebsd32_misc.c
inux/linux_stats.c
216f1ebfa0008a314f14a3ea587b9521d48b4ad0 13-Jun-2018 bde <bde@FreeBSD.org> Fix some bugs found while fixing the representation and translation
of 64-bit dev_t's (but not ones involving dev_t's).

st_size was supposed to be clamped in cvtstat() and linux's copy_stat(),
but the clamping code wasn't aware that st_size is signed, and also had
an obfuscated off-by-1 value for the unsigned limit, so its effect was
to produce a bizarre negative size instead of clamping.

Change freebsd32's copy_ostat() to be no worse than cvtstat(). It was
missing clamping and bzero()ing of padding.

Reviewed by: kib (except a final fix of the clamp to the signed maximum)
reebsd32/freebsd32_misc.c
inux/linux_stats.c
0b8f91e596c7a76e48cff0255dc3a9c6311fb237 12-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the ip_eth_mc_map() function in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/net/ip.h
250cb229a0bbd014c9c70c07006c84ba53599d2d 11-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the kstrtobool() and kstrtobool_from_user() functions
in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/kernel.h
e8a116fa87253aa61a1d74a2a0142cb19490dfbf 11-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the user_access_begin(), user_access_end(), usafe_get_user() and
unsafe_put_user() function macros in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/asm/uaccess.h
83d5f6a31223c744237963a44a2363041514e4c7 07-Jun-2018 hselasky <hselasky@FreeBSD.org> Define ARCH_KMALLOC_MINALIGN in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/slab.h
66dc0675f5ab446a90c49d115fe2364aa6f40cb1 07-Jun-2018 hselasky <hselasky@FreeBSD.org> Wrap timespec64 into timespec in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/ktime.h
inuxkpi/common/include/linux/time.h
4501f2c51adc638b8552061c2958911e9df111d8 07-Jun-2018 hselasky <hselasky@FreeBSD.org> Move the EXPORT_SYMBOL_XXX() function macros into own header file.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/export.h
inuxkpi/common/include/linux/module.h
0361cfdee1089c4e59d9091bf44d925bf2369776 07-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the dev_pm_set_driver_flags() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/device.h
f9b5ddd4acb3ef9250a13897ddf31a1cd77a23bb 06-Jun-2018 hselasky <hselasky@FreeBSD.org> Make some list functions RCU safe in the LinuxKPI.
While at it rename hlist_add_after() into hlist_add_behind().

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/list.h
208832619f75f8ea623d34c6ba6baefc5968be4d 06-Jun-2018 hselasky <hselasky@FreeBSD.org> Rewrite code using atomic_fcmpset_int() in the LinuxKPI.

Suggested by: mjg@
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/asm/atomic.h
fb0d8899cf0377688d3a56f2b117af6d0c63e437 06-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the __add_wait_queue_entry_tail() function in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/wait.h
fca226d193313909a17964dbc531e245db01a0dd 06-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the might_sleep_if() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/wait.h
0de2147890d2001c323081034e83c7a439fdff68 06-Jun-2018 hselasky <hselasky@FreeBSD.org> Rename two structure field members while keeping backwards compatibility in
the LinuxKPI. Add a comment saying in which Linux version this change was made.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/wait.h
93c0ede5261a024836671cb7549d6ca5d4c945a2 06-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the init_wait_entry() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/wait.h
inuxkpi/common/src/linux_schedule.c
d4524951a3002331249f23e33fab57cb0358ecfe 06-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the atomic_dec_if_positive() function in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/asm/atomic.h
29c0e1b8fd1b104888df35c73fc9aee460e0df52 06-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the ktime_compare() and ktime_after() functions in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/ktime.h
e32e178ec0f2c1b74067421ccb40cdca5861b169 06-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the rdmsrl_safe() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/asm/msr.h
b69dab33aa901df6c724fd7e26139f7e71541861 05-Jun-2018 hselasky <hselasky@FreeBSD.org> Declare and set the global "system_highpri_wq" workqueue structure pointer
in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/workqueue.h
inuxkpi/common/src/linux_work.c
12f56c9734a9b30ac5e005c9becdecaa2de0bff5 05-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the INIT_DELAYED_WORK_ONSTACK() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/workqueue.h
0a8148581a7b77fe2f63814761664e7249b9aaa8 05-Jun-2018 hselasky <hselasky@FreeBSD.org> Define the __kernel_size_t type in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/types.h
364df1ce069a13fc3bdd9467f16b206454ed8272 05-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the task_pid_vnr() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/sched.h
4527ebf55c7929bed0d4869447ad09e5581c97f3 05-Jun-2018 hselasky <hselasky@FreeBSD.org> Add "access" function pointer to the "vm_operations_struct" structure
in the LinuxKPI. While at it document when to use the "virtual_address" or
the "address" field in the "vm_fault" structure.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/mm.h
204460a40c2f365d8e7525ff40a43b433ef9e716 05-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement mul_u32_u32() function in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/math64.h
ae83c4132eebe608be393a2ab4cf56974948707d 05-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement timer_setup() and from_timer() function macros in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/timer.h
d11c50b90147fc6588b0347deac461ce58ad0ade 04-Jun-2018 markj <markj@FreeBSD.org> Regen after r334626.
reebsd32/freebsd32_systrace_args.c
9d9fd255d646b6c389fa347cb633c665c4485aa4 04-Jun-2018 markj <markj@FreeBSD.org> Reimplement brk() and sbrk() to avoid the use of _end.

Previously, libc.so would initialize its notion of the break address
using _end, a special symbol emitted by the static linker following
the bss section. Compatibility issues between lld and ld.bfd could
cause the wrong definition of _end (libc.so's definition rather than
that of the executable) to be used, breaking the brk()/sbrk()
interface.

Avoid this problem and future interoperability issues by simply not
relying on _end. Instead, modify the break() system call to return
the kernel's view of the current break address, and have libc
initialize its state using an extra syscall upon the first use of the
interface. As a side effect, this appears to fix brk()/sbrk() usage
in executables run with rtld direct exec, since the kernel and libc.so
no longer maintain separate views of the process' break address.

PR: 228574
Reviewed by: kib (previous version)
MFC after: 2 months
Differential Revision: https://reviews.freebsd.org/D15663
reebsd32/syscalls.master
438f1662b8fa528d7af09fb3d4a7f0b18f630079 01-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement the __sg_alloc_table_from_pages() function based on the existing
sg_alloc_table_from_pages() function in the LinuxKPI.

This basically allow segments to have a limit, max_segment.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/scatterlist.h
b8c9fa199022ce09415e9f7af33a5ec04a364233 01-Jun-2018 hselasky <hselasky@FreeBSD.org> Implement radix_tree_iter_delete() in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/radix-tree.h
inuxkpi/common/src/linux_radix.c
16fd0dadcc672d26b8cc15797405e361af8bddbe 01-Jun-2018 hselasky <hselasky@FreeBSD.org> Improve high resolution timer support in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/hrtimer.h
inuxkpi/common/src/linux_hrtimer.c
5433faa41f34f90bc4a100f051a0cbca836981a4 01-Jun-2018 hselasky <hselasky@FreeBSD.org> Add more GFP macro definitions in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/gfp.h
e398ec6acfcce4cca7cc4bbae2cbd0d7c051e9c6 31-May-2018 hselasky <hselasky@FreeBSD.org> Implement support for the PCI_BUS_NUM() function macro in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
6040841f87391e0e1b9ef08e1c1d3974f18781d3 31-May-2018 hselasky <hselasky@FreeBSD.org> Implement support for the kvmalloc_array() function in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/slab.h
4a5a36369c79286044b935b81ef13f854db59848 31-May-2018 hselasky <hselasky@FreeBSD.org> Correct macroname in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/completion.h
f7a0601b22cf433816b01ee4788408eda49b59bf 31-May-2018 hselasky <hselasky@FreeBSD.org> Define __initconst in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/compiler.h
f280706501d54a36a00a3b8c212c5b86cb9c3610 31-May-2018 hselasky <hselasky@FreeBSD.org> Implement bitmap_complement() in the LinuxKPI.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/bitmap.h
70dc9766535edda36c038e2c200fa611f754abba 31-May-2018 hselasky <hselasky@FreeBSD.org> Implement idr_is_empty() in the LinuxKPI and make idr_remove() API compatible
with upstream Linux by returning the pointer to the removed element.

Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/idr.h
inuxkpi/common/src/linux_idr.c
e56e8a49a8852415f67546b26f99291b6dcc55cf 30-May-2018 brooks <brooks@FreeBSD.org> Remove alternative names that are identical to the default.

Verified by make sysent producing no changes.
reebsd32/syscalls.master
857ae5a01f4ee31d2a3fe3303b4ad089d07ea476 28-May-2018 hselasky <hselasky@FreeBSD.org> The schedule_timeout_killable() function should listen for signals
in the LinuxKPI.

Found by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
7a086faf53abcf800b07eb5e39b82b0caf5dfa17 28-May-2018 hselasky <hselasky@FreeBSD.org> Implement wait_event_killable() in the LinuxKPI.

Requested by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/wait.h
c4a3a5d220c220c6ad290f848e75334381257452 28-May-2018 hselasky <hselasky@FreeBSD.org> Allow TASK_PARKED bit being set when going to sleep in the LinuxKPI.

Found by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_schedule.c
12a8583fb05e7f7481c6917ba3313145af0828ec 25-May-2018 brooks <brooks@FreeBSD.org> Regen after r334223: make vadvise compat freebsd11.
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
747734422491dc3a706e961f09a13c74cf367c06 25-May-2018 brooks <brooks@FreeBSD.org> Make vadvise compat freebsd11.

The vadvise syscall (aka ovadvise) is undocumented and has always been
implmented as returning EINVAL. Put the syscall under COMPAT11 and
provide a userspace implementation.

Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15557
reebsd32/syscalls.master
ecd6e9d3074a783a743434c51dbfd16571c55fa2 23-May-2018 mmacy <mmacy@FreeBSD.org> UDP: further performance improvements on tx

Cumulative throughput while running 64
netperf -H $DUT -t UDP_STREAM -- -m 1
on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps

Single stream throughput increases from 910kpps to 1.18Mpps

Baseline:
https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg

- Protect read access to global ifnet list with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg

- Protect short lived ifaddr references with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg

- Convert if_afdata read lock path to epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg

A fix for the inpcbhash contention is pending sufficient time
on a canary at LLNW.

Reviewed by: gallatin
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15409
inprocfs/linprocfs.c
inux/linux_ioctl.c
inuxkpi/common/include/linux/inetdevice.h
7aeac9ef1893e0b29408213e3a320d9d1ef28357 18-May-2018 mmacy <mmacy@FreeBSD.org> ifnet: Replace if_addr_lock rwlock with epoch + mutex

Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32
4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32
4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32
4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32
4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32
4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32
4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32

After the patch

InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51
5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51
5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51
5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51
5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52
5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by: gallatin
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15366
inux/linux_ioctl.c
5bb6870ecb4be5705224790f5c31d6dc84de7166 15-May-2018 brooks <brooks@FreeBSD.org> Allow freebsd32 __sysctl(2) to return ENOMEM.

This is required by programs like sockstat that read variably sized
sysctls such as kern.file. The normal path has no such restriction and
the restriction was added without comment along with initial support for
freebsd32 in 2002 (r100384).

Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15438
reebsd32/freebsd32_misc.c
73b6d8a09d89f4d261b6b5b99223e5f1cd207afe 09-May-2018 markj <markj@FreeBSD.org> Remove "All rights reserved" from my files.

See r333391 for the rationale.

MFC after: 1 week
inuxkpi/common/include/linux/hrtimer.h
inuxkpi/common/src/linux_hrtimer.c
a0bd5d3d7ffae2d09d6ae3cb12bed3ca80e88928 09-May-2018 mmacy <mmacy@FreeBSD.org> Eliminate the overhead of gratuitous repeated reinitialization of cap_rights

- Add macros to allow preinitialization of cap_rights_t.

- Convert most commonly used code paths to use preinitialized cap_rights_t.
A 3.6% speedup in fstat was measured with this change.

Reported by: mjg
Reviewed by: oshogbo
Approved by: sbruno
MFC after: 1 month
loudabi/cloudabi_file.c
inux/linux_event.c
inux/linux_file.c
inux/linux_ioctl.c
inux/linux_mmap.c
inux/linux_socket.c
inux/linux_stats.c
inuxkpi/common/include/linux/file.h
4bc2d771ed3e895174d432f6ac231a1ba4da26cd 09-May-2018 hselasky <hselasky@FreeBSD.org> Add myself to copyright in the LinuxKPI RCU support layer.

Suggested by: mmacy@
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_rcu.c
1c11f552d63c8d13159b579aed059a7649bbf5aa 04-May-2018 jamie <jamie@FreeBSD.org> Make it easier for filesystems to count themselves as jail-enabled,
by doing most of the work in a new function prison_add_vfs in kern_jail.c
Now a jail-enabled filesystem need only mark itself with VFCF_JAIL, and
the rest is taken care of. This includes adding a jail parameter like
allow.mount.foofs, and a sysctl like security.jail.mount_foofs_allowed.
Both of these used to be a static list of known filesystems, with
predefined permission bits.

Reviewed by: kib
Differential Revision: D14681
inprocfs/linprocfs.c
insysfs/linsysfs.c
eb861103c8d89a52f8b9dc8c7ec4690ccdbe7220 30-Apr-2018 hselasky <hselasky@FreeBSD.org> Define USEC_PER_MSEC and USEC_PER_SEC in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/time.h
553b9804f0dcf06241de99e2bc1dd09cd0b4a13a 24-Apr-2018 eadler <eadler@FreeBSD.org> [procfs] Split procfs_attr into multiple functions

Reviewed by: des, kib
Discussed with: mmacy
Differential Revision: https://reviews.freebsd.org/D15150
inprocfs/linprocfs.c
a42a93a10254704fd1a8f2cd433594ec4b40a94f 24-Apr-2018 kib <kib@FreeBSD.org> Fix futexes on i386 after the 4/4G split.

Use proper method to access userspace. For now, only the slow copyout
path is implemented.

Reported and tested by: tijl (previous version)
Sponsored by: The FreeBSD Foundation
inux/linux_futex.c
inux/linux_futex.h
2b071b580cccf6b5473991ee3e28a4a944553864 23-Apr-2018 emaste <emaste@FreeBSD.org> Map FreeBSD EDOOFUS to Linux EINVAL

Previously EDOOFUS mapped to EBUSY. EINVAL seems more appropriate.

Discussed with: cem
MFC after: 1 week
Sponsored by: Turing Robotic Industries Inc.
inux/linux_errno.inc
f051bf839c517540b55c247be60496ab325626e6 20-Apr-2018 kib <kib@FreeBSD.org> Rename PROC_PDEATHSIG_SET -> PROC_PDEATHSIG_CTL and PROC_PDEATHSIG_GET
-> PROC_PDEATHSIG_STATUS for consistency with other procctl(2)
operations names.

Requested by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 13 days
reebsd32/freebsd32_misc.c
96955e60da90f2d583d2e469059a2b8832ea9ccb 19-Apr-2018 jhb <jhb@FreeBSD.org> Simplify the code to allocate stack for auxv, argv[], and environment vectors.

Remove auxarg_size as it was only used once right after a confusing
assignment in each of the variants of exec_copyout_strings().

Reviewed by: emaste
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D15123
reebsd32/freebsd32_misc.c
a8cb340144294b1db646271aba2923d36245b030 18-Apr-2018 kib <kib@FreeBSD.org> Add PROC_PDEATHSIG_SET to procctl interface.

Allow processes to request the delivery of a signal upon death of
their parent process. Supposed consumer of the feature is PostgreSQL.

Submitted by: Thomas Munro
Reviewed by: jilles, mjg
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D15106
reebsd32/freebsd32_misc.c
c5db8e6d0dae95e287cbece1d8e2c67503924fbb 09-Apr-2018 emaste <emaste@FreeBSD.org> linuxulator: deduplicate linux_exec_imgact_try

Previously linuxulator had three identical copies of
linux_exec_imgact_try. Deduplicate before adding another arch to
linuxulator.

Sponsored by: Turing Robotic Industries Inc
Differential Revision: https://reviews.freebsd.org/D14856
inux/linux_emul.c
inux/linux_emul.h
9d79658aab1a30f34fee169ce74bdff4ca405c18 06-Apr-2018 brooks <brooks@FreeBSD.org> Move most of the contents of opt_compat.h to opt_global.h.

opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.

Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.

Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.

Reviewed by: kib, cem, jhb, jtl
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14941
reebsd32/freebsd32_ioctl.c
reebsd32/freebsd32_misc.c
reebsd32/freebsd32_sysent.c
a32/ia32_genassym.c
a32/ia32_sysvec.c
inux/linux_util.c
53726df9f41ab509cf3e38cfd5b98e1d9f686f0f 05-Apr-2018 markj <markj@FreeBSD.org> Fix the definitions of get_cpu() and put_cpu().

They are supposed to disable preemption.

Reported by: rstone
MFC after: 5 days
inuxkpi/common/include/asm/smp.h
31cf3ac4407818855133b1164847a23d17fd5a85 04-Apr-2018 emaste <emaste@FreeBSD.org> Fix kernel memory disclosure in linux_ioctl_socket

strlcpy is used to copy a string into a buffer to be copied to userland,
previously leaving uninitialized data after the terminating NUL. Zero
the buffer first to avoid a kernel memory disclosure.

admbugs: 765, 811
MFC after: 1 day
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reported by: Vlad Tsyrklevich
Sponsored by: The FreeBSD Foundation
inux/linux_ioctl.c
43703b52c4abe71bc0aa3ba80bc44c19577e9ade 04-Apr-2018 emaste <emaste@FreeBSD.org> linux_ioctl_hdio: fix kernel memory disclosure

Stack-allocated struct linux_hd_big_geometry has undeclared padding
copied to userland.

admbugs: 765
Reported by: Vlad Tsyrklevich
MFC after: 1 day
Security: Kernel memory disclosure
Sponsored by: The FreeBSD Foundation
inux/linux_ioctl.c
0d44c798e29f51b227c964d9fdb75f133daceb0d 03-Apr-2018 markj <markj@FreeBSD.org> Wrap long lines.

MFC after: 3 days
inuxkpi/common/src/linux_schedule.c
bf24bc5621c36047d16fc8bebc25267d447023e0 30-Mar-2018 hselasky <hselasky@FreeBSD.org> Optimise use of Giant in the LinuxKPI.

- Make sure Giant is locked when calling PCI device methods.
Newbus currently requires this.

- Avoid unlocking Giant right before aquiring the sleepqueue lock.
This can save a task switch.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/module.h
inuxkpi/common/src/linux_compat.c
inuxkpi/common/src/linux_pci.c
inuxkpi/common/src/linux_rcu.c
inuxkpi/common/src/linux_schedule.c
4b6d4566890e05634e2e859035e5f5504d840ebd 28-Mar-2018 hselasky <hselasky@FreeBSD.org> Swap two instances of regular macros with function macros in the LinuxKPI,
to narrow down the substitution scope.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
06ec69031ce04d7b104d7dc5804bafdc05bb485f 27-Mar-2018 kib <kib@FreeBSD.org> Fix several leaks of kernel stack data through paddings.

It is random collection of fixes for issues not yet corrected,
reported at https://tsyrklevi.ch/clang_analyzer/freebsd_013017/. Many
issues from that list were already corrected. Most of them are for
compat32, old compat32 or affect both primary host ABI and compat32.

The freebsd32_kldstat(), for instance, was already fixed by using
malloc(M_ZERO). Patch includes correction to report the supplied
version back, which is just pedantic.

Reviewed by: brooks, emaste (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D14868
reebsd32/freebsd32_misc.c
6084a9231e52d475ed39c599dcedba1c25de276c 27-Mar-2018 brooks <brooks@FreeBSD.org> Move 32-bit compat for md(4) ioctls into the md code.

This is more correct in that ioctl commands have no meaning until they
hit the handler associated with the file descriptor.

Add support for MDIOCRESIZE_32 which was missed when it was added.

Reviewed by: cem, kib, markj (various versions)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14714
reebsd32/freebsd32_ioctl.c
reebsd32/freebsd32_ioctl.h
53dacbca2d2e68531a39f5f97c6d5771b783978c 27-Mar-2018 brooks <brooks@FreeBSD.org> Move uio enums to sys/_uio.h.

Include _uio.h instead of uio.h in several headers to reduce header
polution.

Fix a few places that relied on header polution to get the uio.h header.

I have not moved struct uio as many more things that use it rely on
header polution to get other definitions from uio.h.

Reviewed by: cem, kib, markj
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14811
reebsd32/freebsd32_ioctl.c
81de646c9b63ca08c4a8f7f6b158aadcd963bfa6 23-Mar-2018 emaste <emaste@FreeBSD.org> linuxkpi whitespace cleanup

Reviewed by: hselasky, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D14807
inuxkpi/common/include/asm/byteorder.h
inuxkpi/common/include/linux/bitops.h
inuxkpi/common/include/linux/cdev.h
inuxkpi/common/include/linux/compiler.h
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/dma-attrs.h
inuxkpi/common/include/linux/dma-mapping.h
inuxkpi/common/include/linux/err.h
inuxkpi/common/include/linux/errno.h
inuxkpi/common/include/linux/etherdevice.h
inuxkpi/common/include/linux/fs.h
inuxkpi/common/include/linux/idr.h
inuxkpi/common/include/linux/if_ether.h
inuxkpi/common/include/linux/if_vlan.h
inuxkpi/common/include/linux/io.h
inuxkpi/common/include/linux/jiffies.h
inuxkpi/common/include/linux/kernel.h
inuxkpi/common/include/linux/kmod.h
inuxkpi/common/include/linux/kobject.h
inuxkpi/common/include/linux/ktime.h
inuxkpi/common/include/linux/list.h
inuxkpi/common/include/linux/log2.h
inuxkpi/common/include/linux/miscdevice.h
inuxkpi/common/include/linux/mutex.h
inuxkpi/common/include/linux/pci.h
inuxkpi/common/include/linux/rwlock.h
inuxkpi/common/include/linux/rwsem.h
inuxkpi/common/include/linux/slab.h
inuxkpi/common/include/linux/spinlock.h
inuxkpi/common/include/linux/sysfs.h
inuxkpi/common/include/linux/usb.h
inuxkpi/common/include/linux/workqueue.h
inuxkpi/common/include/net/if_inet6.h
inuxkpi/common/include/net/ipv6.h
inuxkpi/common/include/net/netevent.h
inuxkpi/common/src/linux_compat.c
inuxkpi/common/src/linux_idr.c
inuxkpi/common/src/linux_radix.c
inuxkpi/common/src/linux_usb.c
4401e4ad390161b8b87a28210f87eaa426eade0a 23-Mar-2018 emaste <emaste@FreeBSD.org> Rationalize license text on Linuxolator files

Many licenses on Linuxolator files contained small variations from the
standard FreeBSD license text. To avoid license proliferation switch to
the standard 2-Clause FreeBSD license for those files where I have
permission from each of the listed copyright holders.

Approved by: rdivacky, marcel
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
inux/linux_ioctl.h
inux/linux_ipc.h
inux/linux_mib.h
inux/linux_misc.h
inux/linux_signal.h
ab822fada31cffc20107091ba0b32d57622534aa 22-Mar-2018 hselasky <hselasky@FreeBSD.org> The pci_disable_device() function is also expected to clear the PCI
busmaster. This fixes LinuxKPI compliancy with Linux.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
289be8516c5ec21ec34390cfce6c92e014e9efcf 22-Mar-2018 emaste <emaste@FreeBSD.org> Share Linux errno table with libsysdecode

Requested by: jhb
Reviewed by: jhb
Sponsored by: Turing Robotic Industries Inc.
inux/linux_emul.h
inux/linux_errno.c
inux/linux_errno.inc
7aa86e0b9924d40f1c3a80cf9457bff032372871 22-Mar-2018 hselasky <hselasky@FreeBSD.org> Clear old MSIX IRQ numbers in the LinuxKPI.

When disabling the MSIX IRQ vectors for a PCI device through the
LinuxKPI, make sure any old MSIX IRQ numbers are no longer visible to
the linux_pci_find_irq_dev() function else IRQs can be requested from
the wrong PCI device.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pci.h
06362ad468fc11671364e34ed413ea9c19d858f1 21-Mar-2018 cem <cem@FreeBSD.org> Regenerate sysent files after r331279.
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
82710b55b67c1740e1fe70da573fddb10075ca1d 21-Mar-2018 cem <cem@FreeBSD.org> Implement getrandom(2) and getentropy(3)

The general idea here is to provide userspace programs with well-defined
sources of entropy, in a fashion that doesn't require opening a new file
descriptor (ulimits) or accessing paths (/dev/urandom may be restricted
by chroot or capsicum).

getrandom(2) is the more general API, and comes from the Linux world.
Since our urandom and random devices are identical, the GRND_RANDOM flag
is ignored.

getentropy(3) is added as a compatibility shim for the OpenBSD API.

truss(1) support is included.

Tests for both system calls are provided. Coverage is believed to be at
least as comprehensive as LTP getrandom(2) test coverage. Additionally,
instructions for running the LTP tests directly against FreeBSD are provided
in the "Test Plan" section of the Differential revision linked below. (They
pass, of course.)

PR: 194204
Reported by: David CARLIER <david.carlier AT hardenedbsd.org>
Discussed with: cperciva, delphij, jhb, markj
Relnotes: maybe
Differential Revision: https://reviews.freebsd.org/D14500
reebsd32/syscalls.master
9fe078da1cf12e830bb3439b8d86faf4e1692c7d 16-Mar-2018 emaste <emaste@FreeBSD.org> linux_errno.c: add newer errno values

Also introduce a static assert to ensure the list is kept up to date.

Sponsored by: Turing Robotic Industries Inc.
inux/linux_errno.c
566c3d41cc5ef0926a69504a2161b17e5810f75c 16-Mar-2018 emaste <emaste@FreeBSD.org> Share a single bsd-linux errno table across MD consumers

Three copies of the linuxulator linux_sysvec.c contained identical
BSD to Linux errno translation tables, and future work to support other
architectures will also use the same table. Move the table to a common
file to be used by all. Make it 'const int' to place it in .rodata.

(Some existing Linux architectures use MD errno values, but x86 and Arm
share the generic set.)

This change should introduce no functional change; a followup will add
missing errno values.

MFC after: 3 weeks
Sponsored by: Turing Robotic Industries Inc.
Differential Revision: https://reviews.freebsd.org/D14665
inux/linux_emul.h
inux/linux_errno.c
938790ab5777c6fc694669656dd50e924f54dd00 14-Mar-2018 hselasky <hselasky@FreeBSD.org> Fix compliancy of the kstrtoXXX() functions in the LinuxKPI, by skipping
one newline character at the end, if any.

Found by: greg@unrelenting.technology
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
ac81b47ca6a226f0a9b33f34c506c5a773bf5de4 12-Mar-2018 emaste <emaste@FreeBSD.org> Linuxulator: apply style(9) to return

Sponsored by: Turing Robotic Industries Inc.
inux/linux_ioctl.c
inux/linux_signal.c
inux/linux_stats.c
inux/linux_util.c
58bad4a91d9096d3a013d5a0e01d58f531de68b3 09-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement proper support for complete_all() in the LinuxKPI.

When complete_all() is called there might be multiple waiters. The
current implementation could only handle one waiter. Make sure the
completion is sticky when complete_all() is called to be compatible
with Linux.

Found by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/src/linux_compat.c
7927756d5b1516b34aa0da1897c18b40ebff9a7b 07-Mar-2018 eadler <eadler@FreeBSD.org> sys/cloudabi: Avoid relying on GNU specific extensions

An empty initializer list is not technically valid C grammar.

MFC After: 1 week
loudabi/cloudabi_fd.c
3da6faa563c78531bef1ff7c4c84c24b7de8e77e 06-Mar-2018 ae <ae@FreeBSD.org> Add mapping for several ethernet types used by Linux to FreeBSD
ethernet types.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D14594
inuxkpi/common/include/linux/if_ether.h
16ef6542218f056361220d5a729baaed934048ae 05-Mar-2018 brooks <brooks@FreeBSD.org> Regen after r330517.
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
5d5c62475924fdcaa6510a0bcb1c83a3d755bde8 05-Mar-2018 brooks <brooks@FreeBSD.org> Remove remenants of 1990s efforts to let us run Net/OpenBSD binaries.

No functional change (comments change in some generated files.)

Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14571
reebsd32/syscalls.master
f54d4ecc08ab1581128f2fd15e5487cf1da4ff52 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Properly wrap the BUILD_BUG() function macro in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
151894a156e3362380fad8a105433f31bf9d15cb 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Stub kernel_param_lock() and kernel_param_unlock() in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/moduleparam.h
2dab843a2e04937b435dd28388835b703b608d53 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement wait_event_lock_irq() macro function in the LinuxKPI.

MFC after: 1 week
Requested by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/wait.h
7e77289d460b795e99eb8743a851d274e199baae 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Keep the old SLAB_DESTROY_BY_RCU macro definition around in the LinuxKPI
to avoid compilation breakage in external kernel modules.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/slab.h
8e70a8ef19e527d0396224c3c4c60ed345e3b4f8 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement DEFINE_WAIT_FUNC() function macro and default_wake_function()
in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/wait.h
inuxkpi/common/src/linux_schedule.c
9baae572dc0c5a17b125629c4faada5aeee1520f 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement pr_err_ratelimited() function macro in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/printk.h
c22224101f2233b52984c4555040e41a2835f737 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement __MODULE_STRING() function macro in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/module.h
4e79e1a83be1bb9e2b3454df22a67c8c479c42d4 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement BUILD_BUG() function macro in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/kernel.h
00a71092fb5d6f6b6eb32cf1eed507d681193e94 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement writel_relaxed() in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/io.h
0945850a32cf6528551c336c787810f27f790107 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Define noinline and __maybe_unused macros in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/compiler.h
3c7c22a9aac87f6163e45ac6fbeb42d30417bc03 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement for_each_clear_bit() function macro in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/bitops.h
1a484bcfce84ba837c5e6375fd7eb091674532d8 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement GENMASK_ULL() function macro in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/bitops.h
3f07451c558be0d2fc36b7a60f4e48087a9be697 04-Mar-2018 hselasky <hselasky@FreeBSD.org> Rename the SLAB_DESTROY_BY_RCU flag into SLAB_TYPESAFE_BY_RCU in the LinuxKPI
to be compatible with Linux.

MFC after: 1 week
Requested by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/slab.h
inuxkpi/common/src/linux_slab.c
9b775b41410e0556fde3fa9ec343c996e59f90b6 03-Mar-2018 eadler <eadler@FreeBSD.org> sys/linux: Fix a few potential infoleaks in cloudabi

Submitted by: Domagoj Stolfa <domagoj.stolfa@gmail.com>
MFC After: 1 month
Sponsored by: DARPA/AFRL
loudabi/cloudabi_file.c
93e0dd9ce69f8622d04b5d7fb590590ef6fa6d2e 03-Mar-2018 eadler <eadler@FreeBSD.org> sys/linux: Fix a few potential infoleaks in Linux IPC

Submitted by: Domagoj Stolfa <domagoj.stolfa@gmail.com>
MFC After: 1 month
inux/linux_ipc.c
75c93714589a4591786151c40040600dc783c9bd 03-Mar-2018 hselasky <hselasky@FreeBSD.org> Use mstosbt() instead of SBT_1MS in the LinuxKPI to get the last few bits
of precision.

MFC after: 1 week
Suggested by: ian@
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/delay.h
inuxkpi/common/src/linux_schedule.c
74843c3f01b7ef2a6cf360a0d46a95cec1ab4ffb 03-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement msleep_interruptible() in the LinuxKPI. While at it use pause_sbt()
instead of pause() in the msleep() function to avoid rounding errors when
converting delay values forth and back. Add a guard for a delay value
of zero milliseconds which is undefined.

MFC after: 1 week
Requested by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/delay.h
inuxkpi/common/src/linux_schedule.c
a0e10aee51ca255faf748a423ab1dd49539018ce 02-Mar-2018 brooks <brooks@FreeBSD.org> Rename kernel-only members of semid_ds and msgid_ds.

This deliberately breaks the API in preperation for future syscall
revisions which will remove these nonstandard members.

In an exp-run a single port (devel/qemu-user-static) was found to
use them which it did becuase it emulates system calls. This has
been fixed in the ports tree.

PR: 224443 (exp-run)
Reviewed by: kib, jhb (previous version)
Exp-run by: antoine
Sponsored by: DARPA, AFRP
Differential Revision: https://reviews.freebsd.org/D14490
reebsd32/freebsd32_ipc.h
d5842c73935af7f7a7eeaeaec75ad8f3cf6cdf50 02-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement more lockdep stubs in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/lockdep.h
2ddfd536e625fa94c2c5bbecb7a8b82fe87b2afd 02-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement ktime_get_raw() function in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/ktime.h
167a3e9cf657bcfb6cc46cb3c9da5ed8b063f3ac 02-Mar-2018 hselasky <hselasky@FreeBSD.org> Implement wait_on_bit() function macro in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/wait.h
c7b982ecff4f1d2ea3b4d852d09dc02bfa0a9e71 02-Mar-2018 hselasky <hselasky@FreeBSD.org> Rename callout member in struct timer_list to match the one in struct
delayed_work in the LinuxKPI. This allows the timer_pending() function
macro to be used with delayed work structures.

No functional nor structural change.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/timer.h
inuxkpi/common/src/linux_compat.c
5c793c607c102ce4937ce05cb449a04dea6d190b 01-Mar-2018 emaste <emaste@FreeBSD.org> Rationalize license text on Linuxolator files

Many licenses on Linuxolator files contained small variations from the
standard FreeBSD license text. To avoid license proliferation switch to
the standard 2-clause FreeBSD license for those files where I have
permission from each of the listed copyright holders. Additional files
still waiting on permission from others are listed in review D14210.

Approved by: dchagin, rdivacky, sos
MFC after: 1 week
MFC with: r329370
Sponsored by: The FreeBSD Foundation
inux/linux_emul.c
inux/linux_emul.h
2a01827ff67fbdbcd4381a4fa6f110057f80f4e0 01-Mar-2018 hselasky <hselasky@FreeBSD.org> Correct the return value from flush_work() and flush_delayed_work() in the
LinuxKPI to comply more with Linux. This fixes an issue when these functions
are used in waiting loops.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_work.c
5122d84e6415cd7d8bd3c33822834c488b5b56cc 23-Feb-2018 emaste <emaste@FreeBSD.org> Correct pseudo misspelling in sys/ comments

contrib code and #define in intel_ata.h unchanged.
reebsd32/syscalls.master
fdcd80b4497a24c88924f35ce35efc287f798b1f 22-Feb-2018 hselasky <hselasky@FreeBSD.org> Return correct error code to user-space when a system call receives a
signal in the LinuxKPI.

The read(), write() and mmap() system calls can return either EINTR or
ERESTART upon receiving a signal. Add code to figure out the correct
return value by temporarily storing the return code from the relevant
FreeBSD kernel APIs in the Linux task structure.

MFC after: 3 days
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mutex.h
inuxkpi/common/include/linux/rwsem.h
inuxkpi/common/include/linux/sched.h
inuxkpi/common/src/linux_compat.c
inuxkpi/common/src/linux_lock.c
inuxkpi/common/src/linux_schedule.c
f4add23ff529590a71dce4bd02f95aaad02173b7 22-Feb-2018 emaste <emaste@FreeBSD.org> Correct proper nouns in the Linuxulator

- Capitalize Linux
- Spell FreeBSD out in full
- Address some style(9) on changed lines

Sponsored by: Turing Robotic Industries Inc.
inux/linux_emul.c
inux/linux_file.c
inux/linux_ioctl.c
inux/linux_mib.c
inux/linux_misc.c
inux/linux_socket.c
ef80f6306de14db235fa591c81fa30438d1df5fe 21-Feb-2018 hselasky <hselasky@FreeBSD.org> Allow LinuxKPI character devices to receive mmap() calls from the Linux
binary mode user-space emulation layer. This is a regression issue after
r328436, when LinuxKPI character devices started to use DTYPE_DEV in
the "f_type" field of the associated file structure(s).

MFC after: 3 days
Found by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inux/linux_mmap.c
3217658e61a1430fc3a44202497a5dea03416682 20-Feb-2018 brooks <brooks@FreeBSD.org> Reduce duplication in dynamic syscall registration code.

Remove the unused syscall_(de)register() functions in favor of the
better documented and easier to use syscall_helper_(un)register(9)
functions.

The default and freebsd32 versions differed in which array of struct
sysents they used and a few missing updates to the 32-bit code as
features were added to the main code.

Reviewed by: cem
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14337
reebsd32/freebsd32_misc.c
reebsd32/freebsd32_util.h
ee3d0fb8ef5def49a12c51a3030af447009a1ec5 20-Feb-2018 kib <kib@FreeBSD.org> vm_wait() rework.

Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass. If there is no object
associated with the wait, use curthread' policy domainset. The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.

Eliminate pagedaemon_wait(). vm_domain_clear() handles the same
operations.

Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.

Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.

Scetched and reviewed by: jeff
Tested by: pho
Sponsored by: The FreeBSD Foundation, Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D14384
inuxkpi/common/src/linux_page.c
468bce151fdd95baa3875a9fa5eb23517d685690 19-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement list_safe_reset_next() function macro in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/list.h
472784df8e6a721369d9f18ac645163cd01c123d 19-Feb-2018 hselasky <hselasky@FreeBSD.org> When stepping the radix tree in the LinuxKPI make sure we
clear the least significant bits, so that no entries are
skipped.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_radix.c
da1a8ac45b7dadae47b0a9758c5cf1f2ce5dcc3a 18-Feb-2018 hselasky <hselasky@FreeBSD.org> Optimise xchg() to use atomic_swap_32() and atomic_swap_64().

Suggested by: kib@
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic.h
71bdabd3d38930fc59ab6c660886fa7e02d08909 18-Feb-2018 hselasky <hselasky@FreeBSD.org> Fix implementation of xchg() function macro in the LinuxKPI.
The exchange operation must be atomic.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic.h
8fd267584d1b7edad5878fe615b302d91b774de9 18-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement support for radix_tree_for_each_slot() and radix_tree_exception()
in the LinuxKPI and use unsigned long type for the radix tree index.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/radix-tree.h
inuxkpi/common/src/linux_radix.c
dd427b674f15ed720fb62a7c3a237b89a4218a8c 18-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement the KMEM_CACHE() function macro in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/slab.h
09df133de6a924e939a4f19f01d482ac0ae93087 18-Feb-2018 hselasky <hselasky@FreeBSD.org> Make the vm_fault structure in the LinuxKPI compatible with
newer versions of the Linux kernel. No functional change.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/mm.h
inuxkpi/common/src/linux_compat.c
51ba796bc1176e57b1175b48df06cd609d64d477 18-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement the rcu_dereference_raw() function macro.
Make sure all RCU dereferencing use the READ_ONCE() function macro.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/rcupdate.h
22e6306b4c5bef47f6abaea1332cf4fbdc18768f 18-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement __GFP_BITS_SHIFT and __GFP_BITS_MASK macros in the LinuxKPI.
Add compile time asserts to catch conflicts with native defines.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/gfp.h
572b6aadd98d52e2a2b312c4b0d223f073fab222 18-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement __list_del_entry() helper functions in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/list.h
f6937a20e98c2758f61af852708767cdefe74463 18-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement file_inode() and call_mmap() helper functions in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/fs.h
bbcb79ee0ecae17815c4a96b627304f5624b8500 18-Feb-2018 hselasky <hselasky@FreeBSD.org> Refactor dentry structure into its own header file in the LinuxKPI similary
to Linux. No functional change. Implement d_inode() helper function.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/dcache.h
inuxkpi/common/include/linux/fs.h
d0ba7ed6c7bee86c37b58d10022e951cb5cddfb3 18-Feb-2018 hselasky <hselasky@FreeBSD.org> Update the ktime type in the LinuxKPI to be a signed 64-bit integer similarly
to Linux, to avoid compilation issues. Implement ktime_get_real_seconds().

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
inuxkpi/common/include/linux/ktime.h
inuxkpi/common/src/linux_hrtimer.c
d433925d6b3a0565bb20aff0ed6d244b518f2d0b 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement spin_trylock_irq() function macro in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/spinlock.h
8c5ae2f1a1d17b13033bd7e01af3a0aa1c6d428b 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Stub more lockdep function macros in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/lockdep.h
c36c8dcaaa504a9c76b199e12f8ec2df340282f1 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement get_task_pid() function macro in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pid.h
4b7f081188a66c34bf4bc6549d62a368dad48947 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Allow the put_user() function macro to put constant values by using the
existing __put_user() macro.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/uaccess.h
09736ced840a3098b56c77a2b2774823a601589d 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement BUILD_BUG_ON_INVALID() function macro in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
9709dc1e99bd3363b9f9826d047c4e10f45e146b 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Add support for printk_ratelimit() function macro and improve the existing
printk_ratelimited() function macro to return a boolean stating if there
was a printout, true, or not, false.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/printk.h
f83f8b720f2431a989eb0976443ee2e0e8835868 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Add support for kref_read() function in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kref.h
2e7a47da99355c939081758b8c275496a631c164 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Add support for mmgrab() function in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mm_types.h
233c5a4e6941743b0464cdee35dc6c5378c1bee9 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Add support for __percpu and __weak macros in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/compiler.h
70e796381361f89b3a483921f48a8fea4cb8dc4e 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Move the IRQ_RETVAL() and irqreturn definitions to irqreturn.h in the
LinuxKPI to be compatible with Linux. No functional change.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/interrupt.h
inuxkpi/common/include/linux/irqreturn.h
49f93b7a0641e08aaeed6c001e0b9b1635e53ad5 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Add checks for valid IRQ tag before setting up or tearing down an interrupt
handler in the LinuxKPI. This is needed when the interrupt handler is disabled
before freeing the interrupt.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/interrupt.h
86896d4142d237a1f129c1b4867e7db88ee1ff23 17-Feb-2018 hselasky <hselasky@FreeBSD.org> Compile fix for GCC in the LinuxKPI.

Older versions of GCC don't allow flexible array members in a union.
Use a zero length array instead.

MFC after: 1 week
Reported by: jbeich@
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic.h
9800e7c71b148e1708fd69b787afabecbc941a4c 16-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement mutex_trylock_recursive() in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mutex.h
2a76180fd73bb46f62cede934eed84755d755394 16-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement memdup_user_nul() in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/string.h
0317e3d27569dc74a549505af00d3764a2c8ba6c 16-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement tasklet_enable() and tasklet_disable() in the LinuxKPI.

MFC after: 1 week
Requested by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/interrupt.h
inuxkpi/common/src/linux_tasklet.c
94a13a99f3a472e9903f49589740fec8d4c2c30b 16-Feb-2018 hselasky <hselasky@FreeBSD.org> Implement enable_irq() and disable_irq() in the LinuxKPI.

MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/interrupt.h
5e016ca4a6d9e385669721d6c0d4458fb0d09aef 16-Feb-2018 hselasky <hselasky@FreeBSD.org> Allow the cmpxchg() macro in the LinuxKPI to work on pointers without
generating compiler warnings, -Wint-conversion .

Requested by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic.h
624a2a708bfd9969ddc650db6555b1b3afd1508a 16-Feb-2018 emaste <emaste@FreeBSD.org> Rationalize license text on Linuxolator files

Many licenses on Linuxolator files contained small variations from the
standard FreeBSD license text. To avoid license proliferation switch to
the standard 2-clause FreeBSD license for those files where I have
permission from each of the listed copyright holders. Additional files
waiting on permission from others are listed in review D14210.

Approved by: kan, marcel, sos, rdivacky
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
inux/linux_file.c
inux/linux_ioctl.c
inux/linux_ipc.c
inux/linux_mib.c
inux/linux_signal.c
inux/linux_socket.c
inux/linux_stats.c
inux/linux_sysctl.c
b0572451d4fe9c242f091758b743f81442ab992d 15-Feb-2018 brooks <brooks@FreeBSD.org> Regen after r329322.
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
657dc3fb700b031a3e43516dbcb00880bee5476c 15-Feb-2018 brooks <brooks@FreeBSD.org> Remove freebsd32_getdirentries(), it will be unused after the next
commit.
reebsd32/freebsd32_misc.c
8bf519dbe47fbaab26ae6b1887300ac4d01a7dfd 15-Feb-2018 brooks <brooks@FreeBSD.org> Revert r329323. I missed something in my testing.
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
44981bb96df653cf333dd30bd04b49e273f9c82f 15-Feb-2018 brooks <brooks@FreeBSD.org> Regen after r329322: Fix getdirentries(2) under 32-bit compat.

Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14379
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
cab3d7b0d92e3f3bff1f66c7ff594f56f9242833 15-Feb-2018 brooks <brooks@FreeBSD.org> Fix getdirentries(2) under 32-bit compat.

The latest version of getdirentries (syscall 554) takes a pointer
an an off_t as the last argument. The old version which copies out
an int32_t was being used instead. Use the standard sys_getdirentries()
implementation instead.

Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14379
reebsd32/syscalls.master
068e7ccc44eca8ab661f920dfdca285d5162aeaf 13-Feb-2018 kib <kib@FreeBSD.org> linuxkpi: Do not leak pages on put.

When the owner of the wire reference releases the last reference, it
might be that the page was already attempted to be freed (but free
cannot be performed at that time due to wire). Check that the page
was removed from the object as the indicator of the free attempt and
finish the free operation if so.

Reported and tested by: Slava Shwartsman
Reviewed by: hselasky
Sponsored by: Mellanox Technologies
MFC after: 1 week
inuxkpi/common/include/linux/mm.h
ba27b5187b587e8884e2d94794d6b136e6ab75dc 12-Feb-2018 jeff <jeff@FreeBSD.org> Make v_wire_count a per-cpu counter(9) counter. This eliminates a
significant source of cache line contention from vm_page_alloc(). Use
accessors and vm_page_unwire_noq() so that the mechanism can be easily
changed in the future.

Reviewed by: markj
Discussed with: kib, glebius
Tested by: pho (earlier version)
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14273
inprocfs/linprocfs.c
inux/linux_misc.c
86ffa907ab795c106791efc815c3ac694e9faaef 07-Feb-2018 hselasky <hselasky@FreeBSD.org> Fix implementation of ktime_add_ns() and ktime_sub_ns() in the LinuxKPI to
actually return the computed result instead of the input value.

This is a regression issue after r289572.

Found by: gcc6
MFC after: 3 days
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/ktime.h
e67ec0d694fccc59064b89d121d5944787793e9e 06-Feb-2018 jeff <jeff@FreeBSD.org> Use per-domain locks for vm page queue free. Move paging control from
global to per-domain state. Protect reservations with the free lock
from the domain that they belong to. Refactor to make vm domains more
of a first class object.

Reviewed by: markj, kib, gallatin
Tested by: pho
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14000
inprocfs/linprocfs.c
5c9ea56c9897f1f23d450e6436fa36167953c3ee 05-Feb-2018 emaste <emaste@FreeBSD.org> Linuxolator whitespace cleanup

A version of each of the MD files by necessity exists for each CPU
architecture supported by the Linuxolator. Clean these up so that new
architectures do not inherit whitespace issues.

Clean up shared Linuxolator files while here.

Sponsored by: Turing Robotic Industries Inc.
inux/check_internal_locks.d
inux/linux_emul.c
inux/linux_event.c
inux/linux_file.h
inux/linux_fork.c
inux/linux_ioctl.c
inux/linux_ioctl.h
inux/linux_ipc.c
inux/linux_ipc.h
inux/linux_ipc64.h
inux/linux_misc.c
inux/linux_persona.h
inux/linux_signal.c
inux/linux_socket.c
inux/linux_socket.h
inux/linux_time.c
inux/linux_util.h
inux/stats_timing.d
inux/trace_futexes.d
59fcf4b24171436faa30ba7102bdee99f7e7b539 02-Feb-2018 brooks <brooks@FreeBSD.org> Add kern.ipc.{msqids,semsegs,sema} sysctls for FreeBSD32.

Stop leaking kernel pointers though theses sysctls and make sure that the
padding in the structures is zeroed on allocation to avoid other leaks.

Reviewed by: gordon, kib
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D13459
reebsd32/freebsd32_ipc.h
c3f6808d105e617e0e556fb9b075798ff6eec42a 01-Feb-2018 hselasky <hselasky@FreeBSD.org> Fix some recent regressions after r328436 in the LinuxKPI:

1) The OPW() function macro should have the same return type like the
function it executes.
2) The DEVFS I/O-limit should be enforced for all character device reads
and writes.
3) The character device file handle should be passable, same as for
DEVFS based file handles.

Reported by: jbeich @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
20b17c69e02f4c56865fc7d917d86ee38f168f53 01-Feb-2018 hselasky <hselasky@FreeBSD.org> Make sure the LinuxKPI's internal ERESTARTSYS error code gets translated
into ERESTART for mmap and page fault calls aswell.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
e90574eea90dce61f8d9ed80da30591a2681c801 31-Jan-2018 hselasky <hselasky@FreeBSD.org> Properly implement the cond_resched() function macro in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
59f27534ccb4676b118e2ad85bb1d6de9d7a5285 29-Jan-2018 bdrewery <bdrewery@FreeBSD.org> Don't use an .OBJDIR for 'make sysent'.

Reported by: emaste, jhb
Sponsored by: Dell EMC
loudabi32/Makefile
loudabi64/Makefile
reebsd32/Makefile
e8d8aab8d081f83a6081a8713812851a60106684 26-Jan-2018 hselasky <hselasky@FreeBSD.org> Decouple Linux files from the belonging character device right after open
in the LinuxKPI. This is done by calling finit() just before returning a magic
value of ENXIO in the "linux_dev_fdopen" function.

The Linux file structure should mimic the BSD file structure as much as
possible. This patch decouples the Linux file structure from the belonging
character device right after the "linux_dev_fdopen" function has returned.
This fixes an issue which allows a Linux file handle to exist after a
character device has been destroyed and removed from the directory index
of /dev. Only when the reference count of the BSD file handle reaches zero,
the Linux file handle is destroyed. This fixes use-after-free issues related
to accessing the Linux file structure after the character device has been
destroyed.

While at it add a missing NULL check for non-present file operation.
Calling a NULL pointer will result in a segmentation fault.

Reviewed by: kib @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
589a67838d786decf253f5145faa75e6d67a157f 26-Jan-2018 jhibbits <jhibbits@FreeBSD.org> Minimal change to build linuxkpi on architectures with physical addresses larger
than virtual

Summary:
Some architectures have physical/bus addresses that are much larger
than virtual addresses. This change just quiets a warning, as DMAP is not used
on those architectures, and on 64-bit platforms uintptr_t is the same size as
vm_paddr_t and void *.

Reviewed By: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D14043
inuxkpi/common/src/linux_page.c
af27af95214e232d8e6deb4f4a32b4f3463f16ea 24-Jan-2018 hselasky <hselasky@FreeBSD.org> Properly implement the "id" callback argument in the "idr_for_each" function
in the LinuxKPI. The old implementation assumed only one IDR layer was present.
Take additional IDR layers into account when computing the "id" value.

MFC after: 1 week
Found by: Karthik Palanichamy <karthikp@chelsio.com>
Tested by: Karthik Palanichamy <karthikp@chelsio.com>
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_idr.c
ced875130d74498a2efed66b4800f0a211ad5993 21-Jan-2018 pfg <pfg@FreeBSD.org> Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by: wosch
PR: 225197
dis/subr_ndis.c
49a1f464122f4f2a83a3e4640a7d98681d507fcc 19-Jan-2018 nwhitehorn <nwhitehorn@FreeBSD.org> Define PHYS_TO_DMAP() and DMAP_TO_PHYS() as panics on the architectures
(i386 and arm) that never implement them. This allows the removal of
#ifdef PHYS_TO_DMAP on code otherwise protected by a runtime check on
PMAP_HAS_DMAP. It also fixes the build on ARM and i386 after I forgot an
#ifdef in r328168.

Reported by: Milan Obuch
Pointy hat to: me
inuxkpi/common/src/linux_page.c
e79f2b9178164cd2a849cf2496f988d5c7d67fa3 19-Jan-2018 nwhitehorn <nwhitehorn@FreeBSD.org> Remove SFBUF_OPTIONAL_DIRECT_MAP and such hacks, replacing them across the
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.

As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.

Reviewed by: kib
Suggestions from: marius, alc, kib
Runtime tested on: amd64, powerpc64, powerpc, mips64
inuxkpi/common/src/linux_page.c
f35549b5edcb20b858da9b989c168877cffcd080 15-Jan-2018 pfg <pfg@FreeBSD.org> ndis: make some use of mallocarray(9).

Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
dis/subr_ndis.c
058d03378acc2ff02fb8a89f983fbf3e3e9faea2 12-Jan-2018 jeff <jeff@FreeBSD.org> Regenerate auto-generated files
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
94c7af8ca28aeb166fa5b893e875b3bb50319106 12-Jan-2018 jeff <jeff@FreeBSD.org> Implement 'domainset', a cpuset based NUMA policy mechanism. This allows
userspace to control NUMA policy administratively and programmatically.

Implement domainset based iterators in the page layer.

Remove the now legacy numa_* syscalls.

Cleanup some header polution created by having seq.h in proc.h.

Reviewed by: markj, kib
Discussed with: alc
Tested by: pho
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D13403
reebsd32/freebsd32_misc.c
reebsd32/syscalls.master
4292dd94f2b0920844eb108a095514e0c7926abf 10-Jan-2018 pfg <pfg@FreeBSD.org> linuxkpi: Simplify kmalloc_array.

kmalloc_array seems what we call mallocarray(9).
inuxkpi/common/include/linux/slab.h
fd459f1474747c82eeb7be3f8057e81cd0a004bc 07-Jan-2018 ed <ed@FreeBSD.org> Use mallocarray(9) in CloudABI kernel code where possible.

Submitted by: pfg@
loudabi32/cloudabi32_sock.c
loudabi64/cloudabi64_sock.c
51600b48e4df89cc6e65f6d5d4426f7ba0d7845c 07-Jan-2018 kp <kp@FreeBSD.org> linuxkpi: Implement kcalloc() based on mallocarray()

This means we now get integer overflow protection, which Linux code
might expect as it is also provided by kcalloc() in Linux.
inuxkpi/common/include/linux/slab.h
80081475f13dbec50c9fbf7c4faa1ca2892da501 04-Jan-2018 ed <ed@FreeBSD.org> Allow timed waits with relative timeouts on locks and condvars.

Even though pthreads doesn't support this, there are various alternative
APIs that use this. For example, uv_cond_timedwait() accepts a relative
timeout. So does Rust's std::sync::Condvar::wait_timeout().

Though I personally think that relative timeouts are bad (due to
imprecision for repeated operations), it does seem that people want
this. Extend the existing futex functions to keep track of whether an
absolute timeout is used in a boolean flag.

MFC after: 1 month
loudabi/cloudabi_futex.c
loudabi/cloudabi_util.h
loudabi32/cloudabi32_poll.c
loudabi64/cloudabi64_poll.c
0cf83abb130ad2ea55be9b2e87f104283f018776 19-Dec-2017 shurd <shurd@FreeBSD.org> Update Matthew Macy contact info

Email address has changed, uses consistent name (Matthew, not Matt)

Reported by: Matthew Macy <mmacy@mattmacy.io>
Differential Revision: https://reviews.freebsd.org/D13537
inuxkpi/common/src/linux_page.c
inuxkpi/common/src/linux_rcu.c
c6fbed1a3aad789bccfbd38fd9430398031fe4c8 28-Nov-2017 brooks <brooks@FreeBSD.org> Disable vim syntax highlighting.

Vim's default pick doesn't understand that ';' is a comment character
and the result looks horrible.

Reviewed by: emaste
reebsd32/syscalls.master
2a03579eb7145d8a82cc4bedb4307626f7c1b0ba 27-Nov-2017 pfg <pfg@FreeBSD.org> sys/compat: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
reebsd32/freebsd32.h
reebsd32/freebsd32_capability.c
reebsd32/freebsd32_ipc.h
reebsd32/freebsd32_misc.c
reebsd32/freebsd32_misc.h
reebsd32/freebsd32_signal.h
reebsd32/freebsd32_util.h
a32/ia32_signal.h
a32/ia32_sysvec.c
a32/ia32_util.h
insysfs/linsysfs.c
inux/linux_dtrace.h
inux/linux_emul.c
inux/linux_emul.h
inux/linux_file.c
inux/linux_file.h
inux/linux_fork.c
inux/linux_futex.c
inux/linux_futex.h
inux/linux_getcwd.c
inux/linux_ioctl.c
inux/linux_ioctl.h
inux/linux_ipc.c
inux/linux_ipc.h
inux/linux_mib.c
inux/linux_mib.h
inux/linux_misc.c
inux/linux_misc.h
inux/linux_signal.c
inux/linux_signal.h
inux/linux_socket.c
inux/linux_socket.h
inux/linux_stats.c
inux/linux_sysctl.c
inux/linux_sysproto.h
inux/linux_time.c
inux/linux_uid16.c
inux/linux_util.c
inux/linux_util.h
etbsd/dvcfg.h
bac78aa2a49020f1e67b0d1b05e5daf2213879ae 25-Nov-2017 jhb <jhb@FreeBSD.org> Decode kevent structures logged via ktrace(2) in kdump.

- Add a new KTR_STRUCT_ARRAY ktrace record type which dumps an array of
structures.

The structure name in the record payload is preceded by a size_t
containing the size of the individual structures. Use this to
replace the previous code that dumped the kevent arrays dumped for
kevent(). kdump is now able to decode the kevent structures rather
than dumping their contents via a hexdump.

One change from before is that the 'changes' and 'events' arrays are
not marked with separate 'read' and 'write' annotations in kdump
output. Instead, the first array is the 'changes' array, and the
second array (only present if kevent doesn't fail with an error) is
the 'events' array. For kevent(), empty arrays are denoted by an
entry with an array containing zero entries rather than no record.

- Move kevent decoding tables from truss to libsysdecode.

This adds three new functions to decode members of struct kevent:
sysdecode_kevent_filter, sysdecode_kevent_flags, and
sysdecode_kevent_fflags.

kdump uses these helper functions to pretty-print kevent fields.

- Move structure definitions for freebsd11 and freebsd32 kevent
structures to <sys/event.h> so that they can be shared with userland.
The 32-bit structures are only exposed if _WANT_KEVENT32 is defined.
The freebsd11 structures are only exposed if _WANT_FREEBSD11_KEVENT is
defined. The 32-bit freebsd11 structure requires both.

- Decode freebsd11 kevent structures in truss for the compat11.kevent()
system call.

- Log 32-bit kevent structures via ktrace for 32-bit compat kevent()
system calls.

- While here, constify the 'void *data' argument to ktrstruct().

Reviewed by: kib (earlier version)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D12470
reebsd32/freebsd32.h
reebsd32/freebsd32_misc.c
9b06c6070c5c2a454f1e1ae3a3fedad9f8f1ebc4 24-Nov-2017 ed <ed@FreeBSD.org> Don't let cpu_set_syscall_retval() clobber exec_setregs().

Upon successful completion, the execve() system call invokes
exec_setregs() to initialize the registers of the initial thread of the
newly executed process. What is weird is that when execve() returns, it
still goes through the normal system call return path, clobbering the
registers with the system call's return value (td->td_retval).

Though this doesn't seem to be problematic for x86 most of the times (as
the value of eax/rax doesn't matter upon startup), this can be pretty
frustrating for architectures where function argument and return
registers overlap (e.g., ARM). On these systems, exec_setregs() also
needs to initialize td_retval.

Even worse are architectures where cpu_set_syscall_retval() sets
registers to values not derived from td_retval. On these architectures,
there is no way cpu_set_syscall_retval() can set registers to the way it
wants them to be upon the start of execution.

To get rid of this madness, let sys_execve() return EJUSTRETURN. This
will cause cpu_set_syscall_retval() to leave registers intact. This
makes process execution easier to understand. It also eliminates the
difference between execution of the initial process and successive ones.
The initial call to sys_execve() is not performed through a system call
context.

Reviewed by: kib, jhibbits
Differential Revision: https://reviews.freebsd.org/D13180
inux/linux_emul.c
4736ccfd9c3411d50371d7f21f9450a47c19047e 20-Nov-2017 pfg <pfg@FreeBSD.org> sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
reebsd32/freebsd32_ioctl.c
reebsd32/freebsd32_ioctl.h
9da7bdde061c43b87cf9bb2852984b78e292b1e6 18-Nov-2017 pfg <pfg@FreeBSD.org> spdx: initial adoption of licensing ID tags.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes: yes
Differential Revision: https://reviews.freebsd.org/D13133
inprocfs/linprocfs.c
dis/cfg_var.h
dis/hal_var.h
dis/kern_ndis.c
dis/kern_windrv.c
dis/ndis_var.h
dis/ntoskrnl_var.h
dis/pe_var.h
dis/resource_var.h
dis/subr_hal.c
dis/subr_ndis.c
dis/subr_ntoskrnl.c
dis/subr_pe.c
dis/subr_usbd.c
dis/usbd_var.h
2d36b76cdfe2dbdc3096b26125e43716b3a5039b 15-Nov-2017 gordon <gordon@FreeBSD.org> Properly bzero kldstat structure to prevent kernel information leak.

Submitted by: kib
Reported by: TJ Corley
Security: CVE-2017-1088
reebsd32/freebsd32_misc.c
160a26a6fa1ac57b442a8017b3744d576e9816cb 13-Nov-2017 hselasky <hselasky@FreeBSD.org> Properly handle the case where the linux_cdev_handle_insert() function
in the LinuxKPI returns NULL. This happens when the VM area's private
data handle already exists and could cause a so-called NULL pointer
dereferencing issue prior to this fix.

Found by: greg@unrelenting.technology
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
cbe24d82fa72c8a282b46331e2e901e451ccfd10 11-Nov-2017 mjg <mjg@FreeBSD.org> Use pfind_any in linux_rt_sigqueueinfo and kern_sigqueue
inux/linux_signal.c
757e3c2aff052057d541ca42852c09c71161219a 11-Nov-2017 hselasky <hselasky@FreeBSD.org> Remove release and acquire semantics when accessing the "state" field of the
LinuxKPI task struct. Change type of "state" variable from "int" to
"atomic_t" to simplify code and avoid unneccessary casting.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
inuxkpi/common/src/linux_current.c
inuxkpi/common/src/linux_schedule.c
3cbfbde84ecb1fa57d316500a5f089924bf56c26 11-Nov-2017 hselasky <hselasky@FreeBSD.org> Mask away return codes from del_timer() and del_timer_sync() because
they are not the same like in Linux.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/timer.h
9c20efc8f10bd1ad54d9bb2385b957b612cb3ce2 10-Nov-2017 hselasky <hselasky@FreeBSD.org> Remove some not needed comments in the LinuxKPI. Use the Linux source tree
to lookup documentation for the functions implemented in the LinuxKPI
instead.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/list.h
e31f084c58cfda39c584b5d43f968a47aa4a722f 08-Nov-2017 ed <ed@FreeBSD.org> Upgrade to CloudABI v0.17.

Compared to the previous version, v0.16, there are a couple of minor
changes:

- CLOUDABI_AT_PID: Process identifiers for CloudABI processes.

Initially, BSD process identifiers weren't exposed inside the runtime,
due to them being pretty much useless inside of a cluster computing
environment. When jobs are scheduled across systems, the BSD process
number doesn't act as an identifier. Even on individual systems they
may recycle relatively quickly.

With this change, the kernel will now generate a UUIDv4 when executing
a process. These UUIDs can be obtained within the process using
program_getpid(). Right now, FreeBSD will not attempt to store this
value. This should of course happen at some point in time, so that it
may be printed by administration tools.

- Removal of some unused structure members for polling.

With the polling framework being simplified/redesigned, it turns out
some of the structure fields were not used by the C library. We can
remove these to keep things nice and tidy.

Obtained from: https://github.com/NuxiNL/cloudabi
loudabi32/cloudabi32_module.c
loudabi32/cloudabi32_poll.c
loudabi32/cloudabi32_proto.h
loudabi32/cloudabi32_systrace_args.c
loudabi64/cloudabi64_module.c
loudabi64/cloudabi64_poll.c
loudabi64/cloudabi64_proto.h
loudabi64/cloudabi64_systrace_args.c
33b01cd51dae3938af2658e156e7414220335120 08-Nov-2017 hselasky <hselasky@FreeBSD.org> Make the dma_alloc_coherent() function in the LinuxKPI NULL safe with regard
to the "dev" argument.

Submitted by: Krishnamraju Eraparaju @ Chelsio
Sponsored by: Chelsio Communications
MFC after: 1 week
inuxkpi/common/include/linux/dma-mapping.h
0d7b41d32f33fa5cfbf2c987b92f9b9046b85e84 03-Nov-2017 hselasky <hselasky@FreeBSD.org> Remove redundant dev->si_drv1 NULL checks in the LinuxKPI.
This pointer is checked during the linux_dev_open() callback and does
not need to be NULL checked again. It should always be set for
character devices belonging to the "linuxcdevsw" and technically
there is no need to NULL check this pointer at all.

Suggested by: kib @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
ea26035abf9ef05b7cfc6864e2cd8fb3e263de98 01-Nov-2017 hselasky <hselasky@FreeBSD.org> Implement ioread16be() in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/io.h
0b152443f50c2637d4517fdfc260718e1bdd33ac 01-Nov-2017 hselasky <hselasky@FreeBSD.org> Unconditionally include "opt_inet6.h" in the LinuxKPI.
This makes sure the INET6 macro gets properly defined,
also for kernel module builds.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/in6.h
54594d8cf830a354005c2c1626bd342f5d9de82e 27-Oct-2017 obrien <obrien@FreeBSD.org> Update comment to match r177997 & r178036 changes.
inux/linux_file.c
9f5ab27b8397136386c8923e61657b0f5cedd81c 18-Oct-2017 ed <ed@FreeBSD.org> Import the latest CloudABI definitions, version 0.16.

The most important change in this release is the removal of the
poll_fd() system call; CloudABI's equivalent of kevent(). Though I think
that kqueue is a lot saner than many of its alternatives, our
experience is that emulating this system call on other systems
accurately isn't easy. It has become a complex API, even though I'm not
convinced this complexity is needed. This is why we've decided to take a
different approach, by looking one layer up.

We're currently adding an event loop to CloudABI's C library that is API
compatible with libuv (except when incompatible with Capsicum).
Initially, this event loop will be built on top of plain inefficient
poll() calls. Only after this is finished, we'll work our way backwards
and design a new set of system calls to optimize it.

Interesting challenges will include integrating asynchronous I/O into
such a system call API. libuv currently doesn't aio(4) on Linux/BSD, due
to it being unreliable and having undesired semantics.

Obtained from: https://github.com/NuxiNL/cloudabi
loudabi/cloudabi_fd.c
loudabi32/cloudabi32_poll.c
loudabi32/cloudabi32_proto.h
loudabi32/cloudabi32_syscall.h
loudabi32/cloudabi32_syscalls.c
loudabi32/cloudabi32_sysent.c
loudabi32/cloudabi32_systrace_args.c
loudabi64/cloudabi64_poll.c
loudabi64/cloudabi64_proto.h
loudabi64/cloudabi64_syscall.h
loudabi64/cloudabi64_syscalls.c
loudabi64/cloudabi64_sysent.c
loudabi64/cloudabi64_systrace_args.c
d0053c9a3497be9b2c7a67d1d4cdecc4372fe9bd 15-Oct-2017 tijl <tijl@FreeBSD.org> Add information needed by Linux libdrm 2.4.74 (shipped with CentOS 7.4).

Create a config file for PCI devices that exposes their configuration
space. Only fields needed by libdrm are filled in (vendor, device,
revision, subvendor and subdevice).

Link /sys/class/drm/card%d/device to the PCI device directory.
insysfs/linsysfs.c
50660d4b5aae676da405a31127aa0bf97fabd822 15-Oct-2017 tijl <tijl@FreeBSD.org> Set DEVNAME to dri/card%d. This works with both in-tree drm and drm-next
and is also the value used on Linux.

Tested by: Greg V <greg@unrelenting.technology>
insysfs/linsysfs.c
62d70cdcc930868b382db4cbe5662e439b0eb3c8 15-Oct-2017 tijl <tijl@FreeBSD.org> Add special handling for current in-tree drm devices, like r323692 added
for drm-next.
inux/linux_util.c
63bd2db3e1329b2f90a1b97e13e50ef6b57812bc 15-Oct-2017 tijl <tijl@FreeBSD.org> Use sizeof instead of strlen on string constants. The compiler doesn't
optimise the strlen calls away with -ffreestanding.
inux/linux_util.c
e78c994ec82c983af573fbd7f42bde3d0c424350 13-Oct-2017 markj <markj@FreeBSD.org> Make the PHOLD in linux_wait_event_common() unconditional.

After some in-progress work is committed, this would otherwise be the only
instance of #if(n)def NO_SWAPPING in the tree. Moreover, the requisite
opt_vm.h include was missing, so the PHOLD/PRELE calls were always being
compiled in anyway.

MFC after: 1 week
inuxkpi/common/src/linux_schedule.c
573665c410ab11814a14632101c4ef236c7ba849 13-Oct-2017 hselasky <hselasky@FreeBSD.org> Don't call selrecord() outside the select system call in the LinuxKPI, because
then td->td_sel is NULL and this will result in a segfault inside selrecord().
This happens when only using kqueue() to poll for read and write events.
If select() and kqueue() is mixed there won't be a segfault.

Reported by: Johannes Lundberg
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
a443158f576342201c77563c673e6be7d1402655 12-Oct-2017 emaste <emaste@FreeBSD.org> regen freebsd32_sysent.c after r324564 (freebsd32_posix_fallocate)
reebsd32/freebsd32_sysent.c
32eb7d082759ffb518d105ef6c4d8ebeab4cea3a 12-Oct-2017 emaste <emaste@FreeBSD.org> allow posix_fallocate in 32-bit compat capability mode

Reported by: kib
MFC after: 2 weeks
MFC with: r324560
Sponsored by: The FreeBSD Foundation
reebsd32/capabilities.conf
a9152a7f997253f41a4d0ebedb5295e98de2517d 09-Oct-2017 glebius <glebius@FreeBSD.org> Shorten list of arguments to mbuf external storage freeing function.

All of these arguments are stored in m_ext, so there is no reason
to pass them in the argument list. Not all functions need the second
argument, some don't even need the first one. The second argument
lives in next cache line, so not dereferencing it is a performance
gain. This was discovered in sendfile(2), which will be covered by
next commits.

The second goal of this commit is to bring even more flexibility
to m_ext mbufs, allowing to create more fields in m_ext, opaque to
the generic mbuf code, and potentially set and dereferenced by
subsystems.

Reviewed by: gallatin, kbowling
Differential Revision: https://reviews.freebsd.org/D12615
dis/kern_ndis.c
dis/ndis_var.h
28838c86830f228ff7206ef020d50418930803a2 04-Oct-2017 markj <markj@FreeBSD.org> Add get_random_{int,long} to the LinuxKPI.

Fix some whitespace bugs while here.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D12588
inuxkpi/common/include/linux/random.h
3a9dfc3d72ce35402718fd013fe67ab13eb3e6fd 04-Oct-2017 hselasky <hselasky@FreeBSD.org> Make sure the timer belonging to the delayed work in the LinuxKPI
gets drained before invoking the work function. Else the timer
mutex may still be in use which can lead to use-after-free situations,
because the work function might free the work structure before returning.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/workqueue.h
inuxkpi/common/src/linux_work.c
33c5477d0edbb3f4e76dee637d969d26c6bd38eb 24-Sep-2017 pfg <pfg@FreeBSD.org> Small style(9) issue: spaces vs TAB.
inux/linux_stats.c
10ef676c4bbe7379de1f3687444e4311a7d872e2 22-Sep-2017 hselasky <hselasky@FreeBSD.org> Add support for 32-bit compatibility IOCTLs in the LinuxKPI.

Bump the FreeBSD version to force recompilation of external
kernel modules due to structure change.

PR: 222504
Submitted by: Greg V <greg@unrelenting.technology>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/fs.h
inuxkpi/common/src/linux_compat.c
10c470434ca41fbb953aa8e6fe2d393c66ce57f6 18-Sep-2017 rlibby <rlibby@FreeBSD.org> linsysfs: quiet gcc -Wformat after r323692

Reviewed by: cem
Sponsored by: Dell EMC Isilon
insysfs/linsysfs.c
59d84271534db60758fbc21d3c96d5d6ca2c0cdd 18-Sep-2017 cem <cem@FreeBSD.org> linsysfs(5): Fix two unrelated issues

1. Swap the order of device_get_ivars with device_get_devclass and devclass
name validation. This bug was introduced in r323692.

2. Error check device_get_children and free the returned list. This bug was
introduced in the original linsysfs commit.

Reported by: Oleg V. Nauman <oleg AT theweb.org.ua>, hselasky (1); hselasky (2)
Reviewed by: hselasky
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12407
insysfs/linsysfs.c
f48e4f4e392226f5f7ee5c2231852af08943aafc 18-Sep-2017 hselasky <hselasky@FreeBSD.org> The LinuxKPI atomics do not have acquire nor release semantics unless
specified. Fix code to use READ_ONCE() and WRITE_ONCE() where appropriate.

Suggested by: kib @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic-long.h
inuxkpi/common/include/asm/atomic.h
inuxkpi/common/include/asm/atomic64.h
inuxkpi/common/include/linux/bitops.h
inuxkpi/common/src/linux_tasklet.c
46b838884250d223d34bf76b82e073d96237df84 18-Sep-2017 hselasky <hselasky@FreeBSD.org> Only wire pages in the LinuxKPI instead of holding and wiring them.
This prevents the page daemon from regularly scanning the held pages.

Suggested by: kib @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mm.h
inuxkpi/common/src/linux_page.c
cb9941733ef933c78cf2f6984e44144ebdf2a842 18-Sep-2017 hselasky <hselasky@FreeBSD.org> Add support for shared memory functions to the LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/fs.h
inuxkpi/common/src/linux_page.c
09ad0b962f3029e47b3f430948933b6fe066ccdf 17-Sep-2017 cem <cem@FreeBSD.org> linsysfs(5): Add support for recent libdrm

Expose more information about PCI devices (and GPUs in particular) via
linsysfs to libdrm.

This allows unmodified modern 64-bit Linux libdrm to work, which allows
modern Linux Mesa to work. The submitter reports that he tested the change
with an Ubuntu 16.04 chroot + amdgpu from graphics/drm-next-kmod.

PR: 222375
Submitted by: Greg V <greg AT unrelenting.technology>
insysfs/linsysfs.c
inux/linux_util.c
5bc2b511b11c99adb86387f5e96a7e6581ebb029 09-Sep-2017 hselasky <hselasky@FreeBSD.org> Only search the scope ID in ip6_find_dev() for IPv6 addresses which
have a scope ID. Change size of the searched scope ID to the full
16-bits. There can typically be more than 255 interfaces.

Suggested by: ae @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/inetdevice.h
a4de8e71eef6b047775edba1dd39c3ea367cafa5 09-Sep-2017 hselasky <hselasky@FreeBSD.org> Resolve IPv6 scope ID issues when using ip6_find_dev() in the LinuxKPI.

Workaround problem that ifa_ifwithaddr() also matches the scope ID of
the IPv6 address when searching for a maching IPv6 address. For now
simply try all valid scope IDs until a match is found.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/inetdevice.h
24eb9f47135d89569c0fa92c46e5e82532249808 09-Sep-2017 hselasky <hselasky@FreeBSD.org> Properly implement poll_wait() in the LinuxKPI. This prevents direct
use of the linux_poll_wakeup() function from unsafe contexts, which
can lead to use-after-free issues.

Instead of calling linux_poll_wakeup() directly use the wake_up()
family of functions in the LinuxKPI to do this.

Bump the FreeBSD version to force recompilation of external kernel modules.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/fs.h
inuxkpi/common/include/linux/poll.h
inuxkpi/common/src/linux_compat.c
8d8519606aa28d7705180d8642b5010f9680b363 09-Sep-2017 hselasky <hselasky@FreeBSD.org> Add more sanity checks to linux_fget() in the LinuxKPI. This prevents
returning pointers to file descriptors which were not created by the
LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/file.h
918d59433167f3b0dcbb7954c9181a1a77837f16 08-Sep-2017 sobomax <sobomax@FreeBSD.org> Correct bintime32 declaration: uint32_t sec -> time32_t sec.

Submitted by: jhb
MFC after: 1 month
reebsd32/freebsd32.h
c716ebbc67e045404c1a2c3a0d2e6759cb80d45a 07-Sep-2017 sobomax <sobomax@FreeBSD.org> In the recvmsg32() system call iterate over returned structure(s)
and convert any messages of types SCM_BINTIME, SCM_TIMESTAMP,
SCM_REALTIME and SCM_MONOTONIC from 64-bit to its 32-bit
representation. Otherwise we either run out of user-supplied
buffer to copy those out resulting in the MSG_CTRUNC or simply
return values that the userland 32-bit code is not going
to parse correctly. This fixes at least two regression tests
failing to function properly in 32-bit compat mode:

tools/regression/sockets/udp_pingpong
tools/regression/sockets/unix_cmsg

PR: kern/222039
MFC after: 30 days
reebsd32/freebsd32.h
reebsd32/freebsd32_misc.c
d1d571879f9c38f330db21a3cce805c707a2d28c 05-Sep-2017 ed <ed@FreeBSD.org> Merge pipes and socket pairs.

Now that CloudABI's sockets API has been changed to be addressless and
only connected socket instances are used (e.g., socket pairs), they have
become fairly similar to pipes. The only differences on CloudABI is that
socket pairs additionally support shutdown(), send() and recv().

To simplify the ABI, we've therefore decided to remove pipes as a
separate file descriptor type and just let pipe() return a socket pair
of type SOCK_STREAM. S_ISFIFO() and S_ISSOCK() are now defined
identically.
loudabi/cloudabi_fd.c
loudabi/cloudabi_file.c
a6497b6a8cd1cb66c4fda10863178322e327fa4a 30-Aug-2017 sobomax <sobomax@FreeBSD.org> Add proper support for the md_label into md(4) ioctl compat layer.
While I am here, declare struct md_ioctl32 as packed which allows
us to stop playing tricks with sizeof(md_ioctl32)+y as well as
simplifies md_pad handling. Both were necessary because of different
alignment preferences on amd64 vs i386.

MFC after: 4 weeks
reebsd32/freebsd32_ioctl.c
reebsd32/freebsd32_ioctl.h
5301a361dcbfa8f04113c34f90bfa7936aaa7ddb 30-Aug-2017 ed <ed@FreeBSD.org> Complete the CloudABI networking refactoring.

Now that all of the packaged software has been adjusted to either use
Flower (https://github.com/NuxiNL/flower) for making incoming/outgoing
network connections or can have connections injected, there is no longer
need to keep accept() around. It is now a lot easier to write networked
services that are address family independent, dual-stack, testable, etc.

Remove all of the bits related to accept(), but also to
getsockopt(SO_ACCEPTCONN).
loudabi/cloudabi_fd.c
loudabi/cloudabi_sock.c
loudabi32/cloudabi32_proto.h
loudabi32/cloudabi32_syscall.h
loudabi32/cloudabi32_syscalls.c
loudabi32/cloudabi32_sysent.c
loudabi32/cloudabi32_systrace_args.c
loudabi64/cloudabi64_proto.h
loudabi64/cloudabi64_syscall.h
loudabi64/cloudabi64_syscalls.c
loudabi64/cloudabi64_sysent.c
loudabi64/cloudabi64_systrace_args.c
7558b39c55afa3985e60ecfd46209abaa0ad3dda 25-Aug-2017 ed <ed@FreeBSD.org> Sync CloudABI compatibility against the latest upstream version (v0.13).

With Flower (CloudABI's network connection daemon) becoming more
complete, there is no longer any need for creating any unconnected
sockets. Socket pairs in combination with file descriptor passing is all
that is necessary, as that is what is used by Flower to pass network
connections from the public internet to listening processes.

Remove all of the kernel bits that were used to implement socket(),
listen(), bindat() and connectat(). In principle, accept() and
SO_ACCEPTCONN may also be removed, but there are still some consumers
left.

Obtained from: https://github.com/NuxiNL/cloudabi
MFC after: 1 month
loudabi/cloudabi_fd.c
loudabi/cloudabi_sock.c
loudabi32/cloudabi32_proto.h
loudabi32/cloudabi32_syscall.h
loudabi32/cloudabi32_syscalls.c
loudabi32/cloudabi32_sysent.c
loudabi32/cloudabi32_systrace_args.c
loudabi64/cloudabi64_proto.h
loudabi64/cloudabi64_syscall.h
loudabi64/cloudabi64_syscalls.c
loudabi64/cloudabi64_sysent.c
loudabi64/cloudabi64_systrace_args.c
c1973cc94eed9e94518cfc9a8ecbb6585c8808bf 23-Aug-2017 markj <markj@FreeBSD.org> Set the bus number field when attaching a PCI device.

MFC after: 1 week
inuxkpi/common/src/linux_pci.c
e5cef00963873c85139191010bdddc60dd9eb6fe 22-Aug-2017 markj <markj@FreeBSD.org> Add some miscellaneous definitions to support the DRM drivers.

MFC after: 1 week
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/fs.h
inuxkpi/common/include/linux/kobject.h
inuxkpi/common/include/linux/lockdep.h
inuxkpi/common/include/linux/module.h
925995d6357e9242603c83b37479ad6079b27232 21-Aug-2017 hselasky <hselasky@FreeBSD.org> Fix for deadlock situation in the LinuxKPI's RCU synchronize API.

Deadlock condition:
The return value of TDQ_LOCKPTR(td) is the same for two threads.

1) The first thread signals a wakeup while keeping the rcu_read_lock().
This invokes sched_add() which in turn will try to lock TDQ_LOCK().

2) The second thread is calling synchronize_rcu() calling mi_switch() over
and over again trying to yield(). This prevents the first thread from running
and releasing the RCU reader lock.

Solution:
Release the thread lock while yielding to allow other threads to acquire the
lock pointed to by TDQ_LOCKPTR(td).

Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_rcu.c
10d10d0edfce04fcd8c60782bc25793816a70d8e 20-Aug-2017 markj <markj@FreeBSD.org> Define prefetch() only if it hasn't already been defined.

MFC after: 1 week
inuxkpi/common/include/linux/list.h
51503f03b1a4af316da103846384be7921b568ba 20-Aug-2017 markj <markj@FreeBSD.org> Add a couple of trivial headers to the LinuxKPI.

MFC after: 1 week
inuxkpi/common/include/asm/msr.h
inuxkpi/common/include/linux/atomic.h
e2500ac573cf91d0ab0af3bd9bcb4193b834f92e 18-Aug-2017 cem <cem@FreeBSD.org> Move some other SI_SUB_INIT_IF initializations to SI_SUB_TASKQ

Drop the EARLY_AP_STARTUP gtaskqueue code, as gtaskqueues are now
initialized before APs are started.

Reviewed by: hselasky@, jhb@
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12054
inuxkpi/common/src/linux_tasklet.c
inuxkpi/common/src/linux_work.c
c9aaa09a9e018dc9a860fb8d21cf86d5a44b728a 16-Aug-2017 markj <markj@FreeBSD.org> Add device resource management fields to struct device.

MFC after: 1 week
inuxkpi/common/include/linux/device.h
1dd6601de78a8979c244b8d7b4fc2b10b29a06df 11-Aug-2017 hselasky <hselasky@FreeBSD.org> Make sure the "vm_flags" and "vm_page_prot" fields get set correctly
in the VM area structure in the LinuxKPI when doing mmap() and that
unsupported bits are masked away.

While at it fix some redundant use of parenthesing inside some related
macros.

Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/page.h
inuxkpi/common/src/linux_compat.c
5c221064ba549e195663a6d9b809e4ee0b6fedd6 11-Aug-2017 markj <markj@FreeBSD.org> Add a specialized function for DRM drivers to register themselves.

Such drivers attach to a vgapci bus rather than directly to a pci bus. For
the rest of the LinuxKPI to work correctly in this case, we override the
vgapci bus' ivars with those of the grandparent.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11932
inuxkpi/common/include/linux/pci.h
inuxkpi/common/src/linux_pci.c
853c517175dbcb28a5a8bdfd12d63d55a41cdf6f 10-Aug-2017 hselasky <hselasky@FreeBSD.org> Use integer type to pass around jiffies and/or ticks values in the
LinuxKPI because in FreeBSD ticks are 32-bit.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/completion.h
inuxkpi/common/include/linux/jiffies.h
inuxkpi/common/include/linux/timer.h
inuxkpi/common/src/linux_compat.c
1ea0a5a734a72532923abab90bd2f1b8cdc7cf85 10-Aug-2017 hselasky <hselasky@FreeBSD.org> Fixes for wait event in the LinuxKPI. These are regression issues
after r319757.

1) Correct the return value from __wait_event_common() from 1 to 0 in
case the timeout is specified as MAX_SCHEDULE_TIMEOUT. In the other
case __ret is zero and will be substituted in the last part of the
macro with the appropriate value before return.

2) Make sure the "timeout" argument is casted to "int" before
evaluating negativity. Else the signedness of a "long" might be
checked instead of the signedness of an integer.

3) The wait_event() function should not have a return value.

Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/wait.h
d495afeb1efb2caee9afdd96023596801148ca1e 10-Aug-2017 hselasky <hselasky@FreeBSD.org> Make sure the linux_wait_event_common() function in the LinuxKPI properly
handles a timeout value of MAX_SCHEDULE_TIMEOUT which basically means there
is no timeout. This is a regression issue after r319757.

While at it change the type of returned variable from "long" to "int" to
match the actual return type.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_schedule.c
ffae9ac0c30afa5aae79e991326069939b720a99 08-Aug-2017 mav <mav@FreeBSD.org> Fix few issues of LinuxKPI workqueue.

LinuxKPI workqueue wrappers reported "successful" cancellation for works
already completed in normal way. This change brings reported status and
real cancellation fact into sync. This required for drm-next operation.

Reviewed by: hselasky (earlier version)
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D11904
inuxkpi/common/src/linux_work.c
f71ca6688908891abaa379672960694be5129d86 08-Aug-2017 markj <markj@FreeBSD.org> Add round_jiffies_up(), local_clock() and __setup_timer() to the LinuxKPI.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11871
inuxkpi/common/include/linux/sched.h
inuxkpi/common/include/linux/timer.h
b45702507383100f5461725164f5bb792ded5d72 08-Aug-2017 markj <markj@FreeBSD.org> Add macros for defining attribute groups and for WO and RW attributes.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11872
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/sysfs.h
04a64ac1b2e9b0da5fc220f236663734e3d33551 07-Aug-2017 mav <mav@FreeBSD.org> Fix hrtimer_active() in case of cancellation.

While there, switch to FreeBSD internal callout active status.

Reviewed by: markj, hselasky
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D11900
inuxkpi/common/include/linux/hrtimer.h
inuxkpi/common/src/linux_hrtimer.c
3364e8aea957b8f29913ef5da0c80f4081ffad76 07-Aug-2017 br <br@FreeBSD.org> o Replace __riscv__ with __riscv
o Replace __riscv64 with (__riscv && __riscv_xlen == 64)

This is required to support new GCC 7.1 compiler.
This is compatible with current GCC 6.1 compiler.

RISC-V is extensible ISA and the idea here is to have built-in define
per each extension, so together with __riscv we will have some subset
of these as well (depending on -march string passed to compiler):

__riscv_compressed
__riscv_atomic
__riscv_mul
__riscv_div
__riscv_muldiv
__riscv_fdiv
__riscv_fsqrt
__riscv_float_abi_soft
__riscv_float_abi_single
__riscv_float_abi_double
__riscv_cmodel_medlow
__riscv_cmodel_medany
__riscv_cmodel_pic
__riscv_xlen

Reviewed by: ngie
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D11901
inuxkpi/common/src/linux_page.c
84319709659d324a2c78b62c9451dfc3ec918971 03-Aug-2017 markj <markj@FreeBSD.org> Add subsystem vendor and device ID fields to struct pci_dev.

MFC after: 1 week
inuxkpi/common/include/linux/pci.h
inuxkpi/common/src/linux_pci.c
4218cff66f3c19fd478fa4301de739ad5458ea45 02-Aug-2017 hselasky <hselasky@FreeBSD.org> Fix LinuxKPI regression after r321920. The mda_unit and si_drv0 fields are not
wide enough to hold the full 64-bit dev_t. Instead use the "dev" field in
the "linux_cdev" structure to store and lookup this value.

While at it remove superfluous use of parenthesis inside the
MAJOR(), MINOR() and MKDEV() macros in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/cdev.h
inuxkpi/common/include/linux/fs.h
inuxkpi/common/include/linux/kdev_t.h
inuxkpi/common/src/linux_compat.c
e032b1d69d9ceac38b3317bac6a75fc7730674df 31-Jul-2017 hselasky <hselasky@FreeBSD.org> Remove cycle_t type from the LinuxKPI similar to Linux upstream.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/clocksource.h
93ea6317f4045ccfd813ad69683d9f801ea02c43 30-Jul-2017 dchagin <dchagin@FreeBSD.org> Avoid using [LINUX_]SHAREDPAGE constant directly in the vdso code.
This is needed for https://reviews.freebsd.org/D11780.

Reported by: kib@
inux/linux_vdso.c
inux/linux_vdso.h
ea2fa60ba904d7490a39e86a14a8f48d0829638a 29-Jul-2017 ian <ian@FreeBSD.org> Add inline functions to convert between sbintime_t and decimal time units.
Use them in some existing code that is vulnerable to roundoff errors.

The existing constant SBT_1NS is a honeypot, luring unsuspecting folks into
writing code such as long_timeout_ns*SBT_1NS to generate the argument for a
sleep call. The actual value of 1ns in sbt units is ~4.3, leading to a
large roundoff error giving a shorter sleep than expected when multiplying
by the trucated value of 4 in SBT_1NS. (The evil honeypot aspect becomes
clear after you waste a whole day figuring out why your sleeps return early.)
inuxkpi/common/src/linux_hrtimer.c
feffd686029c113ca4a4d0d85eaa1dd013bab509 26-Jul-2017 ed <ed@FreeBSD.org> Upgrade to the latest sources generated from the CloudABI specification.

The CloudABI specification has had some minor changes over the last half
year. No substantial features have been added, but some features that
are deemed unnecessary in retrospect have been removed:

- mlock()/munlock():

These calls tend to be used for two different purposes: real-time
support and handling of sensitive (cryptographic) material that
shouldn't end up in swap. The former use case is out of scope for
CloudABI. The latter may also be handled by encrypting swap.

Removing this has the advantage that we no longer need to worry about
having resource limits put in place.

- SOCK_SEQPACKET:

Support for SOCK_SEQPACKET is rather inconsistent across various
operating systems. Some operating systems supported by CloudABI (e.g.,
macOS) don't support it at all. Considering that they are rarely used,
remove support for the time being.

- getsockname(), getpeername(), etc.:

A shortcoming of the sockets API is that it doesn't allow you to
create socket(pair)s, having fake socket addresses associated with
them. This makes it harder to test applications or transparently
forward (proxy) connections to them.

With CloudABI, we're slowly moving networking connectivity into a
separate daemon called Flower. In addition to passing around socket
file descriptors, this daemon provides address information in the form
of arbitrary string labels. There is thus no longer any need for
requesting socket address information from the kernel itself.

This change also updates consumers of the generated code accordingly.
Even though system calls end up getting renumbered, this won't cause any
problems in practice. CloudABI programs always call into the kernel
through a kernel-supplied vDSO that has the numbers updated as well.

Obtained from: https://github.com/NuxiNL/cloudabi
loudabi/cloudabi_fd.c
loudabi/cloudabi_mem.c
loudabi/cloudabi_sock.c
loudabi/cloudabi_util.h
loudabi32/cloudabi32_proto.h
loudabi32/cloudabi32_sock.c
loudabi32/cloudabi32_syscall.h
loudabi32/cloudabi32_syscalls.c
loudabi32/cloudabi32_sysent.c
loudabi32/cloudabi32_systrace_args.c
loudabi64/cloudabi64_proto.h
loudabi64/cloudabi64_sock.c
loudabi64/cloudabi64_syscall.h
loudabi64/cloudabi64_syscalls.c
loudabi64/cloudabi64_sysent.c
loudabi64/cloudabi64_systrace_args.c
0992a81e37caa9e38d46fb7350447b8b40432fa7 22-Jul-2017 rlibby <rlibby@FreeBSD.org> linuxkpi compiler.h: avoid gcc -Wunused-value in dummy expressions

It looks like the __acquire and __release macros are for the consumption
of static analysis tools and have no semantic effect. Transform the
definitions from constant expressions to empty statements in order to
avoid -Wunused-value from gcc.

Likewise avoid future warnings for __chk_{user,io}_ptr, but with a cast
to void, because it looks like some linux kernel code may use those in
expression contexts.

Reviewed by: hselasky, markj
Approved by: markj (mentor)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11695
inuxkpi/common/include/linux/compiler.h
bd6bce15179827418a22303e362e3ce20e7270ee 22-Jul-2017 dchagin <dchagin@FreeBSD.org> Style(9) whitespace fix.

MFC after: 1 week
inux/linux_ioctl.h
2f63b6248f206be1cbc15bb4fbbca7deb69837ae 14-Jul-2017 kib <kib@FreeBSD.org> Correct sysent flags for dynamically loaded syscalls.

Using the https://github.com/google/capsicum-test/ suite, the
PosixMqueue.CapModeForked test was failing due to an ECAPMODE after
calling kmq_notify(). On further inspection, the dynamically
loaded syscall entry was initialized with sy_flags zeroed out, since
SYSCALL_INIT_HELPER() left sysent.sy_flags with the default value.

Add a new helper SYSCALL{,32}_INIT_HELPER_F() which takes an
additional argument to specify the sy_flags value.

Submitted by: Siva Mahadevan <smahadevan@freebsdfoundation.org>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D11576
reebsd32/freebsd32_util.h
6419ee87e8cb35ca2f8536e71cb7e1942cf7f759 13-Jul-2017 markj <markj@FreeBSD.org> Add some functions to jiffies.h.

Also add some checks for overflow to existing functions.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11533
inuxkpi/common/include/linux/jiffies.h
8c46c174e4c256567f56cae3e931caf5d54e0ba8 09-Jul-2017 markj <markj@FreeBSD.org> Add some functions to math64.h in the LinuxKPI, and fix nearby style.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11535
inuxkpi/common/include/linux/math64.h
fa2419e3087cffc7861121cddc370cb9721a70cf 09-Jul-2017 markj <markj@FreeBSD.org> Add a few functions to ktime.h in the LinuxKPI, and fix nearby style.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11534
inuxkpi/common/include/linux/ktime.h
9f5e54b3392a5aeb269fe7f0ae928153451cea53 09-Jul-2017 markj <markj@FreeBSD.org> Free existing per-thread task structs when unloading linuxkpi.ko.

They are otherwise leaked.

Reported and tested by: ae
MFC after: 1 week
inuxkpi/common/src/linux_current.c
13bdb56c14d8bae292772ce8d69e5aa047961c90 08-Jul-2017 markj <markj@FreeBSD.org> Add some helper definitions to fs.h in the LinuxKPI.

Add a field to struct linux_file to allow the creation of anonymous
shmem objects.

MFC after: 1 week
inuxkpi/common/include/linux/fs.h
inuxkpi/common/src/linux_compat.c
07c668198e183b2868ad0e48485400f187828a59 08-Jul-2017 markj <markj@FreeBSD.org> Fix the definitions of pgprot_{noncached,writecombine} after r316562.

MFC after: 1 week
inuxkpi/common/include/linux/page.h
8e37238bb0cb4ac21c72d7fc4e52decb9bfb7127 08-Jul-2017 markj <markj@FreeBSD.org> Add device_is_registered() to the LinuxKPI.

MFC after: 1 week
inuxkpi/common/include/linux/device.h
1859895bfc5c386dea7cad076b366dbf87edd3ad 08-Jul-2017 markj <markj@FreeBSD.org> Add TASK_COMM_LEN to the LinuxKPI.

MFC after: 1 week
inuxkpi/common/include/linux/sched.h
78f1912247cfd70ff3053d44a1fb5e723303d4be 07-Jul-2017 hselasky <hselasky@FreeBSD.org> Complete r320189 which allows a NULL VM fault handler in the LinuxKPI.
Instead of mapping a dummy page upon a page fault, map the page
pointed to by the physical address given by IDX_TO_OFF(vmap->vm_pfn).
To simplify the implementation use OBJT_DEVICE to implement our own
linux_cdev_pager_fault() instead of using the existing
linux_cdev_pager_populate().

Some minor code factoring while at it.

Reviewed by: markj @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
a77f17c5a97882bbcd73e069f5fa865d8646eb57 07-Jul-2017 hselasky <hselasky@FreeBSD.org> Fix a bug in synchronize RCU when the calling thread is bound to a CPU.

Set "td_pinned" to zero after "sched_unbind()" to prevent "td_pinned"
from temporarily becoming negative during "sched_bind()". This can
happen if "sched_bind()" uses "sched_pin()" and "sched_unpin()".

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_rcu.c
22d88a5b36c87b4bd207a4b37d0e9625d528d1a2 04-Jul-2017 markj <markj@FreeBSD.org> Invoke suspend/resume methods from the driver pmops if available.

Obtained from: kmacy (original version)
MFC after: 1 week
inuxkpi/common/src/linux_pci.c
139f23635621e9c7bad5c56a6b9be406cb1acf0b 04-Jul-2017 markj <markj@FreeBSD.org> Add some auxiliary types for device driver support.

MFC after: 1 week
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/pci.h
bf4195749f64d2178657ff7c01134a509c755fc7 04-Jul-2017 markj <markj@FreeBSD.org> Add a field for the class code to struct pci_driver.

Fill out some previously uninitialized fields as well.

MFC after: 1 week
inuxkpi/common/include/linux/pci.h
inuxkpi/common/src/linux_pci.c
2f1b018f95b32a72bd4a82fb6671de081bccfe58 04-Jul-2017 markj <markj@FreeBSD.org> Add some PCI class definitions.

MFC after: 1 week
inuxkpi/common/include/linux/pci.h
557c67212b4cc4515249917756b850caaf100376 04-Jul-2017 markj <markj@FreeBSD.org> Rename the "driver" field to "bsddriver" to avoid a name collision.

MFC after: 1 week
inuxkpi/common/include/linux/pci.h
inuxkpi/common/src/linux_pci.c
8123dae824eb53453766edd8f4e0747f7cbc2b15 04-Jul-2017 markj <markj@FreeBSD.org> Hold the PCI device list lock when removing an element.

MFC after: 1 week
inuxkpi/common/src/linux_pci.c
26ab55861f86807df3271dd4f77eb6815fee7516 03-Jul-2017 markj <markj@FreeBSD.org> Let io_mapping_init_wc() fall back to an uncacheable mapping.

This allows usage of the function on architectures that don't support
write-combining.

Reported by: bz, emaste
X-MFC With: r320196
inuxkpi/common/include/linux/io-mapping.h
54afa4768cd4eb8cb9f296cc31e5599f1e847fb1 01-Jul-2017 kib <kib@FreeBSD.org> Port PowerPC kqueue(2) compat32 fix in r320500 to MIPS.

All 32bit MIPS ABIs align uint64_t on 8-byte. Since struct kevent32
is defined using 32bit types to avoid extra alignment on amd64/i386,
layout of the structure needs paddings on PowerPC and apparently MIPS.

Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D11434
reebsd32/freebsd32.h
reebsd32/freebsd32_misc.c
bff8060ac559b7b9f361750f48eae08678dfeaf2 30-Jun-2017 kib <kib@FreeBSD.org> Amend the layout of kevent32 on powerpc where uint64_t has 8-byte
alignment.

Reported,tested and assertion updates by: andreast
Sponsored by: The FreeBSD Foundation
reebsd32/freebsd32.h
reebsd32/freebsd32_misc.c
131ea5a7f4d2bb55b70318fad89b099a49b4e2f7 29-Jun-2017 jhb <jhb@FreeBSD.org> Store a 32-bit PT_LWPINFO struct for 32-bit process core dumps.

Process core notes for a 32-bit process running on a 64-bit host need to
use 32-bit structures so that the note layout matches the layout of notes
of a core dump of a 32-bit process under a 32-bit kernel.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D11407
reebsd32/freebsd32_signal.h
342f85f1d4b1160fbc654259b1ba83b08ddedd72 27-Jun-2017 jhibbits <jhibbits@FreeBSD.org> Update comments and simplify conditionals for compat32

Only amd64 (because of i386) needs 32-bit time_t compat now, everything else is
64-bit time_t. Rather than checking on all 64-bit time_t archs, only check the
oddball amd64/i386.

Reviewed By: emaste, kib, andrew
Differential Revision: https://reviews.freebsd.org/D11364
reebsd32/freebsd32.h
reebsd32/freebsd32_misc.c
ff2787fe4828b3c16202b742c9b2464d67bbaef9 26-Jun-2017 markj <markj@FreeBSD.org> Implement parts of the hrtimer API in the LinuxKPI.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11359
inuxkpi/common/include/linux/hrtimer.h
inuxkpi/common/src/linux_hrtimer.c
e647d591f80674fb209ad7dc4fe485148f0e5a52 26-Jun-2017 avg <avg@FreeBSD.org> linux_getdents, linux_readdir: fix mismatch between malloc and free tags

MFC after: 3 days
inux/linux_file.c
11d34fd62ad2616af22ec55f92a5764467faf658 26-Jun-2017 jhibbits <jhibbits@FreeBSD.org> Solve the y2038 problem for powerpc

AKA Make time_t 64 bits on powerpc(32).

PowerPC currently (until now) was one of two architectures with a 32-bit time_t
on 32-bit archs (the other being i386). This is an ABI breakage, so all ports,
and all local binaries, *must* be recompiled.

Tested by: andreast, others
MFC after: Never
Relnotes: Yes
reebsd32/freebsd32.h
reebsd32/freebsd32_misc.c
3a714c6e83a001a26446ca1472fbab727e9d3601 25-Jun-2017 markj <markj@FreeBSD.org> Add u64_to_user_ptr() to the LinuxKPI.

MFC after: 1 week
inuxkpi/common/include/linux/kernel.h
cc928d211abfd91edbd76f0baaf33adc50dc57d0 25-Jun-2017 markj <markj@FreeBSD.org> Add ns_to_ktime() to the LinuxKPI.

MFC after: 1 week
inuxkpi/common/include/linux/ktime.h
7ab36912bcce08042dc5719ff917a08b24d85053 25-Jun-2017 markj <markj@FreeBSD.org> Add a couple of macros to lockdep.h in the LinuxKPI.

MFC after: 1 week
inuxkpi/common/include/linux/lockdep.h
9575bf4b0e1a9eb0e6f08c376f12656c3c3556a0 25-Jun-2017 markj <markj@FreeBSD.org> Add the thaw_early method to struct dev_pm_ops in the LinuxKPI.

MFC after: 1 week
inuxkpi/common/include/linux/device.h
7ca4ed867f77a10886f791733ee01a1a3edeb77c 25-Jun-2017 markj <markj@FreeBSD.org> Add noop_lseek() to the LinuxKPI.

MFC after: 1 week
inuxkpi/common/include/linux/fs.h
889f8689d9ee280ad230ec9c6b9a47154cc80dda 23-Jun-2017 mmokhi <mmokhi@FreeBSD.org> Fix caveat in new implementation of linprocfs_docpuinfo():
Prevent kernel panic in case that extended-cpuid isn't supported by CPU

Reviewed by: kib, ngie, trasz
Approved by: trasz
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11294
inprocfs/linprocfs.c
881fc6be99dc2a184409ca173b616185c143e75c 21-Jun-2017 markj <markj@FreeBSD.org> Update io-mapping.h in the LinuxKPI.

Add io_mapping_init_wc() and add a third (unused) parameter to
io_mapping_map_wc().

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11286
inuxkpi/common/include/linux/io-mapping.h
4fffa01fff70e6a1cd635b34906298d840be27f5 21-Jun-2017 markj <markj@FreeBSD.org> Add missing lock destructor invocations to the LinuxKPI unload handler.

MFC after: 1 week
inuxkpi/common/src/linux_compat.c
a49c47f0df4e182261551f544d8b8778cc80c1e6 21-Jun-2017 markj <markj@FreeBSD.org> Include kmod.h from the LinuxKPI's module.h.

MFC after: 1 week
inuxkpi/common/include/linux/module.h
0959c781629c4325e7f89efc97c1461dc8b0ad10 21-Jun-2017 markj <markj@FreeBSD.org> Add a lockdep macro to the LinuxKPI.

Also fix some nearby style issues.

MFC after: 1 week
inuxkpi/common/include/linux/lockdep.h
41ece8e7db60f66c2fad6277e5d28f5d95f505fb 21-Jun-2017 hselasky <hselasky@FreeBSD.org> Allow the VM fault handler to be NULL in the LinuxKPI when handling a
memory map request. When the VM fault handler is NULL a return code of
VM_PAGER_BAD is returned from the character device's pager populate
handler. This fixes compatibility with Linux.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
d656eefc9c3c27f843536f6cd4f742367686afac 18-Jun-2017 markj <markj@FreeBSD.org> Add kthread parking support to the LinuxKPI.

Submitted by: kmacy (original version)
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11264
inuxkpi/common/include/linux/kthread.h
inuxkpi/common/include/linux/sched.h
inuxkpi/common/src/linux_current.c
inuxkpi/common/src/linux_kthread.c
c8d732dce12599bb992d1545358a02f90c1a40b7 18-Jun-2017 markj <markj@FreeBSD.org> Avoid including list.h in LinuxKPI headers.

list.h includes a number of FreeBSD headers as a workaround for the
LIST_HEAD name collision. To reduce pollution, avoid including list.h
in commonly used headers when it is not explicitly needed.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11249
inuxkpi/common/include/linux/completion.h
inuxkpi/common/include/linux/kernel.h
inuxkpi/common/include/linux/kobject.h
inuxkpi/common/include/linux/mm_types.h
inuxkpi/common/include/linux/sched.h
70be6f4a291d437aaabcd97e26342811ee78b94b 18-Jun-2017 emaste <emaste@FreeBSD.org> Add ZFS to Linux statfs ftype

PR: 220086
Reviewed by: cem
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D11252
inux/linux_stats.c
2c739e0e3fd75b75884c9138ba10dc185208cf50 17-Jun-2017 markj <markj@FreeBSD.org> Remove prototypes for unimplemented LinuxKPI functions.

MFC after: 1 week
inuxkpi/common/include/linux/mm.h
f7db07a7156d3ed61c79a3d3cac6909aed16df44 17-Jun-2017 kib <kib@FreeBSD.org> Regen.
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
d7f022a3ab2734f7928f69391807dc59d62044fe 17-Jun-2017 kib <kib@FreeBSD.org> Add abstime kqueue(2) timers and expand struct kevent members.

This change implements NOTE_ABSTIME flag for EVFILT_TIMER, which
specifies that the data field contains absolute time to fire the
event.

To make this useful, data member of the struct kevent must be extended
to 64bit. Using the opportunity, I also added ext members. This
changes struct kevent almost to Apple struct kevent64, except I did
not changed type of ident and udata, the later would cause serious API
incompatibilities.

The type of ident was kept uintptr_t since EVFILT_AIO returns a
pointer in this field, and e.g. CHERI is sensitive to the type
(discussed with brooks, jhb).

Unlike Apple kevent64, symbol versioning allows us to claim ABI
compatibility and still name the new syscall kevent(2). Compat shims
are provided for both host native and compat32.

Requested by: bapt
Reviewed by: bapt, brooks, ngie (previous version)
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D11025
reebsd32/freebsd32.h
reebsd32/freebsd32_misc.c
reebsd32/syscalls.master
e2a14c603f20af5eaf11dfafcd181d6d82269dd7 12-Jun-2017 kib <kib@FreeBSD.org> Move struct syscall_args syscall arguments parameters container into
struct thread.

For all architectures, the syscall trap handlers have to allocate the
structure on the stack. The structure takes 88 bytes on 64bit arches
which is not negligible. Also, it cannot be easily found by other
code, which e.g. caused duplication of some members of the structure
to struct thread already. The change removes td_dbg_sc_code and
td_dbg_sc_nargs which were directly copied from syscall_args.

The structure is put into the copied on fork part of the struct thread
to make the syscall arguments information correct in the child after
fork.

This move will also allow several more uses shortly.

Reviewed by: jhb (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
X-Differential revision: https://reviews.freebsd.org/D11080
a32/ia32_util.h
fa3d0ad78159b5cace42491f69317f055fc0668f 12-Jun-2017 dchagin <dchagin@FreeBSD.org> Remove the outdated definition.

MFC after: 1 week
inux/linux_file.c
703192d28e9b16ca3d33378325b8620e36a7ebe3 12-Jun-2017 dchagin <dchagin@FreeBSD.org> Since r318735 (ino64 project) the size of the native struct dirent is
equal or greater than the size of Linux struct dirent or struct dirent64.
So, remove LINUX_RECLEN_RATIO magic as useless.
inux/linux_file.c
e10054b3aa9548f4e53eaaeca28c75b7331f265a 09-Jun-2017 markj <markj@FreeBSD.org> Implement pci_disable_device() in the LinuxKPI.

Submitted by: kmacy
MFC after: 2 weeks
inuxkpi/common/include/linux/pci.h
d2500bbf9204ac3be025a94212f523033b78ddf0 09-Jun-2017 markj <markj@FreeBSD.org> Augment wait queue support in the LinuxKPI.

In particular:
- Don't evaluate event conditions with a sleepqueue lock held, since such
code may attempt to acquire arbitrary locks.
- Fix the return value for wait_event_interruptible() in the case that the
wait is interrupted by a signal.
- Implement wait_on_bit_timeout() and wait_on_atomic_t().
- Implement some functions used to test for pending signals.
- Implement a number of wait_event_*() variants and unify the existing
implementations.
- Unify the mechanism used by wait_event_*() and schedule() to put the
calling thread to sleep.

This is required to support updated DRM drivers. Thanks to hselasky for
finding and fixing a number of bugs in the original revision.

Reviewed by: hselasky
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10986
inuxkpi/common/include/linux/sched.h
inuxkpi/common/include/linux/wait.h
inuxkpi/common/src/linux_kthread.c
inuxkpi/common/src/linux_schedule.c
f7034622774f0f9608565c170988a236598feeb5 09-Jun-2017 kib <kib@FreeBSD.org> Enhance vfs.ino64_trunc_error sysctl.

Provide a new mode "2" which returns a special overflow indicator in
the non-representable field instead of the silent truncation (mode
"0") or EOVERFLOW (mode "1").

In particular, the typical use of st_ino to detect hard links with
mode "2" reports false positives, which might be more suitable for
some uses.

Discussed with: bde
Sponsored by: The FreeBSD Foundation
reebsd32/freebsd32_misc.c
29092d1ae99e0f6b9aeee45615e474c88ad86623 08-Jun-2017 jhibbits <jhibbits@FreeBSD.org> Remove ARM and MIPS from linuxkpi ioremap_attr definition

ARM and MIPS fail universe builds.

ARM and MIPS are missing the following:
* VM_MEMATTR_WRITE_THROUGH
* VM_MEMATTR_WRITE_COMBINING

Pointy-hat to: jhibbits
inuxkpi/common/include/linux/io.h
inuxkpi/common/src/linux_compat.c
d485495fa5d08a78d74518045ded41644b3f9ad8 07-Jun-2017 jhibbits <jhibbits@FreeBSD.org> Add more #ifdef arch checks to the linuxkpi

arm, mips, and powerpc all implement pmap_mapdev_attr() and pmap_unmapdev(),
so add those archs to the checks. powerpc also includes the atomic_swap_*()
functions, so add that to the supported list as well. Not tested except by
compiling powerpc.

Reviewed by: markj
inuxkpi/common/include/asm/atomic.h
inuxkpi/common/include/asm/atomic64.h
inuxkpi/common/include/linux/io.h
inuxkpi/common/src/linux_compat.c
9b5b52b6c495d8cc9b7c2e63539a4d1f60e08a11 06-Jun-2017 hselasky <hselasky@FreeBSD.org> Fix init order in the LinuxKPI for IDR support after recent changes.

CPU_FOREACH() is not available until SI_SUB_CPU at SI_ORDER_ANY
when the LinuxKPI is loaded as part of the kernel.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_idr.c
6a43deb7a3a4bcfbcd7043b1fb69e35711e54462 05-Jun-2017 kib <kib@FreeBSD.org> Add sysctl vfs.ino64_trunc_error controlling action on truncating
inode number or link count for the ABI compat binaries.

Right now, and by default after the change, too large 64bit values are
silently truncated to 32 bits. Enabling the knob causes the system to
return EOVERFLOW for stat(2) family of compat syscalls when some
values cannot be completely represented by the old structures. For
getdirentries(2), knob skips the dirents which would cause non-trivial
truncation of d_ino.

EOVERFLOW error is specified by the X/Open 1996 LFS document
('Adding Support for Arbitrary File Sizes to the Single UNIX
Specification').

Based on the discussion with: bde
Sponsored by: The FreeBSD Foundation
reebsd32/freebsd32_misc.c
1a97fb3a1a7ac966f541b33dcb7e5ca7bd3694d0 04-Jun-2017 dchagin <dchagin@FreeBSD.org> On success, getrandom() Linux system call returns the number of bytes that
were copied to the buffer supplied by the user.

PR: 219464
Submitted by: Maciej Pasternacki
Reported by: Maciej Pasternacki
MFC after: 1 week
inux/linux_misc.c
ff032734d8fa0832a9bb404461b6b34ca7d78899 04-Jun-2017 dchagin <dchagin@FreeBSD.org> Revert r319053 due to lack of sence. As pointed out by kib@ opt_global.h
contains such fundamental settings as e.g. SMP option and fake
opt_global.h almost never match real configured kernels.

Reported by: kib@
inux/linux_misc.c
ab63ba157f2e05e536448c3217655f30392e1039 02-Jun-2017 hselasky <hselasky@FreeBSD.org> Improve kqueue() support in the LinuxKPI. Some applications using the
kqueue() does not set non-blocking I/O mode for event driven read of
file descriptors. This means the LinuxKPI internal kqueue read and
write event flags must be updated before the next read and/or write
system call. Else the read and/or write system call may block. This
can happen when there is no more data to read following a previous
read event. Then the application also gets blocked from processing
other events. This situation can also be solved by the applications
setting and using non-blocking I/O mode.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
702f89e7fe9b2ceb4524553c9227cb58d4bf3710 02-Jun-2017 hselasky <hselasky@FreeBSD.org> Add support for setting the non-blocking I/O flag for LinuxKPI
character devices. In Linux the FIONBIO IOCTL is handled by the kernel
and not the drivers. Also need return success for the FIOASYNC ioctl
due to existing logic in kern_fcntl() even though it is not supported
currently.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
68f2e5bd3546d43094bdb3715900074701fdc86c 01-Jun-2017 hselasky <hselasky@FreeBSD.org> Make sure the selrecord() function is only called from within system
polling contexts in the LinuxKPI.

After the kqueue() support was added to the LinuxKPI in r319409 the
Linux poll file operation will be used outside the system file polling
callback function, which can cause a NULL-pointer panic inside
selrecord() because curthread->td_sel is set to NULL. This patch moves
the selrecord() call away from poll_wait() and to the system file poll
callback function in the LinuxKPI, which essentially wraps the Linux
one. This is similar to what the cuse(3) module is currently doing.
Refer to sys/fs/cuse/*.[ch] for more details.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/poll.h
inuxkpi/common/src/linux_compat.c
d8c3341b93af7921fd1354497630e1495674fe66 01-Jun-2017 hselasky <hselasky@FreeBSD.org> Translate the ERESTARTSYS error code into ERESTART in the LinuxKPI
ioctl(), read() and write() system call handlers. This error code is
internal to the kernel and should not be seen by user-space programs
according to Linux.

Submitted by: Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
63be8505839bfd4e206ae7ed3025c3daf98e1d02 01-Jun-2017 hselasky <hselasky@FreeBSD.org> Add generic kqueue() and kevent() support to the LinuxKPI character
devices. The implementation allows read and write filters to be
created and piggybacks on the poll() file operation to determine when
a filter should trigger. The piggyback mechanism is simply to check
for the EWOULDBLOCK or EAGAIN return code from read(), write() or
ioctl() system calls and then update the kqueue() polling state bits.
The implementation is similar to the one found in the cuse(3) module.
Refer to sys/fs/cuse/*.[ch] for more details.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/file.h
inuxkpi/common/include/linux/fs.h
inuxkpi/common/include/linux/poll.h
inuxkpi/common/src/linux_compat.c
798a6026c2b86c0933eefbc7617528549ef29aa6 31-May-2017 hselasky <hselasky@FreeBSD.org> Implement print_hex_dump(), print_hex_dump_bytes() and
printk_ratelimited() in the LinuxKPI.

While at it fix the inclusion guard of printk.h to be similar to the
rest of the LinuxKPI header files.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/printk.h
255c5da0b80d636db51ae560d2e192ca9c92058a 31-May-2017 hselasky <hselasky@FreeBSD.org> Properly implement idr_preload() and idr_preload_end() in the
LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/idr.h
inuxkpi/common/src/linux_idr.c
fc09dd7cd5c993969bb59ab732499621ef6ca82f 31-May-2017 hselasky <hselasky@FreeBSD.org> Implement in_atomic() function in the LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kthread.h
inuxkpi/common/src/linux_compat.c
81d82e5e2691dbec72da0505108c39737b9e3a0d 31-May-2017 hselasky <hselasky@FreeBSD.org> Properly set the .d_name field in the cdevsw structure for the
LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
52f245c3fd1006a5a2a9f8be27981a4a7492fa7c 31-May-2017 hselasky <hselasky@FreeBSD.org> Make sure the VMAP's "vm_file" field is referenced in a Linux
compatible way by the linux_dev_mmap_single() function in the
LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
d59cbd826cb464335bd59b129047807190471f3b 31-May-2017 hselasky <hselasky@FreeBSD.org> Remove the VMA handle from its list before calling the LinuxKPI VMA
close operation to prevent other threads from reusing the VM object
handle pointer.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
5ad34d08ba8b1d7b98ba31258e8a619ba0b7d82f 31-May-2017 hselasky <hselasky@FreeBSD.org> Don't acquire a reference on the VM-space when allocating the LinuxKPI
task structure to avoid deadlock when tearing down the VM object
during a process exit.

Found by: markj @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mm_types.h
inuxkpi/common/src/linux_current.c
inuxkpi/common/src/linux_page.c
42ce45834fba0b0be8f6215063498b2640194395 31-May-2017 hselasky <hselasky@FreeBSD.org> Fix a reference count leak in the LinuxKPI due to calling VM open when
it shouldn't be called.

Background:
The Linux VM open operation is called when a new VMA is
created on top of the current VMA. This is done through either mremap
flow or split_vma, usually due to mlock, madvise, munmap and so
on. This is currently not supported by the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
f87686da6d3de891808228991d0b50cdf604f1c6 31-May-2017 hselasky <hselasky@FreeBSD.org> Fixes for refcounting "struct linux_file" in the LinuxKPI.

- Allow "struct linux_file" to be refcounted when its "_file" member
is NULL by using its "f_count" field. The reference counts are
transferred to the file structure when the file descriptor is
installed.

- Add missing vdrop() calls for error cases during open().

- Set the "_file" member of "struct linux_file" during open. This
allows use of refcounting through get_file() and fput() with LinuxKPI
character devices.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/file.h
inuxkpi/common/include/linux/fs.h
inuxkpi/common/src/linux_compat.c
b7e7cccc2ecc8f376470a3d54d4770410bb8c628 31-May-2017 hselasky <hselasky@FreeBSD.org> Make sure the thread's priority is restored for all three cases inside
linux_synchronize_rcu_cb() in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_rcu.c
7936fc01922ae326de58a9e79c83d7f21e212ec3 30-May-2017 markj <markj@FreeBSD.org> Add some miscellaneous definitions to support DRM drivers.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10985
inuxkpi/common/include/linux/compiler.h
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/io.h
inuxkpi/common/include/linux/kernel.h
inuxkpi/common/include/linux/module.h
inuxkpi/common/include/linux/pci.h
inuxkpi/common/include/linux/preempt.h
inuxkpi/common/include/linux/types.h
e29481ad14e7ba622055e4100f2cd5171ca62abb 28-May-2017 dchagin <dchagin@FreeBSD.org> On success, getrandom() Linux system call returns the number of bytes that
were copied to the buffer supplied by the user.

Also fix getrandom() if Linuxulator modules are built without the kernel.

PR: 219464
Submitted by: Maciej Pasternacki
Reported by: Maciej Pasternacki
MFC after: 1 week
inux/linux_misc.c
a607ab76a650f3317f7a1e2625ce186619e65a26 24-May-2017 allanjude <allanjude@FreeBSD.org> Followup to r318765 (capsicumize cpuset_*affinity)

Update *sysent files
reebsd32/freebsd32_sysent.c
56f722576fca8e61c913e04aba26c9f4a3195618 24-May-2017 allanjude <allanjude@FreeBSD.org> Allow cpuset_{get,set}affinity in capabilities mode

bhyve was recently sandboxed with capsicum, and needs to be able to
control the CPU sets of its vcpu threads

Reviewed by: emaste, oshogbo, rwatson
MFC after: 2 weeks
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D10170
reebsd32/capabilities.conf
a40411ddce591d2b4b3c863fd1e4750bfad0349c 23-May-2017 kib <kib@FreeBSD.org> Regen.
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
e75ba1d5c4c79376a78351c8544388491db49664 23-May-2017 kib <kib@FreeBSD.org> Commit the 64-bit inode project.

Extend the ino_t, dev_t, nlink_t types to 64-bit ints. Modify
struct dirent layout to add d_off, increase the size of d_fileno
to 64-bits, increase the size of d_namlen to 16-bits, and change
the required alignment. Increase struct statfs f_mntfromname[] and
f_mntonname[] array length MNAMELEN to 1024.

ABI breakage is mitigated by providing compatibility using versioned
symbols, ingenious use of the existing padding in structures, and
by employing other tricks. Unfortunately, not everything can be
fixed, especially outside the base system. For instance, third-party
APIs which pass struct stat around are broken in backward and
forward incompatible ways.

Kinfo sysctl MIBs ABI is changed in backward-compatible way, but
there is no general mechanism to handle other sysctl MIBS which
return structures where the layout has changed. It was considered
that the breakage is either in the management interfaces, where we
usually allow ABI slip, or is not important.

Struct xvnode changed layout, no compat shims are provided.

For struct xtty, dev_t tty device member was reduced to uint32_t.
It was decided that keeping ABI compat in this case is more useful
than reporting 64-bit dev_t, for the sake of pstat.

Update note: strictly follow the instructions in UPDATING. Build
and install the new kernel with COMPAT_FREEBSD11 option enabled,
then reboot, and only then install new world.

Credits: The 64-bit inode project, also known as ino64, started life
many years ago as a project by Gleb Kurtsou (gleb). Kirk McKusick
(mckusick) then picked up and updated the patch, and acted as a
flag-waver. Feedback, suggestions, and discussions were carried
by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles),
and Rick Macklem (rmacklem). Kris Moore (kris) performed an initial
ports investigation followed by an exp-run by Antoine Brodin (antoine).
Essential and all-embracing testing was done by Peter Holm (pho).
The heavy lifting of coordinating all these efforts and bringing the
project to completion were done by Konstantin Belousov (kib).

Sponsored by: The FreeBSD Foundation (emaste, kib)
Differential revision: https://reviews.freebsd.org/D10439
reebsd32/capabilities.conf
reebsd32/freebsd32.h
reebsd32/freebsd32_misc.c
reebsd32/syscalls.master
inux/linux_file.c
0e65bf42087b1d9428d5f05b489b877f562dbb4f 22-May-2017 glebius <glebius@FreeBSD.org> Fix regression in ndis(4) after r286410. This adds a bunch of checks for
whether this is a Ethernet or 802.11 device and does proper dereferencing.

PR: 213237
Submitted by: <ota j.email.ne.jp>
MFC after: 2 weeks
dis/kern_ndis.c
dis/subr_ndis.c
0048749525d4831924b9f5622813f7a01b25b731 22-May-2017 emaste <emaste@FreeBSD.org> Regen sysent after r318634, no open(2) in capability mode

Sponsored by: The FreeBSD Foundation
reebsd32/freebsd32_sysent.c
41227d9d634afa90136b4abe9d356d1f8d1a40d7 22-May-2017 emaste <emaste@FreeBSD.org> disallow open(2) in capability mode

Previously open(2) was allowed in capability mode, with a comment that
suggested this was likely the case to facilitate debugging. The system
call would still fail later on, but it's better to disallow the syscall
altogether.

We now have the kern.trap_enotcap sysctl or PROC_TRAPCAP_CTL proccontrol
to aid in debugging.

In any case libc has translated open() to the openat syscall since
r277032.

Reviewed by: kib, rwatson
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D10850
reebsd32/capabilities.conf
b4fc0d15e362416a360fe4b6cf282f451e51c656 21-May-2017 markj <markj@FreeBSD.org> Add get_cpu() and put_cpu().

MFC after: 1 week
inuxkpi/common/include/asm/smp.h
91faea9058de7f77b8b6fbacef4c8785818c37c1 18-May-2017 markj <markj@FreeBSD.org> Fix a few uses of kern_yield() in the TTM and the LinuxKPI.

kern_yield(0) effectively causes the calling thread to be rescheduled
immediately since it resets the thread's priority to the highest possible
value. This can cause livelocks when the pattern
"while (!trylock()) kern_yield(0);" is used since the thread holding the
lock may linger on the runqueue for the CPU on which the looping thread is
running.

MFC after: 1 week
inuxkpi/common/src/linux_compat.c
42b8f83c04039b949e0ce2fff60117d8f87a8ca2 09-May-2017 hselasky <hselasky@FreeBSD.org> Fix init order in the LinuxKPI for RCU support.

CPU_FOREACH() is not available until SI_SUB_CPU at SI_ORDER_ANY
when the LinuxKPI is loaded as part of the kernel.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_rcu.c
18a036c89679f4d0d0fe77e5c44792c812b6e8d7 06-May-2017 mmokhi <mmokhi@FreeBSD.org> Fix linprocfs_docpuinfo() output regarding to what newer Linux apps expect

Reviewed by: trasz
Approved by: trasz
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10274
inprocfs/linprocfs.c
3515bf9113f1f1093aa9c0026bb38065e7191a58 05-May-2017 brooks <brooks@FreeBSD.org> Regent post r317845.

MFC after: 1 week
MFC with: r317845
Sponsored by: DARPA, AFRL
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
9878cf2cb0aa23c0e8b169a3c1d9733348587379 05-May-2017 brooks <brooks@FreeBSD.org> Provide a freebsd32 implementation of sigqueue()

The previous misuse of sys_sigqueue() was sending random register or
stack garbage to 64-bit targets. The freebsd32 implementation preserves
the sival_int member of value when signaling a 64-bit process.

Document the mixed ABI implementation of union sigval and the
incompability of sival_ptr with pointer integrity schemes.

Reviewed by: kib, wblock
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D10605
reebsd32/freebsd32_misc.c
reebsd32/syscalls.master
15912ab4a72d48a2aa2d1eedb3ea3055699c7c5a 05-May-2017 markj <markj@FreeBSD.org> Use pmap_invalidate_cache() to implement wbinvd_on_all_cpus().

Suggested by: jhb
X-MFC with: r317651
inuxkpi/common/src/linux_compat.c
ea53fefda4176da2e4cba770c8f1cf450f13b973 05-May-2017 hselasky <hselasky@FreeBSD.org> Fix for use after free in the LinuxKPI.

Background:
The same VM object might be shared by multiple processes and the
mm_struct is usually freed when a process exits.

Grab a reference on the mm_struct while the vmap is in the
linux_vma_head list in case the first process which inserted a VM
object has exited.

Tested by: kwm @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
02a7b1908b63a854776b566d351708b57c71c1fa 01-May-2017 markj <markj@FreeBSD.org> Add on_each_cpu() and wbinvd_on_all_cpus().

Reviewed by: hselasky
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10550
inuxkpi/common/include/asm/smp.h
inuxkpi/common/include/linux/smp.h
inuxkpi/common/src/linux_compat.c
0566831ac85567c2ad0eba57bbd844f73a54b2a5 01-May-2017 dchagin <dchagin@FreeBSD.org> Fix NULL pointer dereference in futex_wake_op() in case when the same
address specified for arguments uaddr and uaddr2.

PR: 218987
Reported by: luke.tw gmail
MFC after: 1 week
inux/linux_futex.c
35b4cd61e917e5e74e69661e77b66166456b559e 30-Apr-2017 dchagin <dchagin@FreeBSD.org> Fix symlinkat() which use the newdfd argument to look up the old path,
while it should use it for the new path instead.

Reported by: trasz@
MFC after: 1 month
inux/linux_file.c
b04416bc6ac4df806f5984c3363956c2fdf8efc4 27-Apr-2017 hselasky <hselasky@FreeBSD.org> Prefer to use real virtual address over direct map address in the
linux_page_address() function in the LinuxKPI. This solves an issue
where the return value from linux_page_address() is passed to
kmem_free().

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_page.c
7756fb1464e56b4b726d9c6707870aea1a019602 23-Apr-2017 dchagin <dchagin@FreeBSD.org> Map Linux CLOCK_BOOTTIME to native CLOCK_UPTIME.

MFC after: 1 week
inux/linux_time.c
f1e6090f95e521b6b197c72c6f099abd4f1ac93d 23-Apr-2017 dchagin <dchagin@FreeBSD.org> Add Evdev ioctl handler to the Linuxulator.

PR: 218627
Submitted by: Jan Kokemüller
Reported by: Jan Kokemüller
MFC after: 1 week
inux/linux_ioctl.c
inux/linux_ioctl.h
554592788d91538f396edfd4e583049aa000bdbe 19-Apr-2017 markj <markj@FreeBSD.org> Drop Giant before sleeping in linux_wait_for_{timeout_,}common().

Reported and tested by: Pete Wright <pete@nomadlogic.org>
Reviewed by: hselasky (previous version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10414
inuxkpi/common/src/linux_compat.c
8ce43aa5c8f2d7e044625797c14e5c117e5fee14 19-Apr-2017 hselasky <hselasky@FreeBSD.org> Use __typeof() instead of typeof() in some RCU related macros in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/rcupdate.h
4c92046c3a0cf817a80b86fb809a7a89b3053198 19-Apr-2017 hselasky <hselasky@FreeBSD.org> Fix problem regarding priority inversion when using the concurrency
kit, CK, in the LinuxKPI.

When threads are pinned to a CPU core or when there is only one CPU,
it can happen that a higher priority thread can call the CK
synchronize function while a lower priority thread holds the read
lock. Because the CK's synchronize is a simple wait loop this can lead
to a deadlock situation. To solve this problem use the recently
introduced CK's wait callback function.

When detecting a CK blocking condition figure out the lowest priority
among the blockers and update the calling thread's priority and
yield. If another CPU core is holding the read lock, pin the thread to
the blocked CPU core and update the priority. The calling threads
priority and CPU bindings are restored before return.

If a thread holding a CK read lock is detected to be sleeping, pause()
will be used instead of yield().

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
inuxkpi/common/include/linux/srcu.h
inuxkpi/common/src/linux_rcu.c
9e81ea884818850902e0944c4e335a93921ae6a6 19-Apr-2017 hselasky <hselasky@FreeBSD.org> Zero number of CPUs should be translated into the default number of
CPUs when allocating a LinuxKPI workqueue. This also ensures that the
created taskqueue always have a non-zero number of worker threads.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_work.c
77758299a31c141f2ecbacb154a124618afd1931 17-Apr-2017 emaste <emaste@FreeBSD.org> Remove trailing whitespace from r317061
inprocfs/linprocfs.c
21ead51d79e6c9c2eb017eff84a3c8ae660a79b5 17-Apr-2017 glebius <glebius@FreeBSD.org> - Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter
in place. To do per-cpu stats, convert all fields that previously were
maintained in the vmmeters that sit in pcpus to counter(9).
- Since some vmmeter stats may be touched at very early stages of boot,
before we have set up UMA and we can do counter_u64_alloc(), provide an
early counter mechanism:
o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter.
o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter,
so that at early stages of boot, before counters are allocated we already
point to a counter that can be safely written to.
o For sparc64 that required a whole dummy pcpu[MAXCPU] array.

Further related changes:
- Don't include vmmeter.h into pcpu.h.
- vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit,
to match kernel representation.
- struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion.

This is based on benno@'s 4-year old patch:
https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html

Reviewed by: kib, gallatin, marius, lidl
Differential Revision: https://reviews.freebsd.org/D10156
inprocfs/linprocfs.c
57634430237527708cb45463c5630be1cb51c9e0 17-Apr-2017 glebius <glebius@FreeBSD.org> All these files need sys/vmmeter.h, but now they got it implicitly
included via sys/pcpu.h.
inuxkpi/common/include/linux/page.h
fef9613ffa23f143ad551bbb23a312b6beece460 17-Apr-2017 glebius <glebius@FreeBSD.org> Remove unneeded include of vm_phys.h.
inuxkpi/common/src/linux_page.c
d63e123e1aa4af79960e09644c5c4438787eead1 13-Apr-2017 cem <cem@FreeBSD.org> linux_ioctl: Refactor some v4l2 struct converters

According to the C standard, it is invalid to copy beyond the end of an
object, even if that object is obviously a member of a larger object (a
struct, in this case).

Appease the standard and Coverity by refactoring the copy in a
straightforward way. No functional change.

Reported by: Coverity (CWE-120)
CIDs: 1007819, 1007820, 1007821, 1007822, 1009668, 1009669
Security: no (false positive detection)
Sponsored by: Dell EMC Isilon
inux/linux_ioctl.c
f8a2a2a85aec2a0af41442288ca040b1a05dfec9 09-Apr-2017 cognet <cognet@FreeBSD.org> Import CK as of commit 6b141c0bdd21ce8b3e14147af8f87f22b20ecf32
This brings us changes we needed in ck_epoch.
inuxkpi/common/src/linux_rcu.c
4a4fa9a158f5307678bde3c6b72784975720dedb 09-Apr-2017 avatar <avatar@FreeBSD.org> Adding SIOCGIFNAME support in Linuxulator. This should silence the console warning associated
with linux-opera:
linux: pid 23492 (opera): ioctl fd=5, cmd=0x8910 ('\M^I',16) is not implemented
linux: pid 23492 (opera): ioctl fd=28, cmd=0x8910 ('\M^I',16) is not implemented
...

Reviewed by: kib, marcel, dchagin
Tested with: linux-opera-12.16_3
MFC after: 1 month
inux/linux_ioctl.c
inux/linux_ioctl.h
c0436ee94dbf665ad1663d8a5bb457faef93886e 09-Apr-2017 hselasky <hselasky@FreeBSD.org> Fix compilation of LinuxKPI for PowerPC.

Found by: emaste @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mm.h
inuxkpi/common/include/linux/page.h
inuxkpi/common/src/linux_compat.c
fa31f2c310e827eee49e7eed15eb8db73009da03 07-Apr-2017 hselasky <hselasky@FreeBSD.org> Create the LinuxKPI current task structure on the fly if it doesn't
exist when the current macro is used.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
96c1e7440d257e186d8bb05d44c01cc218cc4d7f 07-Apr-2017 hselasky <hselasky@FreeBSD.org> The __stringify() macro in the LinuxKPI should expand any macros
before stringifying.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/compiler.h
7e8c37891e835b2136dc82dfdb8e8e0313e33de0 07-Apr-2017 dchagin <dchagin@FreeBSD.org> Prevent ushort values overflow when convert new Linux 64-bit ipc
struct to the old Linux ipc struct.

Reported by: PVS-Studio
XMFC with: r314866

MFC after: 3 days
inux/linux_ipc.c
12fc511babfca78424f08ce5ef4b12c4ffa9b2a8 06-Apr-2017 brooks <brooks@FreeBSD.org> Regen after r316594.
reebsd32/freebsd32_systrace_args.c
b62c8e842225f86b915a4ad4a07f2ce0360d2b9d 06-Apr-2017 brooks <brooks@FreeBSD.org> Change the size argument of __getcwd() to size_t.

This matches the getcwd() definition.

This is technically an ABI change, but that would only effect 64-bit
big-endian platforms that pass arguments on the stack. We have none of
those.

Reviewed by: jhb
Obtained from: CheriABI
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D9428
reebsd32/syscalls.master
e35a79e6f8ad17d15e01f57815b5a866e3eb3acb 06-Apr-2017 hselasky <hselasky@FreeBSD.org> Cleanup the bitmap_xxx() functions in the LinuxKPI:

- Move all bitmap related functions from bitops.h to bitmap.h, similar
to what Linux does.

- Apply some minor code cleanup and simplifications to optimize the
generated code when using static inline functions.

- Implement the following list of bitmap functions which are needed by
drm-next and ibcore:
- bitmap_find_next_zero_area_off()
- bitmap_find_next_zero_area()
- bitmap_or()
- bitmap_and()
- bitmap_xor()

- Add missing include directives to the qlnxe driver
(davidcs@ has been notified)

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/bitmap.h
inuxkpi/common/include/linux/bitops.h
inuxkpi/common/include/linux/sched.h
inuxkpi/common/src/linux_idr.c
141f4267e1f18734bff67ed997d60fd003c6c59e 06-Apr-2017 hselasky <hselasky@FreeBSD.org> Define VM_READ, VM_WRITE and VM_EXEC in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mm.h
b55b297c0890088d239a24de8a359fb52831c67e 06-Apr-2017 hselasky <hselasky@FreeBSD.org> Implement need_resched() in the LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
62b5298648a632220c050f06bb5198d94fb5ada0 06-Apr-2017 hselasky <hselasky@FreeBSD.org> Fix implementation of task_pid_group_leader() in the LinuxKPI.

In FreeBSD thread IDs and procedure IDs have distinct number
spaces. When asking for the group leader task ID in the LinuxKPI,
return the procedure ID and let this resolve to the first task in the
procedure having a valid LinuxKPI task structure pointer.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
inuxkpi/common/src/linux_current.c
dbb3eab1ad8714548aae66e3d19fbd88dac0f12a 06-Apr-2017 hselasky <hselasky@FreeBSD.org> Implement proper support for memory map operations in the LinuxKPI,
like open, close and fault using the character device pager.

Some notes about the implementation:

1) Linux drivers set the vm_ops and vm_private_data fields during a
mmap() call to indicate that the driver wants to use the LinuxKPI VM
operations. Else these operations are not used.

2) The vm_private_data pointer is associated with a VM area structure
and inserted into an internal LinuxKPI list. If the vm_private_data
pointer already exists, the existing VM area structure is used instead
of the allocated one which gets freed.

3) The LinuxKPI's vm_private_data pointer is used as the callback
handle for the FreeBSD VM object. The VM subsystem in FreeBSD has a
similar list to identify equal handles and will only call the
character device pager's close function once.

4) All LinuxKPI VM operations are serialized through the mmap_sem
sempaphore, which is per procedure, which prevents simultaneous access
to the shared VM area structure when receiving page faults.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mm.h
inuxkpi/common/include/linux/page.h
inuxkpi/common/src/linux_compat.c
d72deeee12f8f97ec7d6d4905def3e7bbd4e28b6 06-Apr-2017 hselasky <hselasky@FreeBSD.org> Before registering a new mm_struct in the LinuxKPI check if other
tasks in the belonging procedure already have a valid mm_struct and
reference that instead.

The mm_struct in the LinuxKPI should be shared among all tasks
belonging to the same procedure. This has to do with with the mmap_sem
semaphore which should serialize all VM operations inside a given
procedure. Linux based drivers depend on this behaviour.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_current.c
87aeccfc861a29b64a658e00411fbe8b1156238f 05-Apr-2017 hselasky <hselasky@FreeBSD.org> Unify error handling when si_drv1 is NULL in the LinuxKPI.

Make sure the character device poll callback function does not return
an error code, but a POLLXXX value, in case of failure.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
cb44dcb8173b778c00ec325d9359d62b3b800dcf 05-Apr-2017 hselasky <hselasky@FreeBSD.org> Implement down_write_killable() in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/rwsem.h
f92cfd9faae04c90b5fc72845f87b66848a8185c 02-Apr-2017 dchagin <dchagin@FreeBSD.org> Use the kern_clock_nanosleep() to implement Linux clock_nanosleep() with
the proper handling of the TIMER_ABSTIME flag.

XMFC after: r315526

MFC after: 1 month
inux/linux_time.c
inux/linux_timer.h
512e16501e9417b24e778d64b3e408d80078e4a6 02-Apr-2017 dchagin <dchagin@FreeBSD.org> Remove excess tv_nsec test as this is done by linux_to_native_timespec().

MFC after: 1 week
inux/linux_futex.c
ae75c64bd3c07774f1d87a41c67cd2363ab8f26c 02-Apr-2017 dchagin <dchagin@FreeBSD.org> The value in the tv_nsec field should be in the range 0 to 999999999.

Pointed out by: bde@

MFC after: 1 week
inux/linux_time.c
59d61364dd2734b1e3a15ee078d6d7dfb38e4e5b 02-Apr-2017 dchagin <dchagin@FreeBSD.org> As noted by bde@ negative tv_sec values are not checked for overflow,
so overflow can still occur. Fix that. Also remove the extra check for
tv_sec size as under COMPAT_LINUX32 it is always true.

Pointed out by: bde@

MFC after: 1 week
inux/linux_time.c
8660ad94c89572674280ff00b58c45b6acca62d0 30-Mar-2017 dchagin <dchagin@FreeBSD.org> Use kern_mincore() helper instead of abusing syscall entry.

Suggested by: kib@
Reviewed by: kib@
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D10143
inux/linux_misc.c
bff09a4976e25eafbbc34e606a03d12cd6c45eff 29-Mar-2017 rwatson <rwatson@FreeBSD.org> Hook up new audit event identifiers for various non-Orange Book/CAPP
system calls supported by OpenBSM 1.2-alpha5.

Obtained from: TrustedBSD Project
MFC after: 3 weeks
Sponsored by: DARPA, AFRL
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_sysent.c
reebsd32/syscalls.master
72c6038299564c89299fd9329bed3b9495a5a73d 27-Mar-2017 hselasky <hselasky@FreeBSD.org> Implement vmalloc_32() in the LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/slab.h
a638ba52ba260b5fc009ad99376a3923e9af980e 27-Mar-2017 hselasky <hselasky@FreeBSD.org> Add more platforms supporting the direct map feature in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_page.c
e31f6cc67182b622699924ce2b7c3da50249b572 27-Mar-2017 hselasky <hselasky@FreeBSD.org> Implement a series of physical page management related functions in
the LinuxKPI for accessing user-space memory in the kernel.

Add functions to hold and wire physical page(s) based on a given range
of user-space virtual addresses.

Add functions to get and put a reference on, wire, hold, mark
accessed, copy and dirty a physical page.

Add new VM related structures and defines as a preparation step for
advancing the memory map capabilities of the LinuxKPI.

Add function to figure out if a virtual address was allocated using
malloc().

Add function to convert a virtual kernel address into its physical
page pointer.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/pgtable.h
inuxkpi/common/include/linux/mm.h
inuxkpi/common/include/linux/page.h
inuxkpi/common/include/linux/pfn.h
inuxkpi/common/include/linux/pfn_t.h
inuxkpi/common/include/linux/preempt.h
inuxkpi/common/include/linux/types.h
inuxkpi/common/src/linux_page.c
7e4bbabbeea28b4de17e6a961b63496b6482ba2e 25-Mar-2017 dchagin <dchagin@FreeBSD.org> Implement Linux mincore() system call.
This is necessary for the upcoming drm-next.

Suggested by: hselasky@
MFC after: 1 month
inux/linux_misc.c
212e3c3fc618fb6a66000b9be0cfdd4fff5a7f02 24-Mar-2017 ed <ed@FreeBSD.org> Include <sys/systm.h> to obtain the memcpy() prototype.

I got a report of this source file not building on Raspberry Pi. It's
interesting that this only fails for that target and not for others.
Again, that's no reason not to include the right headers.

PR: 217969
Reported by: Johannes Jost Meixner
MFC after: 1 week
loudabi/cloudabi_clock.c
5305673fcfac30b6681174b5b69d76511111347c 23-Mar-2017 hselasky <hselasky@FreeBSD.org> Use ppsratecheck() for ratelimiting in the LinuxKPI.

Suggested by: cem @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/kernel.h
inuxkpi/common/src/linux_compat.c
98637e5e2e96d72007cf4db68dfc1e481228976c 23-Mar-2017 hselasky <hselasky@FreeBSD.org> Add proper error checking for the string to number conversion
functions in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
a5f46e2bf3f314db28272d5cf3af6757d11c2471 23-Mar-2017 hselasky <hselasky@FreeBSD.org> Function macros are preferred in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic.h
0c29627c59c4816d092621c1f286804bee0c4e54 23-Mar-2017 hselasky <hselasky@FreeBSD.org> Add support for ratelimited printouts in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/device.h
inuxkpi/common/include/linux/kernel.h
inuxkpi/common/src/linux_compat.c
59c1d82cad7834f03a6360f53e4ae0580acfa30e 22-Mar-2017 ed <ed@FreeBSD.org> Make file descriptor passing for CloudABI's recvmsg() work.

Similar to the change for sendmsg(), create a pointer size independent
implementation of recvmsg() and let cloudabi32 and cloudabi64 call into
it. In case userspace requests one or more file descriptors, call
kern_recvit() in such a way that we get the control message headers in
an mbuf. Iterate over all of the headers and copy the file descriptors
to userspace.
loudabi/cloudabi_sock.c
loudabi/cloudabi_util.h
loudabi32/cloudabi32_sock.c
loudabi64/cloudabi64_sock.c
7d0c79b25db7754e1dffa89e6227ee57459891cf 22-Mar-2017 markj <markj@FreeBSD.org> Extend cmpxchg() to support 8- and 16-bit values, and add xchg().

These are needed to support updated revisions of the DRM code.

Reviewed by: hselasky (previous version)
MFC after: 2 weeks
inuxkpi/common/include/asm/atomic.h
316a5c4d7ff73c8a058256ea1d127673be61bdce 22-Mar-2017 hselasky <hselasky@FreeBSD.org> Add full VNET support to the inet_get_local_port_range() function in
the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/net/ip.h
e7cdb3a370f75c4d939b074737011353ebee36da 22-Mar-2017 hselasky <hselasky@FreeBSD.org> Add support for more IPv4 and IPv6 related macros in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/in.h
inuxkpi/common/include/net/ipv6.h
d2223126c091684f317f871932fce0d52d15d1c6 22-Mar-2017 ed <ed@FreeBSD.org> Make file descriptor passing work for CloudABI's sendmsg().

Reduce the potential amount of code duplication between cloudabi32 and
cloudabi64 by creating a cloudabi_sock_recv() utility function. The
cloudabi32 and cloudabi64 modules will then only contain code to convert
the iovecs to the native pointer size.

In cloudabi_sock_recv(), we can now construct an SCM_RIGHTS cmsghdr in
an mbuf and pass that on to kern_sendit().
loudabi/cloudabi_sock.c
loudabi/cloudabi_util.h
loudabi32/cloudabi32_sock.c
loudabi64/cloudabi64_sock.c
5dc3189a1b5aef2aae6dcd2f923fd21a4f19f4a4 19-Mar-2017 vangyzen <vangyzen@FreeBSD.org> Regenerate syscall files for r315526

Sponsored by: Dell EMC
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
reebsd32/freebsd32_systrace_args.c
d6de25428dd98776c2d2d099df9c5465e2082e91 19-Mar-2017 vangyzen <vangyzen@FreeBSD.org> Add clock_nanosleep()

Add a clock_nanosleep() syscall, as specified by POSIX.
Make nanosleep() a wrapper around it.

Attach the clock_nanosleep test from NetBSD. Adjust it for the
FreeBSD behavior of updating rmtp only when interrupted by a signal.
I believe this to be POSIX-compliant, since POSIX mentions the rmtp
parameter only in the paragraph about EINTR. This is also what
Linux does. (NetBSD updates rmtp unconditionally.)

Copy the whole nanosleep.2 man page from NetBSD because it is complete
and closely resembles the POSIX description. Edit, polish, and reword it
a bit, being sure to keep any relevant text from the FreeBSD page.

Reviewed by: kib, ngie, jilles
MFC after: 3 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D10020
reebsd32/freebsd32_misc.c
reebsd32/syscalls.master
207af3fa683e685d90b8f4baf4ea10e0472ffba6 18-Mar-2017 vangyzen <vangyzen@FreeBSD.org> nanosleep: plug a kernel memory disclosure

nanosleep() updates rmtp on EINVAL. In that case, kern_nanosleep()
has not updated rmt, so sys_nanosleep() updates the user-space rmtp
by copying garbage from its stack frame. This is not only a kernel
memory disclosure, it's also not POSIX-compliant. Fix it to update
rmtp only on EINTR.

Reviewed by: jilles (via D10020), dchagin
MFC after: 3 days
Security: possibly
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D10044
reebsd32/freebsd32_misc.c
inux/linux_time.c
5451804107bd7fd8ab0fccba0293f3def6ccf0be 18-Mar-2017 dchagin <dchagin@FreeBSD.org> Glibc get_nprocs() and get_nprocs_conf() uses the sysfs cpu infrastructure
to get number of processors. Implement /sys/devices/system/cpu/.

MFC after: 1 month
insysfs/linsysfs.c
69ea87350f8a2c20e00cbc8a5a53bd50702d532a 18-Mar-2017 dchagin <dchagin@FreeBSD.org> Implement getrandom() syscall.
Note. GRND_RANDOM option is not supported for now.

MFC after: 1 month
inux/linux_misc.c
inux/linux_misc.h
ed1e1b1d20f85a3c5df3bc1ba7662f8d3e619e07 18-Mar-2017 dchagin <dchagin@FreeBSD.org> As noted by Roel Bouwman Linux allows a large buffer size than the
struct ucred size. Fix this.

PR: 102956
Reported by: Roel Bouwman <roel at qsp nl>
MFC after: 1 week
inux/linux_socket.c
82392d79472a149d66f275247b0f6512453ab138 18-Mar-2017 dchagin <dchagin@FreeBSD.org> To reduce code duplication move socket defines to the MI path.

MFC after: 1 week
inux/linux_socket.h
98c683ad847f6e6e39fae407ccae1545104f3370 18-Mar-2017 dchagin <dchagin@FreeBSD.org> Remove superflous break statment.

MFC after: 1 week
inux/linux_socket.c
48e1f3e4d58e95117f8b47f1aa78c9e173f71a97 18-Mar-2017 dchagin <dchagin@FreeBSD.org> Check for negative nanoseconds.
Linux do that in timespec_valid().

Reported by: vangyzen@
MFC after: 1 week
inux/linux_time.c
febfcc317878d933fb69150b291aca48e2b0d1e5 17-Mar-2017 hselasky <hselasky@FreeBSD.org> Implement get_pid_task(), pid_task() and some other PID helper
functions in the LinuxKPI. Add a usage atomic to the task_struct
structure to facilitate refcounting the task structure when returned
from get_pid_task(). The get_task_struct() and put_task_struct()
function is used to manage atomic refcounting. After this change the
task_struct should only be freed through put_task_struct().

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/pid.h
inuxkpi/common/include/linux/sched.h
inuxkpi/common/src/linux_current.c
inuxkpi/common/src/linux_kthread.c
fea33a644cd656d028d559e6d053ddb32635e820 17-Mar-2017 hselasky <hselasky@FreeBSD.org> Implement minimalistic memory mapping structure, struct mm_struct, and
some associated helper functions in the LinuxKPI. Let the existing
linux_alloc_current() function allocate and initialize the new
structure and let linux_free_current() drop the refcount on the memory
mapping structure. When the mm_struct's refcount reaches zero, the
structure is freed.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mm.h
inuxkpi/common/include/linux/mm_types.h
inuxkpi/common/include/linux/sched.h
inuxkpi/common/src/linux_current.c
e48c0098d42d6bb6c377f7942042f2b2fa0969aa 17-Mar-2017 hselasky <hselasky@FreeBSD.org> Add comment describing the use of pagefault_disable() and
pagefault_enable() in the LinuxKPI.

Suggested by: rpokala@
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/uaccess.h
d84a164390df0c2565f9ef63069fa27bf0ad7aff 16-Mar-2017 hselasky <hselasky@FreeBSD.org> Use __LP64__ to detect presence of suword64() to fix linking and
loading of the LinuxKPI on 32-bit platforms.

Reported by: lwhsu @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
93619331a44381ab33f8436af77378ecf325a764 16-Mar-2017 hselasky <hselasky@FreeBSD.org> The LinuxKPI pagefault disable and enable functions can only be used
pairwise to support the FreeBSD way of pushing and popping the page
fault flags. Ensure this by requiring every occurrence of pagefault
disable function call to have a corresponding pagefault enable call.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/uaccess.h
55c77b03cd5bd5b2bcfe1abb785b9c3ceb50f95a 16-Mar-2017 hselasky <hselasky@FreeBSD.org> Implement more userspace memory access functions in the LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/uaccess.h
inuxkpi/common/src/linux_compat.c
1e984aedcc31e3d420ac10a2370ade73bbe75f84 16-Mar-2017 hselasky <hselasky@FreeBSD.org> Define some more LinuxKPI task related macros.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/sched.h
82b75728197e2f969780e916e91d19967c4cfc11 16-Mar-2017 hselasky <hselasky@FreeBSD.org> Add helper function similar to ip_dev_find() to the LinuxKPI to lookup
a network device by its IPv6 address in the given VNET.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/inetdevice.h
f9c6d3f8a5acd9bda0f5bc161fd50f4913e6ffcb 16-Mar-2017 hselasky <hselasky@FreeBSD.org> Add basic support for VIMAGE to the LinuxKPI and ibcore.

Support is implemented by mapping Linux's "struct net" into FreeBSD's
"struct vnet". Currently only vnet0 is supported by ibcore.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/inetdevice.h
inuxkpi/common/include/linux/netdevice.h
inuxkpi/common/src/linux_compat.c
7f7969c35a1724f664abba6fa7800410a1afb7bb 14-Mar-2017 dchagin <dchagin@FreeBSD.org> Fix usage of the same 'i' variable in the external and nested loops.

Submitted by: Svyatoslav <razmyslov at viva64.com>
Sponsored by: PVS-Studio

MFC after: 1 week
inux/linux_vdso.c
716ce7e26a0e84f84c139f5872624ddc9a211245 14-Mar-2017 hselasky <hselasky@FreeBSD.org> Set "current" pointer for LinuxKPI interrupts and timer callbacks.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_compat.c
5ba1b0f9966bcd2148fc453b8d5d43de2cb89b51 14-Mar-2017 kib <kib@FreeBSD.org> Use designated initializers for kevent_copyops.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week
reebsd32/freebsd32_misc.c
99190650780391317a8a28b3da7dfe19200f3c57 09-Mar-2017 hselasky <hselasky@FreeBSD.org> Fix implementation of the DECLARE_WORK() macro in the LinuxKPI to fully
initialize the declared work structure and not only the function callback
pointer.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/workqueue.h
50afb2f8923b52c6c0678a0f34773d608b29a675 09-Mar-2017 hselasky <hselasky@FreeBSD.org> Implement support for mutexes with deadlock avoidance in the LinuxKPI.

When locking a mutex and deadlock is detected the first mutex lock
call that sees the deadlock will return -EDEADLK .

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/ww_mutex.h
inuxkpi/common/src/linux_lock.c
e4190cd84c4b4dfe3eb49320e5bf24fde8cd421f 09-Mar-2017 hselasky <hselasky@FreeBSD.org> Cleanup the LinuxKPI mutex wrappers.

Add support for using mutexes during KDB and shutdown. This is also
required for doing mode-switching during panic for drm-next.

Add new mutex functions mutex_init_witness() and mutex_destroy()
allowing LinuxKPI mutexes to be tracked by witness.

Declare mutex_is_locked() and mutex_is_owned() like inline functions
to get cleaner warnings. These functions are used inside WARN_ON()
statements which might look a bit odd if these functions get fully
expanded.

Give mutexes better debug names through the mutex_name() macro when
WITNESS_ALL is defined. The mutex_name() macro can prefix parts of the
filename and line number before the mutex name.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/mutex.h
7fd3238e5991fcc53a22b7e3c6848df5ffb613f0 09-Mar-2017 hselasky <hselasky@FreeBSD.org> Don't create any threads before SI_SUB_INIT_IF in the LinuxKPI. Else
kthread_add() will assert it is called too soon. This fixes a startup
issue when COMPAT_LINUXKPI is in enabled the kernel configuration
file.

Reported by: Michael Butler <imb@protected-networks.net>
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_work.c
f17e1394c90cf3dde1ba193208c28bb64b05d258 08-Mar-2017 hselasky <hselasky@FreeBSD.org> Fix compilation warning for powerpc64 by not using const keyword in
return types:

Type qualifiers ignored on function return type [-Wreturn-type]

Reported by: andreast @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_work.c
9985dc6fb93aab1c119eda5ceba8ce99b9f21f61 08-Mar-2017 hselasky <hselasky@FreeBSD.org> Cleanup the LinuxKPI slab implementation.

Put large functions into linux_slab.c instead of declaring them static
inline.

Add support for more memory allocation wrappers like kmalloc_array()
and __vmalloc().

Make sure either the M_WAITOK or the M_NOWAIT flag is set and mask
away unused memory allocation flags before calling FreeBSD's malloc()
routine.

Move kmalloc_node() definition to slab.h where it belongs.

Implement support for the SLAB_DESTROY_BY_RCU feature when creating a
kmem_cache which basically means kmem_cache memory is freed using
call_rcu().

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/gfp.h
inuxkpi/common/include/linux/slab.h
inuxkpi/common/src/linux_slab.c
593f90ae1a6faf7f6504cacbc9e04e675cddd770 08-Mar-2017 hselasky <hselasky@FreeBSD.org> Implement eth_zero_addr() in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/etherdevice.h
7c32df5d0377e62aa463d9dd3e9f8758ec6ecf8d 07-Mar-2017 hselasky <hselasky@FreeBSD.org> Add support for constant pointer constructs to READ_ONCE() in the
LinuxKPI. When the type of the argument is constant the temporary
variable cannot be assigned after the barrier. Instead assign the
temporary variable by initialization.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/compiler.h
2ee4a4e5acbeeedb5e2098bfd801b5a041355ae4 07-Mar-2017 dchagin <dchagin@FreeBSD.org> Linux semop system call return EINVAL in case when the invalid nsops
or semid values specified.

MFC after: 1 month
inux/linux_ipc.c
f5caa5a507338e1e97d435df73c78a80915fc7a1 07-Mar-2017 dchagin <dchagin@FreeBSD.org> Linux kernel does not export to the user space ipc_perm.mode values
other than S_IRWXUGO (0777).

MFC after: 1 month
inux/linux_ipc.c
d7b4f21065812e15462c754e9c897fbce3d82d05 07-Mar-2017 dchagin <dchagin@FreeBSD.org> Reduce code duplication between MD Linux code by moving SYSV IPC 64-bit
related struct definitions out into the MI path.

Invert the native ipc structs to the Linux ipc structs convesion logic.
Since 64-bit variant of ipc structs has more precision convert native ipc
structs to the 64-bit Linux ipc structs and then truncate 64-bit values
into the non 64-bit if needed. Unlike Linux, return EOVERFLOW if the
values do not fit.

Fix SYSV IPC for 64-bit Linuxulator which never sets IPC_64 bit.

MFC after: 1 month
inux/linux_ipc.c
inux/linux_ipc64.h
f2edf2f23dec26ffa29387204b4623badac382f8 07-Mar-2017 hselasky <hselasky@FreeBSD.org> Implement time_is_after_eq_jiffies() function in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/jiffies.h
feff0471f337f58d7ae7a2965575befb38005b73 07-Mar-2017 hselasky <hselasky@FreeBSD.org> Fix implementation of the DECLARE_RWSEM() macro in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/rwsem.h
01cb21eaf89c280cb2459b45492a35f2b7041438 07-Mar-2017 hselasky <hselasky@FreeBSD.org> Make sure jiffies value is cast to an integer in the LinuxKPI before
doing millisecond conversion. Under FreeBSD jiffies are 32-bit.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/jiffies.h
57cdaa84878cf1c8df8781ed2af15926bf91e8c6 07-Mar-2017 hselasky <hselasky@FreeBSD.org> Use grouptaskqueue for tasklets in the LinuxKPI.

This avoids creating own per-CPU threads and also ensures the tasklet
execution happens on the same CPU core invoking the tasklet.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/src/linux_tasklet.c
ca94e0274e45ec686f56c5cba7949ed6a3d5bd8e 07-Mar-2017 hselasky <hselasky@FreeBSD.org> LinuxKPI workqueue cleanup.

This change makes the workqueue implementation behave more like in
Linux, both functionality wise and structure wise.

All workqueue code has been moved to linux_work.c

Add an atomic based statemachine to the work_struct to ensure proper
operation. Prior to this change struct_work was directly mapped to a
FreeBSD task. When a taskqueue has multiple threads the same task may
end up being executed on more than one worker thread simultaneously.
This might cause problems with code coming from Linux, which expects
serial behaviour, similar to Linux tasklets.

Move all global workqueue function names into the linux_xxx domain to
avoid symbol name clashes in the future.

Implement a few more workqueue related functions and macros.

Create two multithreaded taskqueues for the LinuxKPI during module
load, one for time-consuming callbacks and one for non-time consuming
callbacks.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/workqueue.h
inuxkpi/common/src/linux_compat.c
inuxkpi/common/src/linux_work.c
4c583e7b8e6435812a9897aa4fec32441652534c 06-Mar-2017 mmokhi <mmokhi@FreeBSD.org> Add UNIMPLEMENTED() placeholder macro for
the syscalls that are not implemented in Linux kernel itself.
Cleanup DUMMY() macros.

Reviewed by: dchagin, trasz
Approved by: dchagin
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D9804
inux/linux_util.h
1616caa6f020004a677c657907c43e3a6c3b766c 06-Mar-2017 hselasky <hselasky@FreeBSD.org> Implement add_timer_on() function in the LinuxKPI.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/timer.h
inuxkpi/common/src/linux_compat.c
cb3ef41f4780acc6de2fcb13e2869d77504f1609 06-Mar-2017 hselasky <hselasky@FreeBSD.org> Implement DECLARE_RWSEM() macro in the LinuxKPI to initialize a
Read-Write semaphore during module init time.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/rwsem.h
6925c56d7d977dcf74672e7ccde3ebf5f857c08c 06-Mar-2017 hselasky <hselasky@FreeBSD.org> Give LinuxKPI Read-Write semaphores better debug names when
WITNESS_ALL is defined. The lock name is based on the filename and
line number where the initialisation happens.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/rwsem.h
e3550c135489f32c9949eef20f22af39688d7a3a 04-Mar-2017 hselasky <hselasky@FreeBSD.org> Remove duplicate prototype in the LinuxKPI to fix compilation warning.

Reported by: emaste @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/srcu.h
b67815020513c72f2b2fdaf8550a817d40f2e73e 04-Mar-2017 dchagin <dchagin@FreeBSD.org> Style(9).

MFC after: 1 month
inux/linux_ipc.c
5d85c8ac9eeee08495fd978c88820d9204d4d63e 04-Mar-2017 dchagin <dchagin@FreeBSD.org> Remove attribute __packed from some IPC struct definition since
Linuxulator is x86 only.
The only notable differences in algnment for an LP64 64-bit system
when compared to a 32-bit system is an eight or large byte types
alignment.

MFC after: 1 month
inux/linux_ipc.c
94e2ec72e8aa0221720e4ecddfc3a6c0747e0c18 04-Mar-2017 dchagin <dchagin@FreeBSD.org> Hide Linux socketcall constants under corresponding #ifdef since
they are used only in i386 Linuxulator.

MFC after: 1 week
inux/linux_socket.h
4c10a150a60ba299c9e66c2cbaeb6895f2b85d17 03-Mar-2017 hselasky <hselasky@FreeBSD.org> Update the LinuxKPI RCU and SRCU wrappers for the concurrency kit, CK.

- Optimise the RCU implementation to not allocate and free
ck_epoch_records during runtime. Instead allocate two sets of
ck_epoch_records per CPU for general purpose use. The first set is
only used for reader locks and the second set is only used for
synchronization and barriers and is protected with a regular mutex to
prevent simultaneous issues.

- Move the task structure away from the rcu_head structure and into
the per-CPU structures. This allows the size of the rcu_head structure
to be reduced down to the size of two pointers.

- Fix a bug where the linux_rcu_barrier() function only waited for one
per-CPU epoch record to be completed instead of all.

- Use a critical section or a mutex to protect ck_epoch_begin() and
ck_epoch_end() depending on RCU or SRCU type. All the ck_epoch_xxx()
functions, except ck_epoch_register(), ck_epoch_unregister() and
ck_epoch_recycle() are not re-entrant and needs a critical section or
a mutex to operate in the LinuxKPI, after inspecting the CK
implementation of the above mentioned functions. The simultaneous
issues arise from per-CPU epoch records being shared between multiple
threads depending on the amount of taskswitching and how many threads
are involved with the RCU and SRCU operations.

- Properly free all epoch records by using safe list traversal at
LinuxKPI module unload. It turns out the ck_epoch_recycle() always
have the records on an internal list and use a flag in the epoch
record to track allocated and free entries. This would lead to use
after free during module unload.

- Remove redundant synchronize_rcu() call from the
linux_compat_uninit() function. Let the linux_rcu_runtime_uninit()
function do the final rcu_barrier() instead.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/srcu.h
inuxkpi/common/include/linux/types.h
inuxkpi/common/src/linux_compat.c
inuxkpi/common/src/linux_rcu.c
b179c90b2b73c9de19377e9e33d7b7b0dd617a15 01-Mar-2017 kib <kib@FreeBSD.org> With the removal of IA64, the only arch which uses ia32 compat is amd64.

Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
a32/ia32_sysvec.c
3cde5ded9b2f956b961e109b35f52ea6162afc35 28-Feb-2017 dchagin <dchagin@FreeBSD.org> Linux epoll return EEXIST on case when op is EPOLL_CTL_ADD, and the supplied
file descriptor fd is already registered with this epoll instance.

MFC after: 1 month
inux/linux_event.c
6bdeecaaab4f16ef6fba86dc5ad555326809ed11 28-Feb-2017 dchagin <dchagin@FreeBSD.org> Linux epoll return ENOENT error in case when op is EPOLL_CTL_MOD or
EPOLL_CTL_DEL, and fd is not registered with this epoll instance.

MFC after: 1 month
inux/linux_event.c
85c7523219e34dfc616089bd993ad8c9e201e78c 28-Feb-2017 dchagin <dchagin@FreeBSD.org> FreeBSD does not have analgue for epill EPOLLPRI event type.
So, do not set EPOLLPRI event acidently.
Also, do not set EPOLLWRNORM and EPOLLRDNORM events as epoll
do not set this events.

MFC after: 1 month
inux/linux_event.c
745bcd6fbacaf2197a541b607052a7b399ea8a95 28-Feb-2017 glebius <glebius@FreeBSD.org> Remove SVR4 (System V Release 4) binary compatibility support.

UNIX System V Release 4 is operating system released in 1988. It ceased
to exist in early 2000-s.
vr4/Makefile
vr4/README
vr4/TO-DO
vr4/imgact_svr4.c
vr4/svr4.h
vr4/svr4_acl.h
vr4/svr4_dirent.h
vr4/svr4_errno.h
vr4/svr4_exec.h
vr4/svr4_fcntl.c
vr4/svr4_fcntl.h
vr4/svr4_filio.c
vr4/svr4_filio.h
vr4/svr4_fuser.h
vr4/svr4_hrt.h
vr4/svr4_ioctl.c
vr4/svr4_ioctl.h
vr4/svr4_ipc.c
vr4/svr4_ipc.h
vr4/svr4_misc.c
vr4/svr4_mman.h
vr4/svr4_proto.h
vr4/svr4_resource.c
vr4/svr4_resource.h
vr4/svr4_siginfo.h
vr4/svr4_signal.c
vr4/svr4_signal.h
vr4/svr4_socket.c
vr4/svr4_socket.h
vr4/svr4_sockio.c
vr4/svr4_sockio.h
vr4/svr4_sockmod.h
vr4/svr4_stat.c
vr4/svr4_stat.h
vr4/svr4_statvfs.h
vr4/svr4_stream.c
vr4/svr4_stropts.h
vr4/svr4_syscall.h
vr4/svr4_syscallnames.c
vr4/svr4_sysconfig.h
vr4/svr4_sysent.c
vr4/svr4_systeminfo.h
vr4/svr4_sysvec.c
vr4/svr4_termios.c
vr4/svr4_termios.h
vr4/svr4_time.h
vr4/svr4_timod.h
vr4/svr4_types.h
vr4/svr4_ucontext.h
vr4/svr4_ulimit.h
vr4/svr4_ustat.h
vr4/svr4_util.h
vr4/svr4_utsname.h
vr4/svr4_wait.h
vr4/syscalls.conf
vr4/syscalls.master
56928ce42b63e1fe4407a601c26fcfea12dab068 27-Feb-2017 dchagin <dchagin@FreeBSD.org> Return EINVAL when an invalid file descriptor specified.

MFC after: 1 month
inux/linux_event.c
8b75f25a16d52a14a4f14cfee2e9f70cb607607e 27-Feb-2017 dchagin <dchagin@FreeBSD.org> Unify eventfd ioctl method and use it for other similar interfaces.

MFC after: 1 month
inux/linux_event.c
d85177fb6673ba3a8975ec1c79c13fbd04066f0d 27-Feb-2017 hselasky <hselasky@FreeBSD.org> Implement more bit operation functions in the LinuxKPI.
Some minor whitespace nits while at it.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/bitops.h
ead9f4a7a29a2a30258c181e0562572628efffec 27-Feb-2017 hselasky <hselasky@FreeBSD.org> Define __sum16 type in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/types.h
7ee8ec16cda12e4f48ee09b0348686147842e019 26-Feb-2017 dchagin <dchagin@FreeBSD.org> Return EINVAL in case when an invalid size of signal mask specified.

MFC after: 1 month
inux/linux_event.c
5023c81790e5474e27f3149ef325975145b8fe2d 26-Feb-2017 dchagin <dchagin@FreeBSD.org> Restore signal mask in epoll_pwait.

MFC after: 1 month
inux/linux_event.c
5f14efc75a4108c9668a7c7eb596afabff7bbc23 26-Feb-2017 dchagin <dchagin@FreeBSD.org> Return EINVAL when an invalid file descriptor is specified.

MFC after: 1 month
inux/linux_event.c
396e17522fb9168762113681cc46d25b7aa307e2 26-Feb-2017 dchagin <dchagin@FreeBSD.org> Implement timerfd family syscalls.

MFC after: 1 month
inux/linux_event.c
inux/linux_event.h
inux/linux_time.c
inux/linux_timer.h
01e88578646b103c8c66e88e079db1daba90f7d7 26-Feb-2017 dchagin <dchagin@FreeBSD.org> Nostly style(9) changes, replace unused eventfd_truncate()
by default invfo_truncate() method.

MFC after: 1 month
inux/linux_event.c
860fd9aad77c0ac1dc02eb61093ef4c81ad21ec3 26-Feb-2017 dchagin <dchagin@FreeBSD.org> Return EOVERFLOW error in case then the size of tv_sec field of struct timespec
in COMPAT_LINUX32 Linuxulator's not equal to the size of native tv_sec.

MFC after: 1 month
inux/linux_misc.c
inux/linux_time.c
inux/linux_timer.h
aaee2586480086339bf5303ff55432686fe7aeb8 25-Feb-2017 trasz <trasz@FreeBSD.org> Fix linux_fstatfs() to return proper value for f_frsize. Without it,
linux df(1) binary from Xenial shows garbage.

Reviewed by: dchagin
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D9692
inux/linux_stats.c
bf114a4c4ab96ebd1498400e34b1fd1b5e1bd69e 24-Feb-2017 mmokhi <mmokhi@FreeBSD.org> Add linux_preadv() and linux_pwritev() syscalls to Linuxulator.

Reviewed by: dchagin
Approved by: dchagin, trasz (src committers)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9722
inux/linux_file.c
8f19e3211041026ad4089a82d6a72f993fef1771 24-Feb-2017 dchagin <dchagin@FreeBSD.org> Revert r314217. Commit is not match that I have approved.
inux/linux_file.c
0de5ce37241d326ad1dadf9de30696666ceeb33b 24-Feb-2017 mmokhi <mmokhi@FreeBSD.org> Add linux_preadv() and linux_pwritev() syscalls to Linuxulator.

Reviewed by: dchagin
Approved by: dchagin, trasz (src committers)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9722
inux/linux_file.c
60ac5d99b9367a3d5edc5d7364b53d075ee5cff1 24-Feb-2017 hselasky <hselasky@FreeBSD.org> Implement more string functions in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/string.h
1cec1dcad5bc8e3b98b8b70a04624487457ebcec 24-Feb-2017 hselasky <hselasky@FreeBSD.org> Prototype device structure to ensure LinuxKPI header file can be
included standalone.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/device.h
d795bd752a6a6cccf6b4c401ce6b3cdf142b11c9 24-Feb-2017 hselasky <hselasky@FreeBSD.org> Implement srcu_dereference() macro in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/srcu.h
2cf5bc511c9aa85db8c0a9570cb43a26ea6b5cdb 24-Feb-2017 hselasky <hselasky@FreeBSD.org> Implement BIT_ULL() macro in the LinuxKPI.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/bitops.h
deaac43580f4ae4b0f538c74dfd3444777ba2b49 23-Feb-2017 hselasky <hselasky@FreeBSD.org> Implement __test_and_clear_bit() and __test_and_set_bit() in the LinuxKPI.

The clang compiler will optimise these functions down to three AMD64
instructions if the bit argument is a constant during compilation.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/bitops.h
6b0ce86f279c8de45fadfdd92a06aed1d3d81632 23-Feb-2017 dchagin <dchagin@FreeBSD.org> Right clock defines specified in linux_timer.h.
Get rid of spirious clock defines from linux_misc.h.

MFC after: 1 week
inux/linux_misc.h
d7fd8b580c35835090420433c5a42a90a1baf9c2 22-Feb-2017 hselasky <hselasky@FreeBSD.org> Convert magic values into macros in the LinuxKPI scatterlist
implementation.

Suggested by: cem @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/scatterlist.h
760259318e9a7a4d695bf657bdb871f0b25f0f26 22-Feb-2017 hselasky <hselasky@FreeBSD.org> Optimise unmapped LinuxKPI page allocations.

When allocating unmapped pages, take advantage of the direct map on
AMD64 to get the virtual address corresponding to a page. Else all
pages allocated must be mapped because sometimes the virtual address
of a page is requested.

Move all page allocation and deallocation code into an own C-file.

Add support for GFP_DMA32, GFP_KERNEL, GFP_ATOMIC and __GFP_ZERO
allocation flags.

Make a clear separation between mapped and unmapped allocations.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/gfp.h
inuxkpi/common/src/linux_page.c
3f3d5ba6f41d43cff6dea653ce7e818a367a874b 22-Feb-2017 hselasky <hselasky@FreeBSD.org> Improve LinuxKPI scatter list support.

The i915kms driver in Linux 4.9 reimplement parts of the scatter list
functions with regards to performance. In other words there is not so
much room for changing structure layouts and functionality if the
i915kms should be built AS-IS. This patch aligns the scatter list
support to what is expected by the i915kms driver. Remove some
comments not needed while at it.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/scatterlist.h
f407fff4dec58b3db28e387d9bae000ae22b720a 21-Feb-2017 hselasky <hselasky@FreeBSD.org> Replace dummy implementation of RCU in the LinuxKPI with one based on
the in-kernel concurrency kit's ck_epoch API. Factor RCU hlist_xxx()
functions into own rculist.h header file.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/rculist.h
inuxkpi/common/include/linux/rcupdate.h
inuxkpi/common/include/linux/srcu.h
inuxkpi/common/include/linux/types.h
inuxkpi/common/src/linux_compat.c
inuxkpi/common/src/linux_rcu.c
7a0e1ff531c8c910508f2b6980ec0565bb7884b6 21-Feb-2017 trasz <trasz@FreeBSD.org> Get rid of foo_sys() in linuxulator code. It was commented out, and it
would be useless anyway - there is no point in pretending to have block
devices; our "block" devices are in fact character ones, and can only
be accessed as such.

Discussed with: dchagin
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
inux/linux_stats.c
7571839786b3a3fec9d4dbf68d23edb8f2837efb 21-Feb-2017 hselasky <hselasky@FreeBSD.org> Streamline the LinuxKPI spinlock wrappers.

1) Add better spinlock debug names when WITNESS_ALL is defined.

2) Make sure that the calling thread gets bound to the current CPU
while a spinlock is locked. Some Linux kernel code depends on that the
CPU ID doesn't change while a spinlock is locked.

3) Add support for using LinuxKPI spinlocks during a panic().

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/spinlock.h
be82accd8211063219986130acc719ce8a4722ac 21-Feb-2017 hselasky <hselasky@FreeBSD.org> Add support for LinuxKPI tasklets.

Tasklets are implemented using a taskqueue and a small statemachine on
top. The additional statemachine is required to ensure all LinuxKPI
tasklets get serialized. FreeBSD taskqueues do not guarantee
serialisation of its tasks, except when there is only one worker
thread configured.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/bottom_half.h
inuxkpi/common/include/linux/interrupt.h
inuxkpi/common/src/linux_tasklet.c
5a4680a8656b55439b5118e34eb1e666088674f8 21-Feb-2017 hselasky <hselasky@FreeBSD.org> Make the LinuxKPI task struct persistent accross system calls.

A set of helper functions have been added to manage the life of the
LinuxKPI task struct. When an external system call or task is invoked,
a check is made to create the task struct by demand. A thread
destructor callback is registered to free the task struct when a
thread exits to avoid memory leaks.

This change lays the ground for emulating the Linux kernel more
closely which is a dependency by the code using the LinuxKPI APIs.

Add new dedicated td_lkpi_task field has been added to struct thread
instead of abusing td_retval[1].

Fix some header file inclusions to make LINT kernel build properly
after this change.

Bump the __FreeBSD_version to force a rebuild of all kernel modules.

MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/bitops.h
inuxkpi/common/include/linux/compat.h
inuxkpi/common/include/linux/file.h
inuxkpi/common/include/linux/jiffies.h
inuxkpi/common/include/linux/kdev_t.h
inuxkpi/common/include/linux/kernel.h
inuxkpi/common/include/linux/kthread.h
inuxkpi/common/include/linux/rwlock.h
inuxkpi/common/include/linux/rwsem.h
inuxkpi/common/include/linux/sched.h
inuxkpi/common/include/linux/semaphore.h
inuxkpi/common/include/linux/spinlock.h
inuxkpi/common/include/linux/types.h
inuxkpi/common/include/linux/wait.h
inuxkpi/common/src/linux_compat.c
inuxkpi/common/src/linux_current.c
inuxkpi/common/src/linux_kthread.c
inuxkpi/common/src/linux_pci.c
368fd06ff32af6269a5778a96473fa9273225814 20-Feb-2017 trasz <trasz@FreeBSD.org> Add /proc/self/mounts to linprocfs; some linux binaries need it.

MFC after: 2 weeks
Sponsored by: DARPA, AFRL
inprocfs/linprocfs.c
d6f65ff5291b84761946f4a40b4ca56d03603331 19-Feb-2017 trasz <trasz@FreeBSD.org> There are some Linux binaries that expect the system to obey the "addr"
parameter to mmap(2), even if MAP_FIXED is not explicitly specified.
Android ART is one example. Implement bug compatibility for this case
in linuxulator.

Reviewed by: dchagin@
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D9373
inux/linux_mmap.c
ca1a1a9d60fc7133fed31bb48241296ad502531f 19-Feb-2017 dchagin <dchagin@FreeBSD.org> Implement rt_tgsigqueueinfo system call used by glibc for pthread_sigqueue(3).

MFC after: 2 week
inux/linux_signal.c
554adae91ed8c7d85045eaca2c741966bde44e48 18-Feb-2017 dchagin <dchagin@FreeBSD.org> Style(9), some XXX comments fix. No functional changes.

MFC after: 1 week
inux/linux_socket.c
665056f70d80d2d51b3f17fbde3bb414c79d1dd6 18-Feb-2017 dchagin <dchagin@FreeBSD.org> Initialize cap_rights before use.

MFC after: 1 week
inux/linux_socket.c
bb2f96be4628ab41999f53e7dcc37fc016196386 18-Feb-2017 dchagin <dchagin@FreeBSD.org> Finich r313684.

Convert linux_recv(), linux_send() and linux_accept() system call arguments
to the register_t type too.

PR: 217161
MFC after: 3 days
xMFC with: r313284,r313285,r313684
inux/linux_socket.c
inux/linux_socket.h
1940f5fb0552d50f46146cf80281d2a0928282e2 17-Feb-2017 hselasky <hselasky@FreeBSD.org> Implement GFP_DMA32 flag in the LinuxKPI.
Define all FreeBSD native GFP bits as GFP_NATIVE_MASK.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/dma-mapping.h
inuxkpi/common/include/linux/gfp.h
f2c1ffe7cd0d0ac2063ba76d3879651a1a767ffa 16-Feb-2017 hselasky <hselasky@FreeBSD.org> Allow container_of() to be used with constant data pointers.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/kernel.h
9624095350a01027c83805c38ac71855a79a1567 16-Feb-2017 hselasky <hselasky@FreeBSD.org> Implement more LinuxKPI atomic functions and macros.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic-long.h
inuxkpi/common/include/asm/atomic.h
inuxkpi/common/include/asm/atomic64.h
f6311e2737b7897ccbab19eae4d01001d1f9c570 16-Feb-2017 hselasky <hselasky@FreeBSD.org> Allow passing a constant atomic_t to atomic_read().

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/asm/atomic.h
a610c4c939fd39efd8981e21428a714313908a39 16-Feb-2017 hselasky <hselasky@FreeBSD.org> Whitespace fix.

Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
inuxkpi/common/include/linux/device.h
91cb2efc3777ba22ce978f05fe1308b9add9507b 16-Feb-2017 trasz <trasz@FreeBSD.org> Improve debugging output.

MFC after: 2 weeks
Sponsored by: DARPA, AFRL
inux/linux.c
be921ed759376f17b28bc75e4ff910823ae124cb 14-Feb-2017 dchagin <dchagin@FreeBSD.org> Replace Linuxulator implementation of readdir(), getdents() and
getdents64() with wrapper over kern_getdirentries().

The patch was originally written by emaste@ and then adapted by trasz@
and me.

Note:
1. I divided linux_getdents() and linux_readdir() as in case when the
getdents() called with count = 1 (readdir() case) it can overwrite
user stack (by writing to user buffer pointer more than 1 byte).

2. Linux returns EINVAL in case when user supplied buffer is not enough
to contain fetched dirent.

3. Linux returns ENOTDIR in case when fd points to not a directory.

Reviewed by: trasz@
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D2210
inux/linux_file.c
0dac2c595504e696b2f46c683b022e24a6614e42 13-Feb-2017 kib <kib@FreeBSD.org> Rework r313352.

Rename kern_vm_* functions to kern_*. Move the prototypes to
syscallsubr.h. Also change Mach VM types to uintptr_t/size_t as
needed, to avoid headers pollution.

Requested by: alc, jhb
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D9535
loudabi/cloudabi_mem.c
reebsd32/freebsd32_misc.c
inux/linux_misc.c
inux/linux_mmap.c
47bb4eafc2cd231c8df5414cc4cf91f85bd1af9c 13-Feb-2017 kib <kib@FreeBSD.org> Style: wrap long line.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days
reebsd32/freebsd32_misc.c
37b3e9269d80471e7166706bc034099030387d36 12-Feb-2017 dchagin <dchagin@FreeBSD.org> Fix r313284.

Members of the syscall argument structures are padded to a word size. So,
for COMPAT_LINUX32 we should convert user supplied system call arguments
which is 32-bit in that case to the array of register_t.

Reported by: Oleg V. Nauman
MFC after: 1 week
inux/linux_socket.c
e80fc50712d09f2930eba9a68b9783ad7316855c 10-Feb-2017 jhb <jhb@FreeBSD.org> Regenerate all the system call tables to drop "created from" lines.

One of the ibcs2 files contains some actual changes (new headers) as
it hasn't been regenerated after older changes to makesyscalls.sh.
loudabi32/cloudabi32_proto.h
loudabi32/cloudabi32_syscall.h
loudabi32/cloudabi32_syscalls.c
loudabi32/cloudabi32_sysent.c
loudabi64/cloudabi64_proto.h
loudabi64/cloudabi64_syscall.h
loudabi64/cloudabi64_syscalls.c
loudabi64/cloudabi64_sysent.c
reebsd32/freebsd32_proto.h
reebsd32/freebsd32_syscall.h
reebsd32/freebsd32_syscalls.c
reebsd32/freebsd32_sysent.c
vr4/svr4_proto.h
vr4/svr4_syscall.h
vr4/svr4_syscallnames.c
vr4/svr4_sysent.c
322d98e6921174c322eb5ccdb2448fced4e301a0 06-Feb-2017 trasz <trasz@FreeBSD.org> Add kern_vm_mmap2(), kern_vm_mprotect(), kern_vm_msync(), kern_vm_munlock(),
kern_vm_munmap(), and kern_vm_madvise(), and use them in various compats
instead of their sys_*() counterparts.

Reviewed by: ed, dchagin, kib
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D9378
loudabi/cloudabi_mem.c
reebsd32/freebsd32_misc.c
inux/linux_misc.c
inux/linux_mmap.c
1ba976c2fa9ab5ed7d079931b65bac52647abf74 05-Feb-2017 dchagin <dchagin@FreeBSD.org> Update syscall.master to 4.10-rc6. Also fix comments, a typo,
and wrong numbering for a few unimplemented syscalls.

For 32-bit Linuxulator, socketcall() syscall was historically
the entry point for the sockets API. Starting in Linux 4.3, direct
syscalls are provided for the sockets API. Enable it.

The initial version of patch was provided by trasz@ and extended by me.

Submitted by: trasz
MFC after: 2 week
Differential Revision: https://reviews.freebsd.org/D9381
inux/linux_socket.h
14153f36ce9bc2ea3ed09ed4910b3127608591fe 05-Feb-2017 trasz <trasz@FreeBSD.org> Fix linux_pipe() and linux_pipe2() to close file descriptors on copyout
error.

Reviewed by: dchagin
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D9425
inux/linux_file.c
2751ab501f395baaab10f6977f435d23bf3aa8ef 05-Feb-2017 trasz <trasz@FreeBSD.org> Add kern_cpuset_getaffinity() and kern_cpuset_getaffinity(),
and use it in compats instead of their sys_*() counterparts.

Reviewed by: ki