History log of /freebsd-head/sys/compat/linux/linux_socket.c
Revision Date Author Comments
7fbac817ea4993423702bc4bc6a669e000cf7b1c 17-Sep-2020 trasz <trasz@FreeBSD.org> Reduce code duplication by introducing linux_copyout_sockaddr()
helper function. No functional changes.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25804
371726d1752f5ce493011c64efd99ea737a379b2 17-Sep-2020 trasz <trasz@FreeBSD.org> Get rid of sv_errtbl and SV_ABI_ERRNO().

Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D26388
6265a56fda2455ca5973c47b3d0d702ed6d409ec 01-Sep-2020 mjg <mjg@FreeBSD.org> compat: clean up empty lines in .c and .h files
adaa7ce8e90388e6481d918fd12ef528eda58f4b 18-Aug-2020 markj <markj@FreeBSD.org> Fix handling of ancillary data on non-AF_UNIX Linux sockets.

After r340674, the "continue" would restart the loop without having
updated clen, resulting in an infinite loop. Restore the old behaviour
of simply ignoring all control messages on such sockets, since we
currently only implement handling for AF_UNIX-specific messages.

Reported by: syzkaller
Reviewed by: tijl
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26093
3b2643a2c3b9475bec202de255ac5110937aa937 05-Jul-2020 trasz <trasz@FreeBSD.org> Fix Linux recvmsg(2) when msg_namelen returned is 0. Previously
it would fail with EINVAL, breaking some of the Python regression
tests.

While here, cap the user-controlled message length.

Note that the code doesn't seem to be copying out the new length
in either (success or failure) case. This will be addressed separately.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25392
967382a845cf780b5c0ef0535a40e4193e8b66bb 01-Jul-2020 trasz <trasz@FreeBSD.org> Rework linux accept(2). This makes the code flow easier to follow,
and fixes a bug where calling accept(2) could result in closing fd 0.

Note that the code still contains a number of problems: it makes
assumptions about l_sockaddr_in being the same as sockaddr_in,
the EFAULT-related code looks like it doesn't work at all, and the
socket type check is racy. Those will be addressed later on;
I'm trying to work in small steps to avoid breaking one thing while
fixing another.

It fixes Redis, among other things.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25461
4ee3a6b8384dfbb5ffa727cc174f88475e119bec 28-Jun-2020 trasz <trasz@FreeBSD.org> Make linux(4) support SO_PROTOCOL. Running Python test suite
with python3.8 from Focal triggers those.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25491
405e9d0fd06c8491a135a21c8832fef041faa5fa 14-Jun-2020 trasz <trasz@FreeBSD.org> Make linux(4) warn about unsupported CMSG level/type.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25255
1e3f9796b747b306c622ca516b781d65f464e907 12-Jun-2020 trasz <trasz@FreeBSD.org> Fix naming clash.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
1d5d0f32a0d1cda88e33d8121d1b5b2de77c104d 12-Jun-2020 trasz <trasz@FreeBSD.org> Minor code cleanup; no functional changes.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25232
d9119d071e4c677da4ab156bfbe265ae2fe7feb2 11-Jun-2020 trasz <trasz@FreeBSD.org> Make linux(4) handle SO_REUSEPORT.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25216
d9496f6f43e4aab10dd14b2b121108e1c75a7c43 10-Jun-2020 trasz <trasz@FreeBSD.org> Support SO_SNDBUFFORCE/SO_RCVBUFFORCE by aliasing them to the
standard SO_SNDBUF/SO_RCVBUF. Mostly cosmetics, to get rid
of the warning during 'apt upgrade'.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25173
74c388693eb07ef738f06dff754a8e1b816da414 27-Feb-2020 trasz <trasz@FreeBSD.org> Make linuxulator warn about unsupported getsockopt/setsockopt flags.

MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D23791
1c619be1ef0d9bff14d975c26f78d901ed799461 10-Feb-2020 trasz <trasz@FreeBSD.org> Make linux(4) use kern_socketpair(9) instead of going through
sys_socketpair(). It's a cleanup; no functional changes.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22814
77a92d8644e7b1b97832b4a058653519c0b56830 05-Feb-2020 emaste <emaste@FreeBSD.org> linuxulator: implement sendfile

Submitted by: Bora Özarslan <borako.ozarslan@gmail.com>
Submitted by: Yang Wang <2333@outlook.jp>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19917
76f0f7625b8d9e62e5b5319f24973624ad63bd84 28-Jan-2020 trasz <trasz@FreeBSD.org> Add TCP_CORK support to linux(4). This fixes one of the things Nginx
trips over.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23171
75529ff6f1a46c5067a8bf2b91f4db777f1d0549 28-Jan-2020 trasz <trasz@FreeBSD.org> Add compat.linux.ignore_ip_recverr sysctl. This is a workaround
for missing IP_RECVERR setsockopt(2) support. Without it, DNS
resolution is broken for glibc >= 2.30 (glibc BZ #24047).

From the user point of view this fixes "yum update" on recent
CentOS 8.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23234
a85eb0d89bc06626ad19b453635a881137a55e3f 14-Jan-2020 trasz <trasz@FreeBSD.org> Make linux(4) use kern_setsockopt(9) instead of going through
sys_setsockopt. Just a cleanup; no functional changes.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22812
2e64d12585dd8d20fcdaec96fb2415fcb1528ee7 14-Jan-2020 trasz <trasz@FreeBSD.org> Make linux(4) use kern_getsockopt(9) instead of going through
sys_getsockopt(). It's a cleanup; no functional changes.

Reviewed by: kib (earlier version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22813
a25b408b04599c6fce2193dbdb311213402b0efa 30-May-2019 dchagin <dchagin@FreeBSD.org> Complete LOCAL_PEERCRED support. Cache pid of the remote process in the
struct xucred. Do not bump XUCRED_VERSION as struct layout is not changed.

PR: 215202
Reviewed by: tijl
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20415
3a2fd1de63ae212f9419b87ba5a38e09da93f996 30-May-2019 dchagin <dchagin@FreeBSD.org> Linux does not support MSG_OOB for unix(4) or non-stream oriented socket,
return EOPNOTSUPP as a Linux does.

Reviewed by: tijl
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20409
bc0cda5ebb04dab606bf2f24a53eacb1a88527aa 21-May-2019 dchagin <dchagin@FreeBSD.org> Do not leak sa in linux_recvmsg() call if kern_recvit() fails.

MFC after: 1 week
de381845cd93ffd3bb9642989bbf7295c74e6a99 21-May-2019 dchagin <dchagin@FreeBSD.org> Do not use uninitialised sa.

Reported by: tijl@
MFC after: 1 week
05ec103068d451f2e42090eda46b5f674dbabc18 21-May-2019 dchagin <dchagin@FreeBSD.org> Do not leak sa in linux_recvfrom() call if kern_recvit() fails.

MFC after: 1 week
7217903de21b1d0323d452adf11df7307a4cd878 19-May-2019 dchagin <dchagin@FreeBSD.org> Linux send() call returns EAGAIN instead of ENOTCONN in case when the
socket is non-blocking and connect() is not finished yet.

Initial patch developed by Steven Hartland in 2008 and adopted by me.

PR: 129169
Reported by: smh@
MFC after: 2 weeks
c8001997cef4aca8409da5342de76fde94dceb96 13-May-2019 dchagin <dchagin@FreeBSD.org> Linuxulator getpeername() returns EINVAL in case then namelen less then 0.

MFC after: 2 weeks
57102bcf40779603c9e7548e77d75bc66b7514a6 13-May-2019 dchagin <dchagin@FreeBSD.org> Our bsd_to_linux_sockaddr() and linux_to_bsd_sockaddr() functions
alter the userspace sockaddr to convert the format between linux and BSD versions.
That's the minimum 3 of copyin/copyout operations for one syscall.

Also some syscall uses linux_sa_put() and linux_getsockaddr() when load
sockaddr to userspace or from userspace accordingly.

To avoid this chaos, especially converting sockaddr in the userspace,
rewrite these 4 functions to convert sockaddr only in kernel and leave
only 2 of this functions.

Also in order to reduce duplication between MD parts of the Linuxulator put
struct sockaddr conversion functions that are MI out into linux_common module.

PR: 232920
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20157
9e861e433f49c93ac3cea38dff4568d208690357 08-Jan-2019 markj <markj@FreeBSD.org> Specify the correct option level when emulating SO_PEERCRED.

Our equivalent to SO_PEERCRED, LOCAL_PEERCRED, is implemented at
socket option level 0, not SOL_SOCKET.

PR: 234722
Submitted by: Dániel Bakai <bakaidl@gmail.com>
MFC after: 2 weeks
679845ea2018bae1ea99ead6d34fbdec75863ee3 20-Nov-2018 tijl <tijl@FreeBSD.org> Fix another user address dereference in linux_sendmsg syscall.

This was hidden behind the LINUX_CMSG_NXTHDR macro which dereferences its
second argument. Stop using the macro as well as LINUX_CMSG_FIRSTHDR. Use
the size field of the kernel copy of the control message header to obtain
the next control message.

PR: 217901
MFC after: 2 days
X-MFC-With: r340631
823217168b94e653cfc59ce6051b436c359e2344 19-Nov-2018 tijl <tijl@FreeBSD.org> Do proper copyin of control message data in the Linux sendmsg syscall.

Instead of calling m_append with a user address, allocate an mbuf cluster
and copy data into it using copyin. For the SCM_CREDS case, instead of
zeroing a stack variable and appending that to the mbuf, zero part of the
mbuf cluster directly. One mbuf cluster is also the size limit used by
the FreeBSD sendmsg syscall (uipc_syscalls.c:sockargs()).

PR: 217901
Reviewed by: kib
MFC after: 3 days
a7587890604d97703f514d1afb4cd660fc4e4192 06-Nov-2018 brooks <brooks@FreeBSD.org> Use declared types for caddr_t arguments.

Leave ptrace(2) alone for the moment as it's defined to take a caddr_t.

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17852
7a979485ab0753cfd7296338fb6a13f9360df5f8 07-Aug-2018 markj <markj@FreeBSD.org> Improve handling of control message truncation.

If a recvmsg(2) or recvmmsg(2) caller doesn't provide sufficient space
for all control messages, the kernel sets MSG_CTRUNC in the message
flags to indicate truncation of the control messages. In the case
of SCM_RIGHTS messages, however, we were failing to dispose of the
rights that had already been externalized into the recipient's file
descriptor table. Add a new function and mbuf type to handle this
cleanup task, and use it any time we fail to copy control messages
out to the recipient. To simplify cleanup, control message truncation
is now only performed at control message boundaries.

The change also fixes a few related bugs:
- Rights could be leaked to the recipient process if an error occurred
while copying out a message's contents.
- We failed to set MSG_CTRUNC if the truncation occurred on a control
message boundary, e.g., if the caller received two control messages
and provided only the exact amount of buffer space needed for the
first.

PR: 131876
Reviewed by: ed (previous version)
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D16561
b3776cb8de3ed109d5b9625fc934a7face4f69a7 30-Jul-2018 asomers <asomers@FreeBSD.org> Make timespecadd(3) and friends public

The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.

Our kernel currently defines two-argument versions of timespecadd and
timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions. Solaris also defines a three-argument version, but
only in its kernel. This revision changes our definition to match the
common three-argument version.

Bump _FreeBSD_version due to the breaking KPI change.

Discussed with: cem, jilles, ian, bde
Differential Revision: https://reviews.freebsd.org/D14725
a0bd5d3d7ffae2d09d6ae3cb12bed3ca80e88928 09-May-2018 mmacy <mmacy@FreeBSD.org> Eliminate the overhead of gratuitous repeated reinitialization of cap_rights

- Add macros to allow preinitialization of cap_rights_t.

- Convert most commonly used code paths to use preinitialized cap_rights_t.
A 3.6% speedup in fstat was measured with this change.

Reported by: mjg
Reviewed by: oshogbo
Approved by: sbruno
MFC after: 1 month
f4add23ff529590a71dce4bd02f95aaad02173b7 22-Feb-2018 emaste <emaste@FreeBSD.org> Correct proper nouns in the Linuxulator

- Capitalize Linux
- Spell FreeBSD out in full
- Address some style(9) on changed lines

Sponsored by: Turing Robotic Industries Inc.
624a2a708bfd9969ddc650db6555b1b3afd1508a 16-Feb-2018 emaste <emaste@FreeBSD.org> Rationalize license text on Linuxolator files

Many licenses on Linuxolator files contained small variations from the
standard FreeBSD license text. To avoid license proliferation switch to
the standard 2-clause FreeBSD license for those files where I have
permission from each of the listed copyright holders. Additional files
waiting on permission from others are listed in review D14210.

Approved by: kan, marcel, sos, rdivacky
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
5c9ea56c9897f1f23d450e6436fa36167953c3ee 05-Feb-2018 emaste <emaste@FreeBSD.org> Linuxolator whitespace cleanup

A version of each of the MD files by necessity exists for each CPU
architecture supported by the Linuxolator. Clean these up so that new
architectures do not inherit whitespace issues.

Clean up shared Linuxolator files while here.

Sponsored by: Turing Robotic Industries Inc.
2a03579eb7145d8a82cc4bedb4307626f7c1b0ba 27-Nov-2017 pfg <pfg@FreeBSD.org> sys/compat: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
ed1e1b1d20f85a3c5df3bc1ba7662f8d3e619e07 18-Mar-2017 dchagin <dchagin@FreeBSD.org> As noted by Roel Bouwman Linux allows a large buffer size than the
struct ucred size. Fix this.

PR: 102956
Reported by: Roel Bouwman <roel at qsp nl>
MFC after: 1 week
98c683ad847f6e6e39fae407ccae1545104f3370 18-Mar-2017 dchagin <dchagin@FreeBSD.org> Remove superflous break statment.

MFC after: 1 week
554adae91ed8c7d85045eaca2c741966bde44e48 18-Feb-2017 dchagin <dchagin@FreeBSD.org> Style(9), some XXX comments fix. No functional changes.

MFC after: 1 week
665056f70d80d2d51b3f17fbde3bb414c79d1dd6 18-Feb-2017 dchagin <dchagin@FreeBSD.org> Initialize cap_rights before use.

MFC after: 1 week
bb2f96be4628ab41999f53e7dcc37fc016196386 18-Feb-2017 dchagin <dchagin@FreeBSD.org> Finich r313684.

Convert linux_recv(), linux_send() and linux_accept() system call arguments
to the register_t type too.

PR: 217161
MFC after: 3 days
xMFC with: r313284,r313285,r313684
37b3e9269d80471e7166706bc034099030387d36 12-Feb-2017 dchagin <dchagin@FreeBSD.org> Fix r313284.

Members of the syscall argument structures are padded to a word size. So,
for COMPAT_LINUX32 we should convert user supplied system call arguments
which is 32-bit in that case to the array of register_t.

Reported by: Oleg V. Nauman
MFC after: 1 week
a5463064988ebab24f5bd2837bb9cb09e4fc97f7 30-Jan-2017 trasz <trasz@FreeBSD.org> Add kern_listen(), kern_shutdown(), and kern_socket(), and use them
instead of their sys_*() counterparts in various compats. The svr4
is left untouched, because there's no point.

Reviewed by: ed@, kib@
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D9367
7827ebf36be1800f4828ca0bd7326fc58bd1261a 06-Jan-2017 glebius <glebius@FreeBSD.org> Use getsock_cap() instead of fgetsock().

Reviewed by: dchagin
41574d71106b68fd70cfa13d6e2fae3b82511b89 22-Sep-2016 oshogbo <oshogbo@FreeBSD.org> capsicum: propagate rights on accept(2)

Descriptor returned by accept(2) should inherits capabilities rights from
the listening socket.

PR: 201052
Reviewed by: emaste, jonathan
Discussed with: many
Differential Revision: https://reviews.freebsd.org/D7724
c04eec0cf0d22693afb178c43250f77699c18964 29-Jun-2016 dchagin <dchagin@FreeBSD.org> MFC r302213:

Fix a bug introduced in r283433.

[1] Remove unneeded sockaddr conversion before kern_recvit() call as the from
argument is used to record result (the source address of the received message) only.

[2] In Linux the type of msg_namelen member of struct msghdr is signed but native
msg_namelen has a unsigned type (socklen_t). So use the proper storage to fetch fromlen
from userspace and than check the user supplied value and return EINVAL if it is less
than 0 as a Linux do.

Reported by: Thomas Mueller <tmueller at sysgo dot com> [1]
Tested by: Thomas Mueller <tmueller at sysgo dot com> [both]
Reviewed by: kib@
c0a24efce9d60b733c0f507e897dfa3e089c59cc 26-Jun-2016 dchagin <dchagin@FreeBSD.org> Fix a bug introduced in r283433.

[1] Remove unneeded sockaddr conversion before kern_recvit() call as the from
argument is used to record result (the source address of the received message) only.

[2] In Linux the type of msg_namelen member of struct msghdr is signed but native
msg_namelen has a unsigned type (socklen_t). So use the proper storage to fetch fromlen
from userspace and than check the user supplied value and return EINVAL if it is less
than 0 as a Linux do.

Reported by: Thomas Mueller <tmueller at sysgo dot com> [1]
Reviewed by: kib@
Approved by: re (gjb, kib)
MFC after: 3 days
32dd9edc0555ce09f7ac2a113175c6a42e44348a 05-Jun-2016 dchagin <dchagin@FreeBSD.org> MFC r300431:

Convert proto family in both directions. The linux and native values for
local and inet are identical, but for inet6 values differ.

PR: 155040
Reported by: Simon Walton
e8e944c226a0adc2138f6e72bce80f5e6e162ac9 05-Jun-2016 dchagin <dchagin@FreeBSD.org> MFC r300416:

Add a missing errno translation for SO_ERROR optname.

PR: 135458
Reported by: Stefan Schmidt
00d578928eca75be320b36d37543a7e2a4f9fbdb 27-May-2016 grehan <grehan@FreeBSD.org> Create branch for bhyve graphics import.
aef1c65d598e571ebaebf265ca422509e3297788 22-May-2016 dchagin <dchagin@FreeBSD.org> Convert proto family in both directions. The linux and native values for
local and inet are identical, but for inet6 values differ.

PR: 155040
Reported by: Simon Walton
MFC after: 2 week
97c779b69f971ffb4a3467e8b91db614297bc832 22-May-2016 dchagin <dchagin@FreeBSD.org> Add a missing errno translation for SO_ERROR optname.

PR: 135458
Reported by: Stefan Schmidt @ stadtbuch.de
MFC after: 1 week
a7d40a88c91d105dcfe2f235bc84a522bfea3de2 19-Apr-2016 pfg <pfg@FreeBSD.org> kernel: use our nitems() macro when it is available through param.h.

No functional change, only trivial cases are done in this sweep,

Discussed in: freebsd-current
238b2f4322eb16bdc6bf9b3a137401599a20cd02 03-Apr-2016 dchagin <dchagin@FreeBSD.org> MFC r297309:

Whitespaces and style(9) fix. No functional changes.
a5f7ea1073b23d713cd4d9aa792ec74bec75a3f7 27-Mar-2016 dchagin <dchagin@FreeBSD.org> Revert r297310 as the SOL_XXX are equal to the IPPROTO_XX except SOL_SOCKET.

Pointed out by: ae@
22c1ebea2165a7ab10a96ecfa28eeadff62b87e2 27-Mar-2016 dchagin <dchagin@FreeBSD.org> iConvert Linux SOL_IPV6 level.

MFC after: 1 week
5a426e14e9e100d0f6338b574bc8554be94b056d 27-Mar-2016 dchagin <dchagin@FreeBSD.org> Whitespaces and style(9) fix. No functional changes.

MFC after: 1 week
b9ebebfb6dc398f8c372779ec6ade880c8b32ed4 27-Mar-2016 dchagin <dchagin@FreeBSD.org> Revert r297303.
ef3387785250f1e6d21eeb0bd4cdfd134077e40b 27-Mar-2016 dchagin <dchagin@FreeBSD.org> MFC r296503, r296504:

Linux accept() system call return EOPNOTSUPP errno instead of EINVAL for UDP sockets.
5fc2be082f352cc5c667f8a194ed776916cc1472 23-Mar-2016 ae <ae@FreeBSD.org> MFC r296557:
Add support for IPPROTO_IPV6 socket layer for getsockopt/setsockopt calls.
Also add mapping for several options from RFC 3493 and 3542.
07110a8ca79a6e3690814e52a43796c290d84df6 09-Mar-2016 ae <ae@FreeBSD.org> Add support for IPPROTO_IPV6 socket layer for getsockopt/setsockopt calls.
Also add mapping for several options from RFC 3493 and 3542.

Reviewed by: dchagin
Tested by: Joe Love <joe at getsomwhere dot net>
MFC after: 2 weeks
82003e4255c1c5364d130098d840d1c97cf76aaa 08-Mar-2016 dchagin <dchagin@FreeBSD.org> Does not leak fp. While here remove bogus cast of fp->f_data.

MFC after: 1 week
51e9cd7c4195705eeeb570effd498dd4a9d7539c 08-Mar-2016 dchagin <dchagin@FreeBSD.org> Linux accept() system call return EOPNOTSUPP errno instead of EINVAL
for UDP sockets.

MFC after: 1 week
fb5d84effe4e90aaedb5e202e2a44ece749c29b3 21-Jan-2016 dchagin <dchagin@FreeBSD.org> MFC 294233:

Prevent double free of control in common sendmsg path as sosend
already freeing it.
43c7490c64e0bb3d034c8d88d519f567e9d0fb4c 17-Jan-2016 dchagin <dchagin@FreeBSD.org> Prevent double free of control in common sendmsg path as sosend
already freeing it.
3fca4fe8523b16d9f47fd7eeb14581f9ce96d00e 09-Jan-2016 dchagin <dchagin@FreeBSD.org> MFC r284166 (by jkim):

Properly initialize flags for accept4(2) not to return spurious EINVAL.
Note this fixes a Linuxulator regression introduced in r283490.

PR: 200662
7cbbe6a948f4c6b49f665981024645ae2d8de465 09-Jan-2016 dchagin <dchagin@FreeBSD.org> MFC r283497:

Convert SCM_TIMESTAMP in recvmsg().
1e8561a8e3e93cb5e99ec1b6531703316e8b2dd9 09-Jan-2016 dchagin <dchagin@FreeBSD.org> MFC r283494:

Fix an mbuf(9) leak in sendmsg() under failure condition and
remove unneeded check for failed M_WAITOK allocation.
b5efd2488f0723cef24e597a690286e3b4c9ec21 09-Jan-2016 dchagin <dchagin@FreeBSD.org> MFC r283490:

Since FreeBSD supports SOCK_CLOEXEC & SOCK_NONBLOCK options
remove its emulation via fcntl call from Linuxulator.
bbbcfd1903a96b7d229027efb7e55d76a0601cab 09-Jan-2016 dchagin <dchagin@FreeBSD.org> MFC r283488:

Implement recvmmsg() and sendmmsg() system calls.
220551826587846e865838bbffb78ba84fcc0a30 09-Jan-2016 dchagin <dchagin@FreeBSD.org> MFC r283437:

To avoid code duplication move open/fcntl definitions to the MI
header file.
c0b5073f8da1bd5f4d69c11f33fb728424c7a053 09-Jan-2016 dchagin <dchagin@FreeBSD.org> MFC r283433:

Rewrite linux_recvfrom. To avoid double conversion of sockaddr use
kern_recvit() directly.
And check fromlen parameter before sockaddr copyin and conversion.
9bb36bc01cd74565fb879d43a4431edff14a2741 09-Jan-2016 dchagin <dchagin@FreeBSD.org> MFC r283427:

Where possible we will use M_LINUX malloc(9) type.
Move M_FUTEX defines to the linux_common.ko.
0a1120ef093d9730dc72cdda6f4ff89d8c577d0f 09-Jan-2016 dchagin <dchagin@FreeBSD.org> MFC r283415:

Disable i386 call for x86-64 Linux.
62726e37ba923515e74d7e361fdf5f17a26872b4 09-Jan-2016 dchagin <dchagin@FreeBSD.org> MFC r283413:

64-bit paltforms, like x86_64, do not use multiplexing on
socketcall system calls.
4fefd3d137d34d9b42f8e414efff28e4a015dfd4 08-Jun-2015 jkim <jkim@FreeBSD.org> Properly initialize flags for accept4(2) not to return spurious EINVAL.
Note this fixes a Linuxulator regression introduced in r283490.

PR: 200662
a71dbd5ba1a64532e3e7987a1c96536e6e7ae15d 24-May-2015 dchagin <dchagin@FreeBSD.org> Convert SCM_TIMESTAMP in recvmsg().
706fa727d79be3be62a854dcaa7030b647104808 24-May-2015 dchagin <dchagin@FreeBSD.org> Fix an mbuf(9) leak in sendmsg() under failure condition and
remove unneeded check for failed M_WAITOK allocation.

Found by: Brainy Code Scanner
Reported by: Maxime Villard
796d34d09cdb59afca2d3e36ab48eb6eb39bf7aa 24-May-2015 dchagin <dchagin@FreeBSD.org> Since FreeBSD supports SOCK_CLOEXEC & SOCK_NONBLOCK options
remove its emulation via fcntl call from Linuxulator.
ea835975de15ee9cce8a79e4f9b8aa86cb963d05 24-May-2015 dchagin <dchagin@FreeBSD.org> Implement recvmmsg() and sendmmsg() system calls.
db8a000521e6e8da552691bb097b52a61cc1057d 24-May-2015 dchagin <dchagin@FreeBSD.org> To avoid code duplication move open/fcntl definitions to the MI
header file.

Differential Revision: https://reviews.freebsd.org/D1087
Reviewed by: trasz
c6387d07c9859c7a50097ab80479e89670b48414 24-May-2015 dchagin <dchagin@FreeBSD.org> Rewrite linux_recvfrom. To avoid double conversion of sockaddr use
kern_recvit() directly.
And check fromlen parameter before sockaddr copyin and conversion.

Differential Revision: https://reviews.freebsd.org/D1082
f5eca4c95705929d25f9548a80b5041730c2186f 24-May-2015 dchagin <dchagin@FreeBSD.org> Where possible we will use M_LINUX malloc(9) type.
Move M_FUTEX defines to the linux_common.ko.

Differential Revision: https://reviews.freebsd.org/D1077
Reviewed by: emaste
a92a30f54da0f10ef3807a9ce5639951610c03da 24-May-2015 dchagin <dchagin@FreeBSD.org> Disable i386 call for x86-64 Linux.

Differential Revision: https://reviews.freebsd.org/D1067
Reviewed by: trasz
c0ca16d4f0773f5ddf0a8a12b29a2ba05bdba943 24-May-2015 dchagin <dchagin@FreeBSD.org> 64-bit paltforms, like x86_64, do not use multiplexing on
socketcall system calls.

Differential Revision: https://reviews.freebsd.org/D1065
Reviewed by: trasz
6102a34d3875e6b3f22e0245d7698fe549c674a1 19-Mar-2015 rwatson <rwatson@FreeBSD.org> Merge r263233 from HEAD to stable/10:

Update kernel inclusions of capability.h to use capsicum.h instead; some
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.

Sponsored by: Google, Inc.
3e6f1de34088867465b6a1a87fb10c25fc811e50 08-Jan-2015 dchagin <dchagin@FreeBSD.org> MFC r276512:
Fix Clang -Wpointer-sign warnings.
236e47c874f5ddbdf22a6c2cdb9292c2f081f3dd 01-Jan-2015 dchagin <dchagin@FreeBSD.org> Fix Clang -Wpointer-sign warnings.

MFC after: 1 week
b4ef709604332a259f2a08f546cceec6ab3ecace 13-Nov-2014 kib <kib@FreeBSD.org> Remove the no-at variants of the kern_xx() syscall helpers. E.g., we
have both kern_open() and kern_openat(); change the callers to use
kern_openat().

This removes one (sometimes two) levels of indirection and
consolidates arguments checks.

Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
33fdc14c0cd663baae9fad419e3f9cfe12578196 16-Mar-2014 rwatson <rwatson@FreeBSD.org> Update kernel inclusions of capability.h to use capsicum.h instead; some
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.

MFC after: 3 weeks
eb1a5f8de9f7ea602c373a710f531abbf81141c4 21-Feb-2014 gjb <gjb@FreeBSD.org> Move ^/user/gjb/hacking/release-embedded up one directory, and remove
^/user/gjb/hacking since this is likely to be merged to head/ soon.

Sponsored by: The FreeBSD Foundation
6b01bbf146ab195243a8e7d43bb11f8835c76af8 27-Dec-2013 gjb <gjb@FreeBSD.org> Copy head@r259933 -> user/gjb/hacking/release-embedded for initial
inclusion of (at least) arm builds with the release.

Sponsored by: The FreeBSD Foundation
2c1ec831c915278de463c18f392c618182c48c35 26-Oct-2013 glebius <glebius@FreeBSD.org> Provide includes that are needed in these files, and before were read
in implicitly via if.h -> if_var.h pollution.

Sponsored by: Netflix
Sponsored by: Nginx, Inc.
029a6f5d92dc57925b5f155d94d6e01fdab7a45d 05-Sep-2013 pjd <pjd@FreeBSD.org> Change the cap_rights_t type from uint64_t to a structure that we can extend
in the future in a backward compatible (API and ABI) way.

The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.

The structure definition looks like this:

struct cap_rights {
uint64_t cr_rights[CAP_RIGHTS_VERSION + 2];
};

The initial CAP_RIGHTS_VERSION is 0.

The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.

The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.

To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.

#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)

We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:

#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL)
#define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)

#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)

There is new API to manage the new cap_rights_t structure:

cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
void cap_rights_set(cap_rights_t *rights, ...);
void cap_rights_clear(cap_rights_t *rights, ...);
bool cap_rights_is_set(const cap_rights_t *rights, ...);

bool cap_rights_is_valid(const cap_rights_t *rights);
void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);

Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:

cap_rights_t rights;

cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);

There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:

#define cap_rights_set(rights, ...) \
__cap_rights_set((rights), __VA_ARGS__, 0ULL)
void __cap_rights_set(cap_rights_t *rights, ...);

Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:

cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);

Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.

This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.

Sponsored by: The FreeBSD Foundation
a0bd41720afb55b501ddff2c3bbd7f50a7c0ba1a 04-Mar-2013 eadler <eadler@FreeBSD.org> Remove check for NULL prior to free(9) and m_freem(9).

Approved by: cperciva (mentor)
af17a55dfd7a008dea74152e32f5d6c803b46bdd 23-Jan-2013 jhb <jhb@FreeBSD.org> Don't assume that all Linux TCP-level socket options are identical to
FreeBSD TCP-level socket options (only the first two are). Instead,
using a mapping function and fail unsupported options as we do for other
socket option levels.

MFC after: 2 weeks
8e20fa5ae93243e19700ca06c01524b90fe3b784 05-Dec-2012 glebius <glebius@FreeBSD.org> Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually
d61d88a310caef24d3eb268f44db0c3ef87c2ece 15-Jan-2012 uqs <uqs@FreeBSD.org> Convert files to UTF-8
f3e29d548396db158b981d33b8e7affeac709ae3 06-Oct-2011 jkim <jkim@FreeBSD.org> Use the caculated length instead of maximum length.
5bf71ef7d883aec32fe869b288317533e4324d3e 06-Oct-2011 jkim <jkim@FreeBSD.org> Remove a now-defunct variable.
f18be120377c0b30a1f1f5f755ea6282b2fa35de 06-Oct-2011 jkim <jkim@FreeBSD.org> Use uint32_t instead of u_int32_t. Fix style(9) nits.
40f14199fbe765dba43b1948da218393e38f8e5f 06-Oct-2011 jkim <jkim@FreeBSD.org> Make sure to ignore the leading NULL byte from Linux abstract namespace.
bd4e9fe2ca235f77b480cd2357ca375a084f9a8b 06-Oct-2011 jkim <jkim@FreeBSD.org> Restore the original socket address length if it was not really AF_INET6.
55a4bbebe3049907c91a6dbf26531e15e11da72a 06-Oct-2011 jkim <jkim@FreeBSD.org> Retern more appropriate errno when Linux path name is too long.
cf76cae97886d8148e79ce548e4f5a682948dd9e 06-Oct-2011 jkim <jkim@FreeBSD.org> Inline do_sa_get() function and remove an unused return value.
9a911728e76f65555fab4e616679856f71205f36 06-Oct-2011 jkim <jkim@FreeBSD.org> Unroll inlined strnlen(9) and make it easier to read. No functional change.
1485b2ad01c1e0d5adfdbcc6d06114ecaa15a950 04-Oct-2011 cperciva <cperciva@FreeBSD.org> Fix a bug in UNIX socket handling in the linux emulator which was
exposed by the security fix in FreeBSD-SA-11:05.unix.

Approved by: so (cperciva)
Approved by: re (kib)
Security: Related to FreeBSD-SA-11:05.unix, but not actually
a security fix.
99851f359e6f006b3223bb37dbc49e751ca8c13a 16-Sep-2011 kmacy <kmacy@FreeBSD.org> In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by: rwatson
Approved by: re (bz)
4af919b491560ff051b65cdf1ec730bdeb820b2e 11-Aug-2011 rwatson <rwatson@FreeBSD.org> Second-to-last commit implementing Capsicum capabilities in the FreeBSD
kernel for FreeBSD 9.0:

Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *. With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.

Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.

In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.

Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.

Approved by: re (bz)
Submitted by: jonathan
Sponsored by: Google Inc
94ec7d298848badbf8397b4223feb987bc9a8d38 31-Mar-2011 avg <avg@FreeBSD.org> Revert r220032:linux compat: add SO_PASSCRED option with basic handling

I have not properly thought through the commit. After r220031 (linux
compat: improve and fix sendmsg/recvmsg compatibility) the basic
handling for SO_PASSCRED is not sufficient as it breaks recvmsg
functionality for SCM_CREDS messages because now we would need to handle
sockcred data in addition to cmsgcred. And that is not implemented yet.

Pointyhat to: avg
df7a39b1d0adfdfc6402d5739a09ff973e96544a 26-Mar-2011 avg <avg@FreeBSD.org> linux compat: add SO_PASSCRED option with basic handling

This seems to have been a part of a bigger patch by dchagin that either
haven't been committed or committed partially.

Submitted by: dchagin, nox
MFC after: 2 weeks
a92365858357b636caacb1f576fcb0c622aebbec 26-Mar-2011 avg <avg@FreeBSD.org> linux compat: improve and fix sendmsg/recvmsg compatibility

- implement baseic stubs for capget, capset, prctl PR_GET_KEEPCAPS
and prctl PR_SET_KEEPCAPS.
- add SCM_CREDS support to sendmsg and recvmsg
- modify sendmsg to ignore control messages if not using UNIX
domain sockets

This should allow linux pulse audio daemon and client work on FreeBSD
and interoperate with native counter-parts modulo the differences in
pulseaudio versions.

PR: kern/149168
Submitted by: John Wehle <john@feith.com>
Reviewed by: netchild
MFC after: 2 weeks
09f9c897d33c41618ada06fbbcf1a9b3812dee53 19-Oct-2010 jamie <jamie@FreeBSD.org> A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.
f1216d1f0ade038907195fc114b7e630623b402c 19-Mar-2010 delphij <delphij@FreeBSD.org> Create a custom branch where I will be able to do the merge.
8abc79c5071df3bedadab6fb81f17e9285be8633 23-Feb-2010 delphij <delphij@FreeBSD.org> MFC r203728:

- Return EAFNOSUPPORT instead of EINVAL for unsupported address family,
this matches the Linux behavior.
- Check if we have sufficient space allocated for socket structure, which
fixes a buffer overflow when wrong length is being passed into the
emulation layer. [1]

PR: kern/138860
Submitted by: Mateusz Guzik <mjguzik gmail com>
Reported by: Alexander Best [1]
54ac71ad0f6c9320cedd699c4dce60be1177e418 09-Feb-2010 delphij <delphij@FreeBSD.org> - Return EAFNOSUPPORT instead of EINVAL for unsupported address family,
this matches the Linux behavior.
- Check if we have sufficient space allocated for socket structure, which
fixes a buffer overflow when wrong length is being passed into the
emulation layer. [1]

PR: kern/138860
Submitted by: Mateusz Guzik <mjguzik gmail com>
Reported by: Alexander Best [1]
MFC after: 2 weeks
fbd7093613c0b2f60e6612613fd0b2c99533fcf6 06-Dec-2009 bz <bz@FreeBSD.org> MFC r198467:

Unconditionally call the setsockopt for IPV6_V6ONLY for v6 linux sockets
no matter whether we are compiled as module or if our default of the
net.inet6.ip6.v6only sysctl already matches what we would set.

This avoids unnecessary complications with modules, VIMAGES, INET6 and
the sysctl value, especially considering that most users will use
linux compat as a module.

Discussed with: kib, rwatson (weeks ago)
Reviewed by: rwatson
b991a4ad124ba6d7146b64c3f5c06a9a799ceeb7 25-Oct-2009 bz <bz@FreeBSD.org> Unconditionally call the setsockopt for IPV6_V6ONLY for v6 linux sockets
no matter whether we are compiled as module or if our default of the
net.inet6.ip6.v6only sysctl already matches what we would set.

This avoids unnecessary complications with modules, VIMAGES, INET6 and
the sysctl value, especially considering that most users will use
linux compat as a module.

Discussed with: kib, rwatson (weeks ago)
Reviewed by: rwatson
MFC after: 6 weeks
fb9ffed6504601ed9da2c6b9a620b133c838964c 01-Aug-2009 rwatson <rwatson@FreeBSD.org> Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by: bz
Approved by: re (vimage blanket)
57ca4583e728cab422fba8f15de10bd0b637b3dd 14-Jul-2009 rwatson <rwatson@FreeBSD.org> Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)
8e9d8c289c626522729c4d78e2db40ec72ff1237 01-Jun-2009 dchagin <dchagin@FreeBSD.org> Add forgotten in previous commit flags argument.

Approved by: kib (mentor)
MFC after: 1 month
bb8f1f3e67af8643881e71a4d80d6580ba2aa43a 01-Jun-2009 dchagin <dchagin@FreeBSD.org> Implement accept4 syscall.

Approved by: kib (mentor)
MFC after: 1 month
76d24c5be3b678fc63b8b3b0cd9fad664c2303ca 01-Jun-2009 dchagin <dchagin@FreeBSD.org> Implement a variation of the accept_common() which takes
a flags argument.

Do not preserve td_retval before kern_fcntl(F_SETFL) as it does not
changed.

Approved by: kib (mentor)
MFC after: 1 month
0cc88e7ca3ae43c15437a869b4f410740d56a881 01-Jun-2009 dchagin <dchagin@FreeBSD.org> Split linux_accept() syscall onto linux_accept_common() which should
be used by linuxulator and linux_accept() itself.

Approved by: kib (mentor)
MFC after: 1 month
6fb0275352a53e468de58a1124fc5b35e9234143 31-May-2009 dchagin <dchagin@FreeBSD.org> Implement a variation of the socketpair() syscall which takes a flags
in addition to the type argument.

Approved by: kib (mentor)
MFC after: 1 month
ab797d42e47bd545852aab8db4db5d162e9a0e38 31-May-2009 dchagin <dchagin@FreeBSD.org> Move new socket flags handling into a separate function as Linux
introduced more syscalls which uses these flags.

Approved by: kib (mentor)
MFC after: 1 month
fbb545b6849d6e28632c082abf30bd6b222ed6d3 31-May-2009 dchagin <dchagin@FreeBSD.org> Remove empty lines.

Approved by: kib (mentor)
MFC after: 1 month
56c9819821af0ba40bed6b1d2e8cfb2692657d9c 19-May-2009 dchagin <dchagin@FreeBSD.org> Validate user-supplied arguments values.
Args argument is a pointer to the structure located in user space in
which the socketcall arguments are packed. The structure must be
copied to the kernel instead of direct dereferencing.

Approved by: kib (mentor)
MFC after: 1 week
7316b5296a8c60515aa749726e133fec1df8d8dc 18-May-2009 dchagin <dchagin@FreeBSD.org> Implement MSG_CMSG_CLOEXEC flag for linux_recvmsg().

Approved by: kib (mentor)
MFC after: 1 month
5351e066994a3d70d5c9b6c7a9d290bd517a8966 16-May-2009 dchagin <dchagin@FreeBSD.org> Somewhere between 2.6.23 and 2.6.27, Linux added SOCK_CLOEXEC and
SOCK_NONBLOCK flags, that allow to save fcntl() calls.

Implement a variation of the socket() syscall which takes a flags
in addition to the type argument.

Approved by: kib (mentor)
MFC after: 1 month
a0c026b20b499c61df6fa4153a626efb7ccb0e59 16-May-2009 dchagin <dchagin@FreeBSD.org> Return EINVAL in case when the incorrect or unsupported
type argument is specified.

Do not map type argument value as its Linux values are
identical to FreeBSD values.

Approved by: kib (mentor)
eae11e9cce32ec5994ed146a89e03c915d86051e 16-May-2009 dchagin <dchagin@FreeBSD.org> Use the protocol family constants for the domain argument validation.
Return immediately when the socket() failed.

Approved by: kib (mentor)
MFC after: 1 month
bc4e3c1f6d89b977ecd4d226f0cd88dccd1e70a9 16-May-2009 dchagin <dchagin@FreeBSD.org> Emulate SO_PEERCRED socket option.
Temporarily use 0 for pid member as the FreeBSD does not cache remote
UNIX domain socket peer pid.

PR: kern/102956
Reviewed by: rwatson
Approved by: kib (mentor)
MFC after: 1 month
ebcb20267286e4ffadb3f603af56bd7a1df1f422 11-May-2009 dchagin <dchagin@FreeBSD.org> Translate l_timeval arg to native struct timeval in
linux_setsockopt()/linux_getsockopt() for SO_RCVTIMEO,
SO_SNDTIMEO opts as l_timeval has MD members.

Remove bogus __packed attribute from l_timeval struct on __amd64__.

PR: kern/134276
Submitted by: Thomas Mueller <tmueller sysgo com>
Approved by: kib (mentor)
MFC after: 2 weeks
4f4faf9d43becfa5ced141ee8572bdeb9969b1cf 11-May-2009 dchagin <dchagin@FreeBSD.org> Add forgotten linux to bsd flags argument mapping into the linux_recv().

PR: kern/134276
Submitted by: Thomas Mueller <tmueller sysgo com>
Approved by: kib (mentor)
MFC after: 2 weeks
3ce50871ce9326b57e479bd87070622c222b9519 07-May-2009 dchagin <dchagin@FreeBSD.org> Return EAFNOSUPPORT instead of EINVAL in case when the incorrect or
unsupported domain argument is specified.

Approved by: kib (mentor)
9f1df514229a703d7db011676bd5a157cb106ddc 07-May-2009 dchagin <dchagin@FreeBSD.org> Rework r191742.
Use the protocol family constants for the domain argument validation.

Return EAFNOSUPPORT in case when the incorrect domain argument
is specified.

Return EPROTONOSUPPORT instead of passing values that are not 0
to the BSD layer.

Suggested by: rwatson

Approved by: kib (mentor)
MFC after: 1 month
f04150bca8c18c38f8cdd8cc5e33b058eab61b37 02-May-2009 dchagin <dchagin@FreeBSD.org> Linux socketpair() call expects explicit specified protocol for
AF_LOCAL domain unlike FreeBSD which expects 0 in this case.

Approved by: kib (mentor)
MFC after: 1 month
8d976eab5c2cbc080800be588a8572ebec4a7795 26-Apr-2009 zec <zec@FreeBSD.org> In preparation for turning on options VIMAGE in next commits,
rearrange / replace / adjust several INIT_VNET_* initializer
macros, all of which currently resolve to whitespace.

Reviewed by: bz (an older version of the patch)
Approved by: julian (mentor)
604d89458ab94ec81eaefa2d55ef219cba461e31 02-Dec-2008 bz <bz@FreeBSD.org> Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by: brooks, gnn, des, zec, imp
Sponsored by: The FreeBSD Foundation
8ffb383318e85cc991be93037b0aba04d0079d92 29-Nov-2008 kib <kib@FreeBSD.org> Make linux_sendmsg() and linux_recvmsg() work on linux32/amd64.
Change types used in the linux' struct msghdr and struct cmsghdr
definitions to the properly-sized architecture-specific types.
Move ancillary data handler from linux_sendit() to linux_sendmsg().

Submitted by: dchagin
19b6af98ec71398e77874582eb84ec5310c7156f 22-Nov-2008 dfr <dfr@FreeBSD.org> Clone Kip's Xen on stable/6 tree so that I can work on improving FreeBSD/amd64
performance in Xen's HVM mode.
66f807ed8b3634dc73d9f7526c484e43f094c0ee 23-Oct-2008 des <des@FreeBSD.org> Retire the MALLOC and FREE macros. They are an abomination unto style(9).

MFC after: 3 months
cf5320822f93810742e3d4a1ac8202db8482e633 19-Oct-2008 lulf <lulf@FreeBSD.org> - Import the HEAD csup code which is the basis for the cvsmode work.
8797d4caecd5881e312923ee1d07be3de68755dc 02-Oct-2008 zec <zec@FreeBSD.org> Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by: julian, bz, brooks, zec
Reviewed by: julian, bz, brooks, kris, rwatson, ...
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation
9d63564c1b8efb5c3942a46abbbffb2c270003cb 17-Sep-2008 kib <kib@FreeBSD.org> MFC r182890:
Remove superfluous copyin() of args, structures are already in kernel space.

Approved by: re (kensmith)
626be4984b8725de1d5bfcd3ca1670bff15bf636 09-Sep-2008 kib <kib@FreeBSD.org> Remove superfluous copyin() of args, structures are already in kernel space.

Submitted by: dchagin
MFC after: 1 week
1021d43b569bfc8d2c5544bde2f540fa432b011f 17-Aug-2008 bz <bz@FreeBSD.org> Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
68fcd79292e6e5bf6d581945fed927092d213709 03-Sep-2007 netchild <netchild@FreeBSD.org> MFC:
- sync linuxulator:
* de-COMPAT_43-ify:
+ socket related ioctl's
This differs from -current, as the kernel ABI is different
(kern_bind() and kern_connect() free the struct sockaddr on -stable
themself, so two calls to free() are not included in this MFC).
* bug-/compatibility-fixes
* ioctl TIOCGPTN
* 1 style(9)-fix

Tested by: "Arno J. Klaassen" <arno@heho.snv.jussieu.fr>
23574c86734ab5cb088584d30345e698cbbeaef2 06-Aug-2007 rwatson <rwatson@FreeBSD.org> Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which
previously conditionally acquired Giant based on debug.mpsafenet. As that
has now been removed, they are no longer required. Removing them
significantly simplifies error-handling in the socket layer, eliminated
quite a bit of unwinding of locking in error cases.

While here clean up the now unneeded opt_net.h, which previously was used
for the NET_WITH_GIANT kernel option. Clean up some related gotos for
consistency.

Reviewed by: bz, csjp
Tested by: kris
Approved by: re (kensmith)
962e1e68368724b796f21f5f4a62f9c67448f74c 08-Jul-2007 netchild <netchild@FreeBSD.org> MFC (4 of X):
- don't limit number of syscalls to 255
- handle more socket options
- bug-/compatibility-fixes to linux
* file related (includes fixes which prevent creation of strange files
which can only be removed with a fsck)
* make ping work
* ...
- add devfs to the file system type handling/translation

Compile tested by: scf (i386, as part of a mega-MFC-patch)
Tested by: Arno J. Klaassen <arno@heho.snv.jussieu.fr> (amd64)
34d21d2a2cf7c2511ae8464c114e645196bbbad5 08-Jul-2007 netchild <netchild@FreeBSD.org> MFC (1 of X):
- easy linuxulator style(9) fixes (easy = hand removal of non-style code
change sections in a full diff)

Tested by: scf (i386), Arno J. Klaassen <arno@heho.snv.jussieu.fr> (amd64)
c4b43c46c906a35a302c13d4b22acc1f93e75de1 14-Apr-2007 rwatson <rwatson@FreeBSD.org> Some Linux applications (ping) pass a non-NULL msg_control argument to
sendmsg() while using a 0-length msg_controllen. This isn't allowed in
the FreeBSD system call ABI, so detect this case and set msg_control to
NULL. This allows Linux ping to work.

Submitted by: rdivacky
8f812418c1ae9367d5f65cdf384261133103866a 01-Feb-2007 kib <kib@FreeBSD.org> Introduce some more SO_ option equivalents from Linux to FreeBSD.

The msg variable in linux_recvmsg() was not initialized.
Copy it from userspace.

Submitted by: rdivacky
6ecb474f4fee36803a21cadcee5dc0f6b0e6b4d2 23-Sep-2006 netchild <netchild@FreeBSD.org> MFp4:
- Linux returns ENOPROTOOPT in a case of not supported opt to setsockopt.
- Return EISDIR in pread() when arg is a directory.
- Return EINVAL instead of EFAULT when namelen is not correct in accept().
- Return EINVAL instead of EACCESS if invalid access mode is entered in
access().
- Return EINVAL instead of EADDRNOTAVAIL in a case of bad salen param
to bind().

Submitted by: rdivacky
Tested with: LTP (vfork01 fails now, but it seems to be a race and
not caused by those changes)
MFC after: 1 week
947b8c9fbd64c828ef441411fff4f0ce542a99e2 19-Jul-2006 jhb <jhb@FreeBSD.org> Don't free the sockaddr in kern_bind() and kern_connect() as not all
callers pass a sockaddr allocated via malloc() from M_SONAME anymore.
Instead, free it in the callers when necessary.
e09e5b52dbb8914136f6708a8042007a16277dde 08-Jul-2006 jhb <jhb@FreeBSD.org> Add a kern_close() so that the ABIs can close a file descriptor w/o having
to populate a close_args struct and change some of the places that do.
326540d4d567d7d8120fe40a0512fe88864e7178 10-May-2006 netchild <netchild@FreeBSD.org> Now that we don't have a linuxolator on alpha anymore:
- unifdef __alpha__
- revert rev. 1.66 of linux_socket.c
68ff3be0b395955e8feac72262824b90a155a710 01-Apr-2006 rwatson <rwatson@FreeBSD.org> Annotate uses of fgetsock() with indications that they should rely
on their existing file descriptor references to sockets, rather than
use fgetsock() to retrieve a direct socket reference.

MFC after: 3 months
e29c4e80fd7361560c782a5d2034987165c0395d 21-Mar-2006 netchild <netchild@FreeBSD.org> Fix the LINT build on alpha:
- rename some file local structure definitions, the names clash with
autogenerated names
- on !alpha add some compatibility defines for those renamed structures
- make some functions globally visible on alpha
106242f7bb92f6bf4608b260788264aa932b5fd1 19-Mar-2006 ru <ru@FreeBSD.org> Unbreak COMPAT_LINUX32 option support on amd64.

Broken by: netchild
d1db96cb48e485fb9f14532b1ee1a979b4739114 18-Mar-2006 netchild <netchild@FreeBSD.org> Fixup some problems in my previous commit (COMPAT_43).

Pointyhat to: netchild
c1829f604cdf8a3f393bfa6cb85fe9a6d4908919 18-Mar-2006 netchild <netchild@FreeBSD.org> Get rid of the need of COMPAT_43 in the linuxolator.

Submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz>
Obtained from: DragonFly (some parts)
df3c4d7b569bc7c9e3f108f8ede0f8409592e438 10-Jan-2006 glebius <glebius@FreeBSD.org> MFC 1.62:
Add \n to log() message.
3529ccbc92b30787ded7c49ac45b22bd9cf6c91e 27-Dec-2005 glebius <glebius@FreeBSD.org> Add \n to log() message.

Submitted by: Stanislaw Halik <weirdo tehran.lain.pl>
2b01dbdaa07af097e72d1ab008447ee1f3b6bd9b 28-Sep-2005 rwatson <rwatson@FreeBSD.org> Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57,
osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60,
svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81,
svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55,
svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10,
ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58,
unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133:

Now that Giant is acquired in uprintf() and tprintf(), the caller no
longer leads to acquire Giant unless it also holds another mutex that
would generate a lock order reversal when calling into these functions.
Specifically not backed out is the acquisition of Giant in nfs_socket.c
and rpcclnt.c, where local mutexes are held and would otherwise violate
the lock order with Giant.

This aligns this code more with the eventual locking of ttys.

Suggested by: bde
c479a90eb8129ad770ff6daba981e9f20af69e6f 19-Sep-2005 rwatson <rwatson@FreeBSD.org> Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(),
as they both interact with the tty code (!MPSAFE) and may sleep if the
tty buffer is full (per comment).

Modify all consumers of uprintf() and tprintf() to hold Giant around
calls into these functions. In most cases, this means adding an
acquisition of Giant immediately around the function. In some cases
(nfs_timer()), it means acquiring Giant higher up in the callout.

With these changes, UFS no longer panics on SMP when either blocks are
exhausted or inodes are exhausted under load due to races in the tty
code when running without Giant.

NB: Some reduction in calls to uprintf() in the svr4 code is probably
desirable.

NB: In the case of nfs_timer(), calling uprintf() while holding a mutex,
or even in a callout at all, is a bad idea, and will generate warnings
and potential upset. This needs to be fixed, but was a problem before
this change.

NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having
non-MPSAFE tty code.

MFC after: 1 week
8816876fa9909d3428b22db33b1e6fd8edcdd209 09-Jul-2005 jhb <jhb@FreeBSD.org> Add missing locking to linux_connect() so that it can be marked MP safe:
- Conditionally grab Giant around the EISCONN hack at the end based on
debug.mpsafenet.
- Protect access to so_emuldata via SOCK_LOCK.

Reviewed by: rwatson
Approved by: re (scottl)
fbf7a9b2eeca945d9a6947410d6fa2b1c321d366 23-Mar-2005 das <das@FreeBSD.org> Reject packets larger than IP_MAXPACKET in linux_sendto() for sockets
with the IP_HDRINCL option set. Without this change, a Linux process
with access to a raw socket could cause a kernel panic. Raw sockets
must be created by root, and are generally not consigned to untrusted
applications; hence, the security implications of this bug are
minimal. I believe this only affects 6-CURRENT on or after 2005-01-30.

Found by: Coverity Prevent analysis tool
Security: Local DOS
b795e2430a08adbe58c55709aea6123ff893cd2c 08-Mar-2005 sobomax <sobomax@FreeBSD.org> Add kernel-only flag MSG_NOSIGNAL to be used in emulation layers to surpress
SIGPIPE signal for the duration of the sento-family syscalls. Use it to
replace previously added hack in Linux layer based on temporarily setting
SO_NOSIGPIPE flag.

Suggested by: alfred
a5d845fec654f4b43610ba376ae17fa9309853b0 07-Mar-2005 sobomax <sobomax@FreeBSD.org> Handle MSG_NOSIGNAL flag in linux_send() by setting SO_NOSIGPIPE on socket
for the duration of the send() call. Such approach may be less than ideal
in threading environment, when several threads share the same socket and it
might happen that several of them are calling linux_send() at the same time
with and without SO_NOSIGPIPE set.

However, such race condition is very unlikely in practice, therefore this
change provides practical improvement compared to the previous behaviour.

PR: kern/76426
Submitted by: Steven Hartland <killing@multiplay.co.uk>
MFC after: 3 days
68d0bd21861da82ecf4f8a65da75bedd481808f7 30-Jan-2005 sobomax <sobomax@FreeBSD.org> Extend kern_sendit() to take another enum uio_seg argument, which specifies
where the buffer to send lies and use it to eliminate yet another stackgap
in linuxlator.

MFC after: 2 weeks
98e2482a94352fb526a94972423b2c3dc3ebae6f 14-Jan-2005 obrien <obrien@FreeBSD.org> Match the LINUX32's style with existing style
Submitted by: Jung-uk Kim <jkim@niksun.com>

Use positive, not negative logic.
cc23ea84d0ad17e7d69a1539947fdc50a38c6af0 24-Aug-2004 jhb <jhb@FreeBSD.org> Fix the ABI wrappers to use kern_fcntl() rather than calling fcntl()
directly. This removes a few more users of the stackgap and also marks
the syscalls using these wrappers MP safe where appropriate.

Tested on: i386 with linux acroread5
Compiled on: i386, alpha LINT
bf69a165581d2df18a7ae0a951e5879204f1b6fe 23-Aug-2004 des <des@FreeBSD.org> Don't try to translate the control message unless we're certain it's
valid; otherwise a caller could trick us into changing any 32-bit word
in kernel memory to LINUX_SOL_SOCKET (0x00000001) if its previous value
is SOL_SOCKET (0x0000ffff).

MFC after: 3 days
6d0528abdfecb0a45eec1ee51b594803b1e11866 16-Aug-2004 tjr <tjr@FreeBSD.org> Changes to MI Linux emulation code necessary to run 32-bit Linux binaries
on AMD64, and the general case where the emulated platform has different
size pointers than we use natively:
- declare certain structure members as l_uintptr_t and use the new PTRIN
and PTROUT macros to convert to and from native pointers.
- declare some structures __packed on amd64 when the layout would differ
from that used on i386.
- include <machine/../linux32/linux.h> instead of <machine/../linux/linux.h>
if compiling with COMPAT_LINUX32. This will need to be revisited before
32-bit and 64-bit Linux emulation support can coexist in the same kernel.
- other small scattered changes.

This should be a no-op on i386 and Alpha.
85955763051e72f86391c0a9ca1ce1c12ef93a34 18-Jul-2004 dwmalone <dwmalone@FreeBSD.org> I missed two pieces of the commit to this file. Robert has already
added one, this adds the other.
606ea367aee6e42c3e2ef3e78e6a2c9450ca0733 18-Jul-2004 rwatson <rwatson@FreeBSD.org> Remove 'sg' argument to linux_sendto_hdrincl, which is what I think was
intended. This fixes the build, but might require revision.
c8c1b8f415a3f8e57219cf3587707d7d24b8d582 17-Jul-2004 dwmalone <dwmalone@FreeBSD.org> Add a kern_setsockopt and kern_getsockopt which can read the option
values from either user land or from the kernel. Use them for
[gs]etsockopt and to clean up some calls to [gs]etsockopt in the
Linux emulation code that uses the stackgap.
b9f13e4266e1a358f8e0f5ee3542e657eabb8a19 10-Jul-2004 phk <phk@FreeBSD.org> Clean up and wash struct iovec and struct uio handling.

Add copyiniov() which copies a struct iovec array in from userland into
a malloc'ed struct iovec. Caller frees.

Change uiofromiov() to malloc the uio (caller frees) and name it
copyinuio() which is more appropriate.

Add cloneuio() which returns a malloc'ed copy. Caller frees.

Use them throughout.
64c32415ce9ffa8cd1c9daad2badb08c90241e48 08-Jul-2004 phk <phk@FreeBSD.org> Use a couple of regular kernel entry points, rather than COMPAT_43
entry points.
8ce2b349efc45b33216c6f6f2be111c9c99ad0ba 25-Dec-2003 bde <bde@FreeBSD.org> Quick fix for LINT breakage caused by interface changes in accept(2), etc.
The log message for rev.1.160 of kern/uipc_syscalls.c and associated
changes only claimed to add restrict qualifiers (which have no effect in
the kernel so they probably shouldn't be added), but the following
interface changes were also made:
- caddr_t to `void *' and `struct sockaddr_t *'
- `int *' to `socklen_t *'.
These interface changes are not quite null, and this fix is quick (like
the changes in uipc_syscalls 1.160) because it uses bogus casts instead
of complete bounds-checked conversions.

Things should be fixed better when the conversions can be done without
using the stack gap. linux_check_hdrincl() already uses the stack gap
and is fixed completely though the type mismatches in it were not fatal
(there were only fatal type mismatches from unopaquing pointers to
[o]sockaddr't's -- the difference between accept()'s args and oaccept()'s
args is now non-opaque, but this is not reflected in their args structs).
3e991a3b59b5351609ea5c3ca9defdbd375cf5f4 09-Nov-2003 dwmalone <dwmalone@FreeBSD.org> Use kern_sendit rather than sendit for the Linux send* syscalls.
This means we can avoid using the stack gap for most send* syscalls
now (it is still used in the IP_HDRINCL case).
324480cbaf22f812dbbff8d773d2f51c15a59e1d 11-Oct-2003 iwasaki <iwasaki@FreeBSD.org> Fix some problems in linux_sendmsg() and linux_recvmsg().
- Allocate storage for uap->msg always because it is copyin()'ed in
native sendmsg().
- Convert sockopt level from Linux to FreeBSD after native recvmsg() calling.
- Some cleanups.

Tested with: Oracle 9i shared server connection mode.

MFC after: 1 week
f72cbcf20753e9285eb1bfcde01b63b6bc05be01 10-Jun-2003 obrien <obrien@FreeBSD.org> Use __FBSDID().
9468fdaf14ab3e5212aac4e764e4616b726ec850 29-Apr-2003 kan <kan@FreeBSD.org> Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on: standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
3c182bd3cd14b38db69ecb7fbaa4fbad85bf23c4 03-Mar-2003 des <des@FreeBSD.org> Clean up whitespace and remove register keyword.
021faa11ac3e8868e33ee981995319d77cba1b58 03-Mar-2003 des <des@FreeBSD.org> More caddr_t removal, in conjunction with copy{in,out}(9) this time.
Also clean up some egregious casts and incorrect use of sizeof.
5820529758adc03586d4899b86cf35c9ddd95895 20-Feb-2003 ume <ume@FreeBSD.org> Add M_WAITOK
7901f87059611b1f0e7666303597591dad9b5b16 08-Feb-2003 dwmalone <dwmalone@FreeBSD.org> 1) Linux_sendto was trashing the BSD sockaddr it put in the stackgap,
so be more careful about calling stackgap_init.

Tested by: Fred Souza <fred@storming.org>

2) Linux_sendmsg was forgetting to fill out the bsd_args struct.

Reviewed by: ume

3) The args to linux_connect have differently named types on alpha and
i386, so add a cast to stop gcc complaining.

Spotted by: peter
f1aeff9dcbbfd44faf53c5eddf9f6206bd2bdce1 05-Feb-2003 ume <ume@FreeBSD.org> Avoid undefined symbol error with an IPv4 only kernel.

Reported by: "Sergey A. Osokin" <osa@freebsd.org.ru>
9689f0580db3e8ffcfba46b99c0b3a370eb9524c 03-Feb-2003 ume <ume@FreeBSD.org> Add IPv6 support for Linuxlator.

Reviewed by: dwmalone
MFC after: 10 days
7a31c08874a06ca4e5a0b375518b70fef7c57656 24-Sep-2002 mini <mini@FreeBSD.org> Back out last commit. Linux uses the old 4.3BSD sockaddr format.
e206834961edb2d9141a787805ae32d92c3c9877 23-Sep-2002 mini <mini@FreeBSD.org> Don't use compatability syscall wrappers in emulation code.
This is needed for the COMPAT_FREEBSD3 option split.

Reviewed by: alfred, jake
28bcbfe85d38c560248dd8166be09f8d94775502 02-Jun-2002 schweikh <schweikh@FreeBSD.org> Fix typo in the BSD copyright: s/withough/without/

Spotted and suggested by: des
MFC after: 3 weeks
271e61648420b256374ae031ae2d70a0c59672ba 17-Nov-2001 dillon <dillon@FreeBSD.org> Fix missing holdsock()->fgetsock()

Submitted by: Hisashi Hiramoto <hiramoto@phys.chs.nihon-u.ac.jp>
9246a16af858019b295cc1274132b8af3d4d67a3 26-Oct-2001 fenner <fenner@FreeBSD.org> Force the length of the sockaddr to be correct for AF_INET and AF_INET6
in bind() and connect(). Linux doesn't care if the length of the
sockaddr matches its address family; FreeBSD does. This fixes the
known issues with the resolver in linux_base-7.
5596676e6c6c1e81e899cd0531f9b1c28a292669 12-Sep-2001 julian <julian@FreeBSD.org> KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha
df61d9eb64550a7afc1b41ccc9e0261af45c98c1 08-Sep-2001 marcel <marcel@FreeBSD.org> Round of cleanups and enhancements. These include (in random order):

o Introduce private types for use in linux syscalls for two reasons:
1. establish type independence for ease in porting and,
2. provide a visual queue as to which syscalls have proper
prototypes to further cleanup the i386/alpha split.
Linuxulator types are prefixed by 'l_'. void and char have not
been "virtualized".

o Provide dummy functions for all syscalls and remove dummy functions
or implementations of truely obsolete syscalls.

o Sanitize the shm*, sem* and msg* syscalls.

o Make a first attempt to implement the linux_sysctl syscall. At this
time it only returns one MIB (KERN_VERSION), but most importantly,
it tells us when we need to add additional sysctls :-)

o Bump the kenel version up to 2.4.2 (this is not the same as the
KERN_VERSION MIB, BTW).

o Implement new syscalls, of which most are specific to i386. Our
syscall table is now up to date with Linux 2.4.2. Some highlights:
- Implement the 32-bit uid_t and gid_t bases syscalls.
- Implement a couple of 64-bit file size/offset bases syscalls.

o Fix or improve numerous syscalls and prototypes.

o Reduce style(9) violations while I'm here. Especially indentation
inconsistencies within the same file are addressed. Re-indenting
did not obfuscate actual changes to the extend that it could not
be combined.

NOTE: I spend some time testing these changes and found that if there
were regressions, they were not caused by these changes AFAICT.
It was observed that installing a RH 7.1 runtime environment
did make matters worse. Hangs and/or reboots have been observed
with and without these changes, so when it failed to make life
better in cases it doesn't look like it made it worse.
0e6ea63318fe28e058f9968b8be7e22ba6704adc 02-Mar-2001 jlemon <jlemon@FreeBSD.org> Only pick up so_error the first time through with EISCONN, as advertised.
The sense of the test was reversed, so we were returning EISCONN, then 0.

Pointed out and tested by: Martin Blapp <mb@imp.ch>
0bdef2632976d1c6964f12eec3370f9292d21da9 01-Mar-2001 jlemon <jlemon@FreeBSD.org> Correctly emulate linux_connect. For nonblocking sockets, the behavior
is to return EINPROGRESS, EALREADY, (so_error ONCE), EISCONN. Certain
linux applications rely on the so_error (normally 0) being returned in
order to operate properly.

Tested by: Thomas Moestl <tmoestl@gmx.net>
f587d3bd718045cb14ebccfb171b17da0ae9ef45 19-Dec-2000 assar <assar@FreeBSD.org> translate the flags in recvfrom and recvmsg from linux to bsd ones

Approved by: marcel
561104748931a3f0db188d4465837a7a1ec3ee2e 03-Dec-2000 marcel <marcel@FreeBSD.org> Don't auto-generate the syscalls.
0ee48b4aca419a4b02c82a50299bcd6e23a3a25b 16-Nov-2000 gallatin <gallatin@FreeBSD.org> Use the linux_connect() on alpha rather than passing directly through
to our native connect(). This is required to deal with the differences
in the way linux handles connects on non-blocking sockets.

This gets the private beta of the Compaq Linux/alpha JDK working
on FreeBSD/alpha

Approved by: marcel
c3aea64316011697102c05ece8650ef2bd9fb6d8 10-Nov-2000 marcel <marcel@FreeBSD.org> Revert auto-generation. The Alpha port is broken.
Syncing with it is wrong.
7980b37e09111f86fa7b0583f6a2e79e6434a0b0 09-Nov-2000 marcel <marcel@FreeBSD.org> Sync with Alpha:
Do not use sysent.c, proto.h and syscall.h in source tree;
use auto-generated versions.
c4a9f49ba81d429e0feff56ed0c369ee75bab7b1 01-Nov-2000 obrien <obrien@FreeBSD.org> The MI/MD split wasn't perfect and the MI files need hacks for the
AlphaLinux compat bits. This will be better cleaned up soon.

Agreed to what ever was necessary by: marcel
abf2c201618e161195536aba27ea37a2cf60fad5 26-Aug-2000 marcel <marcel@FreeBSD.org> Whitespace change: (near) KNF
219e29595a8d293c1e81f0136a866f25a69d648e 22-Aug-2000 marcel <marcel@FreeBSD.org> Update include directives.
b42951578188c5aab5c9f8cbcde4a743f8092cdc 02-Apr-2000 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'ALSA'.
39a8b7149f0ab399cdd6a6d5f254c70ba2cf6fa6 28-Feb-2000 marcel <marcel@FreeBSD.org> Fix accept(2) behavior in that accepted sockets don't inherit the
parents flags.

Note on the PR:
The PR contains another patch that's not being committed without
further background information. The PR stays open for now.

PR: 16946 (Victor A. Salaman <salaman@teknos.com>)
Prompted by: msmith
Indirect/implicit approval: jkh (shoot me if I'm wrong :-)
3b842d34e82312a8004a7ecd65ccdb837ef72ac1 28-Aug-1999 peter <peter@FreeBSD.org> $Id$ -> $FreeBSD$
c319cef85823dfc262035ce8e05b52a1f9730102 11-Jan-1999 msmith <msmith@FreeBSD.org> Fix linux sendmsg() emulation

Submitted by: Brian Feldman <green@unixhelp.org>
241d58f7a246527ab2b3d4696a446eac3c99ff58 30-Dec-1998 sos <sos@FreeBSD.org> Commit patch in

PR: 9232
Submitted by: marcel@scc.nl <Marcel Moolenaar>
cd450d67141a2e84500ff624dc9d39c255a7de77 28-Mar-1998 bde <bde@FreeBSD.org> Moved some #includes from <sys/param.h> nearer to where they are actually
used.
d3baeeda1e135d3e3e0d724edeea82b87846ed9e 07-Feb-1998 msmith <msmith@FreeBSD.org> In the words of the submitter:

----
I've worked to enhance the connect() patches.

I've just tested this with the Linux JDK appletviewer on an applet
that does a lot of connects, and it works as well as during my
previous tests.

The connect() patch is now a merge between my older patch and the
OpenBSD stuff. It ensures that any async error is returned by
connect() instead of getsockopt(SOL_SOCKET, SO_ERROR) as reasonnable
systems do.

There are also minor patches to implement IPPROTO_TCP for
get/setsocktopt(). These are also tested (with Linux Apache).
----

I would appreciate any feedback regarding these changes, as they'd
be very useful in 2.2.6.

Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)
0506343883d62f6649f7bbaf1a436133cef6261d 11-Jan-1998 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'jb'.
7c6e96080c4fb49bf912942804477d202a53396c 10-Jan-1998 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'JB'.
01dd6091edaa3e5d6ce972956bdaff5e8575d53f 16-Dec-1997 eivind <eivind@FreeBSD.org> Make COMPAT_43 and COMPAT_SUNOS new-style options.
7bbd5d48704e983e22e4ac28475e6ec127347953 14-Dec-1997 msmith <msmith@FreeBSD.org> As described by the submitter:

- emulate Linux IP_HDRINCL behaviour in sendto(): byte order fixed
Note that we do an extra getsockopt() on every sendto()
to check if the option is set because we don't keep state
in the emulator code. Is there a better way to implement
this?
- correct a bug (value of "name" not passed) with
getsockopt()

Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)
4c8218a5c7d132b8ae0bd2a5a677455d69fabaab 06-Nov-1997 phk <phk@FreeBSD.org> Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.
98c28d00a775205ea3df14b0c3bb1a5b984485d0 20-Jul-1997 bde <bde@FreeBSD.org> Removed unused #includes.
94b6d727947e1242356988da003ea702d41a97de 22-Feb-1997 peter <peter@FreeBSD.org> Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
808a36ef658c1810327b5d329469bcf5dad24b28 14-Jan-1997 jkh <jkh@FreeBSD.org> Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
0c67934fafa58a5271cfa2c381752fdeb4696353 03-Dec-1996 fenner <fenner@FreeBSD.org> Add IP_OPTIONS and the multicast-related setsockopts to the
list of IP setsockopts the Linux emulator recognizes.

Explicitly disallow IP_HDRINCL since Linux's handling of
raw output is different than BSD's.

Closes PR#kern/2111.

Submitted by: y-nakaga@ccs.mt.nec.co.jp (Yoshihisa NAKAGAWA)
8465726bdae0892d85c26b32cd01d2f6936303ef 02-Mar-1996 peter <peter@FreeBSD.org> Mega-commit for Linux emulator update.. This has been stress tested under
netscape-2.0 for Linux running all the Java stuff. The scrollbars are now
working, at least on my machine. (whew! :-)

I'm uncomfortable with the size of this commit, but it's too
inter-dependant to easily seperate out.

The main changes:

COMPAT_LINUX is *GONE*. Most of the code has been moved out of the i386
machine dependent section into the linux emulator itself. The int 0x80
syscall code was almost identical to the lcall 7,0 code and a minor tweak
allows them to both be used with the same C code. All kernels can now
just modload the lkm and it'll DTRT without having to rebuild the kernel
first. Like IBCS2, you can statically compile it in with "options LINUX".

A pile of new syscalls implemented, including getdents(), llseek(),
readv(), writev(), msync(), personality(). The Linux-ELF libraries want
to use some of these.

linux_select() now obeys Linux semantics, ie: returns the time remaining
of the timeout value rather than leaving it the original value.

Quite a few bugs removed, including incorrect arguments being used in
syscalls.. eg: mixups between passing the sigset as an int, vs passing
it as a pointer and doing a copyin(), missing return values, unhandled
cases, SIOC* ioctls, etc.

The build for the code has changed. i386/conf/files now knows how
to build linux_genassym and generate linux_assym.h on the fly.

Supporting changes elsewhere in the kernel:

The user-mode signal trampoline has moved from the U area to immediately
below the top of the stack (below PS_STRINGS). This allows the different
binary emulations to have their own signal trampoline code (which gets rid
of the hardwired syscall 103 (sigreturn on BSD, syslog on Linux)) and so
that the emulator can provide the exact "struct sigcontext *" argument to
the program's signal handlers.

The sigstack's "ss_flags" now uses SS_DISABLE and SS_ONSTACK flags, which
have the same values as the re-used SA_DISABLE and SA_ONSTACK which are
intended for sigaction only. This enables the support of a SA_RESETHAND
flag to sigaction to implement the gross SYSV and Linux SA_ONESHOT signal
semantics where the signal handler is reset when it's triggered.

makesyscalls.sh no longer appends the struct sysentvec on the end of the
generated init_sysent.c code. It's a lot saner to have it in a seperate
file rather than trying to update the structure inside the awk script. :-)

At exec time, the dozen bytes or so of signal trampoline code are copied
to the top of the user's stack, rather than obtaining the trampoline code
the old way by getting a clone of the parent's user area. This allows
Linux and native binaries to freely exec each other without getting
trampolines mixed up.
efe4e33b175e71e1db7760bd963299b28c614238 15-Dec-1995 peter <peter@FreeBSD.org> Clean up some warnings by using the generated structures in <sys/sysproto.h>
for passing to the bsd system calls, rather than inveninting our own
equivalent structures.
825daf3e3289f2ab37f6f075a4eb4270b8f846c2 22-Nov-1995 bde <bde@FreeBSD.org> Completed function declarations and added prototypes.

Removed some unnecessary #includes.

Fixed warnings about nested externs.
86f1bc4514fdcfd255f37f3218fe234bdc3664fc 05-Nov-1995 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'LINUX'.
f14ea10694ed07a2d0627c9efa2b79c03b19639d 25-Jun-1995 sos <sos@FreeBSD.org> First incarnation of our Linux emulator or rather compatibility code.
This first shot only incorporaties so much functionality that DOOM
can run (the X version), signal handling is VERY weak, so is many
other things. But it meets my milestone number one (you guessed it
- running DOOM).

Uses /compat/linux as prefix for loading shared libs, so it won't
conflict with our own libs.

Kernel must be compiled with "options COMPAT_LINUX" for this to work.