History log of /freebsd-head/sys/net/if.h
Revision Date Author Comments
465bfae22845e24b1dbfbb590d749fb63b3c15a0 17-Sep-2019 kib <kib@FreeBSD.org> Add SIOCGIFDOWNREASON.

The ioctl(2) is intended to provide more details about the cause of
the down for the link.

Eventually we might define a comprehensive list of codes for the
situations. But interface also allows the driver to provide free-form
null-terminated ASCII string to provide arbitrary non-formalized
information. Sample implementation exists for mlx5(4), where the
string is fetched from firmware controlling the port.

Reviewed by: hselasky, rrs
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21527
1cf31620c802b9e665385827148ab45a22cef571 27-Aug-2019 jhb <jhb@FreeBSD.org> Add kernel-side support for in-kernel TLS.

KTLS adds support for in-kernel framing and encryption of Transport
Layer Security (1.0-1.2) data on TCP sockets. KTLS only supports
offload of TLS for transmitted data. Key negotation must still be
performed in userland. Once completed, transmit session keys for a
connection are provided to the kernel via a new TCP_TXTLS_ENABLE
socket option. All subsequent data transmitted on the socket is
placed into TLS frames and encrypted using the supplied keys.

Any data written to a KTLS-enabled socket via write(2), aio_write(2),
or sendfile(2) is assumed to be application data and is encoded in TLS
frames with an application data type. Individual records can be sent
with a custom type (e.g. handshake messages) via sendmsg(2) with a new
control message (TLS_SET_RECORD_TYPE) specifying the record type.

At present, rekeying is not supported though the in-kernel framework
should support rekeying.

KTLS makes use of the recently added unmapped mbufs to store TLS
frames in the socket buffer. Each TLS frame is described by a single
ext_pgs mbuf. The ext_pgs structure contains the header of the TLS
record (and trailer for encrypted records) as well as references to
the associated TLS session.

KTLS supports two primary methods of encrypting TLS frames: software
TLS and ifnet TLS.

Software TLS marks mbufs holding socket data as not ready via
M_NOTREADY similar to sendfile(2) when TLS framing information is
added to an unmapped mbuf in ktls_frame(). ktls_enqueue() is then
called to schedule TLS frames for encryption. In the case of
sendfile_iodone() calls ktls_enqueue() instead of pru_ready() leaving
the mbufs marked M_NOTREADY until encryption is completed. For other
writes (vn_sendfile when pages are available, write(2), etc.), the
PRUS_NOTREADY is set when invoking pru_send() along with invoking

A pool of worker threads (the "KTLS" kernel process) encrypts TLS
frames queued via ktls_enqueue(). Each TLS frame is temporarily
mapped using the direct map and passed to a software encryption
backend to perform the actual encryption.

(Note: The use of PHYS_TO_DMAP could be replaced with sf_bufs if
someone wished to make this work on architectures without a direct

KTLS supports pluggable software encryption backends. Internally,
Netflix uses proprietary pure-software backends. This commit includes
a simple backend in a new ktls_ocf.ko module that uses the kernel's
OpenCrypto framework to provide AES-GCM encryption of TLS frames. As
a result, software TLS is now a bit of a misnomer as it can make use
of hardware crypto accelerators.

Once software encryption has finished, the TLS frame mbufs are marked
ready via pru_ready(). At this point, the encrypted data appears as
regular payload to the TCP stack stored in unmapped mbufs.

ifnet TLS permits a NIC to offload the TLS encryption and TCP
segmentation. In this mode, a new send tag type (IF_SND_TAG_TYPE_TLS)
is allocated on the interface a socket is routed over and associated
with a TLS session. TLS records for a TLS session using ifnet TLS are
not marked M_NOTREADY but are passed down the stack unencrypted. The
ip_output_send() and ip6_output_send() helper functions that apply
send tags to outbound IP packets verify that the send tag of the TLS
record matches the outbound interface. If so, the packet is tagged
with the TLS send tag and sent to the interface. The NIC device
driver must recognize packets with the TLS send tag and schedule them
for TLS encryption and TCP segmentation. If the the outbound
interface does not match the interface in the TLS send tag, the packet
is dropped. In addition, a task is scheduled to refresh the TLS send
tag for the TLS session. If a new TLS send tag cannot be allocated,
the connection is dropped. If a new TLS send tag is allocated,
however, subsequent packets will be tagged with the correct TLS send
tag. (This latter case has been tested by configuring both ports of a
Chelsio T6 in a lagg and failing over from one port to another. As
the connections migrated to the new port, new TLS send tags were
allocated for the new port and connections resumed without being

ifnet TLS can be enabled and disabled on supported network interfaces
via new '[-]txtls[46]' options to ifconfig(8). ifnet TLS is supported
across both vlan devices and lagg interfaces using failover, lacp with
flowid enabled, or lacp with flowid enabled.

Applications may request the current KTLS mode of a connection via a
new TCP_TXTLS_MODE socket option. They can also use this socket
option to toggle between software and ifnet TLS modes.

In addition, a testing tool is available in tools/tools/switch_tls.
This is modeled on tcpdrop and uses similar syntax. However, instead
of dropping connections, -s is used to force KTLS connections to
switch to software TLS and -i is used to switch to ifnet TLS.

Various sysctls and counters are available under the kern.ipc.tls
sysctl node. The kern.ipc.tls.enable node must be set to true to
enable KTLS (it is off by default). The use of unmapped mbufs must
also be enabled via kern.ipc.mb_use_ext_pgs to enable KTLS.

KTLS is enabled via the KERN_TLS kernel option.

This patch is the culmination of years of work by several folks
including Scott Long and Randall Stewart for the original design and
implementation; Drew Gallatin for several optimizations including the
use of ext_pgs mbufs, the M_NOTREADY mechanism for TLS records
awaiting software encryption, and pluggable software crypto backends;
and John Baldwin for modifications to support hardware TLS offload.

Reviewed by: gallatin, hselasky, rrs
Obtained from: Netflix
Sponsored by: Netflix, Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D21277
520aafe3ec663aef3c6d2631d6d32bfe470eeb67 29-Jun-2019 jhb <jhb@FreeBSD.org> Add an external mbuf buffer type that holds multiple unmapped pages.

Unmapped mbufs allow sendfile to carry multiple pages of data in a
single mbuf, without mapping those pages. It is a requirement for
Netflix's in-kernel TLS, and provides a 5-10% CPU savings on heavy web
serving workloads when used by sendfile, due to effectively
compressing socket buffers by an order of magnitude, and hence
reducing cache misses.

For this new external mbuf buffer type (EXT_PGS), the ext_buf pointer
now points to a struct mbuf_ext_pgs structure instead of a data
buffer. This structure contains an array of physical addresses (this
reduces cache misses compared to an earlier version that stored an
array of vm_page_t pointers). It also stores additional fields needed
for in-kernel TLS such as the TLS header and trailer data that are
currently unused. To more easily detect these mbufs, the M_NOMAP flag
is set in m_flags in addition to M_EXT.

Various functions like m_copydata() have been updated to safely access
packet contents (using uiomove_fromphys()), to make things like BPF

NIC drivers advertise support for unmapped mbufs on transmit via a new
IFCAP_NOMAP capability. This capability can be toggled via the new
'nomap' and '-nomap' ifconfig(8) commands. For NIC drivers that only
transmit packet contents via DMA and use bus_dma, adding the
capability to if_capabilities and if_capenable should be all that is

If a NIC does not support unmapped mbufs, they are converted to a
chain of mapped mbufs (using sf_bufs to provide the mapping) in
ip_output or ip6_output. If an unmapped mbuf requires software
checksums, it is also converted to a chain of mapped mbufs before
computing the checksum.

Submitted by: gallatin (earlier version)
Reviewed by: gallatin, hselasky, rrs
Discussed with: ae, kp (firewalls)
Relnotes: yes
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20616
4bb7057f71f2839655c2e1d09705d78ec4300abe 26-Nov-2018 markj <markj@FreeBSD.org> Plug routing sysctl leaks.

Various structures exported by sysctl_rtsock() contain padding fields
which were not being zeroed.

Reported by: Thomas Barabosch, Fraunhofer FKIE
Reviewed by: ae
MFC after: 3 days
Security: kernel memory disclosure
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18333
0a7aab5128b088dbea782c007891937ec5fb5579 11-May-2018 mmacy <mmacy@FreeBSD.org> iflib(9): Add support for cloning pseudo interfaces

Part 3 of many ...
The VPC framework relies heavily on cloning pseudo interfaces
(vmnics, vpc switch, vcpswitch port, hostif, vxlan if, etc).

This pulls in that piece. Some ancillary changes get pulled
in as a side effect.

Reviewed by: shurd@
Approved by: sbruno@
Sponsored by: Joyent, Inc.
Differential Revision: https://reviews.freebsd.org/D15347
0080e81d7cd9056eb8e5f471948492e6b672846c 05-Apr-2018 brooks <brooks@FreeBSD.org> Add 32-bit compat for ioctls that take struct ifgroupreq.

Use an accessor to access ifgr_group and ifgr_groups.

Use an macro CASE_IOC_IFGROUPREQ(cmd) in place of case statements such
as "case SIOCAIFGROUP:". This avoids poluting the switch statements
with large numbers of #ifdefs.

Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14960
ac0325b4db68e6658c0ca652e4ca905a15b6a026 30-Mar-2018 brooks <brooks@FreeBSD.org> Use an accessor function to access ifr_data.

This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size). This is believed to be sufficent to
fully support ifconfig on 32-bit systems.

Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Relnotes: yes
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14900
0754c526f1f41925b14391ed2c654b1830b39572 27-Mar-2018 brooks <brooks@FreeBSD.org> Fix access to ifru_buffer on freebsd32.

Make all kernel accesses to ifru_buffer go via access functions
which take the process ABI into account and use an appropriate union
to access members in the correct place in struct ifreq.

Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14846
9de215608cfe3e871e92c6d6444063dd8be2b5c9 27-Mar-2018 kib <kib@FreeBSD.org> Allow to specify PCP on packets not belonging to any VLAN.

According to 802.1Q-2014, VLAN tagged packets with VLAN id 0 should be
considered as untagged, and only PCP and DEI values from the VLAN tag
are meaningful. See for instance

Make it possible to specify PCP value for outgoing packets on an
ethernet interface. When PCP is supplied, the tag is appended, VLAN
id set to 0, and PCP is filled by the supplied value. The code to do
VLAN tag encapsulation is refactored from the if_vlan.c and moved into

Drivers might have issues with filtering VID 0 packets on
receive. This bug should be fixed for each driver.

Reviewed by: ae (previous version), hselasky, melifaro
Sponsored by: Mellanox Technologies
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D14702
9c54c9c64c2141c0ed2c6aa495402ca1d7c2bb05 06-Dec-2017 glebius <glebius@FreeBSD.org> Garbage collect IFCAP_POLLING_NOCOUNT. It wasn't used since very
beginning of polling(4). The module always ignored return value
from driver polling handler.
4736ccfd9c3411d50371d7f21f9450a47c19047e 20-Nov-2017 pfg <pfg@FreeBSD.org> sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
ce9362dfb845a4f4507925622f04413497a85e44 07-Nov-2017 kib <kib@FreeBSD.org> Add a place for a driver to report rx timestamps in nanoseconds from
boot for the received packets.

The rcv_tstmp field overlaps the place of Ln header length indicators,
not used by received packets. The basic pkthdr rearrangement change
in sys/mbuf.h was provided by gallatin.

There are two accompanying M_ flags: M_TSTMP means that there is the
timestamp (and it was generated by hardware).

Another flag M_TSTMP_HPREC indicates that the timestamp is
high-precision. Practically M_TSTMP_HPREC means that hardware
provided additional precision comparing with the stamps when the flag
is not set. E.g., for ConnectX all packets are stamped by hardware
when PCIe transaction to write out the completion descriptor is
performed, but PTP packet are stamped on port. For Intel cards, when
PTP assist is enabled, only PTP packets are stamped in the limited
number of registers, so if Intel cards ever start support this
mechanism, they would always set M_TSTMP | M_TSTMP_HPREC if hardware
timestamp is present for the given packet.

Add IFCAP_HWRXTSTMP interface capability to indicate the support for
hardware rx timestamping, and ifconfig(8) command to toggle it.

Based on the patch by: gallatin
Reviewed by: gallatin (previous version), hselasky
Sponsored by: Mellanox Technologies
MFC after: 2 weeks (? mbuf KBI issue)
X-Differential revision: https://reviews.freebsd.org/D12638
7d1e282c9d0926edc2ee8a2957e3511663f335ef 05-Sep-2017 sephe <sephe@FreeBSD.org> if: Add ioctls to get RSS key and hash type/function.

It will be needed by hn(4) to configure its RSS key and hash
type/function in the transparent VF mode in order to match VF's
RSS settings. The description of the transparent VF mode and
the RSS hash value issue are here:

These are generic enough to promise two independent IOCs instead
of abusing SIOCGDRVSPEC.

Setting RSS key and hash type/function is a different story,
which probably requires more discussion.

Comment about UDP_{IPV4,IPV6,IPV6_EX} were only in the patch
in the review request; these hash types are standardized now.

Reviewed by: gallatin
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D12174
7e6cabd06e6caa6a02eeb86308dc0cb3f27e10da 28-Feb-2017 imp <imp@FreeBSD.org> Renumber copyright clause 4

Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
efa6326974ec2cdb6721fec731bcd86758d0877c 18-Jan-2017 hselasky <hselasky@FreeBSD.org> Implement kernel support for hardware rate limited sockets.

- Add RATELIMIT kernel configuration keyword which must be set to
enable the new functionality.

- Add support for hardware driven, Receive Side Scaling, RSS aware, rate
limited sendqueues and expose the functionality through the already
established SO_MAX_PACING_RATE setsockopt(). The API support rates in
the range from 1 to 4Gbytes/s which are suitable for regular TCP and
UDP streams. The setsockopt(2) manual page has been updated.

- Add rate limit function callback API to "struct ifnet" which supports
the following operations: if_snd_tag_alloc(), if_snd_tag_modify(),
if_snd_tag_query() and if_snd_tag_free().

- Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT
flag, which tells if a network driver supports rate limiting or not.

- This patch also adds support for rate limiting through VLAN and LAGG
intermediate network devices.

- How rate limiting works:

1) The userspace application calls setsockopt() after accepting or
making a new connection to set the rate which is then stored in the
socket structure in the kernel. Later on when packets are transmitted
a check is made in the transmit path for rate changes. A rate change
implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the
destination network interface, which then sets up a custom sendqueue
with the given rate limitation parameter. A "struct m_snd_tag" pointer is
returned which serves as a "snd_tag" hint in the m_pkthdr for the
subsequently transmitted mbufs.

2) When the network driver sees the "m->m_pkthdr.snd_tag" different
from NULL, it will move the packets into a designated rate limited sendqueue
given by the snd_tag pointer. It is up to the individual drivers how the rate
limited traffic will be rate limited.

3) Route changes are detected by the NIC drivers in the ifp->if_transmit()
routine when the ifnet pointer in the incoming snd_tag mismatches the
one of the network interface. The network adapter frees the mbuf and
returns EAGAIN which causes the ip_output() to release and clear the send
tag. Upon next ip_output() a new "snd_tag" will be tried allocated.

4) When the PCB is detached the custom sendqueue will be released by a
non-blocking ifp->if_snd_tag_free() call to the currently bound network

Reviewed by: wblock (manpages), adrian, gallatin, scottl (network)
Differential Revision: https://reviews.freebsd.org/D3687
Sponsored by: Mellanox Technologies
MFC after: 3 months
af4ff984e7cfca491e9ec471fce250785ae30126 06-Jun-2016 araujo <araujo@FreeBSD.org> Add support to priority code point (PCP) that is an 3-bit field
which refers to IEEE 802.1p class of service and maps to the frame
priority level.

Values in order of priority are: 1 (Background (lowest)),
0 (Best effort (default)), 2 (Excellent effort),
3 (Critical applications), 4 (Video, < 100ms latency),
5 (Video, < 10ms latency), 6 (Internetwork control) and
7 (Network control (highest)).

Example of usage:
root# ifconfig em0.1 create
root# ifconfig em0.1 vlanpcp 3

The review D801 includes the pf(4) part, but as discussed with kristof,
we won't commit the pf(4) bits for now.
The credits of the original code is from rwatson.

Differential Revision: https://reviews.freebsd.org/D801
Reviewed by: gnn, adrian, loos
Discussed with: rwatson, glebius, kristof
Tested by: many including Matthew Grooms <mgrooms__shrew.net>
Obtained from: pfSense
Relnotes: Yes
00d578928eca75be320b36d37543a7e2a4f9fbdb 27-May-2016 grehan <grehan@FreeBSD.org> Create branch for bhyve graphics import.
a91e3ef58a004aace2995a6dec20856019ce1c7d 15-Aug-2015 melifaro <melifaro@FreeBSD.org> MFC r270064,r270068,r270069,r270115,r270129,r270287,r270822,r271014,

Add support for reading i2c SFP/SFP+ data from NIC driver and
presenting most interesting fields via ifconfig -v.
This version supports Intel ixgbe driver only.

Tested on: Cisco,Intel,Mellanox,ModuleTech,Molex transceivers

* Add new net/sff8436.h containing constants used to access
QSFP+ data via i2c inteface. These constants has been taken
from SFF-8436 "QSFP+ 10 Gbs 4X PLUGGABLE TRANSCEIVER" standard
rev 4.8.
* Add support for printing QSFP+ information from 40G NICs
such as Chelsio T5.

cxl1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
ether 00:07:43:28:ad:08
media: Ethernet 40Gbase-LR4 <full-duplex>
status: active
plugged: QSFP+ 40GBASE-LR4 (MPO Parallel Optic)
vendor: OEM PN: OP-QSFP-40G-LR4 SN: 20140318001 DATE: 2014-03-18
module temperature: 64.06 C voltage: 3.26 Volts
lane 1: RX: 0.47 mW (-3.21 dBm) TX: 2.78 mW (4.46 dBm)
lane 2: RX: 0.20 mW (-6.94 dBm) TX: 2.80 mW (4.47 dBm)
lane 3: RX: 0.18 mW (-7.38 dBm) TX: 2.79 mW (4.47 dBm)
lane 4: RX: 0.90 mW (-0.45 dBm) TX: 2.80 mW (4.48 dBm)

Tested on: Chelsio T5
Tested on: Mellanox/Huawei passive/active cables/transceivers.

Sponsored by: Yandex LLC
69a7dea554e8ce785a94c7019ded96d47838221d 29-Aug-2014 melifaro <melifaro@FreeBSD.org> * Add SIOCGI2C driver ioctl used to retrieve i2c info.
* Convert ixgbe to use this ioctl
* Convert ifconfig to use generic i2c handler for "ix" interfaces.

Approved by: Eric Joyner (ixgbe part)
MFC after: 2 weeks
Sponsored by: Yandex LLC
d32e428cc37439544fc5159604ba10ed560a88c1 29-Jul-2014 glebius <glebius@FreeBSD.org> Garbage collect couple of unused fields from struct ifaddr:
- ifa_claim_addr() unused since removal of NetAtalk
- ifa_metric seems to be never utilized, always a copy of if_metric
1e3b3008927ebb2708c3f6a87ba9f302ad3c0c66 03-Apr-2014 glebius <glebius@FreeBSD.org> o Provide a compatibility shim for netstat(1) to obtain output queue
drops via NET_RT_IFLISTL sysctl. The sysctl handler appends oqdrops
at the end of struct if_msghdrl, and netstat(1) sees that as an
additional field of struct if_data. This allows us to fetch the data
keeping ABI and API compatibility.
This is direct commit to stable/10.

o Merge r263331 from head, to restore printing of queue drops.

Sponsored by: Nginx, Inc.
Sponsored by: Netflix
b38edcd355dfe9c2ac4080b8837687b0dba7dd41 13-Mar-2014 glebius <glebius@FreeBSD.org> Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbit
interface, in the r241616 a crutch was provided. It didn't work well, and
finally we decided that it is time to break ABI and simply make if_baudrate
a 64-bit value. Meanwhile, the entire struct if_data was reviewed.

o Remove the if_baudrate_pf crutch.

o Make all fields of struct if_data fixed machine independent size. The
notion of data (packet counters, etc) are by no means MD. And it is a
bug that on amd64 we've got a 64-bit counters, while on i386 32-bit,
which at modern speeds overflow within a second.

This also removes quite a lot of COMPAT_FREEBSD32 code.

o Give 16 bit for the ifi_datalen field. This field was provided to
make future changes to if_data less ABI breaking. Unfortunately the
8 bit size of it had effectively limited sizeof if_data to 256 bytes.

o Give 32 bits to ifi_mtu and ifi_metric.
o Give 64 bits to the rest of fields, since they are counters.

__FreeBSD_version bumped.

Discussed with: emax
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
eb1a5f8de9f7ea602c373a710f531abbf81141c4 21-Feb-2014 gjb <gjb@FreeBSD.org> Move ^/user/gjb/hacking/release-embedded up one directory, and remove
^/user/gjb/hacking since this is likely to be merged to head/ soon.

Sponsored by: The FreeBSD Foundation
6b01bbf146ab195243a8e7d43bb11f8835c76af8 27-Dec-2013 gjb <gjb@FreeBSD.org> Copy head@r259933 -> user/gjb/hacking/release-embedded for initial
inclusion of (at least) arm builds with the release.

Sponsored by: The FreeBSD Foundation
3c1f482e0e0f1e3715112a75435f2e38eeec0519 11-Nov-2013 glebius <glebius@FreeBSD.org> Remove never used ioctls that originate from KAME. The proof
of their zero usage was exp-run from misc/183538.
1dbe9493b0e886302320923ffb8f5b383fa1c051 06-Nov-2013 glebius <glebius@FreeBSD.org> Provide compat layer for OSIOCAIFADDR.
cb6df3f35cfa18fcb27b94a3a42423ae51d713c0 05-Nov-2013 glebius <glebius@FreeBSD.org> Axe IFF_SMART. Fortunately this layering violating flag was never used,
it was just declared.
3b6f8b896cf04f26955153df6e46b94aebfaf707 05-Nov-2013 glebius <glebius@FreeBSD.org> Drop support for historic ioctls and also undefine them, so that code
that checks their presence via ifdef, won't use them.

Bump __FreeBSD_version as safety measure.
c829949efacb31ac7d8dc8d4df99d801a43a2fec 28-Oct-2013 glebius <glebius@FreeBSD.org> - Make the prophecy from 1997 happen and remove if_var.h inclusion
from if.h.
- Remove unnecessary includes and declarations from if.h
- Remove unnecessary includes and declarations from if_var.h [1]
- Mark some declarations that are about to be removed in near
future with comments, explaning why this declaration is still
- Protect eventhandler declarations with #ifdef SYS_EVENTHANDLER_H.

Obtained from: bdeBSD [1]
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
75528d8e36fb23734af42c83fe710155dc3e2d5c 09-Oct-2013 glebius <glebius@FreeBSD.org> There are some high performance NICs that count statistics in hardware,
and there are ifnets, that do that via counter(9). Provide a flag that
would skip cache line trashing '+=' operation in ether_input().

Sponsored by: Netflix
Sponsored by: Nginx, Inc.
Reviewed by: melifaro, adrian
Approved by: re (marius)
e3737c33e77ed583376ee2f2b90d7f232650d182 24-Aug-2013 andre <andre@FreeBSD.org> Restructure the mbuf pkthdr to make it fit for upcoming capabilities and
features. The changes in particular are:

o Remove rarely used "header" pointer and replace it with a 64bit protocol/
layer specific union PH_loc for local use. Protocols can flexibly overlay
their own 8 to 64 bit fields to store information while the packet is
worked on.

o Mechanically convert IP reassembly, IGMP/MLD and ATM to use pkthdr.PH_loc
instead of pkthdr.header.

o Extend csum_flags to 64bits to allow for additional future offload
information to be carried (e.g. iSCSI, IPsec offload, and others).

o Move the RSS hash type enumerator from abusing m_flags to its own 8bit
rsstype field. Adjust accessor macros.

o Add cosqos field to store Class of Service / Quality of Service information
with the packet. It is not yet supported in any drivers but allows us to
get on par with Cisco/Juniper in routing applications (plus MPLS QoS) with
a modernized ALTQ.

o Add four 8 bit fields l[2-5]hlen to store the relative header offsets
from the start of the packet. This is important for various offload
capabilities and to relieve the drivers from having to parse the packet
and protocol headers to find out location of checksums and other
information. Header parsing in drivers is a lot of copy-paste and
unhandled corner cases which we want to avoid.

o Add another flexible 64bit union to map various additional persistent
packet information, like ether_vtag, tso_segsz and csum fields.
Depending on the csum_flags settings some fields may have different usage
making it very flexible and adaptable to future capabilities.

o Restructure the CSUM flags to better signify their outbound (down the
stack) and inbound (up the stack) use. The CSUM flags used to be a bit
chaotic and rather poorly documented leading to incorrect use in many
places. Bring clarity into their use through better naming.
Compatibility mappings are provided to preserve the API. The drivers
can be corrected one by one and MFC'd without issue.

o The size of pkthdr stays the same at 48/56bytes (32/64bit architectures).

Sponsored by: The FreeBSD Foundation
0dfb309a1fc65341261b94a9852bbd1ee0b58577 17-Oct-2012 emax <emax@FreeBSD.org> provide helper if_initbaudrate() to set if_baudrate_pf and if_baudrate_pf.
again, use ixgbe(4) as an example of how to use new helper function.

Reviewed by: jhb
MFC after: 1 week
214df82afacb6e4f782e0d8090c25db6a7230fdf 16-Oct-2012 emax <emax@FreeBSD.org> introduce concept of ifi_baudrate power factor. the idea is to work
around the problem where high speed interfaces (such as ixgbe(4))
are not able to report real ifi_baudrate. bascially, take a spare
byte from struct if_data and use it to store ifi_baudrate power
factor. in other words,

real ifi_baudrate = ifi_baudrate * 10 ^ ifi_baudrate power factor

this should be backwards compatible with old binaries. use ixgbe(4)
as an example on how drivers would set ifi_baudrate power factor

Discussed with: kib, scottl, glebius
MFC after: 1 week
722e0571698d84ca79f966be76d161a8617893f2 29-Jun-2012 jhb <jhb@FreeBSD.org> Hold GIF_LOCK() for almost all of gif_start(). It is required to be held
across in_gif_output() and in6_gif_output() anyway, and once it is held
across those it might as well be held for the entire loop. This simplifies
the code and removes the need for the custom IFF_GIF_WANTED flag (which
belonged in the softc and not as an IFF_* flag anyway).

Tested by: Vincent Hoffman vince unsane co uk
8fa5fc067b8255233ff88ab1968833084df48ee4 12-Jun-2012 rrs <rrs@FreeBSD.org> Opps forgot to commit the flag.
ac429c704460cebd04cbe8314edf44e51038bc3b 28-May-2012 bz <bz@FreeBSD.org> It turns out that too many drivers are not only parsing the L2/3/4
headers for TSO but also for generic checksum offloading. Ideally we
would only have one common function shared amongst all drivers, and
perhaps when updating them for IPv6 we should introduce that.
Eventually we should provide the meta information along with mbufs to
avoid (re-)parsing entirely.

To not break IPv6 (checksums and offload) and to be able to MFC the
changes without risking to hurt 3rd party drivers, duplicate the v4
framework, as other OSes have done as well.

Introduce interface capability flags for TX/RX checksum offload with
IPv6, to allow independent toggling (where possible). Add CSUM_*_IPV6
flags for UDP/TCP over IPv6, and reserve further for SCTP, and IPv6
fragmentation. Define CSUM_DELAY_DATA_IPV6 as we do for legacy IP and
add an alias for CSUM_DATA_VALID_IPV6.

This pretty much brings IPv6 handling in line with IPv4.
TSO is still handled in a different way and not via if_hwassist.

Update ifconfig to allow (un)setting of the new capability flags.
Update loopback to announce the new capabilities and if_hwassist flags.

Individual driver updates will have to follow, as will SCTP.

Reported by: gallatin, dim, ..
Reviewed by: gallatin (glanced at?)
MFC after: 3 days
X-MFC with: r235961,235959,235958
d05091db1d58d82a2d4c3cb7c1d505fd42a0a13f 11-Feb-2012 bz <bz@FreeBSD.org> Introduce a new NET_RT_IFLISTL API to query the address list. It works
on extended and extensible structs if_msghdrl and ifa_msghdrl. This
will allow us to extend both the msghdrl structs and eventually if_data
in the future without breaking the ABI.

Bump __FreeBSD_version to allow ports to more easily detect the new API.

Reviewed by: glebius, brooks
MFC after: 3 days
f55d6eed8c8bfbbc9c0a4422abfe0567efed9473 11-Feb-2012 bz <bz@FreeBSD.org> Backout changes from r228571. Remove if_data from struct ifa_msghdr again.
While this breaks carp on HEAD temporary, it restores the upgrade path from
stable, and head before 20111215.

Reviewed by: glebius, brooks
653f8c5e7181f0fd06ea5451ebb67351c2dd5626 21-Dec-2011 glebius <glebius@FreeBSD.org> Provide ABI compatibility shim to enable configuring of addresses
with ifconfig(8) prior to r228571.

Requested by: brooks
27a36f6ac8242750daa092abd7180b10d16f4508 16-Dec-2011 glebius <glebius@FreeBSD.org> A major overhaul of the CARP implementation. The ip_carp.c was started
from scratch, copying needed functionality from the old implemenation
on demand, with a thorough review of all code. The main change is that
interface layer has been removed from the CARP. Now redundant addresses
are configured exactly on the interfaces, they run on.

The CARP configuration itself is, as before, configured and read via
SIOCSVH/SIOCGVH ioctls. A new prefix created with SIOCAIFADDR or
SIOCAIFADDR_IN6 may now be configured to a particular virtual host id,
which makes the prefix redundant.

ifconfig(8) semantics has been changed too: now one doesn't need
to clone carpXX interface, he/she should directly configure a vhid
on a Ethernet interface.

To supply vhid data from the kernel to an application the getifaddrs(8)
function had been changed to pass ifam_data with each address. [1]

The new implementation definitely closes all PRs related to carp(4)
being an interface, and may close several others. It also allows
to run a single redundant IP per interface.

Big thanks to Bjoern Zeeb for his help with inet6 part of patch, for
idea on using ifam_data and for several rounds of reviewing!

PR: kern/117000, kern/126945, kern/126714, kern/120130, kern/117448
Reviewed by: bz
Submitted by: bz [1]
b18bd1101c248172695f243f7b5e0684ca86676b 21-Oct-2011 ed <ed@FreeBSD.org> Add missing #includes.

According to POSIX, these two header files should be able to be included
by themselves, not depending on other headers. The <net/if.h> header
uses struct sockaddr when __BSD_VISIBLE=1, while <netinet/tcp.h> uses
integer datatypes (u_int32_t, u_short, etc).

MFC after: 2 months
9cad5bfef3ce97c030d30e66deb6371458c2281b 03-Jul-2011 bz <bz@FreeBSD.org> Add infrastructure to allow all frames/packets received on an interface
to be assigned to a non-default FIB instance.

You may need to recompile world or ports due to the change of struct ifnet.

Submitted by: cjsp
Submitted by: Alexander V. Chernikov (melifaro ipfw.ru)
(original versions)
Reviewed by: julian
Reviewed by: Alexander V. Chernikov (melifaro ipfw.ru)
MFC after: 2 weeks
X-MFC: use spare in struct ifnet
7cd78b912e49a7054d596bcc33452dd4e104da5a 14-Jun-2011 luigi <luigi@FreeBSD.org> Grab one of the ifcap bits for netmap, and enable printing in ifconfig.

Document the fact that we might want an IFCAP_CANTCHANGE mask,
even though the value is not yet used in sys/net/if.c

(asked on -current a week ago, no feedback so i assume no objection).
6dc48cb05c2ff1b9db74aafddb1a26eb6ce7633c 07-Dec-2010 weongyo <weongyo@FreeBSD.org> Adds IFF_CANTCONFIG to IFF_CANTCHANGE that it shouldn't happen through
33417874f42d859e6925c0cad02eb4a0ade247ca 07-Dec-2010 weongyo <weongyo@FreeBSD.org> Introduces IFF_CANTCONFIG interface flag to point that the interface
isn't configurable in a meaningful way. This is for ifconfig(8) or
other tools not to change code whenever IFT_USB-like interfaces are
registered at the interface list.

Reviewed by: brooks
No objections: gavin, jkim
8950ed8036d593032cf4434f9e6a3289fc3cdb58 21-Oct-2010 pluknet <pluknet@FreeBSD.org> Reshuffle SIOCGIFCONF32 handler from r155224.

- move all the chunks into one file, which allows to hide SIOCGIFCONF32
global definition as well.
- replace __amd64__ with proper COMPAT_FREEBSD32 around.
- handle 32bit capacity before going into the handler itself instead of
doing internal 32bit specific changes within it (e.g. as it's done for
- use explicitely sized types for ABI compat.

Approved by: kib (mentor)
MFC after: 2 weeks
09f9c897d33c41618ada06fbbcf1a9b3812dee53 19-Oct-2010 jamie <jamie@FreeBSD.org> A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.
103baea9240b5ccaea44e22c2809b2ed736bf095 02-Apr-2010 qingli <qingli@FreeBSD.org> MFC 205222

Verify interface up status using its link state only
if the interface has such capability. The interface
capability flag indicates whether such capability
exists. This approach is much more backward compatible.
Physical device driver changes will be part of another

Also updated the ifconfig utility to show the LINKSTATE
capability if present.

Reviewed by: rwatson, imp, juli
f1216d1f0ade038907195fc114b7e630623b402c 19-Mar-2010 delphij <delphij@FreeBSD.org> Create a custom branch where I will be able to do the merge.
779a6c8b6b0feba7a2b03303706217b5b2ec7eda 18-Mar-2010 yongari <yongari@FreeBSD.org> MFC r204149:
Add TSO support on VLANs. Intentionally separated IFCAP_VLAN_HWTSO
from IFCAP_VLAN_HWTAGGING. I think some hardwares may be able to
TSO over VLAN without VLAN hardware tagging.
Driver changes and userland support will follow.
4ff4954e4e9b86c057c52b9d83cd9b7ba9517e0c 16-Mar-2010 qingli <qingli@FreeBSD.org> Verify interface up status using its link state only
if the interface has such capability. The interface
capability flag indicates whether such capability
exists. This approach is much more backward compatible.
Physical device driver changes will be part of another

Also updated the ifconfig utility to show the LINKSTATE
capability if present.

Reviewed by: rwatson, imp, juli
MFC after: 3 days
2c255a85f1a94490d53a0d50a11a7292322fdebb 26-Feb-2010 delphij <delphij@FreeBSD.org> MFC 203052:

Add interface description capability as inspired by OpenBSD. Thanks for
rwatson@, jhb@, brooks@ and others for feedback to the old implementation!

Sponsored by: iXsystems, Inc.
453676a091f9cfbbcbb1bae93154f6b75893d5c0 20-Feb-2010 yongari <yongari@FreeBSD.org> Add TSO support on VLANs. Intentionally separated IFCAP_VLAN_HWTSO
from IFCAP_VLAN_HWTAGGING. I think some hardwares may be able to
TSO over VLAN without VLAN hardware tagging.
Driver changes and userland support will follow.

Reviewed by: thompsa
d9a0cd0982402f9faf826972323ba7e2c92d4da2 27-Jan-2010 delphij <delphij@FreeBSD.org> Revised revision 199201 (add interface description capability as inspired
by OpenBSD), based on comments from many, including rwatson, jhb, brooks
and others.

Sponsored by: iXsystems, Inc.
MFC after: 1 month
7bf8a1b9d646bd7856bd06f8cb4b0812a6e8b925 05-Jan-2010 jhb <jhb@FreeBSD.org> MFC 201196:
Change vlan interfaces to cope more usefully with the parent interface being
renamed. Previously the vlan interfaces would lose their configuration as if
the parent interface had been physically removed. Now vlan interfaces ignore
rename events.
- Add a new ifnet flag (IFF_RENAMING) that is set while an ifnet is being
renamed. This flag can be checked in ifnet departure/arrival event
handlers to treat rename events differently.
- Change the ifnet departure event handler in the if_vlan(4) driver to
ignore departure events due to a trunk interface being renamed.
3ce93dcb7c6607b76d019f9acff610d5389f2b5a 29-Dec-2009 jhb <jhb@FreeBSD.org> Change vlan interfaces to cope more usefully with the parent interface being
renamed. Previously the vlan interfaces would lose their configuration as if
the parent interface had been physically removed. Now vlan interfaces ignore
rename events.
- Add a new ifnet flag (IFF_RENAMING) that is set while an ifnet is being
renamed. This flag can be checked in ifnet departure/arrival event
handlers to treat rename events differently.
- Change the ifnet departure event handler in the if_vlan(4) driver to
ignore departure events due to a trunk interface being renamed.

Reviewed by: brooks, rwatson
MFC after: 1 week
8fed657163fb373990aaa15c79b58a7c963373b2 12-Nov-2009 delphij <delphij@FreeBSD.org> Revert revision 199201 for now as it has introduced a kernel vulnerability
and requires more polishing.
13a19ef806aacb68fca8a06969fe760e790cf191 11-Nov-2009 delphij <delphij@FreeBSD.org> Add interface description capability as inspired by OpenBSD.

MFC after: 3 months
5675a54fb1a409499766ce55a009367c043fdc28 15-Jun-2009 jamie <jamie@FreeBSD.org> Manage vnets via the jail system. If a jail is given the boolean
parameter "vnet" when it is created, a new vnet instance will be created
along with the jail. Networks interfaces can be moved between prisons
with an ioctl similar to the one that moves them between vimages.
For now vnets will co-exist under both jails and vimages, but soon
struct vimage will be going away.

Reviewed by: zec, julian
Approved by: bz (mentor)
b523608331b881784ac18a7dfcb65c7a679130b0 30-May-2009 attilio <attilio@FreeBSD.org> When user_frac in the polling subsystem is low it is going to busy the
CPU for too long period than necessary. Additively, interfaces are kept
polled (in the tick) even if no more packets are available.
In order to avoid such situations a new generic mechanism can be
implemented in proactive way, keeping track of the time spent on any
packet and fragmenting the time for any tick, stopping the processing
as soon as possible.

In order to implement such mechanism, the polling handler needs to
change, returning the number of packets processed.
While the intended logic is not part of this patch, the polling KPI is
broken by this commit, adding an int return value and the new flag
IFCAP_POLLING_NOCOUNT (which will signal that the return value is
meaningless for the installed handler and checking should be skipped).

Bump __FreeBSD_version in order to signal such situation.

Reviewed by: emaste
Sponsored by: Sandvine Incorporated
c797841f0da1dbdd3f6280d977851c0b16d96d9c 23-Apr-2009 rwatson <rwatson@FreeBSD.org> Add a new interface flag, IFF_DYING, which is set when a device driver
calls if_free(), and remains set if the refcount is elevated. IF_DYING
skips the bit in the if_flags bitmask previously used by IFF_NEEDSGIANT,
so that an MFC can be done without changing which bit is used, as
IFF_NEEDSGIANT is still present in 7.x.

ifnet_byindex_ref() checks for IFF_DYING and returns NULL if it is set,
preventing new references from by acquired by index, preventing
monitoring sysctls from seeing it. Other lookup mechanisms currently
do not check IFF_DYING, but may need to in the future.

MFC after: 3 weeks
a4aa8097eaacf8ec63c16fbc4bb688c225bcb8de 18-Apr-2009 rwatson <rwatson@FreeBSD.org> Remove IFF_NEEDSGIANT interface flag: we no longer provide ifnet-layer
infrastructure to support non-MPSAFE network device drivers.
137b1713f41b2cd8bc372d908230e207061dcc1d 16-Feb-2009 luigi <luigi@FreeBSD.org> remove unnecessary forward declaration
d0cece42b9e5bb36b8b2473caed8df147b13c9c5 23-Dec-2008 kmacy <kmacy@FreeBSD.org> IF_RELENG7 185850:186420

merge latest from 7 stable
19b6af98ec71398e77874582eb84ec5310c7156f 22-Nov-2008 dfr <dfr@FreeBSD.org> Clone Kip's Xen on stable/6 tree so that I can work on improving FreeBSD/amd64
performance in Xen's HVM mode.
cf5320822f93810742e3d4a1ac8202db8482e633 19-Oct-2008 lulf <lulf@FreeBSD.org> - Import the HEAD csup code which is the basis for the cvsmode work.
e4ffb4bcce93ec96d96a47298d3ac371133bf089 28-Aug-2008 jfv <jfv@FreeBSD.org> Fix to bug kern/126850. Only dispatch event hander if the
interface had a parent (was attached).

Reviewed by: EvilSam
MFC after: 1 week
a863ae38807422cb9415976fbe9071fd3e71cd36 07-Aug-2008 jhb <jhb@FreeBSD.org> MFC: Trim some noise from some #ifdef's.
e7ce4e83d342a20ffd8fcfcbc8b0a17b52f7a331 07-Aug-2008 jhb <jhb@FreeBSD.org> MFC: Trim some noise from some #ifdef's.
4ffedc310b77717888890137221df8ea55bf7969 30-Jul-2008 jhb <jhb@FreeBSD.org> Trim some noise from some #ifdef's. This had leaked into the compat32
support for bpf(4) due to hacks in the Y! tree for a truss32 binary
(since superseded by native support for 32-bit binaries in truss itself).

MFC after: 1 week
4c59ed5458f44623d32621d5611d1cb36a81e27b 30-Jul-2008 kmacy <kmacy@FreeBSD.org> MFC IFCAP_TOE
02e03a52bf84af52234247a95d7e73575585f700 03-Apr-2008 sam <sam@FreeBSD.org> MFC: WOL infrastructure support
5d497f8d2b934e27aac1fbda1472650c0008b0d3 01-Apr-2008 iedowse <iedowse@FreeBSD.org> MFC 1.113: Add IFF_NEEDSGIANT to IFF_CANTCHANGE to prevent userland
code from clearing IFF_NEEDSGIANT.
8b81d719e16ff5e6da5d41c553b89216e72864fe 27-Mar-2008 iedowse <iedowse@FreeBSD.org> Add IFF_NEEDSGIANT to IFF_CANTCHANGE, to prevent user-level code
from clearing the IFF_NEEDSGIANT flag on Giant-locked interfaces.
In particular, wpa_supplicant was doing this on USB interfaces,
causing panics when Giant-locked code was then called without Giant.

Submitted by: Alexey Popov
Reviewed by: rwatson
MFC after: 3 days
96bf4f52953dddc12e3490919ba932dcfb5bc76d 15-Dec-2007 kmacy <kmacy@FreeBSD.org> fix bonehead cut and paste error in last commit
d568417c8a243d45d370c377e37b6c67272575f5 15-Dec-2007 kmacy <kmacy@FreeBSD.org> Create separate capability flags for TCP over IPv4 and TCP over IPv6
755846c8cc3dfe3b178e2b0feee3359aa63a85d8 15-Dec-2007 kmacy <kmacy@FreeBSD.org> add interface capability for TOE
08954540f840d7da2963bbfea3f07480e5b0b7d2 10-Dec-2007 sam <sam@FreeBSD.org> Wake On Lan (WOL) infrastructure

Submitted by: Stefan Sperling <stsp@stsp.name>
Reviewed by: brooks
1294243b0904bb1392f338c03b0cc4fd9a775519 11-Jun-2007 andre <andre@FreeBSD.org> Add IFCAP_LRO flag for drivers to announce their TCP Large Receive Offload
ec3f5ae79f183ed96738b13d970bdf0219a1eca5 16-May-2007 brooks <brooks@FreeBSD.org> The struct if_data members ifi_recvquota and ifi_xmitquota have been
unused for ages. Rename them to ifi_spare_char1 and ifi_spare_char2
respectively to indicate this face.
78898ea5a86dc4f5a06f63d1f2c50c2c6497bef9 02-May-2007 yar <yar@FreeBSD.org> Fix a couple of typos in a comment.
cb05913fd251edc3d35bcbeca73a8b681e2e58e8 06-Sep-2006 andre <andre@FreeBSD.org> First step of TSO (TCP segmentation offload) support in our network stack.

o add IFCAP_TSO[46] for drivers to announce this capability for IPv4 and IPv6
o add CSUM_TSO flag to mbuf pkthdr csum_flags field
o add tso_segsz field to mbuf pkthdr
o enhance ip_output() packet length check to allow for large TSO packets
o extend tcp_maxmtu[46]() with a flag pointer to pass interface capabilities
o adjust all callers of tcp_maxmtu[46]() accordingly

Discussed on: -current, -net
Sponsored by: TCP/IP Optimization Fundraise 2005
ae5965062b9bbb95c18a0e4e157f2a1c1247263f 06-Sep-2006 andre <andre@FreeBSD.org> Improve description of if_capabilities, if_capenable and ifi_hwassist.

Sponsored by: TCP/IP Optimization Fundraise 2005
f5cde2819f76cb3f86ff02a0c422b289ce94a096 19-Jun-2006 mlaier <mlaier@FreeBSD.org> Import interface groups from OpenBSD. This allows to group interfaces in
order to - for example - apply firewall rules to a whole group of
interfaces. This is required for importing pf from OpenBSD 3.9

Obtained from: OpenBSD (with changes)
Discussed on: -net (back in April)
0f921e0992f543c4aafd5604a99a6edaa059ff36 12-May-2006 jhb <jhb@FreeBSD.org> Remove various bits of conditional Alpha code and fixup a few comments.
f2d6dfc1f897f444ba53a5b9295391cff03ad7fc 15-Feb-2006 ps <ps@FreeBSD.org> MFC:
Implement SIOCGIFCONF for 32bit binaries.

Approved by: re
5025ffa6d7b9fd52c83ce82c2301f94b93127a49 02-Feb-2006 ps <ps@FreeBSD.org> Implement SIOCGIFCONF for 32bit binaries.
19f8b36e662bea1b79c01dab7717540417040328 30-Jan-2006 glebius <glebius@FreeBSD.org> Merge the //depot/user/yar/vlan branch into CVS. It contains some collective
work by yar, thompsa and myself. The checksum offloading part also involves
work done by Mihail Balikov.

The most important changes:

o Instead of global linked list of all vlan softc use a per-trunk
hash. The size of hash is dynamically adjusted, depending on
number of entries. This changes struct ifnet, replacing counter
of vlans with a pointer to trunk structure. This change is an
improvement for setups with big number of VLANs, several interfaces
and several CPUs. It is a small regression for a setup with a single
VLAN interface.
An alternative to dynamic hash is a per-trunk static array with
4096 entries, which is a compile time option - VLAN_ARRAY. In my
experiments the array is not an improvement, probably because such
a big trunk structure doesn't fit into CPU cache.
o Introduce an UMA zone for VLAN tags. Since drivers depend on it,
the zone is declared in kern_mbuf.c, not in optional vlan(4) driver.
This change is a big improvement for any setup utilizing vlan(4).
o Use rwlock(9) instead of mutex(9) for locking. We are the first
ones to do this! :)
o Some drivers can do hardware VLAN tagging + hardware checksum
offloading. Add an infrastructure for this. Whenever vlan(4) is
attached to a parent or parent configuration is changed, the flags
on vlan(4) interface are updated.

In collaboration with: yar, thompsa
In collaboration with: Mihail Balikov <mihail.balikov interbgc.com>
97d261903e19a9ccc27633a4c5757894fc682cb5 07-Oct-2005 glebius <glebius@FreeBSD.org> Big overall MFC of polling(4) cleanup:

o First attempt on removing Giant from polling. Details:
o Second attempt, and big polling cleanup including:
- Functinal approach to turning polling on/off
- Deprecating of poll_in_trap
- Removal of ifnet knowledge from kern_poll.c
o Improved checking of user configurable sysctls. Details:
o Moving DEVICE_POLLING from opt_global.h to opt_device_polling.h:

o All related documentation fixes.

Approved by: re (kensmith)
Thanks to: everyone, who helped with testing
f41a83bf429b15386f43f43f3f5326d4ece7bfce 01-Oct-2005 glebius <glebius@FreeBSD.org> Big polling(4) cleanup.

o Axe poll in trap.

o Axe IFF_POLLING flag from if_flags.

o Rework revision 1.21 (Giant removal), in such a way that
poll_mtx is not dropped during call to polling handler.
This fixes problem with idle polling.

o Make registration and deregistration from polling in a
functional way, insted of next tick/interrupt.

o Obsolete kern.polling.enable. Polling is turned on/off
with ifconfig.

Detailed kern_poll.c changes:
- Remove polling handler flags, introduced in 1.21. The are not
needed now.
- Forget and do not check if_flags, if_capenable and if_drv_flags.
- Call all registered polling handlers unconditionally.
- Do not drop poll_mtx, when entering polling handlers.
- In ether_poll() NET_LOCK_GIANT prior to locking poll_mtx.
- In netisr_poll() axe the block, where polling code asks drivers
to unregister.
- In netisr_poll() and ether_poll() do polling always, if any
handlers are present.
- In ether_poll_[de]register() remove a lot of error hiding code. Assert
that arguments are correct, instead.
- In ether_poll_[de]register() use standard return values in case of
error or success.
- Introduce poll_switch() that is a sysctl handler for kern.polling.enable.
poll_switch() goes through interface list and enabled/disables polling.
A message that kern.polling.enable is deprecated is printed.

Detailed driver changes:
- On attach driver announces IFCAP_POLLING in if_capabilities, but
not in if_capenable.
- On detach driver calls ether_poll_deregister() if polling is enabled.
- In polling handler driver obtains its lock and checks IFF_DRV_RUNNING
flag. If there is no, then unlocks and returns.
- In ioctl handler driver checks for IFCAP_POLLING flag requested to
be set or cleared. Driver first calls ether_poll_[de]register(), then
obtains driver lock and [dis/en]ables interrupts.
- In interrupt handler driver checks IFCAP_POLLING flag in if_capenable.
If present, then returns.This is important to protect from spurious

Reviewed by: ru, sam, jhb
03e9f7fc13724ddac99c2cede9a3ee12665abcda 25-Aug-2005 rwatson <rwatson@FreeBSD.org> Merge if.h:1.98 from HEAD to RELENG_6:

For each interface flag, indicate whether or not it is owned by the
device driver, owned by the network stack, or initialized by the device
driver before attach and read-only from then on.

Not all device drivers and network stack components currently follow
these rules, especially with respect to IFF_UP, and a few exceptions

Approved by: re (scottl)
8f8fa61d9b908491aa25431ee2d79e509e637f4a 25-Aug-2005 rwatson <rwatson@FreeBSD.org> Merge if.c:1.242, if.h:1.97, if_var.h:1.102, rtsock.c:1.125 from HEAD
to RELENG_6:

and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making
and documenting the locking of these flags the responsibility of the
device driver, not the network stack. The flags for these two fields
will be mutually exclusive so that they can be exposed to user space as
though they were stored in the same variable.

Provide #defines to provide the old names #ifndef _KERNEL, so that user
applications (such as ifconfig) can use the old flag names. Using the
old names in a device driver will result in a compile error in order to
help device driver writers adopt the new model.

When exposing the interface flags to user space, via interface ioctls
or routing sockets, or the two fields together. Since the driver flags
cannot currently be set for user space, no new logic is currently
required to handle this case.

Add some assertions that general purpose network stack routines, such
as if_setflags(), are not improperly used on driver-owned flags.

With this change, a large number of very minor network stack races are
closed, subject to correct device driver locking. Most were likely
never triggered.

Driver sweep to follow; many thanks to pjd and bz for the line-by-line
review they gave this patch.

Reviewed by: pjd, bz

Approved by: re (scottl)
76ad033815b8ee606dfb2c8a6cb7f369f869754c 09-Aug-2005 rwatson <rwatson@FreeBSD.org> For each interface flag, indicate whether or not it is owned by the
device driver, owned by the network stack, or initialized by the device
driver before attach and read-only from then on.

Not all device drivers and network stack components currently follow
these rules, especially with respect to IFF_UP, and a few exceptions

MFC after: 7 days
74759aaa78777146f23aa05c856f574efdfb41d9 09-Aug-2005 rwatson <rwatson@FreeBSD.org> Rename IFF_RUNNING to IFF_DRV_RUNNING, IFF_OACTIVE to IFF_DRV_OACTIVE,
and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making
and documenting the locking of these flags the responsibility of the
device driver, not the network stack. The flags for these two fields
will be mutually exclusive so that they can be exposed to user space as
though they were stored in the same variable.

Provide #defines to provide the old names #ifndef _KERNEL, so that user
applications (such as ifconfig) can use the old flag names. Using the
old names in a device driver will result in a compile error in order to
help device driver writers adopt the new model.

When exposing the interface flags to user space, via interface ioctls
or routing sockets, or the two fields together. Since the driver flags
cannot currently be set for user space, no new logic is currently
required to handle this case.

Add some assertions that general purpose network stack routines, such
as if_setflags(), are not improperly used on driver-owned flags.

With this change, a large number of very minor network stack races are
closed, subject to correct device driver locking. Most were likely
never triggered.

Driver sweep to follow; many thanks to pjd and bz for the line-by-line
review they gave this patch.

Reviewed by: pjd, bz
MFC after: 7 days
6c5bdda300f45e4abacd6f3dbf4663bbfdfefa35 05-Jun-2005 thompsa <thompsa@FreeBSD.org> Add hooks into the networking layer to support if_bridge. This changes struct
ifnet so a buildworld is necessary.

Approved by: mlaier (mentor)
Obtained from: NetBSD
e5a9c072c1563b17574a8123220f8773afeedcee 25-Feb-2005 brooks <brooks@FreeBSD.org> Change the definition of struct if_data's member ifi_epoch from wall
clock time to uptime because wall clock time may go backwards.

This is a change in the API which will impact SNMP agents who are using
ifi_epoch to set RFC2233's ifCounterDiscontinuityTime. None are know to
exist today. This will not impact applications that are using the
<index, epoch> tuple to verify interface uniqueness except that it
eliminates a race which could lead to a false assumption of uniqueness.

Because this is a behavior change, bump __FreeBSD_version.

Discussed with: re (jhb, scottl)
MFC after: 3 days
Pointed out by: pkh (way back at EuroBSDCon)
Pointy hat: brooks
a50ffc29129a52835a39bf4868cd5facdc7dce30 07-Jan-2005 imp <imp@FreeBSD.org> /* -> /*- for license, minor formatting changes
f5e433d72b353827585dc716ec1c9bf236d0e3ec 17-Nov-2004 jmg <jmg@FreeBSD.org> sync comment on IFF_OACTIVE with reality.. IFF_OACTIVE is set when the
hardware cannot take anymore packets, and so will supress the calling of
the device's if_start method...

Submitted by: bde
143d77da28b1eae3998141ecdbf493f35562a44e 08-Sep-2004 brooks <brooks@FreeBSD.org> Re-add ifi_epoch, to struct if_data, this time replacing ifi_unused
to avoid ABI changes. It is set to the last time the interface
counters were zeroed, currently the time if_attach() was called. It is
intentended to be a valid value for RFC2233's ifCounterDiscontinuityTime
and to make it easier for applications to verify that the interface they
find at a given index is the one that was there last time they looked.

Due to space constraints ifi_epoch is a time_t rather then a struct
timeval. SNMP would prefer higher precision, but this unlikely to be
useful in practice.
9baee722362720c44fff4fcd638e44879a8bfda6 02-Sep-2004 brooks <brooks@FreeBSD.org> Back out ifi_epoch. The ABI breakage is too disruptive this close to
5-STABLE. ifi_epoch will shortly be reintroduced with less precistion
using the space currently allocated to ifi_unused.
ba918da2a51c9e7f2352076e0fbab5f3a82f8104 01-Sep-2004 brooks <brooks@FreeBSD.org> Use a spare byte in struct if_data to store the structure size without
increasing it. Add code to ifconfig to use this size to find the
sockaddr_dl after the struct if_data in the routing message. This
allows struct if_data to grow (up to 255 bytes) without breaking

Submitted by: peter
922e581a21e3c43e13da68f2584a1e249ae79fa3 30-Aug-2004 brooks <brooks@FreeBSD.org> Add a new variable, ifi_epoch, to struct if_data. It is set to the last
time the interface counters were zeroed, currently the time if_attach()
was called. It is indentended to be a valid value for RFC2233's
ifCounterDiscontinuityTime and to make it easier for applications to
verify that the interface they find at a given index is the one that was
there last time they looked.

An if_epoch "compatability" macro has not been created as ifi_epoch has
never been a member of struct ifnet.

Approved by: andre, bms, wollman
b463bc6c336f88c5c53b54a13c72ffd11be29e4e 27-Jul-2004 rwatson <rwatson@FreeBSD.org> Add a new network interface flag, IFF_NEEDSGIANT, which will allow
device drivers to declare that the ifp->if_start() method implemented
by the driver requires Giant in order to operate correctly.

Add a 'struct task' to 'struct ifnet' that can be used to execute a
deferred ifp->if_start() in the event that if_start needs to be called
in a Giant-free environment. To do this, introduce if_start(), a
wrapper function for ifp->if_start(). If the interface can run MPSAFE,
it directly dispatches into the interface start routine. If it can't
run MPSAFE, we're running with debug.mpsafenet != 0, and Giant isn't
currently held, the task is queued to execute in a swi holding Giant
via if_start_deferred().

Modify if_handoff() to use if_start() instead of direct dispatch.
Modify 802.11 to use if_start() instead of direct dispatch.

This is intended to provide increased compatibility for non-MPSAFE
network device drivers in the presence of Giant-free operation via
asynchronous dispatch. However, this commit does not mark any network
interfaces as IFF_NEEDSGIANT.
e1dd867b5532da103ae1459a89ca3df2b8b6f0f6 22-Jun-2004 brooks <brooks@FreeBSD.org> Major overhaul of pseudo-interface cloning. Highlights include:

- Split the code out into if_clone.[ch].
- Locked struct if_clone. [1]
- Add a per-cloner match function rather then simply matching names of
the form <name><unit> and <name>.
- Use the match function to allow creation of <interface>.<tag>
vlan interfaces. The old way is preserved unchanged!
- Also the match function to allow creation of stf(4) interfaces named
stf0, stf, or 6to4. This is the only major user visible change in
that "ifconfig stf" creates the interface stf rather then stf0 and
does not print "stf0" to stdout.
- Allow destroy functions to fail so they can refuse to delete
interfaces. Currently, we forbid the deletion of interfaces which
were created in the init function, particularly lo0, pflog0, and
pfsync0. In the case of lo0 this was a panic implementation so it
does not count as a user visiable change. :-)
- Since most interfaces do not need the new functionality, an family of
wrapper functions, ifc_simple_*(), were created to wrap old style
cloner functions.
- The IF_CLONE_INITIALIZER macro is replaced with a new incompatible

Submitted by: Maurycy Pawlowski-Wieronski <maurycy at fouk.org> [1]
Reviewed by: andre, mlaier
Discussed on: net
25ae331e1251bd9562a10f53fdb2f054f7b21dfe 03-May-2004 andre <andre@FreeBSD.org> Link state change notification of ethernet media to the routing socket.

o Extend the if_data structure with an ifi_link_state field and
provide the corresponding defines for the valid states.

o The mii_linkchg() callback updates the ifi_link_state field
and calls rt_ifmsg() to notify listeners on the routing socket
in addition to the kqueue KNOTE.

o If vlans are configured on a physical interface notify and update
all vlan pseudo devices as well with the vlan_link_state() callback.

No objections by: sam, wpaul, ru, bms
Brucification by: bde
dd9ed984465845ba1807ad894b5cf81130230e40 11-Apr-2004 ru <ru@FreeBSD.org> Added the new interface capability option for drivers that implement
user-configurable polling(4) support. Make ifconfig(8) aware of it.

Suggested by: luigi
b49b7fe7994689a25dfc2162fe02f1d030360089 07-Apr-2004 imp <imp@FreeBSD.org> Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
f1e94c6f29b079e4ad9d9305ef3e90a719bcbbda 31-Oct-2003 brooks <brooks@FreeBSD.org> Replace the if_name and if_unit members of struct ifnet with new members
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.

This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.

Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)
e7ebf2c29a49deea4eaab17c9af60539b957ba24 01-Oct-2003 ru <ru@FreeBSD.org> By popular demand, added the "static ARP" per-interface option.
7092aea8c374b04d088d4a58833928c81805dd36 14-Nov-2002 sam <sam@FreeBSD.org> o add IF_*bps macros for netbsd compatibility
o add interface capabilities for vlan use and to signal jumbo frame support

Reviewed by: many
Approved by: re
73d23540ae1d0a77fa8be27264e676840fd5b15c 02-Oct-2002 mike <mike@FreeBSD.org> style(9):
o Align members of struct if_nameindex.
o Align and sort function prototypes.
7849239d924261f9d43a6afd7a1a628b20a9fbe6 02-Oct-2002 mike <mike@FreeBSD.org> Use standards visibility conditionals to conditionalize most of this
header (details on how the visibility conditionals work are available
in <sys/cdefs.h>). Use standard types instead of BSD specific ones,
so that this header compiles in the standards case (specifically this
means changing `u_int' to `unsigned int').
d61cac74b0dc667b645880b386219cacfb3c53f6 27-Sep-2002 phk <phk@FreeBSD.org> Add the "Monitor" interface flag.

Setting this flag on an ethernet interface blocks transmission of packets
and discards incoming packets after BPF processing.

This is useful if you want to monitor network trafic but not interact
with the network in question.

Sponsored by: http://www.babeltech.dk
fb383aafc75ac488266b7800dad41a7c1ece20f8 28-Aug-2002 sobomax <sobomax@FreeBSD.org> Add IFF_POLLING into the list of flags which are protected from changing via

MFC after: 1 day
e50e3b03ec109b238fb7d960a5be0b08961b41d4 19-Aug-2002 sobomax <sobomax@FreeBSD.org> Implement user-setable promiscuous mode (a new `promisc' flag for ifconfig(8)).
Also, for all interfaces in this mode pass all ethernet frames to upper layer,
even those not addressed to our own MAC, which allows packets encapsulated
in those frames be processed with packet filters (ipfw(8) et al).

Emphatically requested by: Anton Turygin <pa3op@ukr-link.net>
Valuable suggestions by: fenner
f6cebc060671b6c67f52080c35a0e55d5498cbf0 18-Aug-2002 sobomax <sobomax@FreeBSD.org> Increase size of ifnet.if_flags from 16 bits (short) to 32 bits (int). To avoid
breaking application ABI use unused ifreq.ifru_flags[1] for upper 16 bits in

Reviewed by: -hackers, -net
6cfd5a5a1d8c4a93c799f5e36a35e82908de3464 25-May-2002 brooks <brooks@FreeBSD.org> Move all unit number management cloned interfaces into the cloning
code. The reverts the API change which made the <if>_clone_destory()
functions return an int instead of void bringing us into closer
alignment with NetBSD.

Reviewed by: net (a long time ago)
5e19174e4ef09b116ed7a1c94be4212a11f99492 20-May-2002 iedowse <iedowse@FreeBSD.org> Avoid exposing struct if_clone and the sys/queue.h macros to userland
programs by restricting these to the case where _KERNEL is defined.

Reviewed by: brooks (ages ago)
c9985516e46bc6cccc11eac067da81d7968b7700 19-Mar-2002 alfred <alfred@FreeBSD.org> Remove __P.
9a5e4c88a3c8f122fd226c41249210b7d9a95ab7 11-Mar-2002 mux <mux@FreeBSD.org> Simplify the interface cloning framework by handling unit
unit allocation with a bitmap in the generic layer. This
allows us to get rid of the duplicated rman code in every
clonable interface.

Reviewed by: brooks
Approved by: phk
50d3be4c82e6c70eac43734b67628e7a27fa3e24 04-Mar-2002 brooks <brooks@FreeBSD.org> Change the network interface cloning API so the destroy function returns
an int errorcode instead of void in preperation for merging cloning of
the loopback device.

Submitted by: mux
MFC after: 2 weeks
7bbde3fb1fe2ff7b5efd1426bdbd435faa950f70 18-Jan-2002 ru <ru@FreeBSD.org> Introduce an interface announcement message for the routing
socket so that routing daemons and other interested parties
know when an interface is attached/detached.

PR: kern/33747
Obtained from: NetBSD
MFC after: 2 weeks
f8ad22919e217e5aa0f3f7a246fc37aaee182364 14-Dec-2001 luigi <luigi@FreeBSD.org> Device Polling code for -current.

Non-SMP, i386-only, no polling in the idle loop at the moment.

To use this code you must compile a kernel with


and at runtime enable polling with

sysctl kern.polling.enable=1

The percentage of CPU reserved to userland can be set with

sysctl kern.polling.user_frac=NN (default is 50)

while the remainder is used by polling device drivers and netisr's.
These are the only two variables that you should need to touch. There
are a few more parameters in kern.polling but the default values
are adequate for all purposes. See the code in kern_poll.c for
more details on them.

Polling in the idle loop will be implemented shortly by introducing
a kernel thread which does the job. Until then, the amount of CPU
dedicated to polling will never exceed (100-user_frac).
The equivalent (actually, better) code for -stable is at


and also supports polling in the idle loop.

NOTE to Alpha developers:
There is really nothing in this code that is i386-specific.
If you move the 2 lines supporting the new option from
sys/conf/{files,options}.i386 to sys/conf/{files,options} I am
pretty sure that this should work on the Alpha as well, just that
I do not have a suitable test box to try it. If someone feels like
trying it, I would appreciate it.

NOTE to other developers:
sure some things could be done better, and as always I am open to
constructive criticism, which a few of you have already given and
I greatly appreciated.
However, before proposing radical architectural changes, please
take some time to possibly try out this code, or at the very least
read the comments in kern_poll.c, especially re. the reason why I
am using a soft netisr and cannot (I believe) replace it with a
simple timeout.

Quick description of files touched by this commit:

new file kern/kern_poll.c
new option
poll in trap (disabled by default)
initialization and hardclock hooks.
minor swi_net changes
the bulk of the code.
new flag
declaration for functions used in device drivers.
device driver modifications
85e1c0879143bd206c275035a79c73a212654cfa 17-Oct-2001 jlemon <jlemon@FreeBSD.org> Add a SIOCGIFINDEX ioctl, which returns the index of a named interface.
This will be used to more efficiently support if_nametoindex(3).
28193b25ab7bec0c525aac3dc9957b0199098d20 18-Sep-2001 jlemon <jlemon@FreeBSD.org> Split HWCSUM into two components: RX and TX, for the benefit of drivers
which can only do checksum offloading in one direction.
131e3ad4ce7736c260d62c4f5f2ff41a46dc7de0 18-Sep-2001 jlemon <jlemon@FreeBSD.org> Add two fields to the ifnet structure indicating what extra capabilities
a network device has, and which ones are enabled.
5596676e6c6c1e81e899cd0531f9b1c28a292669 12-Sep-2001 julian <julian@FreeBSD.org> KSE Milestone 2
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha
5da97d80e2d7042b9d86959519aca3d58066ca21 02-Jul-2001 brooks <brooks@FreeBSD.org> Add kernel infrastructure for network device cloning.

Reviewed by: ru, ume
Obtained from: NetBSD
MFC after: 1 week
ab5676fc870d2d819cf41120313443182db079cf 21-Feb-2001 rwatson <rwatson@FreeBSD.org> o Move per-process jail pointer (p->pr_prison) to inside of the subject
credential structure, ucred (cr->cr_prison).
o Allow jail inheritence to be a function of credential inheritence.
o Abstract prison structure reference counting behind pr_hold() and
pr_free(), invoked by the similarly named credential reference
management functions, removing this code from per-ABI fork/exit code.
o Modify various jail() functions to use struct ucred arguments instead
of struct proc arguments.
o Introduce jailed() function to determine if a credential is jailed,
rather than directly checking pointers all over the place.
o Convert PRISON_CHECK() macro to prison_check() function.
o Move jail() function prototypes to jail.h.
o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the
flag in the process flags field itself.
o Eliminate that "const" qualifier from suser/p_can/etc to reflect
mutex use.


o Some further cleanup of the linux/jail code is still required.
o It's now possible to consider resolving some of the process vs
credential based permission checking confusion in the socket code.
o Mutex protection of struct prison is still not present, and is
required to protect the reference count plus some fields in the

Reviewed by: freebsd-arch
Obtained from: TrustedBSD Project
7d76aced28ec9c258bd533a4e33516f71f44b6de 06-Feb-2001 asmodai <asmodai@FreeBSD.org> Fix typo: compatability -> compatibility.

Compatability is not an existing english word.
b42951578188c5aab5c9f8cbcde4a743f8092cdc 02-Apr-2000 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'ALSA'.
0dcc5bc0d1cca22e0204f9b9da39474b95100992 27-Mar-2000 jlemon <jlemon@FreeBSD.org> Add support for offloading IP/TCP/UDP checksums to NIC hardware which
supports them.
15b9bcb121e1f3735a2c98a11afdb52a03301d7e 29-Dec-1999 peter <peter@FreeBSD.org> Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.
7efc91cadcfeb421fc4d02ba94db784616f3714c 05-Nov-1999 shin <shin@FreeBSD.org> KAME related header files additions and merges.
(only those which don't affect c source files so much)

Reviewed by: cvs-committers
Obtained from: KAME project
3b842d34e82312a8004a7ecd65ccdb837ef72ac1 28-Aug-1999 peter <peter@FreeBSD.org> $Id$ -> $FreeBSD$
d20b3b092821e3562a5d044181e5c7829276dc91 05-Jul-1999 bde <bde@FreeBSD.org> Fixed English errors, spelling errors and formatting errors in rev.1.51
and rev.1.53.
1048fa73010ca9dfb0cfba83b166fcfa2ef994be 19-Jun-1999 phk <phk@FreeBSD.org> Add a new interface ioctl, to return "aux status".

This is inteded for to allow ifconfig to print various unstructured
information from an interface.

The data is returned from the kernel in ASCII form, see the comment in
if.h for some technicalities.

Canonical cut&paste example to be found in if_tun.c

Initial use:
Now tun* interfaces tell the PID of the process which opened them.

Future uses could be (volounteers welcome!):
Have ppp/slip interfaces tell which tty they use.
Make sync interfaces return their media state: red/yellow/blue
alarm, timeslot assignment and so on.
Make ethernets warn about missing heartbeats and/or cables
0098c16802b914fdce6d769f5d8614afd9752297 06-Jun-1999 phk <phk@FreeBSD.org> Introduce IFF_SMART bit.

This means that the driver will add/delete routes when it knows it is
up/down, rather than have the generic code belive it is up if configured.

This is probably most useful for serial lines, although many PHY chips
could probably tell us if we're connected to the cable/hub as well.
31167e1a820f19bcb6a06bf97cbead60ae7105d6 08-May-1999 phk <phk@FreeBSD.org> Fix some disordering I introduced with the jail code.
ca21a25f173ed030b0093e4d83140e3b0b43db01 28-Apr-1999 phk <phk@FreeBSD.org> This Implements the mumbled about "Jail" feature.

This is a seriously beefed up chroot kind of thing. The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.

For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact: "real virtual servers".

Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own

Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.

It generally does what one would expect, but setting up a jail
still takes a little knowledge.

A few notes:

I have no scripts for setting up a jail, don't ask me for them.

The IP number should be an alias on one of the interfaces.

mount a /proc in each jail, it will make ps more useable.

/proc/<pid>/status tells the hostname of the prison for
jailed processes.

Quotas are only sensible if you have a mountpoint per prison.

There are no privisions for stopping resource-hogging.

Some "#ifdef INET" and similar may be missing (send patches!)

If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!

Tools, comments, patches & documentation most welcome.

Have fun...

Sponsored by: http://www.rndassociates.com/
Run for almost a year by: http://www.servetheweb.com/
79b9e4e72513965017026325591cc1c2b139deb2 19-Feb-1999 phk <phk@FreeBSD.org> Since ifru_flags is a short, we can fit in a copy of the flags
before they got changed. This can help eliminate much of the
gymnastics drivers do in their ioctl routines to figure this out.

Remove commented out IFF_NOTRAILERS
e4c65b8a7fc8cb3176920d5808c9099c69367508 21-Mar-1998 peter <peter@FreeBSD.org> On most other systems "out there", <net/if.h> does not require the caller
to #include <sys/time.h> first. I've lost count of the number of times
I've had to patch this in porting code. The problem is the
"struct timeval ifi_lastchange" in the mib stats. (most other systems don't
have this, until 4.4bsd anyway).
6ce595a40c4ac6011201f812617639a05193a421 13-Jan-1998 wollman <wollman@FreeBSD.org> Add a macro to accurately calculate the length of a struct ifreq when
it contains an address. This can replace all the myriad (wrong) ways
in which this task is performed in the current system. As an added
bonus, since it's a macro, then third-party software vendors have an easy
way to tell whether it's there or not. (This will become necessary
when sizeof(struct sockaddr) is increaased, and also when additional
fields are added to struct ifreq.)
0506343883d62f6649f7bbaf1a436133cef6261d 11-Jan-1998 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'jb'.
7c6e96080c4fb49bf912942804477d202a53396c 10-Jan-1998 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'JB'.
36e7a51ea1dedf0fc860ff3106aee1db1ab3b1f5 12-Oct-1997 phk <phk@FreeBSD.org> Last major round (Unless Bruce thinks of somthing :-) of malloc changes.

Distribute all but the most fundamental malloc types. This time I also
remembered the trick to making things static: Put "static" in front of

A couple of finer points by: bde
97edcdf2f7e8f7cd5dc81bd510475b99b2df57ea 03-May-1997 peter <peter@FreeBSD.org> add SIOC{S,G}IFMEDIA ioctl support
94b6d727947e1242356988da003ea702d41a97de 22-Feb-1997 peter <peter@FreeBSD.org> Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
808a36ef658c1810327b5d329469bcf5dad24b28 14-Jan-1997 jkh <jkh@FreeBSD.org> Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
19e2ac904f46e8d37d7cd3d2b1186258b4e8f73b 13-Jan-1997 wollman <wollman@FreeBSD.org> Use the new if_multiaddrs list for multicast addresses rather than the
previous hackery involving struct in_ifaddr and arpcom. Get rid of the
abominable multi_kludge. Update all network interfaces to use the
new machanism. Distressingly few Ethernet drivers program the multicast
filter properly (assuming the hardware has one, which it usually does).
4d20b5bdb1ab98e8e30b49f4823ca47016929850 03-Jan-1997 wollman <wollman@FreeBSD.org> Separate kernel-internal data structures from exposed user interface
to interfaces. (Amazing nobody had done this!)

More commits to fix up user-land to follow.
3417f9411098d1cd19c5db539c0768e778b83a1c 13-Dec-1996 wollman <wollman@FreeBSD.org> Convert the interface address and IP interface address structures
to TAILQs. Fix places which referenced these for no good reason
that I can see (the references remain, but were fixed to compile
again; they are still questionable).
9b9067caec777892be0e223bef825ffc72c90754 11-Dec-1996 wollman <wollman@FreeBSD.org> Use queue macros for the list of interfaces. Next stop: ifaddrs!
e1e9e3aa19b00c35016cddf8a23e2b7fc0d9f4e0 11-Dec-1996 wollman <wollman@FreeBSD.org> Include <net/if_arp.h> in the one header that requires it,
<netinet/if_ether.h>, rather than in <net/if.h>, most of whose callers
have no need of it.

Pointed-out-by: bde
944eee9c6bcea962bfaab5753ca059edaf4e87a5 10-Dec-1996 wollman <wollman@FreeBSD.org> Finally, after six years, remove the ``quick hack for SNMP'' that was
``going away soon''.
1665979d2e5ce0eb4858dedb47572346ff1eadef 10-Dec-1996 dg <dg@FreeBSD.org> 1) Implement SIOCSIFMTU in ether_ioctl(), and change ether_ioctl's return
type to be int so that errors can be returned.
2) Use the new SIOCSIFMTU ether_ioctl support in the few drivers that are
using ether_ioctl().
3) In if_fxp.c: treat if_bpf as a token, not as a pointer. Don't bother
testing for FXP_NTXSEG being reached in fxp_start()...just check for
non-NULL 'm'. Change fxp_ioctl() to use ether_ioctl().
e6dd1aae14b363028ee8042c459a446159ed3660 21-Oct-1996 fenner <fenner@FreeBSD.org> Fix comments, which appear to have been mangled long ago and far away.
8c49d9975c6bbd7704d4d1247e57ce623ee4dbf9 12-Oct-1996 bde <bde@FreeBSD.org> Removed nested include if <sys/socket.h> from <net/if.h> and
<net/if_arp.h> and fixed the things that depended on it. The nested
include just allowed unportable programs to compile and made my
simple #include checking program report that networking code doesn't
need to include <sys/socket.h>.
ac4035b489d727d3b8fe4c8d428b63ae1f04a0d2 26-Aug-1996 julian <julian@FreeBSD.org> correct a field comment that someone must have accidentally spammed
as it's still used for what the original BSD4.4 comment says it's for.
b66887b753becbfa885b773ad8eee174d7b7c3a9 04-Aug-1996 phk <phk@FreeBSD.org> Add a callback pointer to the interfaces "init" routine.
Add ether_ioctl() which can take care of the SIOC[SG]IFADDR cases for
most (ethernet) drivers.
50a3e4ed9e414f69f86253ea0bf631d402af6571 30-Jul-1996 wollman <wollman@FreeBSD.org> Add better support for retrieving management information from network
interfaces. This creates two new tables in the net.link.generic branch
of the MIB; one contains (essentially) `ifdata' structures, and the other
contains a blob provided by the interface (and presumably used to
implement link-layer-specific MIB variables). A number of things
have been moved around in the `ifnet' and `ifdata' structures, so
NEW VERSIONS OF ifconfig(8) AND routed(8) ARE REQUIRED. (A simple
recompile is all that's necessary.)

I have a sample program which uses this interface for those interested
in making use of it.
1499b723bf88e94551babdefaccd5fd26e3262bd 23-Jul-1996 wollman <wollman@FreeBSD.org> Fix a spelling error I forgot to bring over from my personal version
of the last (IF_ENQ_DROP) commit.
1e5c7466fc7a8db3a30171d28edf3e2e14a97f6b 22-Jul-1996 wollman <wollman@FreeBSD.org> Add a new, better mechanism for sticking packets onto ifqueues.
The old system had the misfeature that the only policy it could implement
was tail-drop; the new IF_ENQ_DROP macro/function makes it possible
to implement more sophisticated queueing policies on a system-wide
basis. No code actually uses this yet (although on my machine
I have converted the ethernet and (polled) loopback to use it).
3d25650dd7c966ac51f8b3282ebd9e19d4180bba 10-Jun-1996 gpalmer <gpalmer@FreeBSD.org> Change the use if ifnet.if_lastchange to be more in line with
SNMP requirements. Update description of ifnet.if_lastchange in if.h
to indicate this.
25ee6cca2a6483ef8d489b2eb60f0fbe475a32f2 06-Feb-1996 wollman <wollman@FreeBSD.org> Clean up Ethernet drivers:
- fill in and use ifp->if_softc
- use if_bpf rather than private cookie variables
- change bpf interface to take advantage of this
- call ether_ifattach() directly from Ethernet drivers
- delete kludge in if_attach() that did this indirectly
ae614ac2900c22bdd2fe050e5b9afd85ad375e68 26-Jan-1996 wollman <wollman@FreeBSD.org> Delete the if_private[] array in struct ifnet; this turned out to be
of limited utility. In their place, add bunch of pointers
which will eventually be needed by the polled-interrupt scheme we're working
here. (It will probably be a while before the code is written and
committed here.) At the same time, a `void *if_softc' field
was added to the beginning of the structure to make certain driver
writers happier.

The practical upshot of all this is that you need to
recompile utilities such as netstat which manipulate struct ifnet.
2dd896405c74abac2e4fc5b59f6e4c0c8510f53f 26-Jan-1996 phk <phk@FreeBSD.org> The last part of the ether_sprint -> %6D change.
Sorry for the delay.
(%D is for hexdumping.)
8cf28016080c1f47c139dd9c75419b6341e3eea6 09-Dec-1995 phk <phk@FreeBSD.org> Staticize, clean lint.
8156a5707a8830d1ce5658e103e6780f22cfc8dd 05-Dec-1995 dg <dg@FreeBSD.org> all:
Removed ifnet.if_init and ifnet.if_reset as they are generally unused.
Change the parameter passed to if_watchdog to be a ifnet * rather than
a unit number. All of this is an attempt to move toward not needing an
array of softc pointers (which is usually static in size) to point to
the driver softc.

Changed some of the argument passing to some functions to make a little
more sense.

if_ep.c, if_vx.c:
Killed completely bogus use of if_timer. It was being set in such a way
that the interface was being reset once per second (blech!).
86f1bc4514fdcfd255f37f3218fe234bdc3664fc 05-Nov-1995 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'LINUX'.
234ef116d975fedba5bd998e69fa71d6a1c0542b 13-Oct-1995 wollman <wollman@FreeBSD.org> Say goodbye to IFF_NOTRAILERS. Support for trailers was officially
dropped for 4.4, but for some reason this flag lived on. (Until
today, that is.)
252487f3dde38c3e41336534067ea6fe9c47d787 03-Oct-1995 wollman <wollman@FreeBSD.org> Import of 4.4-Lite-2 sys/net to make merge and examination easier. Since we
are not on the vendor branch for any of these files, the conflicts shown make
no matter.

Obtained from: 4.4BSD-Lite-2
2807c8cd41fd1112fbdc01fbead9c6ca6c5cd1f3 31-Aug-1995 wollman <wollman@FreeBSD.org> Add a few hooks (in the form of an array of four void *'s) to allow
various bits of software to save some data in the ifnet structure without
having to constantly change the declaration thereof.
005a25e2c241e35a57c76c177e6adcec99a51a17 30-Aug-1995 bde <bde@FreeBSD.org> Fix several sysinit functions that had the wrong type and unnecessarily
external linkage.

Remove useless comments saying that SYSINIT() does system initialization.
b31df0923855dfd9d896af85136b21457656fb94 16-Aug-1995 bde <bde@FreeBSD.org> Make everything except the unsupported network sources compile cleanly
with -Wnested-externs.
14671ce1557e9f7863bc541f0336530f9fa1a0dd 09-Jul-1995 joerg <joerg@FreeBSD.org> Move some struct definitions outside of struct's, so their scopes for
C++ will match the scopes for C.

Submitted by: Warner Losh
c86f0c7a71e7ade3e38b325c186a9cf374e0411e 30-May-1995 rgrimes <rgrimes@FreeBSD.org> Remove trailing whitespace.
321a03d090577e9d6eb90169b5ded161378501ea 26-Apr-1995 pst <pst@FreeBSD.org> Cleanup loopback interface support.
Reviewed by: wollman
289f11acb49b6dbb3081e09bf94a86f008f55814 16-Mar-1995 bde <bde@FreeBSD.org> Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'. Fix all the bugs found. There were no serious
2e14d9ebc3d3592c67bdf625af9ebe0dfc386653 14-Mar-1995 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'MATT_THOMAS'.
911a1abecea24d45887966c057d21df949527ec2 30-Dec-1994 dg <dg@FreeBSD.org> Moved declaration of ifnet pointer out of the header file and into the
.c file where it belongs. Bezeroed some uninitialized malloc data.
4022b9d7cdddf7f18debd4fab761daad71821177 22-Dec-1994 dg <dg@FreeBSD.org> Removed bogus semicolon at end of a #define line.
7d587b958356f1a73ccf6a7794d5edc094412662 21-Dec-1994 wollman <wollman@FreeBSD.org> Add generic part of generic multiple-physical-interface support (the
successor of IFF_ALTPHYS).
e0e4a54365b008d26c9a64fa37da5a20063a2401 21-Dec-1994 wollman <wollman@FreeBSD.org> Add a #define for if_rawoutput(), which isn't used now, but eventually will
0aecbf065e65ef3dab5ed654f22df43c81011ff4 13-Dec-1994 wollman <wollman@FreeBSD.org> Add support for two separate cloning flags, one set by the lower layers,
and one set by the protocol family. Also add another parameter to
rtalloc1() to allow for any interface flags to be ignored; currently
this is only useful for RTF_PRCLONING. Get rid of rt_prflags and re-unite
with rt_flags. Add T/TCP ``route metrics''.


This also adds a new interface parameter, `ifi_physical', which will
eventually replace IFF_ALTPHYS as the mechanism for specifying the
particular physical connection desired on a multiple-connection card.

53e66dfa73daaa6b76e7ee04e5d621bbce158f31 16-Nov-1994 phk <phk@FreeBSD.org> #include <socket.h> -> <sys/socket.h>
480097ce44462733ea9f5465a4477438202ca04e 15-Nov-1994 bde <bde@FreeBSD.org> Include <sys/socket.h> for declaration of struct sockaddr. This helps
genassym compile when KERNEL is not defined.

Uniformize idempotency ifdef.
37abb2da7af11246870f24a1971e13cf70bb3933 14-Nov-1994 bde <bde@FreeBSD.org> if.h:
Declare a complete prototype for the function pointer *ifa_rtrequest.

Declare a complete prototype for the function pointer *rnh_walktree
and for the function rn_walktree.

Uniformize idempotency ifdef.
7eee0c291493539f62f316a821e23d9129b02c95 01-Oct-1994 wollman <wollman@FreeBSD.org> Define IFF_ALTPHYS to be IFF_LINK2. Gross, but effective. (There aren't any
more bits left in if_flags and I don't want to make it a long this late in
the release cycle.)
34cd81d75f398ee455e61969b118639dacbfd7a6 23-Sep-1994 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'MACKERRAS'.
be1bed59fbc221986b3caa67bfed4cf41d917a58 21-Aug-1994 paul <paul@FreeBSD.org> Make idempotent.

Submitted by: Paul
f9fc827448679cf1d41e56512c34521bf06ce37a 18-Aug-1994 wollman <wollman@FreeBSD.org> Fix up some sloppy coding practices:

- Delete redundant declarations.
- Add -Wredundant-declarations to Makefile.i386 so they don't come back.
- Delete sloppy COMMON-style declarations of uninitialized data in
header files.
- Add a few prototypes.
- Clean up warnings resulting from the above.

NB: ioconf.c will still generate a redundant-declaration warning, which
is unavoidable unless somebody volunteers to make `config' smarter.
77ebe221fd4422acd9c3807b2488dff5043107c9 08-Aug-1994 dg <dg@FreeBSD.org> Added ioctl support for SIOCGIFMTU and SIOCSIFMTU. These set the per-
interface MTU.
e16baf7a5fe7ac1453381d0017ed1dcdeefbc995 07-Aug-1994 cvs2svn <cvs2svn@FreeBSD.org> This commit was manufactured by cvs2svn to create branch 'SUNRPC'.
8d205697aac53476badf354623abd4e1c7bc5aff 02-Aug-1994 dg <dg@FreeBSD.org> Added $Id$
2469c867a164210ce96143517059f21db7f1fd17 25-May-1994 rgrimes <rgrimes@FreeBSD.org> The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.

Reviewed by: Rodney W. Grimes
Submitted by: John Dyson and David Greenman
8fb65ce818b3e3c6f165b583b910af24000768a5 24-May-1994 rgrimes <rgrimes@FreeBSD.org> BSD 4.4 Lite Kernel Sources