History log of /freebsd-head/lib/libc/amd64/string/
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
a4c9264e74eb3e70898c1eccf1cdfc590ecbaaa1 30-Jan-2020 mjg <mjg@FreeBSD.org> amd64: sync up libc memcmp with the kernel version (r357309)
emcmp.S
08a81a607578eafe4ee29c237f217634b2868b59 29-Jan-2020 mjg <mjg@FreeBSD.org> amd64: sync up libc memcmp with the kernel version (r357208)
emcmp.S
cfbc1641e480c0a17dad66128d0b523483821619 01-Dec-2018 mjg <mjg@FreeBSD.org> amd64: align target memmove buffer to 16 bytes before using rep movs

See the review for sample test results.

Reviewed by: kib (kernel part)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18401
emmove.S
f5d5aead70f6fbe199876c3f2088168b42f45602 30-Nov-2018 mjg <mjg@FreeBSD.org> amd64: handle small memmove buffers with overlapping stores

Handling sizes of > 32 backwards will be updated later.

Reviewed by: kib (kernel part)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18387
emmove.S
08e0e931c0284aa7be45d66c789160611be2c652 30-Nov-2018 mjg <mjg@FreeBSD.org> amd64: remove stale attribution for memmove work

While the routine started as expanded bcopy, it is now entirely rewritten.

Sponsored by: The FreeBSD Foundation
emmove.S
ffc9789a6e496d2a824aee95d7b485eb2674ad03 30-Nov-2018 mjg <mjg@FreeBSD.org> amd64: tidy up copying backwards in memmove

For non-ERMS case the code used handle possible trailing bytes with
movsb first and then followed it up with movsq. This also happened
to alter how calculations were done for other cases.

Handle the tail with regular movs, just like when copying forward.
Use leaq to calculate the right offset from the get go, instead of
doing separate add and sub.

This adjusts the offset for non-rep cases so that they can be used
to handle the tail.

The routine is still a work in progress.

Sponsored by: The FreeBSD Foundation
emmove.S
c0d4f73b46d83953bea762609f492c0c95f7496d 16-Nov-2018 mjg <mjg@FreeBSD.org> amd64: handle small memset buffers with overlapping stores

Instead of jumping to locations which store the exact number of bytes,
use displacement to move the destination.

In particular the following clears an area between 8-16 (inclusive)
branch-free:

movq %r10,(%rdi)
movq %r10,-8(%rdi,%rcx)

For instance for rcx of 10 the second line is rdi + 10 - 8 = rdi + 2.
Writing 8 bytes starting at that offset overlaps with 6 bytes written
previously and writes 2 new, giving 10 in total.

Provides a nice win for smaller stores. Other ones are erratic depending
on the microarchitecture.

General idea taken from NetBSD (restricted use of the trick) and bionic
string functions (use for various ranges like in this patch).

Reviewed by: kib (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17660
emset.S
8b74f82374da01ca7881876293af1208e6e21ab6 15-Nov-2018 mjg <mjg@FreeBSD.org> amd64: sync up libc memset with the kernel version

- tidy up memset to have rax set earlier for small sizes
- finish the tail in memset with an overlapping store
- align memset buffers to 16 bytes before using rep stos

Sponsored by: The FreeBSD Foundation
emset.S
f199f2664f67dac7f9c1bf842731054ef7e60e1d 15-Nov-2018 mjg <mjg@FreeBSD.org> amd64: convert libc bzero to a C func to avoid future bloat

Reviewed by: kib (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17549
akefile.inc
zero.S
zero.c
emset.S
422b26018fdd22bab6be9cec4e47a5af26185f27 13-Oct-2018 mjg <mjg@FreeBSD.org> amd64: convert libc bcopy to a C func to avoid future bloat

The function is of limited use and is an almost a direct clone of
memmove/memcpy (with arguments swapped). Introduction of ERMS variants
of string routines would mean avoidable growth of libc.

bcopy will get redefined to a __builtin_memmove later on with this
symbol only left for compatibility.

Reviewed by: kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17539
akefile.inc
copy.S
copy.c
fd4ff6330a53337249ece8eb0fa2c46aa2438423 13-Oct-2018 mjg <mjg@FreeBSD.org> amd64: import updated kernel memmove to libc

bcopy is left alone as it is expected to be converted to a C func.

Due to header mess ALIGN_TEXT is temporarily defined explicitly in memmove.S

Reviewed by: kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17538
emcpy.S
emmove.S
d769512cfe1c87e42d0f81df51c2125aa9704be2 05-Oct-2018 mjg <mjg@FreeBSD.org> amd64: import updated kernel memset to libc

See r339205 for details.

An unused ERMS support is retained in the macro. It will be activated
after ifunc support lands.

Reviewed by: kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17405
emset.S
b6728160ea4af28e7c059cb23a988c973537e3fb 01-Oct-2018 mjg <mjg@FreeBSD.org> amd64: reimplement libc memset and bzero with kernel memset

This is a depessimization, see r334537 for an explanation. Routines
remain significantly slower than they have to be.

bzero was removed from the kernel but remains in libc. Macroify to
accommodate differences to memset (no return value, always setting to 0).

The bzero.S file is left in place due to libc build magic which pulls in
a C variant if a matching .S file is missing.

Reviewed by: kib
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D17355
zero.S
emset.S
ef3d3aa9c60d53bcbb890dc53c6df5aaa3a076d3 29-Sep-2018 mjg <mjg@FreeBSD.org> amd64: remove unnecessary cld from libc memcpy/bcopy

The ABI specifies the direction forward on function call, making
the cld instruction redundant.

Approved by: re (kib)
copy.S
09cca5134de06f43a71cc22120ee9aa1aaaeb472 27-Sep-2018 mjg <mjg@FreeBSD.org> amd64: reimplement libc memcmp and bcmp with kernel memcmp

Both are significantly slower than hand-coded loops. See r338963 for
kernel commit.

bcmp differs from memcmp by always returning 1 when a difference is
found, as opposed to going for a value bigger or lower than 0
depending on what it is. This means it can do less work. For now the
code is duplicated and modified. This will get deduplicated after
another round of optimization when memcmp will get a longer-term form.

Both tested with the glibc suite. While the suite does not have a test
for bcmp, I created a wrapper routine which verified that values match
(0 vs 0, 1 vs non-zero).

Reviewed by: kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17336
cmp.S
emcmp.S
7864c480b1e3696a157d2bc6bfe87e2b51483e60 17-Sep-2018 mjg <mjg@FreeBSD.org> amd64: depessimize userspace memcpy/memmove/bcopy

The change resembles what was done in r334537 for kernel routines.
While here take care of i386 variants. Note that primitives remain
suboptimal.

Reviewed by: kib (previous version)
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D17167
copy.S
9014a6e40c74437b39c10b33618d7dfbedea2362 25-Nov-2017 pfg <pfg@FreeBSD.org> libc: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using mis-identified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
trcpy.c
94f539fc96d08f1c9a3b92ca02ff31c26bbda0e0 02-Mar-2017 brooks <brooks@FreeBSD.org> Correct MDSRCS use in <arch>/string/Makefile.inc.

- Remove .c files which duplicate entries in MISRCS.
- Use the same, less merge conflict prone style in all cases.
- Use MDSRCS for mips (.c and .S files both ended up in SRCS).
- Remove pointless sparc64 Makefile.inc.
- Remove uninformative foreign VCS ID entries.

Reviewed by: emaste, imp, jhb
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D9841
akefile.inc
69669cbe99c92053594f595bbb8afd89c18a1892 30-Apr-2016 pfg <pfg@FreeBSD.org> libc: spelling fixes.

Mostly on comments.
trcmp.S
25d8bf1959e6a0586d9249fc25a69028fcf00788 21-Jul-2011 gnn <gnn@FreeBSD.org> Remove incorrect attribution.

Approved by: re (kib)
Pointed out by: brueffer
Pointy hat to: gnn
tpcpy.S
c50aa1163b69fcc4a3e96fc50fc21650d7007875 21-Jul-2011 gnn <gnn@FreeBSD.org> Make both stpcpy and strcpy be assembly language implementations
on amd64.

Submitted by: Guillaume Morin (guillaume at morinfr.org)
Reviewed by: kib, jhb
Approved by: re (bz)
MFC after: 1 month
akefile.inc
tpcpy.S
trcpy.S
trcpy.c
0beb03c7a6efb9f00d850d3e7d11eb9572de1595 04-Feb-2011 kib <kib@FreeBSD.org> Remove duplicate .note.GNU-stack section declaration. bcopy already
made the neccessary provisions.

Reported by: arundel
emmove.S
a5e01acec5aad9ddbe58e73c087304a450c8e84e 07-Jan-2011 kib <kib@FreeBSD.org> Add section .note.GNU-stack for assembly files used by 386 and amd64.
cmp.S
copy.S
zero.S
emcmp.S
emmove.S
emset.S
trcat.S
trcmp.S
trcpy.S
aa63008f13d7a07b62a85c25814374cb77dc7f84 02-Nov-2008 peter <peter@FreeBSD.org> We've been lax about matching END() macros in asm code for some time. This
is used to set the ELF size attribute for functions. It isn't normally
critical but some things can make use of it (gdb for stack traces).
Valgrind needs it so I'm adding it in. The problem is present on all
branches and on both i386 and amd64.
cmp.S
copy.S
zero.S
emcmp.S
emset.S
trcat.S
trcmp.S
trcpy.S
8710214da3628d020a2de7c10bca22a8b6c94013 23-Apr-2005 alc <alc@FreeBSD.org> Optimize the instruction alignment.
trcpy.S
b2ebe1668997f2f770fe8a122a8614a9648942c4 10-Apr-2005 alc <alc@FreeBSD.org> Add a machine-specific, optimized implementation of strcat.

PR: 73111
Submitted by: Ville-Pertti Keinonen <will@iki.fi> (taken from NetBSD)
MFC after: 3 weeks
akefile.inc
trcat.S
420be8df92533bfbd56861dada670045720ba633 10-Apr-2005 alc <alc@FreeBSD.org> Eliminate a conditional branch and as a side-effect eliminate a branch to
a return instruction. (The latter is discouraged by the Opteron
optimization manual because it disables branch prediction for the return
instruction.)

Reviewed by: bde
cmp.S
417aec058f0bd90dd8aa68b96a9faa3c667a4fd8 10-Apr-2005 alc <alc@FreeBSD.org> Add a machine-specific, optimized implementation of strcpy.

PR: 73111
Submitted by: Ville-Pertti Keinonen <will@iki.fi> (taken from NetBSD)
MFC after: 3 weeks
akefile.inc
trcpy.S
654c522ae858faf02e78aed1727928c971fb8112 09-Apr-2005 alc <alc@FreeBSD.org> Add a machine-specific, optimized implementation of strcmp.

PR: 73111
Submitted by: Ville-Pertti Keinonen <will@iki.fi> (taken from NetBSD)
MFC after: 3 weeks
akefile.inc
trcmp.S
1ab7d22f97ea0829f4ecd7575ba5e8c1bea12bca 08-Apr-2005 alc <alc@FreeBSD.org> Add machine-specific, optimized implementations of bcmp and memcmp.

PR: 73111
Submitted by: Ville-Pertti Keinonen <will@iki.fi> (taken from NetBSD)
MFC after: 3 weeks
akefile.inc
cmp.S
emcmp.S
02ae1b51e33e2dbff717a4bb175ba71166ac3563 08-Apr-2005 alc <alc@FreeBSD.org> Eliminate unneeded instructions that are a vestige of mechanical
translation from i386.
copy.S
90a823ed9261212bea6160593fb2c7472be30db4 07-Apr-2005 alc <alc@FreeBSD.org> Eliminate an unneeded instruction that is a vestige of mechanical
translation from i386.
zero.S
cbb9f3f415d27dacd2fe96602be07cf6ffe152c3 07-Apr-2005 alc <alc@FreeBSD.org> Add machine-specific, optimized implementations of bcopy, bzero, memcpy,
memmove, and memset.

PR: 73111
Submitted by: Ville-Pertti Keinonen <will@iki.fi> (taken from NetBSD)
MFC after: 3 weeks
akefile.inc
copy.S
zero.S
emcpy.S
emmove.S
emset.S