Lines Matching defs:and

5 \documentclass[11pt,fleqn,hidelinks,oneside]{book} % Default font size and left-justified equations
11 \newcommand{\insertcode}[2]{\begin{itemize}\item[]\lstinputlisting[caption=#2,label=#1,style=Style1,float=h!]{#1}\end{itemize}} % The first argument is the script location/filename and the second is a caption for the listing
180 As such the driver contains both device-specific parts and OS-specific parts.
181 The Everest architecture, with programmable fastpath processors (Storms), host-based device-dedicated memory (ILT), and minimal on-chip management presents a device which requires a driver with significant portions of device-specific code.
184 Implementing the device-specific code again and again in each OS is both wasteful and difficult to maintain.
186 A large mass of code for operating and interacting with the Everest 4 device, to be incorporated into and used by OS drivers.
188 In the abstract, the ecore is a layer between the HW/FW and the OS.
189 It is device-specific and OS-agnostic. When ecore code requires OS services (e.g. memory allocation, pci configuration space access, etc.) it calls an abstract OS function for that purpose. These are implemented in OS-specific layers.
194 \item Slowpath flows tend to reside largely in ecore and less so in OS specific layers. As much of the functionality as possible is placed in the ecore to leverage it across multiple platforms. \\
196 \item Fastpath flows tend to be in the OS specific layer as too much layering and abstraction is out of place in fastpath.
197 However, the fastpath would usually be set up by ecore flows, for example the address where transmission flow should write a doorbell to the BAR is determined by the ecore at init phase and this address is supplied by ecore to the OS specific layer. \\
201 Different drivers in the same OS may have the ecore within them, and may use it for similar or different purposes:
204 In linux there will be an ethernet driver, an fcoe driver, an iscsi driver, a roce driver and also a slim driver for the diag utility.
208 A storage driver may use the ecore for storage specific purposes, such as the initialization and allocation of task context.
214 This document strives to define and detail what is the ecore.
215 The first parts of the document deal with the concept of the ecore, and its place in the software layers between the device and the OS.
217 This document does not deal with the needs and use cases of any specific OS or tool, but only with the common ground which is the ecore.
221 \item Chapter \ref{cha:overview}'s introduction and section \ref{sec:overview-api} for a listing of the ecore API files and their locations.
227 \item Initialization and De-initialization of the HW [ \ref{sec:init-init}, \ref{sec:init-de-init}].
229 \item Status block initialization [\ref{ssec:sb-init}] and Interrupt handling flow [\ref{sec:sb-flow}].
258 \item Files of the format \texttt{ecore\_<module>\_api.h} -- these files are the SW API between the ecore and the upper-layer driver:
272 \item Files of the format \texttt{ecore\_hsi\_<protocol>.h} -- these files contain the API between FW/HW and the the ecore/upper-layer driver:
303 \item \texttt{ecore\_cxt\_api.[ch]} -- Handles the allocation, configuration and distribution of contexts to the various clients.
311 \item \texttt{ecore\_hw.[ch], ecore\_gtt\_reg\_addr.h, ecore\_gtt\_values.h} -- contains the functionality for register access and DMAE. See chapter \ref{cha:reg}.
313 \item \texttt{ecore.h} -- contains the defintion of the most \textit{elementary} structures in the ecore, the \texttt{ecore\_dev} and the \texttt{ecore\_hwfn}.
315 \item \texttt{ecore\_init\_defs.h, ecore\_init\_fw\_funcs.[ch], ecore\_init\_ops.[ch], \\ ecore\_init\_values.h, ecore\_rt\_defs} -- Code responsible for initialization and configuration of the HW and loading of the FW, mostly in relation with the init-tool. See chapter \ref{cha:hwinit}.
320 \item \texttt{ecore\_int.[ch]} -- Handles interrupts and attentions. See chapter \ref{cha:int}.
324 \item \texttt{ecore\_mcp.[ch]} -- Contains the interface between the ecore and the MFW. See chapter \ref{cha:mfw}.
326 \item \texttt{ecore\_sp\_commands.[ch], ecore\_spq.[ch]} -- Contained the slowpath logic required for sending ramrods and configuring \& handling the various slowpath events.
328 \item \texttt{ecore\_sriov.[ch], ecore\_vf.[ch], ecore\_vfpf\_if.h} -- Contains the SRIOV implementation both from the PF and VF sides.
335 %As the ecore contains most of the lowlevel code operating the non-fastpath parts of the working with the HW and FW, it can be thought of as some sort of library – it contains bits of code meant to be operated from an outside source. Each OS needs to implement its own driver, calling the various functions in the ecore API in a place fitting for that OS driver flows.
336 %Each OS will independently need to create a driver that incorporates the ecore, both filling the OS dependent callbacks required by the ecore to perform and supply an upper level of abstraction which best suits that OS. Notice this upper layer is sometimes also, mistakenly, referred to as ecore [e.g., bnx2c for linux drivers] but there’s an important distinction:
342 %It’s possible [and likely] that an operating system will break the various protocols into different sub-drivers, where each sub-driver will be designated for a specific protocol. Notice that if such separation is made, the preferred implementation is that the OS will implement a ‘core’ driver consisting of the Ecore and an upper-layer, and define an API through which the various protocol drivers communicate with the OS core driver\footnote{Although notice there should be no inter-dependencies between HW-functions in the ecore, so the alternative method where each contains the ecore is also feasible}.
346 \item Basic OS-specific operations that the ecore needs in order to perform it’s work; e.g., memory allocations – the ecore needs to allocate memory for various reasons, and it needs the upper layer to supply the method by which it can do so.
355 In order to support this, the verbosity mechanism contains two distinct values \myindex{\texttt{DP\_LEVEL}} and \myindex{\texttt{DP\_MODULE}} [both can be found in \texttt{ecore.h}]. Since the printing scheme in the ecore was defined with the linux limitations in mind – that is, the API [via ethtool] allowing the setting of the debug message level is only 32-bit long, both \texttt{DP\_MODULE} and \texttt{DP\_LEVEL} together contain only 32-bits.
356 The \texttt{DP\_LEVEL} determines which prints will actually reach the logs based on the message urgency, defining 4 levels – verbose, info, notice and error. When level is set, all prints which are at least as urgent will be printed. Notice this means there’s a single level – e.g., you can’t have a configuration in which you’ll get all the `info’ level prints, but not the `notice’ level.
357 The \texttt{DP\_MODULE} is relevant only when level is set to verbose, and it defines which of the verbose prints should reach system logs, based mostly on component/flow. When setting the module level, a bit mask of the requested components/flows is set.
363 \item ASIC\_ONLY -- By default, this is `off'. Setting this would remove content that is relevant only for simulations of the hardware, I.e., emulations and FPGAs.
367 \item REMOVE\_DBG -- By default, this is `off'. There are several structures and field in ecore which aren't functional; there sole purpose is to store interesting data for memory dumps in case of failures. Setting this would remove all such data items.
378 To access the entire range, windows are defined that can be configured to point to a certain address within the device and allow reading and writing of registers / memory from that address.
379 There are two types of windows, \textbf{PTT} (per PF Translation Table) and \textbf{GTT} (Global Translation Table).
393 All register access should be done within the ecore layer and it is not expected for the upper layers to access registers at all.
394 For this reason, there is no description here on how to find the register address and how to distinguish whether the address is mapped into a \myindex{GTT} or a \myindex{PTT}.
403 Implementation should add the offset to the mapped BAR address and call the appropriate OS specific API.
407 The PTT is reserved per flow, and it is the responsibility of the upper layer to make sure it does not use the same PTT in flows that can run concurrently. Upper-layer requests for a PTT entry using \myfunc{ptt\_acquire}{ptt_acquire}.
410 Using a PTT, ecore [and upper-driver] can access registers/memories using inner BAR addresses; The ecore is responsible for configuring the memory windows, and translates the inner address into an external address [i.e., one which resides on the actual BAR as seen by the host]. The register access is then made by calling \texttt{ecore\_wr} and \texttt{ecore\_rd}.
420 \item \myindex{ILT} – one of the features of our device is that the memories used by various HW blocks are allocated on the host memory [as opposed to large embedded memory segment on chip]. The driver is responsible for allocating the memory needed for those HW blocks [DMA-coherent memory] and configure both the HW blocks themselves and a communal sub-block known as ILT. The ecore contains complicated code that decides exactly how much memory each such block needs, allocates it in an ‘ilt\_shadow’, and then uses that shadow to configure the ILT itself with all the allocated chunks of memory.
424 \item \myindex{RT array} – when the ecore initializes the HW, it utilizes a common, tool-generated code known as the init-tool. Since there are quite a few values which depend upon actual setup configuration and thus must receive feedback during the initialization from the ecore, instead of adding many such hooks there’s the concept of the RunTime array – an array of values filled by the ecore prior to the init-tool run based on the complex ecore logic. The init-tool will then utilize the values in that array to configure the HW according to the correct order of configuration [i.e., writing the values set by ecore in the array in the correct place in the initialization flow where they’re required/the block that contains them is configured].
430 This section gives a brief description of the functions that need to be called, what they do, requirements, etc., in order to successfully initialize the ecore structs and load the HW/FW.
438 \item \myfunc{init\_struct}{init\_struct} – After allocating and setting of zeroes of the ecore\_dev [the upper-layer responsibility], a pointer to it should be passed to this function for some early initialization of the data structure. \\
442 \item It enables the ecore to access its BAR, doing things such as enabling the PTT pool and opening the access in the PGLUE\_B block.
443 Notice this doesn’t actually do anything to the PCI BAR itself – the upper-layer should have initialized those before calling this function, and must guarantee that its REG\_WR/RD functions actually point to valid, accessible addresses.
444 \item It learns as much as it can about system configuration from HW and SHMEM.
453 \item \myfunc{hw\_init}{hw_init} – This function actually initializes the chip, using the init-tool and the runtime array to make the correct configuration.
455 This is required since as part of the flow the \texttt{function\_start} ramrod will be sent to FW; Once FW finishes handling it, an \myindex{EQE} [Event Queue Element] will be placed in the slowpath event queue and an interrupt will be fired. The flow is dependent on the EQE being processed.
467 Once this function returns, the chip is initialized, FW is functional and slowpath event queues are operational.
471 \section{Zipped and Binary firmware}
472 \label{sec:init-Zipped and Binary firmware}
475 Non-zipped firmware [ecore\_init\_values.h and ecore\_init\_values.bin] and Zipped firmware [ecore\_init\_values\_zipped.h and
476 ecore\_init\_values\_zipped.bin] files. Each type of file is generated in two formats that is a C header file and binary file,
486 generated by ecore, C header files [ecore\_init\_values.h and ecore\_init\_values\_zipped.h] and
487 binary firmware files [ecore\_init\_values.bin and ecore\_init\_values\_zipped.bin]. Either of those files formats
489 but using binary firmware files has the advantage where the code size is reduced and the FW can be loaded from a file imported by
494 file in to a void* pointer and pass that firmware data buffer pointer in ecore\_hw\_init() as an argument.
507 \item \myfunc{hw\_stop}{hw_stop} – this function notifies the MFW that the HW-functions unload, stops the FW/HW for all HW-functions in the \texttt{ecore\_dev} including sending the common PF\_STOP ramrod for each HW-function, and disables the HW-functions in various HW blocks.
509 Following this function, it is guaranteed HW will not generate any more slowpath interrupts, so the interrupt handler can be released [and slowpath DPC context can be stopped]. \\
531 it describes how firmware status is reflected on host memory via status blocks, and how the firmware initiates an interrupt toward the driver.
538 There are 288 status blocks per path in Big Bear and 368 in K2.
540 When one of the indices on a status block is updated (because some event occurred at the device), the status block is copied from internal device memory to host memory, and an interrupt is generated.
541 The CAU unit may aggregate several events and generate a single update of the status block and a single interrupt, in order to lower the number of interrupts sent to host CPU.
546 Multiple indices are used for L2 to differentiate between RX / TX and different class of service operations.
550 There is a dedicated status block for ecore usage which is allocated and maintained by ecore.
553 The ecore defines a structure called \texttt{ecore\_sb\_info} which should be allocated by the protocol driver and initialized using the function \myfunc{int\_sb\_init}{int_sb_init}
555 This structure is later used for calling the functions \texttt{ecore\_sb\_update\_sb\_idx()} and \texttt{ecore\_sb\_ack()}.
560 Status blocks need to be allocated and initialized before queues are created.
563 \section{Mode and configuration}
567 \item MSI – Message signaled interrupts. Device is programmed with one address to write to, and 16-bit data to identify the interrupt.
568 \item MSIX – Large number of interrupts (up to 2048) and each one gets a separate target address, making it possible to designate different interrupts to different processors.
573 Enabling and disabling interrupts is OS specific and done differently by the OS specific layer of the driver.
595 The mapping is done inside the IGU CAM and maps a (function, vector) pair to an MSI-X message.
603 CAU also handles coalescing of status block writes and interrupt generation.
604 The CAU unit may aggregate several events and generate a single update of the status block and a single interrupt, in order to lower the number of interrupts sent to host CPU.
608 The flow of handling an interrupt in the device and driver is as follows:
614 \item The device copies the status block to host memory and generates an interrupt.
618 \item (Possible upper-half handling and bottom-half scheduling, or other OS-specifics which are outside the scope of this document).
629 \item The IGU compares the producer and consumer -- if they differ it will generate an additional interrupt.
634 Assume an Rx packet is received by device. After FW places the packet in the Rx rings, it updates the status block of that Rx ring; This in turn is copied into host memory and an MSI-X interrupt for the appropriate Rx queue's status block is triggered.
635 Driver reads the status blocks, scanning the indicies and identifies the interrupt is an Rx CQE consumer and handles the incoming packet. Assuming this is the only interrupt source [and there was also a single packet] driver than acks the status block.
644 The management firmware runs on its own processor on the chip [\myindex{MCP}] and has many responsibilities – it serves as the entity initially configuring the chip [during bios phase], answering the various management protocols, synchronizing between PFs, configuring the physical link, etc.
645 HW functions and the \myindex{MFW} may interact with each other in both ways – driver may send messages to the MFW in the form of commands on a buffer, while the MFW generates attentions for the driver and posts messages in a designated mailbox in the SHMEM. The implementation of the interface resides in \texttt{ecore\_mcp.[ch]}, with the addition of \texttt{.h} files generated by the MFW owners, e.g., \texttt{mcp\_public.h} which contains the SHMEM structure and the list of commands.
648 The interface between driver and MFW is initialized as early as possible in the initial initialization flow [specifically as part of \texttt{ecore\_hw\_prepare()}], as this initializes the Driver access to SHMEM which is used later during initialization to learn about the chip configuration [which was read from NVRAM by MFW and written into SHMEM].
654 \item MFW fills it with current HW configuration, either based on the default found in the NVRAM or based on some management-protocol [e.g., it’s possible vlans configuration is determined by switch and communicated to the MFW]. Driver reads those values and decides upon its logical state/configures HW appropriately. \\
658 \item It’s possible [as in E3] that there will be driver-held information that will be requested by some management-protocol, and the driver will have to fill it in some well-known address in the SHMEM.
661 An upper-layer driver is not supposed to access the SHMEM directly; It should only do so by using ecore functions and accessing ecore structs. The ecore \textit{mcp\_info} struct contains as one of its fields \textit{func\_info} which is filled by the ecore during early device initialization with all the function-specific static\footnote{i.e., data that shouldn't change while driver is running} data. Upper-layer driver can read those values for its own usage.
666 A message is a u32 consisting of a command bit-mask which indicates of the message the HW-functions sends and a cyclic sequential number.
668 The driver increases the sequence number and writes the message and then polls until the MFW writes its response [with the correct sequence number] to another known address in SHMEM\footnote{Obviously, this is a one-pending mechanism.}
673 Before sending the attention, the MFW will increment the producer of the message it wishes to inform the driver and the driver will recognize the message by noticing the difference in producers.
674 After handling said message, the driver will ack the message by writing the new producer back to SHMEM and disabling the general HW attention.
680 \section{API between ecore's MCP interface and upper-layer driver}
681 \myfunc{mcp\_cmd}{mcp_cmd} -- this is the very core of message-passing from driver to MFW. Upper-layer driver should pass the command (FW\_MSG\_CODE\_* from \texttt{mcp\_public.h}) and a parameter, as well as pointers for the MFW response and additional possible parameter. The function will pass the command for MFW and await [sleep] for its reply. \\
684 The MFW is used as both a book-keeper and synchronization mechanism for the loading of PFs, as there are communal resources. The response will be (FW\_MSG\_CODE\_DRV\_LOAD\_<X>), where X can be either ENGINE, PORT or FUNCTION:
693 Dependent on the exact message received from the MFW, it’s possible that this will eventually will call some OSAL which will need to be implemented by the upper-layer driver, e.g., in case of link change indication [The upper-layer needs to be notified and should decide on its own what to do with that information].
710 In order to work with the ecore link interface, upper driver needs to implement an OSAL [\texttt{osal\_link\_update()}] which will be called whenever the link state has changed – this will notify the upper driver that the link has changed and that it should probably read link\_output and act upon it. \\
716 It’s the MFW's duty to decide whether the unloading function is actually the last loaded function on its port and thus whether to actually reset the link.
719 EEE feature enables the device to put its transistors in sleep mode when there is no data activity on the wire. Hence achieves the significant reduction in the power consumption of the device. It's a Base-T feature, more details of which are captured under IEEE 802.3az standard. MFW negotiates the EEE parameters with the peer device and the results will be shared to the ecore as part of link notification. Following are the negotiated parameters which will be encapsulated in the struct \texttt{ecore\_mcp\_link\_state}.
721 \item eee\_active – EEE is negotiated and is currently operational.
742 The MFW is responsible for negotiating the dcbx parameters [e.g., per priority flow control (PFC)] with peer device. During initialization, MFW reads the dcbx parameters from NVRAM (called local parameters) and negotiates these with the peer. The negotiated/agreed parameters are called operational dcbx parameters. MFW provides driver interfaces for querying and configuring the dcbx parameters. The ecore dcbx implementation provides three APIs, one for querying the dcbx paramters and the other two for updating the dcbx configuration.
747 \item \myfunc{dcbx\_query\_params}{dcbx\_query\_params} – The API returns the current dcbx configuration. It expects type (i.e., local/remote/operational) and the buffer for storing the dcbx parameters of that type.\\
751 \item \myfunc{dcbx\_config\_params}{dcbx\_config\_params} – The API is used for sending the dcbx parameters update request. The API expects dcbx parameters to be configured and the flag specifying whether the parameters need to be sent to hardware or just cache at the ecore. When driver sends dcbx config to the hardware, device initiates the dcbx negotiation with the peer using lldp protocol. The negotiation takes few seconds to complete, and also the lldp requests are rate limited (using a predefined credit value). The dcbx API option “hw\_commit” specifies whether the dcbx parameters need to be committed to the hardware or just cache at the driver. When client requests the commit, all the cached parameters are sent to the device and the parameter negotiation will be initiated with the peer. \\
753 The steps for configuring the dcbx parameters are, upper layer driver invokes ecore\_dcbx\_get\_config\_params() API to get the current config parameter set, and update the required parameters, and then invoke ecore\_dcbx\_config\_params() API.
759 MFW needs various bits of information from the driver, and it gathers those in one of two methods:
762 \item Pushing – it’s the ecore and ecore-client’s responsibility to push the data.\\
764 In some cases, ‘Push’ is done without involvement of the ecore-client. If that’s not possible, it becomes more risky as the responsibility of doing things correctly passes to the ecore-client. Ecore-client shouldn’t presume to do ‘push’ only for calls which match the configured management mode. Instead it should always do them and let the ecore be the arbiter of whether those are needed by MFW or not. Ecore provides the following APIs for updating the configuration attributes, it is the client's responsibility to invoke these APIs at the appropriate time.
796 \item ACTIVE - Set when the required initialization is done from the driver side and the device is ready for traffic switching.\\
803 Ecore also provides the TLV request interface for MFW for querying the driver/device attributes. MFW uses mailbox interface to notify ecore on the required TLV information. Ecore parses the request, populates the required information with the help of ecore clients and sends it to the MFW. Ecore client need to provide necessary infrastructure and the OSALs for implementing this interface.
805 \item OSAL\_MFW\_TLV\_REQ - The call indicates that ecore has received a TLV request notification from the MFW. The execution context in interrupt mode, hence ecore client need to schedule a thread/bottom-half context to handle this task, and return the control immediately. The bottom-half thread will need to invoke \myfunc{mfw\_process\_tlv\_req}{mfw_process_tlv_req} for further processing of the TLV request.\\
806 \item OSAL\_MFW\_FILL\_TLV\_DATA - Ecore invokes this callback to get the TLV values of a given type. Ecore client need to fill in the values for all the fields that it's aware of, and also need to set the flags associated with the respective fields. For instance, if client sets value for 'npiv\_enabled' field, it needs to set the flag 'npiv\_enabled\_set' to true.\\
826 This section begins after section \ref{sec:init-init}, I.e., assuming the HW-function has already been initialized by the init tool and the PF\_START ramrod has already been sent.
829 Although VPORTs' and queues' indices are shared between all HW-function on the same engine, the resource allocation scheme determines a range of VPORTs per-HW-function to use for configuration [i.e., developer can assume starting index is always 0 per-HW-function].
838 \item \myfunc{sp\_vport\_start}{sp_vport_start} -- this function initializes a vport in FW [ETH\_RAMROD\_VPORT\_START will be sent]. The handle for this function is a \texttt{vport\_id} which is passed and the most 'interesting' argument is the MTU for that VPORT.
843 There are 2 identifier of the queue - the queue index to add and the VPORT index to add it to. The queue-index should be unique for the Rx-queue; No 2 Rx-queues of the same PF should use the same id.
851 In addition, ecore would also fill a \texttt{p\_handle}. This handle is opaque to the ecore-client, and should be passed to other Rx-queue APIs when doing configuration relating to that queue.
857 Just like for Rx-queues, the ecore would fill the \texttt{p\_ret\_params} with an opaque handler to be used for further calls relating to this queue. In addition, it will provide a \texttt{p\_doorbell} address, which is an address into which a doorbell needs to be written to activate firmware once a packet is placed on this Tx queue and the buffer descriptors are filled.
859 Doorbell addresses are on a different BAR than that of other memories/registers accessed by driver, and the PTT/GTT scheme does not apply to it; Thus the address can simply be accessed using the necessary memory barriers.
862 \item \myfunc{sp\_vport\_update}{sp_vport_update} -- This is required to enable the VPORT. It should be called after the Tx/Rx queues were already added, and this will enable the VPORT to send and receive packets\footnote{Notice that without classification configuration Rx won't actually work. Also notice this function can do a lot of things; Enabling the VPORT is only one of them.}.
876 \item Configuration of the \myindex{Rx mode} -- This defines which datagrams [unicast, multicast, broadcast] should be accepted by the VPORT, and whether all such datagrams or only if a filter is configured for them.
894 \item ECORE\_FILTER\_MOVE -- removes a filter from one vport and adds it to another simultaneously\footnote{Needed by windows.}.
907 This is pretty straight forward, and works in reverse-order to the initialization of the L2 device.
908 After upper-layer driver guarantees that no new Tx-packets will be generated and once Tx queues are all empty, it should do the following:
919 \item Close all Tx queues\footnote{Actually, order does not matter between Tx and Rx queues} by calling \myfunc{eth\_tx\_queue\_stop}{eth_tx_queue_stop}.
926 Following the completion of the \texttt{vport\_stop}, no further traffic should be working. Interrupts can be released, and resources can freed.
934 Our device supports \myindex{100G} link. However, the fastpath pipeline of each HW engine isn't fast enough for that line-rate. The Hardware function term is a catchphrase for the HW resource and identifications normally required by a single pci function. In 100G mode, the device will enumerate as a single pci function\footnote{Or more in multi-function mode; But we will stick with single-function mode for simplicity here.}, but the driver running over this pci function will utilize multiple HW functions.
935 From pci standpoint, the distinction between the HW functions (and thus the HW engines) is done via the bar address. Access to the first half of each of the pci function's bars will be translated into an access into a HW function on the first engine, while access to the second half will be translated into an access into a HW function on the second engine.
940 After the early initialization phase of the ecore (i.e., following ecore\_hw\_prepare()), the \textit{ecore\_dev} field \myindex{num\_hwfns} will be filled with the correct number of HW-functions under the PCI device. The ecore and its client should access only the first num\_hwfns entries in the \textit{hwfns} array.
949 Assume a PCI function is in CMT mode. Let $\text{hwfn}_0$ stand for its HW-function under the first engine and $\text{hwfn}_1$ stand for its HW-function under the second engine.
952 Then for $\forall n \in \mathbb{N}_{+}$, $\text{MSIX}_{2n}$ is connected to $\text{hwfn}_0$'s status block of index $n$, and $\text{MSIX}_{2n+1}$ is connected to $hwfn_1$'s status block of index $n$.
956 Ecore handles almost all the difference between CMT and regular mode on it's own, i.e., it reads the number of HW-functions under the devices and iterates when needed to configure both engines correctly (where as in the non-CMT mode it would have simply configured one).
965 following Example [\ref{ex:CMT1}], $\text{MSIX}_0$ should be enabled and connected to the DPC of $\text{hwfn}_0$ and $\text{MSIX}_1$ should be enabled and connected to the DPC of $\text{hwfn}_1$.
969 When disabling the slowpath, it's important to remember that there were 2 different DPCs allocated and 2 MSI-X vectors configured to support them, as it's the ecore-client responsibility for disabling the interrupts.
972 Since each HW-function is running on a different path and is an independent entity (as perceived by FW/HW), configuration should be almost symmetric for both HW-functions. E.g., Following the flow of section \ref{sec:l2-start}, ecore\_sp\_vport\_start() should be called separately for each HW-function, queues should be opened separately for each, etc..
981 There is an issue between the user control of the number of queues and the actual configuration of queues - e.g., assume user wants $X$ queues. If we use a symmetric configuration what we actually do is open $X$ queues on each path, meaning we actually open $2X$ queues.
983 We can either only open $X/2$ queues on each engine, in which case we lose some abilities, e.g., control the keys of the RSS hash-function, or open $2X$ queues and try to hide this fact from user, but this most likely will either incur a performance penalty, hard-to-maintain code or both.
997 Specifically for iSCSI, before calling \texttt{ecore\_resc\_alloc()}, the upper driver should determine the PF-global parameters, allocate all PF-global queues, and fill the \texttt{iscsi\_pf\_params} part in struct \texttt{ecore\_pf\_params}. \\
1023 \item After the basic initialization process is completed successfully, it is possible to establish the LL2 queue, and send / receive LL2 packets (as described in section \ref{cha:ll2}).
1024 \item \myfunc{sp\_iscsi\_func\_start}{sp_iscsi_func_start} -- this function initializes the iSCSI PF, and passes PF-global parameters to FW. This function should be called before offloading any iSCSI connection.
1034 \item \myfunc{iscsi\_acquire\_connection}{iscsi_acquire_connection} -- this function allocates the resources for the connection. \texttt{p\_in\_conn} which is passed to this function should be NULL. Note that ecore allocates by itself struct \texttt{ecore\_iscsi\_conn}, and returns its pointer to the upper driver via \texttt{p\_out\_conn}. Amongst others, ecore initializes in this struct the \texttt{icid} to be used in later task initialization, and the \texttt{conn\_id} which is zero based index.
1035 \item \myfunc{iscsi\_offload\_connection}{iscsi_offload_connection} -- this function offloads the connection to the device, and requests to establish the TCP connection. Before calling this function, the upper driver should determine the connection TCP parameters, allocate the connection SQ, and fill parameters in \texttt{ecore\_iscsi\_conn} struct. \\
1077 When this call completes, the connection is offloaded and 3-way handshake started. 3-way handshake completion is indicated by an asynchronous call from ecore.
1078 After this call completes (and even before the asynchronous call), driver can post Login PDU to SQ. However FW will process SQ only after 3-way handshake is completed.
1085 \texttt{update\_flag} & The negotiated values for HeaderDigest, DataDigest, InitialR2T and ImmediateData \\ \hline
1098 \item \myfunc{iscsi\_terminate\_connection}{iscsi_terminate_connection} -- this function removes the connection from the device, and requests to close the TCP connection. When this call completes, the connection closure state machine has started, but the connection is still offloaded. Connection closure and removal from the device is indicated by an asynchronous call from ecore.
1128 Specifically for FCoE, before calling \texttt{ecore\_resc\_alloc()}, the upper driver should determine the PF-global parameters, allocate all PF-global queues, and fill the \texttt{fcoe\_pf\_params} part in struct \texttt{ecore\_pf\_params}. \\
1149 \item After the basic initialization process is completed successfully, it is possible to establish the LL2 queue, and send / receive LL2 packets.
1150 \item \myfunc{sp\_fcoe\_func\_start}{sp_fcoe_func_start} -- this function initializes the FCoE PF, and passes PF-global parameters to FW. This function should be called before offloading any FCoE connection.
1159 \item \myfunc{fcoe\_acquire\_connection}{fcoe_acquire_connection} -- this function allocates the resources for the connection. \texttt{p\_in\_conn} which is passed to this function should be NULL. Note that ecore allocates by itself struct \texttt{ecore\_fcoe\_conn}, and returns its pointer to the upper driver via \texttt{p\_out\_conn}. Amongst others, ecore initializes in this struct the \texttt{icid} to be used in later task initialization, and the \texttt{conn\_id} which is zero based index.
1160 \item \myfunc{fcoe\_offload\_connection}{fcoe_offload_connection} -- this function offloads the connection to the device. Before calling this function, the upper driver should allocate the connection SQ, and fill parameters in \texttt{ecore\_fcoe\_conn} struct. \\
1217 This chapter describes the ecore interface for the upper-layer driver of the RDMA protocol. The interface aims at sharing as much as possible between RoCE and iWARP. This chapter is not complete, and currently only details changes for iWARP. (Except for dcqcn which was already detailed before )
1218 For iwarp support, modification to existing structures and functions names will be made to ease distinction between the two. Similar to HSI changes. The following convention will be used:
1220 \item ecore\_rdma\_xxx will be used for common structures and functions
1221 \item ecore\_roce\_xxx will be used for roce specific structures, fields and functions
1222 \item ecore\_iwarp\_xxx will be used for iwarp specific structures, fields and functions
1225 \section{Distinguish between iWARP and RoCE}
1227 Ecore provides the driver the ability to set the ecore personality through the call to ecore\_hw\_prepare by passing personality as a parameter. If ‘personality’ passed in call to ecore\_hw\_prepare is ECORE\_PCI\_DEFAULT the ‘personality’ is derived from the NVRAM configuration for protocol and device capability, else the setting passed by upper driver in the call overrides the NVRAM configuration.
1228 TBD: NVRAM configuration for distinguishing iWARP and RoCE does not exist and is not finalized yet.
1238 Specifically for RDMA, before calling \texttt{ecore\_resc\_alloc()}, the upper driver should determine the PF-global parameters, allocate all PF-global queues, and fill the \texttt{rdma\_pf\_params} part in struct \texttt{ecore\_pf\_params}. \\
1249 \texttt{roce\_enable\_dcqcn} & If enabled maximum number of rate limiters will be allocated during hardware initialization which can later be initialized and configured during roce start. Must be set to enabled dcqcn during roce initialization. This field is relevant to RoCE only.\\ \hline
1253 The values of num\_qps, num\_mrs will impact the amount of memory allocated in the ILT. Note that although these parameters are rdma specific, they are actually used during common hw initialization phase. The amount of ilt memory will differ between RoCE and iWARP as iWARP requires only one cid per QP and RoCE requires two.
1255 \item \myfunc{rdma\_start}{rdma_start} -- this function initializes the RDMA PF, allocates resources required for RDMA and passes PF-global parameters to FW. This function should be called before performing any other RDMA operations.
1261 \texttt{events} & RoCE - callback functions for affiliated and unaffiliated events.\\ \hline
1262 \texttt{desired\_cnq} & desired number of cnqs to be used. Upper layer driver needs to make sure enough resources are available for this number (number of msix vectors and cnq resource\\ \hline
1279 \item \myfunc{rdma\_query\_device}{rdma_query_device} -- this function returns a struct of type ecore\_rdma\_device which contains the capabilities and set options for the given device.
1282 \item Enable\_dcqcn under rdma\_pf\_params allocates additional hardware resources (rate limiters ) which can later be used to enable DCQCN notification point and reaction point. This must be set prior to calling \texttt{ecore\_resc\_alloc()}.
1284 Notification point and reaction point can be enabled independently.
1285 When configuring the device to act as notification point, the ecore will initialize the NIG block accordingly and pass the priority vlan and cnp send timeout values to FW. When configuring the device to act as reaction point, the ecore will send a ramrod to FW that configures the rate limiters allocated for dcqcn support with the values received from the upper layer driver ( such as maximum rate, byte counter limit, active increase rate etc... full detail in ecore\_roce\_api.h file ). At this point all rate limiters will be configured with the same values. If in the future there will be a need to configure different rate limiters with different values an additional API function will be provided. During initialization, ecore will map between physical queues used for RoCE and rate limiters. The number of rate limiters allocated is handled by resource management and is currently divided equally between the functions. During modify\_qp, ecore will configure the responder and requester to work with a unique physical queue, which is configured to work with a unique rate limiter. QPs that are opened after rate limiters are used out will be configured to run on a default physical queue which does not have a rate limiter. FW assumes that the qp\_id is equal to the physical queue id. For simplicity, the implementation assumes that Ethernet is not run simultaneously with RoCE (i.e. Roce only personality). If dcqcn is enabled and ethernet is run, ethernet will run on the same physical queue as the first qp that is allocated.
1289 Unlike RoCE in which connection management is implemented completely in host, connection management for iWARP which involves the TCP 3 way handshake and MPA exchanges is implemented in F/W. The host is nevertheless involved in offloading TCP and MPA and exchanging connection parameters as part of the connection establishment/teardown process.
1291 During connection establishment/teardown, the driver calls ecore connection related APIs and receives callbacks from ecore for connection related events. The driver registers its event callbacks by passing them as parameters to the different connection ecore APIs.
1305 For both passive and active connect, basic information on host and peer is required. We define a structure called \texttt{ecore\_iwarp\_cm\_info} which will be passed between driver and ecore on both downcalls and upcalls. Throughout the rest of the chapter we'll refer to this as the cm\_info.
1325 This function will take care of initiating the TCP 3-way handshake and MPA negotiation. Once the MPA response is received the event EVENT\_ACTIVE\_COMPLETE will be issued to upper-layer driver. This function is asynchronous. The function will receive cm\_info (detailed in \ref{sec:cminfo} ), mss, local and remote mac address. The mac address will be acquired by upper-layer driver using OS ip routing functions (such as find\_route in linux). In addition, it will require a pointer to the associated QP and a pointer to a callback function and callback context which will be used to indicate events to the driver which are related to this connection. \newline
1344 The ecore will use the ll2 interface for implementing passive side connection establishment. Upper layer driver will send 2\-tuples and vlan to ecore layer which the ecore should listen on. Once a SYN packet is received on the ll2 interface, the ecore will search its database to check if a listener was registered with the received 2\-tuple and vlan. If it was received, tcp offload ramrod will be sent and once the MPA request will be received, the event EVENT\_MPA\_REQUEST will be issued to upper layer driver. At this stage it is expected that the upper layer driver will pass the MPA parameters such as private data, ord, ird to all the way to user app, which will in turn create a QP and related objects and later issue a call to ecore\_iwarp\_accept.
1346 This function will receive socket local and remote addresses (port, ip and vlan) and add them to its listening database. In addition a callback function and callback context will be provided which will be used by ecore to send events of connection requests to the driver.
1377 If a connection is rejected QP will not be associated with the connection request and remains an independent object ( if it was created ). Calling this function
1403 This function will remove socket local and remote addresses (port, ip and vlan) from its listening database.
1417 The interface into ecore is done with the states of RoCE and translated internally to iwarp states. This was done
1418 to utilize the same interface for RoCE and iWARP. However, in the future this may be changes so that state translation
1434 To initiate a graceful disconnect sequence, the active side will perform a modify\_qp to ECORE\_ROCE\_QP\_STATE\_SQD. This will be translated to ECORE\_IWARP\_QP\_STATE\_CLOSING and initiate a graceful teardown sequence with FW. Currently, due to existing FW implementation a modify qp to error will be sent fo FW before closing the connection. In the future, FW HSI will be changed so that a CLOSING state is added to FW as well. Once the disconnect is complete, whether gracefully or abortively ( in some cases a graceful disconnect will turn into an abortive one, timeouts, errors in close etc... ) an ECORE\_IWARP\_EVENT\_CLOSE event will be sent to upper layer driver. Ecore will transition to ERROR state in any case at the end of the flow.
1437 To initiate an abortive disconnect sequence, the active side will perform a modify\_qp to ECORE\_ROCE\_QP\_STATE\_ERR. This will be translated to ECORE\_IWARP\_QP\_STATE\_ERROR and initiate an abortive teardown sequence with FW. Once the disconnect is completed, an ECORE\_IWARP\_EVENT\_CLOSE event will be sent to upper layer driver. Ecore will transition to ERROR state in any case at the end of the flow.
1483 \item \myfunc{rdma\_create\_qp}{rdma_create_qp} -- This function will create the qp object in ecore and for iWARP in FW. In RoCE no FW ramrods are sent during the call to this function. The main change from existing create\_qp function, for iWARP is that instead of providing addresses to rq, sq separately, and allocating memory for FW queues in ecore, FW requires contiguous memory for the the pbl of all FW queues (RQ, SQ, ORQ, IRQ, HQ). Therefore interface will change, and instead of upper layer driver providing pbl address to create\_qp, these will be provided as out\_parameters after being allocated in ecore. Upper layer driver will be required to pass the number of pages required for SQ / RQ. Populating the pbls will be done after calling create\_qp and not before as done today. For ease of code sharing between iWARP and RoCE FW will modify RoCE implementation to work the same as iWARP.
1484 \item \myfunc{rdma\_modify\_qp}{rdma_modify_qp} -- The API will remain the same, however, for iWARP not all fields are relevant. Naming convention of RDMA/iWARP/RoCE was done in ecore\_roce\_api to distinguish between what is required and what is not. Modify QP is used in iWARP for part of the teardown flow detailed in \ref{sec:iwarp_teardown}
1488 Ecore client has the ability to signal ecore that a specific tcp port in app tlv should be recognized as pertaining to the iwarp offloaded connections. If an app tlv which matches this port is indicated by MFW, all offloaded iwarp traffic of the PF will abide by this configuration (regardless of the actual tcp port of the offloaded connections). The app tlv can be set by the ecore client via the regular APIs for setting "locally administered params”. Ecore client communicates the tcp port value via \texttt{rdma\_pf\_params} structure, the value needs to be populated before invoking \myfunc{resc\_alloc}{resc_alloc}. To configure the iwarp app tlv in the locally administered dcbx parameters, ecore client need to use the Dcbx APIs described in "Dcbx Interface" section. The relevant APIs are \myfunc{dcbx\_get\_config\_params}{dcbx_get_config_params} and \myfunc{dcbx\_config\_params}{dcbx_config_params}.
1497 The LL2 is a simplified version of L2 for which both slowpath and fastpath flows reside in ecore, and it is being used by the upper-layer drivers of the storage protocols.
1514 \texttt{rx\_vlan\_removal\_en} & can be set if it is desired to get the VLAN stripped and out-of-band. \\ \hline
1525 \item \myfunc{ll2\_establish\_connection}{ll2_establish_connection} -- this function offloads the LL2 connection to the device (both Tx and Rx).
1526 \item After establishing the connection, it is possible to post Rx buffers and to send Tx packets.
1534 \item \texttt{complete\_rx\_packet} -- this is a callback function that should be implemented in the upper driver. Ecore calls this function when a packet is received and written to a buffer in the Rx ring. \texttt{cookie} and \texttt{rx\_buf\_addr} are echoed from the call that posted that buffer. \texttt{placement\_offset} is the offset in bytes in the buffer, starting from which the packet was written. \texttt{packet\_length} is the total packet length in bytes. \texttt{opaque\_data\_0/1} and \texttt{b\_last\_packet} can be ignored. \texttt{vlan} is the VLAN tag stripped from the packet, and it is valid only if PARSING\_AND\_ERR\_FLAGS\_TAG8021QEXIST bit is set in \texttt{parse\_flags}. \texttt{parse\_flags} field contains additional flags which are mostly not interesting for the upper driver.
1535 \item \texttt{release\_rx\_packet} -- this is a callback function that should be implemented in the upper driver. Ecore calls this function when the connection is terminated and there are still buffers in the Rx ring. In this case it will call this function per each buffer, so the upper driver can free those buffers.
1542 \item \myfunc{ll2\_prepare\_tx\_packet}{ll2_prepare_tx_packet} -- this function adds a new packet to the transmit ring. If the packet is composed from more than a single buffer, than the address and length of the additional buffers is provided to ecore by calling \texttt{ecore\_ll2\_set\_fragment\_of\_tx\_packet} for each additional buffer. \\
1543 \texttt{num\_of\_bds} is the number of buffers that compose the packet (including the first buffer), and is limited to CORE\_LL2\_TX\_LOCAL\_RING\_SIZE.
1544 \texttt{first\_frag} should be a DMA-mapped address, and \texttt{first\_frag\_len} is the buffer length in bytes. \texttt{vlan} is the VLAN tag to insert in the packet (if desired), and in this case CORE\_TX\_BD\_FLAGS\_VLAN\_INSERTION flag in \texttt{bd\_flags} should be set. \\
1545 For IP checksum and L4 checksum offload, CORE\_TX\_BD\_FLAGS\_IP\_CSUM and CORE\_TX\_BD\_FLAGS\_L4\_CSUM flags in \texttt{bd\_flags} should be set. \texttt{notify\_fw} should be set.
1546 \item \myfunc{ll2\_set\_fragment\_of\_tx\_packet}{ll2_set_fragment_of_tx_packet} -- this function provides the next buffer of a packet. \texttt{addr} should be a DMA-mapped address, and \texttt{nbytes} is the buffer length in bytes.
1548 \item \texttt{complete\_tx\_packet} -- this is a callback function that should be implemented in the upper driver. Ecore calls this function when the transmission of the packet is completed (it is called once per-packet). \texttt{cookie} and \texttt{first\_frag\_addr} are echoed from the call that posted that first fragment of the packet. \texttt{b\_last\_fragment} and \texttt{b\_last\_packet} can be ignored.
1549 \item \texttt{release\_tx\_packet} -- this is a callback function that should be implemented in the upper driver. Ecore calls this function when the connection is terminated and there are still packets in the Tx ring. In this case it will call this function per each packet, so the upper driver can free the associated buffers.
1556 \item \myfunc{ll2\_terminate\_connection}{ll2_terminate_connection} -- this function removes the LL2 connection from the device. When this function is called, ecore checks for non-completed Tx packet / Rx buffers, and calls the \texttt{release\_tx\_packet()} and / \texttt{release\_rx\_packet()} callback functions respectively.
1569 \myindex{SRIOV} is a PCIe functionality which allows Physical functions (also termed \myindex{PF}s) to spawn Virtual functions (also termed \myindex{VF}s), with a limited set of registers in their PCI configuration space and bars, but that should supply ~the same basic functionality
1576 This is an abstraction, and there are quite a few reservations and exceptions, but that is the working model.
1580 Some relevant documents are \cite{doc:iov-lec}, \cite{doc:iov-sys} and \cite{doc:iov-doc}.
1582 \section{IOV-related fields and terminology}
1585 but there are is a single field it 'owns', b\_hw\_channel -- In most distros VFs will communicate with PFs using the HW-channel [see section \ref{sec:sriov-hw-channel}], and upper-driver should set it to `true'. However, if upper-driver utilizes a designated SW-channel which it can use instead of the HW-channel, it should set let this field remain 'false'. \\
1594 \section{Initializing and Closing VFs}
1600 Following this, upper-driver can initiate the sequence [usually via OS api] that would enable the VFs and cause them to be probed.
1601 Afterwards, upper-driver can initialize the VF same as it would have the PF, i.e., the difference in initialization logic is 'hidden' inside the ecore. Upper-layer code doesn't need to contain all sorts of if-else clauses to differentiate between VF and PF [at least, not as far as the ecore initialization is concerned.
1611 The VF's PCI bar is very different from the PF bar, and with much more limited access toward chip; see \cite{doc:iov-sys} for details about the VF bar. As a result, most of the slowpath configuration that needs to be done for the VF actually has to be done by the PF.
1617 During \textit{ecore\_hw\_prepare()} ecore gathers information about the chip from various locations - HW, shared memory with Management FW, etc.. However, almost all of that information is inaccessible to the VF. Thus the VF has an alternative flow by which it sends an ACQUIRE message to the PF, notifying it that it's up and requesting information about the device - e.g., number of resources such as status blocks and queues available to the VF.
1624 unaware that inside the ecore this will send a VPORT\_START TLV message from VF to PF, and that the PF will open the vport for the VF as a result.
1637 \item This BAR access to the Ustorm RAM is trapped as an aggregated interrupt to and activates a handler in Storm FW.
1638 \item FW identifies the sending VF according to address and trigger's content and derives the parent PF's id. It then triggers an interrupt [event] on the PF, filling the event's cookie with buffer's address.
1639 \item PF driver's ISR wakes. It recognizes the message and calls OSAL\_PF\_VF\_MSG to notify upper-layer driver of the message; This is mostly since the slowpath context isn't the proper place to handle VF messages.
1644 \item VF wakes from the PF's message and processes the answer.
1657 Since the VF hasn't got such a status block allocated for it, the message passing between the PF and the VF consists of polling on the VF side.
1666 The bulletin board periodic sampling is a policy that needs to be determined and done by the upper-layer driver. It's done by calling the API function
1668 If such a change occurs, since the bulletin doesn't contain deltas from previous messages but rather the entire data [due to lack of handshake the PF can't know if VF read previous bulletin boards], the upper-driver has a wide assortment of functions-per-feature which are defined in ecore\_vf\_api.h and can be used to learn of the current state. E.g., \myfunc{vf\_get\_link\_state}{vf_get_link_state},
1679 \item Whenever any of the field of the bulletin fields the PF wants to post changes, PF increments a counter, calculate a CRC and uses DMAE to copy its local buffer into the VF's bulletin buffer.
1680 \item On the VF-side, the polled \textit{ecore\_vf\_read\_bulletin()} samples the buffer, verifies the CRC [to make sure it has a consistent image of the buffer] and if the bulletin index has increment since last seen get's updated according to the new bulletin board.
1685 On many OSes this feature is used to reset VFs on certain occasions, such as their physical assignment and de-assignment from VMs.
1692 The FLR flow is a complicated flow which involves Management firmware, storm firmware and driver all working on cleaning the HW and their own databases
1693 [See \cite{doc:iov-sys} for more details]. From driver point-of-view, management FW notifies driver of FLR after it and the storm FW have already done some work [storm FW done what's called `initial cleanup'].
1694 Ecore handles the MFW messasge about FLR, and eventually notifies upper-layer driver via \myindex{OSAL\_VF\_FLR\_UPDATE} about the FLR.
1700 Upper-layer driver should clean whatever non-ecore `volatile' information it holds for those VFs, and then call \myfunc{iov\_vf\_flr\_cleanup}{iov_vf_flr_cleanup}, which will continue the FLR process -- send a final cleanup ramrod to FW and notify MFW that the FLR process has been complete. Following this call, the FLRed VFs should be operational and in `clean slate' mode.
1708 Sr-iov is exposed to complex versioning challenges. Specifically, a given PF driver may be working with VF drivers of older and/or newer versions at the same time.
1709 This means that the channel and bulletin board must be forwards and backwards compatible. The Bulletin Board achieves this by only adding new fields.
1710 The Channel achieves compatibility through a TLV interface. Messages will always contain a type, length, value header, and may have multiple such parts.
1711 The receiver of a message (be it PF receiving a request or VF receiving a response) will parse the message, process the parts it is aware of and be able to skip over parts which it doesn't recognize.
1724 This chapter describes the ecore interfaces for selftests. The scope of the selftests is to sample various aspects of device functionality and verify that it is operational. It is not intended and does not lay claim to perform full coverage of any functionality. \\
1727 \myfunc{selftest\_register}{selftest_register} -- this test verifies the data integrity of the registers. It writes a predefined value to the register, reads it back and verifies that the contents are correctly saved. It saves the register original content before preforming the test and restores its value after the test. This test is performed via MFW and accesses registers from both engines as well as registers from engine common blocks.
1738 \myfunc{selftest\_interrupt}{selftest_interrupt} -- this test verifies the interrupt path. Ecore employs its most basic flow which exercises interrupts, the heartbeat ramrod. Ramrod is sent and interrupt is received.
1741 \myfunc{selftest\_memory}{selftest_memory} -- this test samples some of the memories in the device. Ecore employs its most basic flow which exercises memories, again the heartbeat ramrod. In this flow context is loaded to the context manager memory and is verified by the storm FW (otherwise the ramrod would fail).
1744 \myfunc{selftest\_nvram}{selftest_nvram} -- this performs the nvram test. It loops through all the nvram partitions, reads the image on the partition and validates its crc.
1751 This chapter provides an high level overview of PTP and describes the ecore interfaces for the same. PTP also known as Time Sync allows the synchronization of the clocks in the distributed systems. The protocol selects one clock in the network as master clock and all other clocks (slave clocks) synchronizes their clocks with the master. Driver's responsibilities include enable/disable of the PTP feature on the device, register/un-register of the hardware clock and its operations to the OS and configure the required Rx/Tx PTP filters. HW/FW does the timestamping of Tx/Rx PTP packets, driver need to read these timestamp values and present it to upper layer protocols (e.g., IPv4). Rx timestamping will be available during the Rx interrupt processing of the driver. FW does the Tx timestamping when first byte of the PTP packet is placed on the wire, driver has to poll for the availability of this timestamp value when processing the PTP Tx packet. \\
1753 To enable PTP support, ecore-client should call \myfunc{ptp\_enable}{ptp_enable} and then configure the required PTP filters which include
1754 enabling the Tx timestamping using \myfunc{ptp\_hwtstamp\_tx\_on}{ptp_hwtstamp_tx_on} and configuring the Rx filter mode
1757 Rx/Tx timestamp values can be read using the APIs \myfunc{ptp\_read\_rx\_ts}{ptp_read_rx_ts} and
1759 The API \myfunc{ptp\_read\_cc}{ptp_read_cc} can be used to read the Phy hardware clock and