Commit Graph

228 Commits (master)

Author SHA1 Message Date
Ben Hutchings 950c54df1e sfc: Reduce RX scatter buffer size, and reduce alignment if appropriate
efx_start_datapath() asserts that we can fit 2 RX scatter buffers plus
a software structure, each appropriately aligned, into a single page.
Where L1_CACHE_BYTES == 256 and PAGE_SIZE == 4096, which is the case
on s390, this assertion fails.

The current scatter buffer size is also not a multiple of 64 or 128,
which are more common cache line sizes.  If we can make both the start
and end of a scatter buffer cache-aligned, this will reduce the need
for read-modify-write operations on inter- processor links.

Fix the alignment by reducing EFX_RX_USR_BUF_SIZE to 2048 - 256 ==
1792.  (We could use 2048 - L1_CACHE_BYTES, but EFX_RX_USR_BUF_SIZE
also affects user-level networking where a larger amount of
housekeeping data may be needed.  Although this version of the driver
does not support user-level networking, I prefer to keep scattering
behaviour consistent with the out-of-tree version.)

This still doesn't fix the s390 build because like most architectures
it has NET_IP_ALIGN == 2.  When NET_IP_ALIGN != 0 we cannot achieve
cache line alignment at either the start or end of a scatter buffer,
so there is actually no point in padding the buffers to a multiple of
the cache line size.  All we need is 4-byte alignment of the network
header, so do that.

Adjust the assertions accordingly.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-14 11:32:04 -07:00
Ben Hutchings c14ff2ea2d sfc: Delete EFX_PAGE_IP_ALIGN, equivalent to NET_IP_ALIGN
The two architectures that define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
(powerpc and x86) now both define NET_IP_ALIGN as 0, so there is no
need for this optimisation any more.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-14 11:32:04 -07:00
Wei Yongjun 155d940a78 sfc: fix return value check in efx_ptp_probe_channel()
In case of error, the function ptp_clock_register() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-08 13:13:30 -07:00
Linus Torvalds 99bece775f Merge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c changes from Wolfram Sang:

 - an arbitration driver.  While the driver is quite simple, it caused
   discussion if we need additional arbitration on top of the one
   specified in the I2C standard.  Conclusion is that I accept a few
   generic mechanisms, but not very specific ones.

 - the core lost the detach_adapter() call.  It has no users anymore and
   was in the way for other cleanups.  attach_adapter() is sadly still
   there since there are users waiting to be converted.

 - the core gained a bus recovery infrastructure.  I2C defines a way to
   recover if the data line is stalled.  This mechanism is now in the
   core and drivers can now pass some data to make use of it.

 - bigger driver cleanups for designware, s3c2410

 - removing superfluous refcounting from drivers

 - removing Ben Dooks as second maintainer due to inactivity.  Thanks
   for all your work so far, Ben!

 - bugfixes, feature additions, devicetree fixups, simplifications...

* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits)
  i2c: xiic: must always write 16-bit words to TX_FIFO
  i2c: octeon: use HZ in timeout value
  i2c: octeon: Fix i2c fail problem when a process is terminated by a signal
  i2c: designware-pci: drop superfluous {get|put}_device
  i2c: designware-plat: drop superfluous {get|put}_device
  i2c: davinci: drop superfluous {get|put}_device
  MAINTAINERS: Ben Dooks is inactive regarding I2C
  i2c: mux: Add i2c-arb-gpio-challenge 'mux' driver
  i2c: at91: convert to dma_request_slave_channel_compat()
  i2c: mxs: do error checking and handling in PIO mode
  i2c: mxs: remove races in PIO code
  i2c-designware: switch to use runtime PM autosuspend
  i2c-designware: use usleep_range() in the busy-loop
  i2c-designware: enable/disable the controller properly
  i2c-designware: use dynamic adapter numbering on Lynxpoint
  i2c-designware-pci: use managed functions pcim_* and devm_*
  i2c-designware-pci: use dev_err() instead of printk()
  i2c-designware: move to managed functions (devm_*)
  i2c: remove CONFIG_HOTPLUG ifdefs
  i2c: s3c2410: Add SMBus emulation for block read
  ...
2013-05-02 14:38:53 -07:00
Linus Torvalds 73287a43cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights (1721 non-merge commits, this has to be a record of some
  sort):

   1) Add 'random' mode to team driver, from Jiri Pirko and Eric
      Dumazet.

   2) Make it so that any driver that supports configuration of multiple
      MAC addresses can provide the forwarding database add and del
      calls by providing a default implementation and hooking that up if
      the driver doesn't have an explicit set of handlers.  From Vlad
      Yasevich.

   3) Support GSO segmentation over tunnels and other encapsulating
      devices such as VXLAN, from Pravin B Shelar.

   4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.

   5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
      Dukkipati.

   6) In the PHY layer, allow supporting wake-on-lan in situations where
      the PHY registers have to be written for it to be configured.

      Use it to support wake-on-lan in mv643xx_eth.

      From Michael Stapelberg.

   7) Significantly improve firewire IPV6 support, from YOSHIFUJI
      Hideaki.

   8) Allow multiple packets to be sent in a single transmission using
      network coding in batman-adv, from Martin Hundebøll.

   9) Add support for T5 cxgb4 chips, from Santosh Rastapur.

  10) Generalize the VXLAN forwarding tables so that there is more
      flexibility in configurating various aspects of the endpoints.
      From David Stevens.

  11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
      from Dmitry Kravkov.

  12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
      Neira Ayuso.

  13) Start adding networking selftests.

  14) In situations of overload on the same AF_PACKET fanout socket, or
      per-cpu packet receive queue, minimize drop by distributing the
      load to other cpus/fanouts.  From Willem de Bruijn and Eric
      Dumazet.

  15) Add support for new payload offset BPF instruction, from Daniel
      Borkmann.

  16) Convert several drivers over to mdoule_platform_driver(), from
      Sachin Kamat.

  17) Provide a minimal BPF JIT image disassembler userspace tool, from
      Daniel Borkmann.

  18) Rewrite F-RTO implementation in TCP to match the final
      specification of it in RFC4138 and RFC5682.  From Yuchung Cheng.

  19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
      you like netlink, so I implemented netlink dumping of netlink
      sockets.") From Andrey Vagin.

  20) Remove ugly passing of rtnetlink attributes into rtnl_doit
      functions, from Thomas Graf.

  21) Allow userspace to be able to see if a configuration change occurs
      in the middle of an address or device list dump, from Nicolas
      Dichtel.

  22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
      Frederic Sowa.

  23) Increase accuracy of packet length used by packet scheduler, from
      Jason Wang.

  24) Beginning set of changes to make ipv4/ipv6 fragment handling more
      scalable and less susceptible to overload and locking contention,
      from Jesper Dangaard Brouer.

  25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
      instead.  From Hong Zhiguo.

  26) Optimize route usage in IPVS by avoiding reference counting where
      possible, from Julian Anastasov.

  27) Convert IPVS schedulers to RCU, also from Julian Anastasov.

  28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
      Eitzenberger.

  29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
      nfnetlink_log, and nfnetlink_queue.  From Gao feng.

  30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.

  31) Support several new r8169 chips, from Hayes Wang.

  32) Support tokenized interface identifiers in ipv6, from Daniel
      Borkmann.

  33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.

  34) Add 802.1ad vlan offload support, from Patrick McHardy.

  35) Support mmap() based netlink communication, also from Patrick
      McHardy.

  36) Support HW timestamping in mlx4 driver, from Amir Vadai.

  37) Rationalize AF_PACKET packet timestamping when transmitting, from
      Willem de Bruijn and Daniel Borkmann.

  38) Bring parity to what's provided by /proc/net/packet socket dumping
      and the info provided by netlink socket dumping of AF_PACKET
      sockets.  From Nicolas Dichtel.

  39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
      Poirier"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
  filter: fix va_list build error
  af_unix: fix a fatal race with bit fields
  bnx2x: Prevent memory leak when cnic is absent
  bnx2x: correct reading of speed capabilities
  net: sctp: attribute printl with __printf for gcc fmt checks
  netlink: kconfig: move mmap i/o into netlink kconfig
  netpoll: convert mutex into a semaphore
  netlink: Fix skb ref counting.
  net_sched: act_ipt forward compat with xtables
  mlx4_en: fix a build error on 32bit arches
  Revert "bnx2x: allow nvram test to run when device is down"
  bridge: avoid OOPS if root port not found
  drivers: net: cpsw: fix kernel warn on cpsw irq enable
  sh_eth: use random MAC address if no valid one supplied
  3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
  tg3: fix to append hardware time stamping flags
  unix/stream: fix peeking with an offset larger than data in queue
  unix/dgram: fix peeking with an offset larger than data in queue
  unix/dgram: peek beyond 0-sized skbs
  openvswitch: Remove unneeded ovs_netdev_get_ifindex()
  ...
2013-05-01 14:08:52 -07:00
Linus Torvalds 5d434fcb25 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "Usual stuff, mostly comment fixes, typo fixes, printk fixes and small
  code cleanups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (45 commits)
  mm: Convert print_symbol to %pSR
  gfs2: Convert print_symbol to %pSR
  m32r: Convert print_symbol to %pSR
  iostats.txt: add easy-to-find description for field 6
  x86 cmpxchg.h: fix wrong comment
  treewide: Fix typo in printk and comments
  doc: devicetree: Fix various typos
  docbook: fix 8250 naming in device-drivers
  pata_pdc2027x: Fix compiler warning
  treewide: Fix typo in printks
  mei: Fix comments in drivers/misc/mei
  treewide: Fix typos in kernel messages
  pm44xx: Fix comment for "CONFIG_CPU_IDLE"
  doc: Fix typo "CONFIG_CGROUP_CGROUP_MEMCG_SWAP"
  mmzone: correct "pags" to "pages" in comment.
  kernel-parameters: remove outdated 'noresidual' parameter
  Remove spurious _H suffixes from ifdef comments
  sound: Remove stray pluses from Kconfig file
  radio-shark: Fix printk "CONFIG_LED_CLASS"
  doc: put proper reference to CONFIG_MODULE_SIG_ENFORCE
  ...
2013-04-30 09:36:50 -07:00
David S. Miller 58717686cf Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
	drivers/net/ethernet/emulex/benet/be.h
	include/net/tcp.h
	net/mac802154/mac802154.h

Most conflicts were minor overlapping stuff.

The be2net driver brought in some fixes that added __vlan_put_tag
calls, which in net-next take an additional argument.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-30 03:55:20 -04:00
Ben Hutchings 89cc80a44b sfc: Fix naming of MTD partitions for FPGA bitfiles
efx_mcdi_get_board_cfg() uses a buffer for the firmware response that
is only large enough to hold subtypes for the originally defined set
of NVRAM partitions.  Longer responses are truncated, and we may read
off the end of the buffer when copying out subtypes for additional
partitions.  In particular, this can result in the MTD partition for
an FPGA bitfile being named e.g. 'eth5 sfc_fpga:00' when it should be
'eth5 sfc_fpga:01'.  This means the firmware update tool (sfupdate)
can't tell which bitfile should be written to the partition.

Correct the response buffer size.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-25 01:37:00 -04:00
Lars-Peter Clausen bf51a8c5e0 i2c: Ignore return value of i2c_del_adapter()
i2c_del_adapter() always returns 0. So all checks testing whether it will be
non zero will always evaluate to false and the conditional code is dead code.
This patch updates all callers of i2c_del_mux_adapter() to ignore the return
value and assume that it will always succeed (which it will). In a subsequent
patch the return type of i2c_del_adapter() will be made void.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Reviewed-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2013-04-02 07:06:03 +02:00
David S. Miller 61816596d1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull in the 'net' tree to get Daniel Borkmann's flow dissector
infrastructure change.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:46:26 -04:00
Jiri Kosina aa1262b387 Merge branch 'master' into for-next
Sync with Linus' tree to be able to apply patch to the newly
added ITG-3200 driver.
2013-03-18 10:57:57 +01:00
Paul Bolle 806b2139db sfc: Fix Kconfig typo "----help---"
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-03-18 10:50:23 +01:00
stephen hemminger debd0034de sfc: make local functions static
Trivial sparse detected functions that should be static.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17 14:26:40 -04:00
Joe Perches 1f9061d27d drivers:net: dma_alloc_coherent: use __GFP_ZERO instead of memset(, 0)
Reduce the number of calls required to alloc
a zeroed block of memory.

Trivially reduces overall object size.

Other changes around these removals
o Neaten call argument alignment
o Remove an unnecessary OOM message after dma_alloc_coherent failure
o Remove unnecessary gfp_t stack variable

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17 12:50:24 -04:00
Wei Yongjun df594563fc sfc: remove duplicated include from efx.c
Remove duplicated include.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-13 11:31:57 -04:00
Ben Hutchings fae8563b25 sfc: Only use TX push if a single descriptor is to be written
Using TX push when notifying the NIC of multiple new descriptors in
the ring will very occasionally cause the TX DMA engine to re-use an
old descriptor.  This can result in a duplicated or partly duplicated
packet (new headers with old data), or an IOMMU page fault.  This does
not happen when the pushed descriptor is the only one written.

TX push also provides little latency benefit when a packet requires
more than one descriptor.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-12 17:07:56 +00:00
Daniel Pieczko 1648a23fa1 sfc: allocate more RX buffers per page
Allocating 2 buffers per page is insanely inefficient when MTU is 1500
and PAGE_SIZE is 64K (as it usually is on POWER).  Allocate as many as
we can fit, and choose the refill batch size at run-time so that we
still always use a whole page at once.

[bwh: Fix loop condition to allow for compound pages; rebase]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:15 +00:00
Ben Hutchings 179ea7f039 sfc: Replace efx_rx_is_last_buffer() with a flag
This condition is brittle and we have lots of flags to spare.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:14 +00:00
Daniel Pieczko 2768935a46 sfc: reuse pages to avoid DMA mapping/unmapping costs
On POWER systems, DMA mapping/unmapping operations are very expensive.
These changes reduce these costs by trying to reuse DMA mapped pages.

After all the buffers associated with a page have been processed and
passed up, the page is placed into a ring (if there is room).  For
each page that is required for a refill operation, a page in the ring
is examined to determine if its page count has fallen to 1, ie. the
kernel has released its reference to these packets.  If this is the
case, the page can be immediately added back into the RX descriptor
ring, without having to re-map it for DMA.

If the kernel is still holding a reference to this page, it is removed
from the ring and unmapped for DMA.  Then a new page, which can
immediately be used by RX buffers in the descriptor ring, is allocated
and DMA mapped.

The time a page needs to spend in the recycle ring before the kernel
has released its page references is based on the number of buffers
that use this page.  As large pages can hold more RX buffers, the RX
recycle ring can be shorter.  This reduces memory usage on POWER
systems, while maintaining the performance gain achieved by recycling
pages, following the driver change to pack more than two RX buffers
into large pages.

When an IOMMU is not present, the recycle ring can be small to reduce
memory usage, since DMA mapping operations are inexpensive.

With a small recycle ring, attempting to refill the descriptor queue
with more buffers than the equivalent size of the recycle ring could
ultimately lead to memory leaks if page entries in the recycle ring
were overwritten.  To prevent this, the check to see if the recycle
ring is full is changed to check if the next entry to be written is
NULL.

[bwh: Combine and rebase several commits so this is complete
 before the following buffer-packing changes.  Remove module
 parameter.]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:13 +00:00
Ben Hutchings 85740cdf0b sfc: Enable RX DMA scattering where possible
Enable RX DMA scattering iff an RX buffer large enough for the current
MTU will not fit into a single page and the NIC supports DMA
scattering for kernel-mode RX queues.

On Falcon and Siena, the RX_USR_BUF_SIZE field is used as the DMA
limit for both all RX queues with scatter enabled.  Set it to 1824,
matching what Onload uses now.

Maintain a statistic for frames truncated due to lack of descriptors
(rx_nodesc_trunc).  This is distinct from rx_frm_trunc which may be
incremented when scattering is disabled and implies an over-length
frame.

Whenever an MTU change causes scattering to be turned on or off,
update filters that point to the PF queues, but leave others
unchanged, as VF drivers assume scattering is off.

Add n_frags parameters to various functions, and make them iterate:
- efx_rx_packet()
- efx_recycle_rx_buffers()
- efx_rx_mk_skb()
- efx_rx_deliver()

Make efx_handle_rx_event() responsible for updating
efx_rx_queue::removed_count.

Change the RX pipeline state to a starting ring index and number of
fragments, and make __efx_rx_packet() responsible for clearing it.

Based on earlier versions by David Riddoch and Jon Cooper.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:12 +00:00
Ben Hutchings b74e3e8cd6 sfc: Update RX buffer address together with length
Adjust rx_buf->page_offset when we eat the RX hash prefix.  Remove
efx_rx_buf_offset(), which is now redundant.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:11 +00:00
Ben Hutchings 5036b7c7b9 sfc: Explicitly prefetch RX hash prefix, not just Ethernet heade
Currently we prefetch from the Ethernet header, but we will also read
the hash prefix.  In practice they should be in the same cache line
and this won't hurt, but it is still pointless to add on the hash
prefix size.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:10 +00:00
Ben Hutchings b184f16b7f sfc: Replace efx_rx_buf_eh() with simpler efx_rx_buf_va()
efx_rx_buf_va() returns the virtual address of the current start of
the buffer.  The callers must add the hash prefix size themselves.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:09 +00:00
Ben Hutchings ff734ef4bc sfc: Wrap __efx_rx_packet() with efx_rx_flush_packet()
The pipeline mechanism will need to change a bit for scattered
packets.  Add a wrapper to insulate efx_process_channel() from this.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:08 +00:00
Ben Hutchings 9bc2fc9b52 sfc: Make RX queue descriptor counts unsigned for consistency
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:07 +00:00
Ben Hutchings 272baeeb6a sfc: Properly distinguish RX buffer and DMA lengths
Replace efx_nic::rx_buffer_len with efx_nic::rx_dma_len, the maximum
RX DMA length.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:06 +00:00
Ben Hutchings 80c2e716d5 sfc: Document current usage of efx_rx_buffer::len and efx_nic::rx_buffer_len
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:05 +00:00
Alexandre Rames 626950db84 sfc: Add AER and EEH support for Siena
The Linux side of EEH is triggered by MMIO reads, but this
driver's data path does not issue any MMIO reads (except in
legacy interrupt mode).  Therefore add a monitor function
to poll EEH periodically.

When preparing to reset the device based on our own error
detection, also poll EEH and defer to its recovery mechanism
if appropriate.

[bwh: Use a separate condition for the initial link poll; fix some
 style errors]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:04 +00:00
Ben Hutchings 634ab72c39 sfc: Disable RSS when using SR-IOV and only 1 RX queue on the PF
On Siena, VFs share RSS configuration with the PF.  We attempted to
support configurations where the PF only uses 1 RX queue and VFs use
multiple RX queues, by (1) setting up RSS for the number of RX queues
per VF (2) disabling RSS in the PF's RX default filters.

Unfortunately commit cd2d5b529c ('sfc: Add SR-IOV back-end support
for SFC9000 family') only included (1).  This is (2).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:03 +00:00
Ben Hutchings d9ccfdd4b3 sfc: Fix replacement detection in efx_filter_insert_filter()
efx_filter_insert_filter() uses the first table entry in the hash chain
that either has the same match values or is empty.  This means that
replacement doesn't always work correctly:

1. Insert filter F1 with match values M1, hashing to H1, at first
   possible entry E1.
2. Insert filter F2 with match values M2, hashing to H1, at second
   possible entry E2.
3. Remove filter F1.
4. Insert filter F3 with match values M2, hashing to H1, at first
   possible entry E1.

F3 should have either replaced F2 or been rejected (depending on
priority and the replace_equal parameter).

Instead, search for both a matching filter that the inserted filter
would replace, and an available insertion point, up to the applicable
maximum search depths.  If we insert at lower depth than a replaced
filter, clear the replaced filter.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:02 +00:00
Ben Hutchings 297891cecc sfc: Merge efx_filter_search() into efx_filter_insert()
efx_filter_search() is only called from efx_filter_insert(), and
neither function is very long.  The following bug fix requires a more
sophisticated search with a third result, which is going to be easier
to implement as part of the same function.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:01 +00:00
Ben Hutchings 385904f819 sfc: Don't use efx_filter_{build,hash,increment}() for default MAC filters
These functions happen to work for default MAC filters: they generate
an initial index of 1/0 for unicast/multicast respectively and an
increment of 1 for either, so a search succeeds at depth 2.  But this
is a matter of luck rather than design, and it really won't work well
with the bug fix we're about to do.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:22:00 +00:00
Ben Hutchings e3a699fab3 sfc: Remove redundant parameter to efx_filter_search()
The 'for_insert' parameter is redundant since there are no longer
any other operations that need to search based on a filter spec.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:59 +00:00
Ben Hutchings 7de07a4deb sfc: More sensible semantics for efx_filter_insert_filter() replace flag
The 'replace' flag to efx_filter_insert_filter() controls whether the
new filter may replace *any* filter, and is checked even before
priority comparison.  But lower-priority filters should never
block insertion of higher-priority filters.

Change the priority checking so that lower-priority filters are
replaced regardless of the value of the flag, and rename the
flag to 'replace_equal'.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:58 +00:00
Alexandre Rames 97d48a10c6 sfc: Remove rx_alloc_method SKB
[bwh: Remove more dead code, and make efx_ptp_rx() pull the data it
 needs into the header area.]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:57 +00:00
Laurence Evans 9230451af9 sfc: tidy up PTP synchronize function efx_ptp_process_times()
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:56 +00:00
Laurence Evans c939a31645 sfc: PTP changes to support improved UUID filtering mode
There is a long-standing problem with the packet-timestamp matching in
the driver. When a PTP packet is received by the MC, the FPGA
timestamps the packet and the MC sends the timestamp and 6 bytes of
the UUID to the driver. The driver then matches the timestamp against
received packets using the same 6 bytes of UUID.

The problem comes from the choice of which 6 bytes to use. The PTP
spec is slightly contradictory and misleading in one of the two places
where the UUIDs are discussed. From section 7.2.2.2 of the spec, a
PTPD2 UUID can be either a EUI-64 or a EUI-64 constructed from a
EUI-48. The typical ethernet based implementation uses a EUI-64
constructed from a EUI-48. This works by taking the first 3 bytes of
the MAC address of the NIC being used for PTP (the OUI), then
inserting 0xFF, 0xFE, then taking the last 3 bytes of the MAC address
giving
          MAC[0], MAC[1], MAC[2], 0xFF, 0xFE, MAC[3], MAC[4], MAC[5]
The current MC firmware and driver discard the first two bytes of this
UUID and packets are matched against timestamps using bytes 2 to 7 so
there is a small risk that in a deployment of Solarflare PTP NICs used
with other vendors NICs, that a PTP packet could be matched against
the wrong timestamp. This applies to all other organisations whose
third byte of the OUI is 0x53. It's a long list but I notice that it
includes Cisco.

The necessary modifications to use bytes 0-2 and 5-7 of the UUID to
match against are quite small but introduce incompatibility between
older version of the firmware and driver.

When PTP is enabled via SO_TIMESTAMPING specifying PTP V2, the driver
will try to enable PTP in the firmware using the enhanced mode
(above). If the firmware returns an error, the driver will enable PTP
in the firmware using the old mode.

[bwh: Fix some style errors; remove private ioctl bits]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:55 +00:00
Ben Hutchings 4a74dc65e3 sfc: Allow efx_channel_type::receive_skb() to reject a packet
Instead of having efx_ptp_rx() call netif_receive_skb() for an invalid
PTP packet, make it return false for rejected packets and have
efx_rx_deliver() pass them up.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-07 20:21:54 +00:00
Ben Hutchings c73e787a8d sfc: Correct efx_rx_buffer::page_offset when EFX_PAGE_IP_ALIGN != 0
RX DMA buffers start at an offset of EFX_PAGE_IP_ALIGN bytes from the
start of a cache line.  This offset obviously needs to be included in
the virtual address, but this was missed in commit b590ace09d
('sfc: Fix efx_rx_buf_offset() in the presence of swiotlb') since
EFX_PAGE_IP_ALIGN is equal to 0 on both x86 and powerpc.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-06 17:57:25 +00:00
Ben Hutchings 35205b211c sfc: Disable soft interrupt handling during efx_device_detach_sync()
efx_device_detach_sync() locks all TX queues before marking the device
detached and thus disabling further TX scheduling.  But it can still
be interrupted by TX completions which then result in TX scheduling in
soft interrupt context.  This will deadlock when it tries to acquire
a TX queue lock that efx_device_detach_sync() already acquired.

To avoid deadlock, we must use netif_tx_{,un}lock_bh().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-03-06 17:57:24 +00:00
Ben Hutchings 29c69a4882 sfc: Detach net device when stopping queues for reconfiguration
We must only ever stop TX queues when they are full or the net device
is not 'ready' so far as the net core, and specifically the watchdog,
is concerned.  Otherwise, the watchdog may fire *immediately* if no
packets have been added to the queue in the last 5 seconds.

The device is ready if all the following are true:

(a) It has a qdisc
(b) It is marked present
(c) It is running
(d) The link is reported up

(a) and (c) are normally true, and must not be changed by a driver.
(d) is under our control, but fake link changes may disturb userland.
This leaves (b).  We already mark the device absent during reset
and self-test, but we need to do the same during MTU changes and ring
reallocation.  We don't need to do this when the device is brought
down because then (c) is already false.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-02-26 15:00:46 +00:00
Ben Hutchings b590ace09d sfc: Fix efx_rx_buf_offset() in the presence of swiotlb
We assume that the mapping between DMA and virtual addresses is done
on whole pages, so we can find the page offset of an RX buffer using
the lower bits of the DMA address.  However, swiotlb maps in units of
2K, breaking this assumption.

Add an explicit page_offset field to struct efx_rx_buffer.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-02-26 14:57:16 +00:00
Ben Hutchings 3a68f19d7a sfc: Properly sync RX DMA buffer when it is not the last in the page
We may currently allocate two RX DMA buffers to a page, and only unmap
the page when the second is completed.  We do not sync the first RX
buffer to be completed; this can result in packet loss or corruption
if the last RX buffer completed in a NAPI poll is the first in a page
and is not DMA-coherent.  (In the middle of a NAPI poll, we will
handle the following RX completion and unmap the page *before* looking
at the content of the first buffer.)

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-02-26 14:55:49 +00:00
Julia Lawall 56567c6f87 drivers/net/ethernet/sfc/ptp.c: adjust duplicate test
Delete successive tests to the same location.  rc was previously tested and
not subsequently updated.  efx_phc_adjtime can return an error code, so the
call is updated so that is tested instead.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@s exists@
local idexpression y;
expression x,e;
@@

*if ( \(x == NULL\|IS_ERR(x)\|y != 0\) )
 { ... when forall
   return ...; }
... when != \(y = e\|y += e\|y -= e\|y |= e\|y &= e\|y++\|y--\|&y\)
    when != \(XT_GETPAGE(...,y)\|WMI_CMD_BUF(...)\)
*if ( \(x == NULL\|IS_ERR(x)\|y != 0\) )
 { ... when forall
   return ...; }
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-21 15:44:58 -05:00
Greg Kroah-Hartman 1dd06ae8db drivers/net: fix up function prototypes after __dev* removals
The __dev* removal patches for the network drivers ended up messing up
the function prototypes for a bunch of drivers.  This patch fixes all of
them back up to be properly aligned.

Bonus is that this almost removes 100 lines of code, always a nice
surprise.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-07 14:22:22 -05:00
Bill Pemberton 87d1fc1130 sfc: remove __dev* attributes
CONFIG_HOTPLUG is going away as an option.  As result the __dev*
markings will be going away.

Remove use of __devinit, __devexit_p, __devinitdata, __devinitconst,
and __devexit.

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-12-03 11:16:46 -08:00
Ben Hutchings b9cc977d9d sfc: Make module parameters really boolean
Most of the module parameters treated as boolean are currently exposed
as type int or uint.  Defining them with the proper type is useful
documentation for both users and developers.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-12-01 02:37:37 +00:00
Ben Hutchings ebf98e797b sfc: Fix timekeeping in efx_mcdi_poll()
efx_mcdi_poll() uses get_seconds() to read the current time and to
implement a polling timeout.  The use of this function was chosen
partly because it could easily be replaced in a co-sim environment
with a macro that read the simulated time.

Unfortunately the real get_seconds() returns the system time (real
time) which is subject to adjustment by e.g. ntpd.  If the system time
is adjusted forward during a polled MCDI operation, the effective
timeout can be shorter than the intended 10 seconds, resulting in a
spurious failure.  It is also possible for a backward adjustment to
delay detection of a areal failure.

Use jiffies instead, and change MCDI_RPC_TIMEOUT to be denominated in
jiffies.  Also correct rounding of the timeout: check time > finish
(or rather time_after(time, finish)) and not time >= finish.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-12-01 02:37:36 +00:00
Daniel Pieczko c2f3b8e3a4 sfc: lock TX queues when calling netif_device_detach()
The assertion of netif_device_present() at the top of
efx_hard_start_xmit() may fail if we don't do this.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-12-01 02:37:35 +00:00
Daniel Pieczko 525d9e8240 sfc: Work-around flush timeout when flushes have completed
We sometimes hit a "failed to flush" timeout on some TX queues, but the
flushes have completed and the flush completion events seem to go missing.
In this case, we can check the TX_DESC_PTR_TBL register and drain the
queues if the flushes had finished.

[bwh: Minor fixes to coding style]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-12-01 02:37:27 +00:00