If the RTNL is held when we invoke flush_scheduled_work() we could
deadlock. One such case is linkwatch, it is a work struct which tries
to grab the RTNL semaphore.
The most common case are net driver ->stop() methods. The
simplest conversion is to instead use cancel_{delayed_}work_sync()
explicitly on the various work struct the driver uses.
This is an OK transformation because these work structs are doing
things like resetting the chip, restarting link negotiation, and so
forth. And if we're bringing down the device, we're about to turn the
chip off and reset it anways. So if we cancel a pending work event,
that's fine here.
Some drivers were working around this deadlock by using a msleep()
polling loop of some sort, and those cases are converted to instead
use cancel_{delayed_}work_sync() as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
Because of some board issues, we need to disable parallel detect on
an HP blade. Without this patch, the link state can become stuck
when it goes into parallel detect mode.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the programmable high/low water marks in 5709 for
802.3 flow control.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The CTX_WR macro is unnecessary and obfuscates the code.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The REG_WR_IND/REG_RD_IND macros are unnecessary and obfuscate the
code. Many callers to these macros read and write shared memory from
the bp->shmem_base, so we add 2 similar functions that automatically
add the shared memory base.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the tx coalescing setup code independent of the MSIX vector.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Correct the MII expansion serdes control register definition.
2. Check an additional RUDI_INVALID bit when determining 5706S link.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prefix "bp->phy_flags" names with BNX2_PHY_FLAG_* for consistency.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In some blade systems using the 5706 serdes, the hardware sometimes
does not properly generate link down interrupts. We add a workaround
in the driver's timer to force a link-down when some PHY registers
report loss of SYNC.
The parallel detect logic is cleaned up slightly to better integrate
the workaround.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The chip has problem running in this mode and needs to be disabled.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable new tx ring and add new MSIX handler and NAPI poll function
for the new tx ring. Enable MSIX when the hardware supports it.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To separate TX IRQs into a different MSIX vector, we need to
support a new tx ring. The original tx ring will still be used
when not using MSIX.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change bnx2_napi struct into an array and add code to manage multiple
IRQs. MSIX hardware structures and new registers are also added.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rx related fields used in NAPI polling are moved from the main
bnx2 struct to the bnx2_napi struct.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tx related fields used in NAPI polling are moved from the main
bnx2 struct to the bnx2_napi struct.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a bnx2_napi structure that will hold a napi_struct and
other fields to handle NAPI polling for the napi_struct. Various tx
and rx indexes and status block pointers will be moved from the main
bnx2 structure to this bnx2_napi structure.
Most NAPI path functions are modified to be passed this bnx2_napi
struct pointer.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a table to keep track of multiple IRQs and restructure the IRQ
request and free functions so that they can be easily expanded to
handle multiple IRQs.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add function to reuse a page in case of allocation or other errors.
Add code to construct the completed SKB with the additional data in
the pages.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new fields to keep track of the pages and the page rings.
Add functions to allocate and free pages.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Factor out the common functions that will be used to initialize the
normal RX rings and the page rings.
Change the copybreak constant RX_COPY_THRESH to 128. This same
constant will be used for the max. size of the linear SKB when pages
are used. Copybreak will be turned off when pages are used.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define the various ring constants to make the code cleaner.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets can be left in the RX ring if the NAPI budget is reached.
This is caused by storing the latest rx index at the beginning of
bnx2_rx_int(). We may not process all the work up to this index
if the budget is reached and so some packets in the RX ring may rot
when we later check for more work using this stored rx index.
The fix is to not store this latest hw index and only store the
processed rx index. We use a new function bnx2_get_hw_rx_cons()
to fetch the latest hw rx index.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the default WoL setting to match the NVRAM's setting. It
always defaulted to WoL disabled before and caused a lot of confusion
for users.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a follow up to the patches from Denys Vlasenkos
<vda.linux@googlemail.com> to further optimize firmware loading.
1. In bnx2_init_cpus(), we allocate memory for decompression once
and use it repeatedly instead of doing this for every firmware image.
2. We eliminate the BSS and SBSS firmware sections in bnx2_fw*.h since
these are always zeros.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch modifies gzip unpacking code in bnx2 driver so that
it does not depend on bnx2 internals. I will move this code
out of the driver and into zlib in follow-on patch.
It can be useful in other drivers which need to store firmwares
or any other relatively big binary blobs - fonts, cursor bitmaps,
whatever.
Patch is run tested by Michael Chan (driver author).
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several devices have multiple independant RX queues per net
device, and some have a single interrupt doorbell for several
queues.
In either case, it's easier to support layouts like that if the
structure representing the poll is independant from the net
device itself.
The signature of the ->poll() call back goes from:
int foo_poll(struct net_device *dev, int *budget)
to
int foo_poll(struct napi_struct *napi, int budget)
The caller is returned the number of RX packets processed (or
the number of "NAPI credits" consumed if you want to get
abstract). The callee no longer messes around bumping
dev->quota, *budget, etc. because that is all handled in the
caller upon return.
The napi_struct is to be embedded in the device driver private data
structures.
Furthermore, it is the driver's responsibility to disable all NAPI
instances in it's ->stop() device close handler. Since the
napi_struct is privatized into the driver's private data structures,
only the driver knows how to get at all of the napi_struct instances
it may have per-device.
With lots of help and suggestions from Rusty Russell, Roland Dreier,
Michael Chan, Jeff Garzik, and Jamal Hadi Salim.
Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.
[ Ported to current tree and all drivers converted. Integrated
Stephen's follow-on kerneldoc additions, and restored poll_list
handling to the old style to fix mutual exclusion issues. -DaveM ]
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NVRAM interface is slightly modified on the 5709. To properly
support it, we need to change the buffered flag in the flash data
structure into multiple flags to indicate buffered operation, address
translation, and the use of write enable (WREN). The 5709 flash
only requires the buffered operation bit to be set.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In addition to the periodic heartbeat, we're adding a heartbeat
request interrupt when the heartbeat is late. This is needed during
netpoll where the timer is not available. -rt kernels will also
benefit since the timer is not as accurate.
[ We discussed this patch last time and we decided that the -rt
kernel problem alone did not justify this patch. I think the
netpoll problem makes this patch necessary. ]
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new fields in struct bnx2 and other bit definitions in shared
memory to support remote PHY.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing code to enable DMA on 5709 A1. The bit is a no-op on A0
and therefore can be set on all 5709 chips.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the code to print PCI or PCIE bus information for all devices.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 5709 supports the one-shot MSI handler similar to some of the tg3
chips. In this mode, the MSI disables itself automatically until it
is re-enabled at the end of NAPI poll.
Put the request_irq/free_irq logic in common procedures.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The indirect register access method will be used by more than one
caller in BH context (NAPI poll and timer), so a spinlock is required.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add PCI ID and code to support the 5709 Serdes PHY.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 5709 Serdes device uses non-standard MII register offsets. This
re-structuring will make it easier to support 5709 Serdes.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The internal PCIE-to-PCIX bridge of the 5708 has the same 40-bit DMA
limitation as some of the tg3 chips. Set dma_mask and persistent DMA
mask to 40-bit to workaround.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tweak a register setting to prevent the tx mailbox from halting.
Update version to 1.5.8.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5709 A0 copper devices will not link up with some link partners
without this workaround.
Update driver to 1.5.5.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Re-organize the firmware handling code and declarations a bit to make
the code more compact.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add code to parallel detect 1Gbps and 2.5Gbps link speeds.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Add support for 2.5Gbps forced speed setting.
2. Remove a long udelay() loop and change to msleep().
3. Other misc. SerDes fixes.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a subtle race condition between bnx2_start_xmit() and bnx2_tx_int()
similar to the one in tg3 discovered by Herbert Xu:
CPU0 CPU1
bnx2_start_xmit()
if (tx_ring_full) {
tx_lock
bnx2_tx()
if (!netif_queue_stopped)
netif_stop_queue()
if (!tx_ring_full)
update_tx_ring
netif_wake_queue()
tx_unlock
}
Even though tx_ring is updated before the if statement in bnx2_tx_int() in
program order, it can be re-ordered by the CPU as shown above. This
scenario can cause the tx queue to be stopped forever if bnx2_tx_int() has
just freed up the entire tx_ring. The possibility of this happening
should be very rare though.
The following changes are made, very much identical to the tg3 fix:
1. Add memory barrier to fix the above race condition.
2. Eliminate the private tx_lock altogether and rely solely on
netif_tx_lock. This eliminates one spinlock in bnx2_start_xmit()
when the ring is full.
3. Because of 2, use netif_tx_lock in bnx2_tx_int() before calling
netif_wake_queue().
4. Add memory barrier to bnx2_tx_avail().
5. Add bp->tx_wake_thresh which is set to half the tx ring size.
6. Check for the full wake queue condition before getting
netif_tx_lock in tg3_tx(). This reduces the number of unnecessary
spinlocks when the tx ring is full in a steady-state condition.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>