The head element of rt6_info{} is dst_entry{}, and
IPv6 specific elements follow.
Because elements at the end of dst_entry{} are frequently
updated, it is not good to put frequently-used static
elements, such as rt6i_idev, rt6i_dst or rt6i_flags in the
same cache line.
On the other hand, fib6_table, rt6i_node or rt6i_gateway are
rarely used, so it is okay to stay in the same cache line.
Let's rearrange rt6_info{}.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
One of my test machine got a deadlock during "tc" sessions,
adding/deleting classes & filters, using traffic estimators.
After some analysis, I believe we have a potential use after free case
in est_timer() :
spin_lock(e->stats_lock); << HERE >>
read_lock(&est_lock);
if (e->bstats == NULL) << TEST >>
goto skip;
Test is done a bit late, because after estimator is killed, and before
rcu grace period elapsed, we might already have freed/reuse memory where
e->stats_locks points to (some qdisc->q.lock)
A possible fix is to respect a rcu grace period at Qdisc dismantle time.
On 64bit, sizeof(struct Qdisc) is exactly 192 bytes. Adding 16 bytes to
it (for struct rcu_head) is a problem because it might change
performance, given QDISC_ALIGNTO is 32 bytes.
This is why I also change QDISC_ALIGNTO to 64 bytes, to satisfy most
current alignment requirements.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows arch code could decide the way to reserve the ibft.
And we should reserve ibft as early as possible, instead of BOOTMEM
stage, in case the table is in RAM range and is not reserved by BIOS
(this will often be the case.)
Move to just after find_smp_config().
Also when CONFIG_NO_BOOTMEM=y, We will not have reserve_bootmem() anymore.
-v2: fix typo about ibft pointed by Konrad Rzeszutek Wilk <konrad@darnok.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4BB510FB.80601@kernel.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Jones <pjones@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
CC: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (76 commits)
drm/radeon/kms: enable ACPI powermanagement mode on radeon gpus.
drm/radeon/kms: rs400/480 should set common registers.
drm/radeon/kms: add sanity check to wptr.
drm/radeon/kms/evergreen: get DP working
drm/radeon/kms: add hw_i2c module option
drm/radeon/kms: use new pre/post_xfer i2c bit algo hooks
drm/radeon/kms: disable MSI on IGP chips
drm/radeon/kms: display watermark updates (v2)
drm/radeon/kms/dp: disable training pattern on the sink at the end of link training
drm/radeon/kms: minor fixes for eDP with LCD* device tags (v2)
drm/radeon/kms/dp: remove extraneous training complete call
drm/radeon/kms/atom: minor fixes to transmitter setup
drm/radeon/kms: Only restrict BO to visible VRAM size when pinning to VRAM.
drm: fix build error when SYSRQ is disabled
drm/radeon/kms: fix macbookpro connector quirk
drm/radeon/r6xx/r7xx: further safe reg clean up
drm/radeon: bump the UMS driver version for r6xx/r7xx const buffer support
drm/radeon/kms: bump the version for r6xx/r7xx const buffer support
drm/radeon/r6xx/r7xx: CS parser fixes
drm/radeon/kms: fix some typos in r6xx/r7xx hpd setup
...
Fix up MSI-related conflicts in drivers/gpu/drm/radeon/radeon_irq_kms.c
I noticed that my KVM virtual machines were experiencing IDE
issues resulting in processes stuck on waiting for buffers to
complete.
The root cause is of course race conditions in the ancient qemu
backend that I'm using. However, the fact that the guest isn't
recovering is a bug.
I've tracked it down to the change made last year to dequeue
requests at the start rather than at the end in the IDE layer.
commit 8f6205cd57
Author: Tejun Heo <tj@kernel.org>
Date: Fri May 8 11:53:59 2009 +0900
ide: dequeue in-flight request
The problem is that the function ide_dma_timeout_retry does not
requeue the current request, causing one request to be lost for
each DMA timeout.
This patch fixes this by requeueing the request.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Scheduler's task migration events don't work because they always
pass NULL regs perf_sw_event(). The event hence gets filtered
in perf_swevent_add().
Scheduler's context switches events use task_pt_regs() to get
the context when the event occured which is a wrong thing to
do as this won't give us the place in the kernel where we went
to sleep but the place where we left userspace. The result is
even more wrong if we switch from a kernel thread.
Use the hot regs snapshot for both events as they belong to the
non-interrupt/exception based events family. Unlike page faults
or so that provide the regs matching the exact origin of the event,
we need to save the current context.
This makes the task migration event working and fix the context
switch callchains and origin ip.
Example: perf record -a -e cs
Before:
10.91% ksoftirqd/0 0 [k] 0000000000000000
|
--- (nil)
perf_callchain
perf_prepare_sample
__perf_event_overflow
perf_swevent_overflow
perf_swevent_add
perf_swevent_ctx_event
do_perf_sw_event
__perf_sw_event
perf_event_task_sched_out
schedule
run_ksoftirqd
kthread
kernel_thread_helper
After:
23.77% hald-addon-stor [kernel.kallsyms] [k] schedule
|
--- schedule
|
|--60.00%-- schedule_timeout
| wait_for_common
| wait_for_completion
| blk_execute_rq
| scsi_execute
| scsi_execute_req
| sr_test_unit_ready
| |
| |--66.67%-- sr_media_change
| | media_changed
| | cdrom_media_changed
| | sr_block_media_changed
| | check_disk_change
| | cdrom_open
v2: Always build perf_arch_fetch_caller_regs() now that software
events need that too. They don't need it from modules, unlike trace
events, so we keep the EXPORT_SYMBOL in trace_event_perf.c
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Chip model can now be selected directly by matching the modalias name
(instead of filling the .model field in platform_data), and allows the
module to be auto-loaded. Previous behaviour is of course still supported.
Convert the two in-tree users to this feature (icontrol & zeus).
Tested on an Zeus platform (mcp2515).
Signed-off-by: Marc Zyngier <maz@misterjones.org>
Acked-by: Christian Pellegrin <chripell@fsfe.org>
Cc: Edwin Peer <epeer@tmtservices.co.za>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds ethtool and device feature flag to allow control
of receive hashing offload.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because we have ensured that the argument is non-zero,
it is better to use __fls() and generate better code.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix build for CONFIG_MODULES not enabled by providing a stub
for is_module_percpu_address().
kernel/lockdep.c:605: error: implicit declaration of function 'is_module_percpu_address'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Add CAIF Serial driver. This driver is implemented as a line discipline.
caif_serial uses the following module parameters:
ser_use_stx - specifies if STart of frame eXtension is in use.
ser_loop - sets the interface in loopback mode.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Header files for CAIF Link layer net-device,
and link-layer registration.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add include files for the CAIF Core protocol stack.
caif_layer.h - Defines the structure of the CAIF protocol layers
cfcnfg.h - CAIF Configuration Module for services and link layers
cfctrl.h - CAIF Control Protocol Layer
cffrml.h - CAIF Framing Layer
cfmuxl.h - CAIF Muxing Layer
cfpkt.h - CAIF Packet layer (skb helper functions)
cfserl.h - CAIF Serial Layer
cfsrvl.h - CAIF Service Layer
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add CAIF types for Socket Address, Socket Options,
and configuration parameters for the GPRS IP network interface.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
percpu.h has always been including slab.h to get k[mz]alloc/free() for
UP inline implementation. percpu.h being used by very low level
headers including module.h and sched.h, this meant that a lot files
unintentionally got slab.h inclusion.
Lee Schermerhorn was trying to make topology.h use percpu.h and got
bitten by this implicit inclusion. The right thing to do is break
this ultimately unnecessary dependency. The previous patch added
explicit inclusion of either gfp.h or slab.h to the source files using
them. This patch updates percpu.h such that slab.h is no longer
included from percpu.h.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits)
r8169: offical fix for CVE-2009-4537 (overlength frame DMAs)
ipv6: Don't drop cache route entry unless timer actually expired.
tulip: Add missing parens.
r8169: fix broken register writes
pcnet_cs: add new id
bonding: fix broken multicast with round-robin mode
drivers/net: Fix continuation lines
e1000: do not modify tx_queue_len on link speed change
net: ipmr/ip6mr: prevent out-of-bounds vif_table access
ixgbe: Do not run all Diagnostic offline tests when VFs are active
igb: use correct bits to identify if managability is enabled
benet: Fix compile warnnings in drivers/net/benet/be_ethtool.c
net: Add MSG_WAITFORONE flag to recvmmsg
e1000e: do not modify tx_queue_len on link speed change
igbvf: do not modify tx_queue_len on link speed change
ipv4: Restart rt_intern_hash after emergency rebuild (v2)
ipv4: Cleanup struct net dereference in rt_intern_hash
net: fix netlink address dumping in IPv4/IPv6
tulip: Fix null dereference in uli526x_rx_packet()
gianfar: fix undo of reserve()
...
In commit 9df93939b7 ("ext3: Use bitops to read/modify
EXT3_I(inode)->i_state") ext3 changed its internal 'i_state' variable to
use bitops for its state handling. However, unline the same ext4
change, it didn't actually change the name of the field when it changed
the semantics of it.
As a result, an old use of 'i_state' remained in fs/ext3/ialloc.c that
initialized the field to EXT3_STATE_NEW. And that does not work
_at_all_ when we're now working with individually named bits rather than
values that get masked. So the code tried to mark the state to be new,
but in actual fact set the field to EXT3_STATE_JDATA. Which makes no
sense at all, and screws up all the code that checks whether the inode
was newly allocated.
In particular, it made the xattr code unhappy, and caused various random
behavior, like apparently
https://bugzilla.redhat.com/show_bug.cgi?id=577911
So fix the initialization, and rename the field to match ext4 so that we
don't have this happen again.
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Daniel J Walsh <dwalsh@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
pl061.h is using u8 type. including <linux/types.h> in pl061.h to avoid
warning.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
linux/amba/bus.h have dependencies on linux/device.h and linux/resource.h, but
it doesn't include them. We get compilation errors in our files which include
bus.h but doesn't include device.h and resource.h. This patch includes device.h
and resource.h in linux/amba/bus.h file.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linux Walleij <linux.ml.walleij@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
CONFIG_SLOW_WORK_PROC was changed to CONFIG_SLOW_WORK_DEBUG, but not in all
instances. Change the remaining instances. This makes the debugfs file
display the time mark and the owner's description again.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lockdep has custom code to check whether a pointer belongs to static
percpu area which is somewhat broken. Implement proper
is_kernel/module_percpu_address() and replace the custom code.
On UP, percpu variables are regular static variables and can't be
distinguished from them. Always return %false on UP.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
Better encapsulate module static percpu area handling so that code
outsidef of CONFIG_SMP ifdef doesn't deal with mod->percpu directly
and add mod->percpu_size and record percpu_size in it. Both percpu
fields are compiled out on UP. While at it, mark mod->percpu w/
__percpu.
This is to prepare for is_module_percpu_address().
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Add new flag MSG_WAITFORONE for the recvmmsg() syscall.
When this flag is specified for a blocking socket, recvmmsg()
will only block until at least 1 packet is available. The
default behavior is to block until all vlen packets are
available. This flag has no effect on non-blocking sockets
or when used in combination with MSG_DONTWAIT.
Signed-off-by: Brandon L Black <blblack@gmail.com>
Acked-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/PCI: truncate _CRS windows with _LEN > _MAX - _MIN + 1
x86/PCI: for host bridge address space collisions, show conflicting resource
frv/PCI: remove redundant warnings
x86/PCI: remove redundant warnings
PCI: don't say we claimed a resource if we failed
PCI quirk: Disable MSI on VIA K8T890 systems
PCI quirk: RS780/RS880: work around missing MSI initialization
PCI quirk: only apply CX700 PCI bus parking quirk if external VT6212L is present
PCI: complain about devices that seem to be broken
PCI: print resources consistently with %pR
PCI: make disabled window printk style match the enabled ones
PCI: break out primary/secondary/subordinate for readability
PCI: for address space collisions, show conflicting resource
resources: add interfaces that return conflict information
PCI: cleanup error return for pcix get and set mmrbc functions
PCI: fix access of PCI_X_CMD by pcix get and set mmrbc functions
PCI: kill off pci_register_set_vga_state() symbol export.
PCI: fix return value from pcix_get_max_mmrbc()
When the cgroup freezer is used to freeze tasks we do not want to thaw
those tasks during resume. Currently we test the cgroup freezer
state of the resuming tasks to see if the cgroup is FROZEN. If so
then we don't thaw the task. However, the FREEZING state also indicates
that the task should remain frozen.
This also avoids a problem pointed out by Oren Ladaan: the freezer state
transition from FREEZING to FROZEN is updated lazily when userspace reads
or writes the freezer.state file in the cgroup filesystem. This means that
resume will thaw tasks in cgroups which should be in the FROZEN state if
there is no read/write of the freezer.state file to trigger this
transition before suspend.
NOTE: Another "simple" solution would be to always update the cgroup
freezer state during resume. However it's a bad choice for several reasons:
Updating the cgroup freezer state is somewhat expensive because it requires
walking all the tasks in the cgroup and checking if they are each frozen.
Worse, this could easily make resume run in N^2 time where N is the number
of tasks in the cgroup. Finally, updating the freezer state from this code
path requires trickier locking because of the way locks must be ordered.
Instead of updating the freezer state we rely on the fact that lazy
updates only manage the transition from FREEZING to FROZEN. We know that
a cgroup with the FREEZING state may actually be FROZEN so test for that
state too. This makes sense in the resume path even for partially-frozen
cgroups -- those that really are FREEZING but not FROZEN.
Reported-by: Oren Ladaan <orenl@cs.columbia.edu>
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Cc: stable@kernel.org
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
pcmcia: use dev_pm_ops for class pcmcia_socket_class
power: support _noirq actions on device types and classes
pcmcia: allow for four multifunction subdevices (again)
pcmcia: do not use ioports < 0x100 on x86
pd6729: Coding Style fixes
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
time: Fix accumulation bug triggered by long delay.
posix-cpu-timers: Reset expire cache when no timer is running
timer stats: Fix del_timer_sync() and try_to_del_timer_sync()
clockevents: Sanitize min_delta_ns adjustment and prevent overflows
RPS currently depends on SMP and SYSFS
Adding a CONFIG_RPS makes sense in case this requirement changes in the
future. This patch saves about 1500 bytes of kernel text in case SMP is
on but SYSFS is off.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The order of the IPv6 raw table is currently reversed, that makes impossible
to use the NOTRACK target in IPv6: for example if someone enters
ip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK
and if we receive fragmented packets then the first fragment will be
untracked and thus skip nf_ct_frag6_gather (and conntrack), while all
subsequent fragments enter nf_ct_frag6_gather and reassembly will never
successfully be finished.
Singed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: don't try to decode GETATTR if DELEGRETURN returned error
sunrpc: handle allocation errors from __rpc_lookup_create()
SUNRPC: Fix the return value of rpc_run_bc_task()
SUNRPC: Fix a use after free bug with the NFSv4.1 backchannel
SUNRPC: Fix a potential memory leak in auth_gss
NFS: Prevent another deadlock in nfs_release_page()
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c-scmi: Provide module aliases for automatic loading
i2c-scmi: Support IBM SMBus CMI devices
acpi: Support IBM SMBus CMI devices
Document the circular buffering capabilities available in Linux.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Randy Dunlap <rdunlap@xenotime.net>
Reviewed-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the extended CSD register the CARD_TYPE is an 8-bit value of which the
upper 6 bits were reserved in JEDEC specifications prior to version 4.4.
In version 4.4 two of the reserved bits were designated for identifying
support for the newly added High-Speed Dual Data Rate. Unfortunately the
mmc_read_ext_csd() function required that the reserved bits be zero
instead of ignoring them as it should.
This patch makes mmc_read_ext_csd() ignore the CARD_TYPE bits that are
reserved or not yet supported. It also stops the function jumping to the
end as though an error occurred, when it is only warns that the CARD_TYPE
bits (that it does interpret) are invalid.
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 57fe60df ("reiserfs: add atomic addition of selinux attributes
during inode creation") contains a bug that will cause it to oops when
mounting a file system that didn't previously contain extended attributes
on a system using security.* xattrs.
The issue is that while creating the privroot during mount
reiserfs_security_init calls reiserfs_xattr_jcreate_nblocks which
dereferences the xattr root. The xattr root doesn't exist, so we get an
oops.
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=15309
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
include/linux/kfifo.h first defines and then undefines __kfifo_initializer
which is used by INIT_KFIFO (which is also a macro, so building a module
which uses INIT_KFIFO will fail).
Signed-off-by: David Härdeman <david@hardeman.nu>
Acked-by: Stefani Seibold <stefani@seibold.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In nl80211.h, be a little more elaborate in the docs for the definitions
NL80211_ATTR_CQM_RSSI_THOLD and NL80211_ATTR_CQM_RSSI_HYST.
Reported-by: Luis R. Rodriguez <mcgrof@gmail.com>
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add support for the set_cqm_config op. This op function configures the
requested connection quality monitor rssi threshold and rssi hysteresis
values to the hardware if the hardware supports
IEEE80211_HW_SUPPORTS_CQM.
For unsupported hardware, currently -EOPNOTSUPP is returned, so the mac80211
is currently not doing connection quality monitoring on the host. This could be
added later, if needed.
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add support for basic configuration of a connection quality monitoring to the
nl80211 interface, and basic support for notifying about triggered monitoring
events.
Via this interface a user-space connection manager may configure and receive
pre-warning events of deteriorating WLAN connection quality, and start
preparing for roaming in advance, before the connection is already lost.
An example usage of such a trigger is starting scanning for nearby AP's in
an attempt to find one with better connection quality, and associate to it
before the connection characteristics of the existing connection become too bad
or the association is even lost, leading in a prolonged delay in connectivity.
The interface currently supports only RSSI, but it could be later extended
to include other parameters, such as signal-to-noise ratio, if need for that
arises.
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The dma map fields in the skb_shared_info structure no longer has any users
and can be dropped since it is making the skb_shared_info unecessarily larger.
Running slabtop show that we were using 4K slabs for the skb->head on x86_64 w/
an allocation size of 1522. It turns out that the dma_head and dma_maps array
made skb_shared large enough that we had crossed over the 2k boundary with
standard frames and as such we were using 4k blocks of memory for all skbs.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some old IBM workstations and desktop computers, the BIOS presents in the
DSDT an SMBus object that is missing the HID identifier that the i2c-scmi
driver looks for. Modify the ACPI device scan code to insert the missing HID
if it finds an IBM system with such an object.
Affected machines: IntelliStation Z20/Z30. Note that the i2c-i801 driver no
longer works on these machines because of ACPI resource conflicts.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Instead of requiring PCMCIA socket drivers to call various functions
during their (bus) resume and suspend functions, register an own
dev_pm_ops for this class. This fixes several suspend/resume bugs
seen on db1xxx-ss, and probably on some other socket drivers, too.
With regard to the asymmetry with only _noirq suspend, but split up
resume, please see bug 14334 and commit 9905d1b411 .
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
This macro worked only when applied to variables named 'kobj'.
While this could have been fixed by simply renaming the macro argument,
a more type-safe replacement by an inline function would be preferred.
However, nobody uses this macro, so it's simpler to just remove it.
Signed-off-by: Ferenc Wagner <wferi@niif.hu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch is based on a RFC patch by Kalle Valo.
The wl1271 has a feature which handles the connection monitor logic
in hardware, basically sending periodically nullfunc frames and reporting
to the host if AP is lost, after attempting to recover by sending
probe-requests to the AP.
Add support to mac80211 by adding a new flag IEEE80211_HW_CONNECTION_MONITOR
which prevents conn_mon_timer from triggering during idle periods, and
prevents sending probe-requests to the AP if beacon-loss is indicated by the
hardware.
Cc: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
These two #defines use the same value, but
SIOCIWFIRST makes more sense in this use.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
request_resource() and insert_resource() only return success or failure,
which no information about what existing resource conflicted with the
proposed new reservation. This patch adds request_resource_conflict()
and insert_resource_conflict(), which return the conflicting resource.
Callers may use this for better error messages or to adjust the new
resource and retry the request.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 45575f5a42 ("ppc64 sys_ipc breakage in 2.6.34-rc2") fixed the
definition of the sys_ipc() helper, but didn't fix the prototype in
<linux/syscalls.h>
Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (38 commits)
ip_gre: include route header_len in max_headroom calculation
if_tunnel.h: add missing ams/byteorder.h include
ipv4: Don't drop redirected route cache entry unless PTMU actually expired
net: suppress lockdep-RCU false positive in FIB trie.
Bluetooth: Fix kernel crash on L2CAP stress tests
Bluetooth: Convert debug files to actually use debugfs instead of sysfs
Bluetooth: Fix potential bad memory access with sysfs files
netfilter: ctnetlink: fix reliable event delivery if message building fails
netlink: fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err()
NET_DMA: free skbs periodically
netlink: fix unaligned access in nla_get_be64()
tcp: Fix tcp_mark_head_lost() with packets == 0
net: ipmr/ip6mr: fix potential out-of-bounds vif_table access
KS8695: update ksp->next_rx_desc_read at the end of rx loop
igb: Add support for 82576 ET2 Quad Port Server Adapter
ixgbevf: Message formatting cleanups
ixgbevf: Shorten up delay timer for watchdog task
ixgbevf: Fix VF Stats accounting after reset
ixgbe: Set IXGBE_RSC_CB(skb)->DMA field to zero after unmapping the address
ixgbe: fix for real_num_tx_queues update issue
...
The ->release_request() callback was designed to allow the transport layer
to do housekeeping after the RPC call is done. It cannot be used to free
the request itself, and doing so leads to a use-after-free bug in
xprt_release().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When compiling userspace application which includes
if_tunnel.h and uses GRE_* defines you will get undefined
reference to __cpu_to_be16.
Fix this by adding missing #include <asm/byteorder.h>
Cc: stable@kernel.org
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no point to align or pad mibs to cache lines, they are per cpu
allocated with a 8 bytes alignment anyway.
This wastes space for no gain. This patch removes __SNMP_MIB_ALIGN__
Since SNMP mibs contain "unsigned long" fields only, we can relax the
allocation alignment from "unsigned long long" to "unsigned long"
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Its currently hard to diagnose when ACK frames are dropped because an
application set TCP_DEFER_ACCEPT on its listening socket.
See http://bugzilla.kernel.org/show_bug.cgi?id=15507
This patch adds a SNMP value, named TCPDeferAcceptDrop
netstat -s | grep TCPDeferAcceptDrop
TCPDeferAcceptDrop: 0
This counter is incremented every time we drop a pure ACK frame received
by a socket in SYN_RECV state because its SYNACK retrans count is lower
than defer_accept value.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the type change, addresses in unicast and multicast lists wouldn't make
sense, not to mention possible different lenghts. So flush both lists here.
Note "dev_addr_discard" will be very soon replaced by "dev_mc_flush" (once
mc_list conversion will be done).
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ignore the new NETDEV_PRE_TYPE_CHANGE event in rtnetlink_event() since
there have been no changes userspace needs to be notified of.
Also add a comment to the netdev notifier event definitions to remind
people to update the exclusion list when adding new event types.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the debug files ended up wrongly in sysfs, because at that point
of time, debugfs didn't exist. Convert these files to use debugfs and
also seq_file. This patch converts all of these files at once and then
removes the exported symbol for the Bluetooth sysfs class.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Convert to list macro's for the list of addresses per interface
in IPv6.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert from reader/writer lock to RCU and spinlock for addrconf
hash list.
Adds an additional helper macro for hlist_for_each_entry_continue_rcu
to handle the continue case.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using hash list macros, simplifies code and helps later RCU.
This patch includes some initialization that is not strictly necessary,
since an empty hlist node/list is all zero; and list is in BSS
and node is allocated with kzalloc.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use list macros instead of open coded linked list.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a bug that allows to lose events when reliable
event delivery mode is used, ie. if NETLINK_BROADCAST_SEND_ERROR
and NETLINK_RECV_NO_ENOBUFS socket options are set.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, ENOBUFS errors are reported to the socket via
netlink_set_err() even if NETLINK_RECV_NO_ENOBUFS is set. However,
that should not happen. This fixes this problem and it changes the
prototype of netlink_set_err() to return the number of sockets that
have set the NETLINK_RECV_NO_ENOBUFS socket option. This return
value is used in the next patch in these bugfix series.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a unaligned access in nla_get_be64() that was
introduced by myself in a17c859849.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
serial: sh-sci: remove duplicated #include
sh: Export uncached helper symbols.
sh: Fix up NUMA build for 29-bit.
serial: sh-sci: Fix build failure for non-sh architectures.
sh: Fix up uncached offset for legacy 29-bit mode.
sh: Support CPU affinity masks for INTC controllers.
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
tty_port,usb-console: Fix usb serial console open/close regression
tty: cpm_uart: use resource_size()
tty_buffer: Fix distinct type warning
hvc_console: Fix race between hvc_close and hvc_remove
uartlite: Fix build on sparc.
tty: Take a 256 byte padding into account when buffering below sub-page units
Revert "tty: Add a new VT mode which is like VT_PROCESS but doesn't require a VT_RELDISP ioctl call"
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (45 commits)
USB: gadget/multi: cdc_do_config: remove redundant check
usb: r8a66597-hcd: fix removed from an attached hub
USB: xhci: Make endpoint interval debugging clearer.
USB: Fix usb_fill_int_urb for SuperSpeed devices
USB: cp210x: Remove double usb_control_msg from cp210x_set_config
USB: Remove last bit of CONFIG_USB_BERRY_CHARGE
USB: gadget: add gadget controller number for s3c-hsotg driver
USB: ftdi_sio: Fix locking for change_speed() function
USB: g_mass_storage: fixed module name in Kconfig
USB: gadget: f_mass_storage::fsg_bind(): fix error handling
USB: g_mass_storage: fix section mismatch warnings
USB: gadget: fix Blackfin builds after gadget cleansing
USB: goku_udc: remove potential null dereference
USB: option.c: Add Pirelli VID/PID and indicate Pirelli's modem interface is 0xff
USB: serial: Fix module name typo for qcaux Kconfig entry.
usb: cdc-wdm: Fix deadlock between write and resume
usb: cdc-wdm: Fix order in disconnect and fix locking
usb: cdc-wdm:Fix loss of data due to autosuspend
usb: cdc-wdm: Fix submission of URB after suspension
usb: cdc-wdm: Fix race between disconnect and debug messages
...
USB 3 and Wireless USB specify a logarithmic encoding of the endpoint
interval that matches the USB 2 specification. usb_fill_int_urb() didn't
know that and was filling in the interval as if it was USB 1.1. Fix
usb_fill_int_urb() for SuperSpeed devices, but leave the wireless case
alone, because David Vrabel wants to keep the old encoding.
Update the struct urb kernel doc to note that SuperSpeed URBs must have
urb->interval specified in microframes.
Add a missing break statement in the usb_submit_urb() interrupt URB
checking, since wireless USB and SuperSpeed USB encode urb->interval
differently. This allows xHCI roothubs to actually register with khubd.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit e1108a63e1 ("usb_serial: Use the
shutdown() operation") breaks the ability to use a usb console
starting in 2.6.33. This was observed when using
console=ttyUSB0,115200 as a boot argument with an FTDI device. The
error is:
ftdi_sio ttyUSB0: ftdi_submit_read_urb - failed submitting read urb, error -22
The handling of the ASYNCB_INITIALIZED changed in 2.6.32 such that in
tty_port_shutdown() it always clears the flag if it is set. The fix
is to add a variable to the tty_port struct to indicate when the tty
port is a console.
CC: Alan Cox <alan@linux.intel.com>
CC: Alan Stern <stern@rowland.harvard.edu>
CC: Oliver Neukum <oliver@neukum.org>
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The TTY layer takes some care to ensure that only sub-page allocations
are made with interrupts disabled. It does this by setting a goal of
"TTY_BUFFER_PAGE" to allocate. Unfortunately, while TTY_BUFFER_PAGE takes the
size of tty_buffer into account, it fails to account that tty_buffer_find()
rounds the buffer size out to the next 256 byte boundary before adding on
the size of the tty_buffer.
This patch adjusts the TTY_BUFFER_PAGE calculation to take into account the
size of the tty_buffer and the padding. Once applied, tty_buffer_alloc()
should not require high-order allocations.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This reverts commit eec9fe7d1a.
Ari writes as the reason this should be reverted:
The problems with this patch include:
1. There's at least one subtlety I overlooked - switching
between X servers (i.e. from one X VT to another) still requires
the cooperation of both X servers. I was assuming that KMS
eliminated this.
2. It hasn't been tested at all (no X server patch exists which
uses the new mode).
As he was the original author of the patch, I'll revert it.
Cc: Ari Entlich <atrigent@ccs.neu.edu>
Cc: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove alignment padding to shrink struct request from 336 to 320 bytes
so needing one fewer cacheline and therefore removing 48 bytes from
struct request_queue.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
When doing "ifenslave -d bond0 eth0", there is chance to get NULL
dereference in netif_receive_skb(), because dev->master suddenly becomes
NULL after we tested it.
We should use ACCESS_ONCE() to avoid this (or rcu_dereference())
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the possibility to refuse the bonding type change for
other subsystems (such as for example bridge, vlan, etc.)
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since generally there could be more netdevices changing type other
than bonding, making this event type name "bonding-unrelated"
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (69 commits)
[SCSI] scsi_transport_fc: Fix synchronization issue while deleting vport
[SCSI] bfa: Update the driver version to 2.1.2.1.
[SCSI] bfa: Remove unused header files and did some cleanup.
[SCSI] bfa: Handle SCSI IO underrun case.
[SCSI] bfa: FCS and include file changes.
[SCSI] bfa: Modified the portstats get/clear logic
[SCSI] bfa: Replace bfa_get_attr() with specific APIs
[SCSI] bfa: New portlog entries for events (FIP/FLOGI/FDISC/LOGO).
[SCSI] bfa: Rename pport to fcport in BFA FCS.
[SCSI] bfa: IOC fixes, check for IOC down condition.
[SCSI] bfa: In MSIX mode, ignore spurious RME interrupts when FCoE ports are in FW mismatch state.
[SCSI] bfa: Fix Command Queue (CPE) full condition check and ack CPE interrupt.
[SCSI] bfa: IOC recovery fix in fcmode.
[SCSI] bfa: AEN and byte alignment fixes.
[SCSI] bfa: Introduce a link notification state machine.
[SCSI] bfa: Added firmware save clear feature for BFA driver.
[SCSI] bfa: FCS authentication related changes.
[SCSI] bfa: PCI VPD, FIP and include file changes.
[SCSI] bfa: Fix to copy fpma MAC when requested by user space application.
[SCSI] bfa: RPORT state machine: direct attach mode fix.
...
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits)
perf: Fix unexported generic perf_arch_fetch_caller_regs
perf record: Don't try to find buildids in a zero sized file
perf: export perf_trace_regs and perf_arch_fetch_caller_regs
perf, x86: Fix hw_perf_enable() event assignment
perf, ppc: Fix compile error due to new cpu notifiers
perf: Make the install relative to DESTDIR if specified
kprobes: Calculate the index correctly when freeing the out-of-line execution slot
perf tools: Fix sparse CPU numbering related bugs
perf_event: Fix oops triggered by cpu offline/online
perf: Drop the obsolete profile naming for trace events
perf: Take a hot regs snapshot for trace events
perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot
perf/x86-64: Use frame pointer to walk on irq and process stacks
lockdep: Move lock events under lockdep recursion protection
perf report: Print the map table just after samples for which no map was found
perf report: Add multiple event support
perf session: Change perf_session post processing functions to take histogram tree
perf session: Add storage for seperating event types in report
perf session: Change add_hist_entry to take the tree root instead of session
perf record: Add ID and to recorded event data when recording multiple events
...
Commit 7c7b60cb87
"of: put default string compare and #a/s-cell values into common header"
Breaks various things on powerpc due to using strncasecmp instead of
strcasecmp for comparing against "compatible" strings.
This causes things like the 4xx PCI code to fail miserably due to the
partial matches in code like this:
for_each_compatible_node(np, NULL, "ibm,plb-pcix")
ppc4xx_probe_pcix_bridge(np);
for_each_compatible_node(np, NULL, "ibm,plb-pci")
ppc4xx_probe_pci_bridge(np);
It's not quite right to do partial name match. Entries in a compatible
list are meant to be matched whole. If a device is compatible with both
"foo" and "foo1", then the device should have both strings in its
"compatible" property.
This patch reverts powerpc and microblaze us to use strcasecmp.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
(for patch description)
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Michal Simek <michal.simek@petalogix.com>
/sys/devices/system/memory/memoryX/phys_device is supposed to contain the
number of the physical device that the corresponding piece of memory
belongs to.
In case a physical device should be replaced or taken offline for whatever
reason it is necessary to set all corresponding memory pieces offline.
The current implementation always sets phys_device to '0' and there is no
way or hook to change that. Seems like there was a plan to implement that
but it wasn't finished for whatever reason.
So add a weak function which architectures can override to actually set
the phys_device from within add_memory_block().
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
IEEE 802.3ae clause 45 specifies a somewhat modified MDIO protocol
for use by 10GIGE phys. The main change is a 21 bit address split into
a 5 bit device ID and a 16 bit register offset. The definition is designed
so that normal and extended devices can run on the same MDIO bus.
Extend mdio-bitbang to do the new protocol. At the MDIO bus level the
protocol is requested by or'ing MII_ADDR_C45 into the register offset.
Make phy_read/phy_write/etc pass a full 32 bit register offset.
This does not attempt to make the phy layer support C45 style PHYs, just
to provide the MDIO bus support.
Tested against a Broadcom 10GE phy with ID 0x206034, and several
Broadcom 10/100/1000 Phys in normal mode.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
`ip -s link` shows interface counters truncated to 32 bit. This is
because interface statistics are transported only in 32-bit quantity
to userspace. This commit adds a new IFLA_STATS64 attribute that
exports them in full 64 bit.
References: http://lkml.indiana.edu/hypermail/linux/kernel/0307.3/0215.html
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements software receive side packet steering (RPS). RPS
distributes the load of received packet processing across multiple CPUs.
Problem statement: Protocol processing done in the NAPI context for received
packets is serialized per device queue and becomes a bottleneck under high
packet load. This substantially limits pps that can be achieved on a single
queue NIC and provides no scaling with multiple cores.
This solution queues packets early on in the receive path on the backlog queues
of other CPUs. This allows protocol processing (e.g. IP and TCP) to be
performed on packets in parallel. For each device (or each receive queue in
a multi-queue device) a mask of CPUs is set to indicate the CPUs that can
process packets. A CPU is selected on a per packet basis by hashing contents
of the packet header (e.g. the TCP or UDP 4-tuple) and using the result to index
into the CPU mask. The IPI mechanism is used to raise networking receive
softirqs between CPUs. This effectively emulates in software what a multi-queue
NIC can provide, but is generic requiring no device support.
Many devices now provide a hash over the 4-tuple on a per packet basis
(e.g. the Toeplitz hash). This patch allow drivers to set the HW reported hash
in an skb field, and that value in turn is used to index into the RPS maps.
Using the HW generated hash can avoid cache misses on the packet when
steering it to a remote CPU.
The CPU mask is set on a per device and per queue basis in the sysfs variable
/sys/class/net/<device>/queues/rx-<n>/rps_cpus. This is a set of canonical
bit maps for receive queues in the device (numbered by <n>). If a device
does not support multi-queue, a single variable is used for the device (rx-0).
Generally, we have found this technique increases pps capabilities of a single
queue device with good CPU utilization. Optimal settings for the CPU mask
seem to depend on architectures and cache hierarcy. Below are some results
running 500 instances of netperf TCP_RR test with 1 byte req. and resp.
Results show cumulative transaction rate and system CPU utilization.
e1000e on 8 core Intel
Without RPS: 108K tps at 33% CPU
With RPS: 311K tps at 64% CPU
forcedeth on 16 core AMD
Without RPS: 156K tps at 15% CPU
With RPS: 404K tps at 49% CPU
bnx2x on 16 core AMD
Without RPS 567K tps at 61% CPU (4 HW RX queues)
Without RPS 738K tps at 96% CPU (8 HW RX queues)
With RPS: 854K tps at 76% CPU (4 HW RX queues)
Caveats:
- The benefits of this patch are dependent on architecture and cache hierarchy.
Tuning the masks to get best performance is probably necessary.
- This patch adds overhead in the path for processing a single packet. In
a lightly loaded server this overhead may eliminate the advantages of
increased parallelism, and possibly cause some relative performance degradation.
We have found that masks that are cache aware (share same caches with
the interrupting CPU) mitigate much of this.
- The RPS masks can be changed dynamically, however whenever the mask is changed
this introduces the possibility of generating out of order packets. It's
probably best not change the masks too frequently.
Signed-off-by: Tom Herbert <therbert@google.com>
include/linux/netdevice.h | 32 ++++-
include/linux/skbuff.h | 3 +
net/core/dev.c | 335 +++++++++++++++++++++++++++++++++++++--------
net/core/net-sysfs.c | 225 ++++++++++++++++++++++++++++++-
net/core/skbuff.c | 2 +
5 files changed, 538 insertions(+), 59 deletions(-)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Values such as max_brightness should be set before backlights are
registered, but the current API doesn't allow that. Add a parameter to
backlight_device_register and update drivers to ensure that they
set this correctly.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>
check_fb from backlight_ops lacks a reference to the backlight_device
that's being referred to. Add this parameter so a backlight_device
can be mapped to a single framebuffer, especially if the same driver
handles multiple devices on a single system.
Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>
9905a43b2d went a little to far with const
qualifiers as there are legitiment cases where the function pointers
can change (machine specific setup code for example).
Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>
The Epson LCD L4F00242T03 is mounted on the Freescale i.MX31 PDK board.
Based upon Marek Vasut work in l4f00242t03.c, this driver provides
basic init and power on/off functionality for this device through the
sysfs lcd interface.
Unfortunately Datasheet for this device are not available and
all the control sequences sent to the display were copied from the
freescale driver that in the i.MX31 Linux BSP.
As in the i.MX31PDK board the core and io suppliers are voltage
regulators, that functionality is embedded here, but not strict.
Signed-off-by: Alberto Panizzo <maramaopercheseimorto@gmail.com>
Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>