Age | Commit message (Collapse) | Author | Files | Lines |
|
The kernel's implementation of notifier chains is unsafe. There is no
protection against entries being added to or removed from a chain while the
chain is in use. The issues were discussed in this thread:
http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2
We noticed that notifier chains in the kernel fall into two basic usage
classes:
"Blocking" chains are always called from a process context
and the callout routines are allowed to sleep;
"Atomic" chains can be called from an atomic context and
the callout routines are not allowed to sleep.
We decided to codify this distinction and make it part of the API. Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name). New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain. The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.
With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed. For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections. (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)
There are some limitations, which should not be too hard to live with. For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem. Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain. (This did happen in a couple of places and the code
had to be changed to avoid it.)
Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization. Instead we use RCU. The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.
Here is the list of chains that we adjusted and their classifications. None
of them use the raw API, so for the moment it is only a placeholder.
ATOMIC CHAINS
-------------
arch/i386/kernel/traps.c: i386die_chain
arch/ia64/kernel/traps.c: ia64die_chain
arch/powerpc/kernel/traps.c: powerpc_die_chain
arch/sparc64/kernel/traps.c: sparc64die_chain
arch/x86_64/kernel/traps.c: die_chain
drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list
kernel/panic.c: panic_notifier_list
kernel/profile.c: task_free_notifier
net/bluetooth/hci_core.c: hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain
net/ipv6/addrconf.c: inet6addr_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain
net/netlink/af_netlink.c: netlink_chain
BLOCKING CHAINS
---------------
arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain
arch/s390/kernel/process.c: idle_chain
arch/x86_64/kernel/process.c idle_notifier
drivers/base/memory.c: memory_chain
drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list
drivers/macintosh/adb.c: adb_client_list
drivers/macintosh/via-pmu.c sleep_notifier_list
drivers/macintosh/via-pmu68k.c sleep_notifier_list
drivers/macintosh/windfarm_core.c wf_client_list
drivers/usb/core/notify.c usb_notifier_list
drivers/video/fbmem.c fb_notifier_list
kernel/cpu.c cpu_chain
kernel/module.c module_notify_list
kernel/profile.c munmap_notifier
kernel/profile.c task_exit_notifier
kernel/sys.c reboot_notifier_list
net/core/dev.c netdev_chain
net/decnet/dn_dev.c: dnaddr_chain
net/ipv4/devinet.c: inetaddr_chain
It's possible that some of these classifications are wrong. If they are,
please let us know or submit a patch to fix them. Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)
The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.
[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Implement the half-closed devices notifiation, by adding a new POLLRDHUP
(and its alias EPOLLRDHUP) bit to the existing poll/select sets. Since the
existing POLLHUP handling, that does not report correctly half-closed
devices, was feared to be changed, this implementation leaves the current
POLLHUP reporting unchanged and simply add a new bit that is set in the few
places where it makes sense. The same thing was discussed and conceptually
agreed quite some time ago:
http://lkml.org/lkml/2003/7/12/116
Since this new event bit is added to the existing Linux poll infrastruture,
even the existing poll/select system calls will be able to use it. As far
as the existing POLLHUP handling, the patch leaves it as is. The
pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing
archs and sets the bit in the six relevant files. The other attached diff
is the simple change required to sys/epoll.h to add the EPOLLRDHUP
definition.
There is "a stupid program" to test POLLRDHUP delivery here:
http://www.xmailserver.org/pollrdhup-test.c
It tests poll(2), but since the delivery is same epoll(2) will work equally.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Return negative error constant.
Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Semaphore to mutex conversion.
The conversion was generated via scripts, and the result was validated
automatically via a script as well.
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes the two NULL pointer dereferences found by the sfuzz
tool from Ilja van Sprundel. The first one was a call of getsockname()
for an unbound socket and the second was calling accept() while this
operation isn't implemented for the HCI socket interface.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
This patch reduces the default L2CAP MTU for all RFCOMM connections
from 1024 to 1013 to improve the interoperability with some broken
RFCOMM implementations. To make this more flexible the L2CAP MTU
becomes also a module parameter and so it can changed at runtime.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
net: Use <linux/capability.h> where capable() is used.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Get rid of needless casting of kmalloc() return value in net/
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This changes some memcmp(one,two,ETH_ALEN) to compare_ether_addr(one,two).
Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The API and code have been through various bits of initial review by
serial driver people but they definitely need to live somewhere for a
while so the unconverted drivers can get knocked into shape, existing
drivers that have been updated can be better tuned and bugs whacked out.
This replaces the tty flip buffers with kmalloc objects in rings. In the
normal situation for an IRQ driven serial port at typical speeds the
behaviour is pretty much the same, two buffers end up allocated and the
kernel cycles between them as before.
When there are delays or at high speed we now behave far better as the
buffer pool can grow a bit rather than lose characters. This also means
that we can operate at higher speeds reliably.
For drivers that receive characters in blocks (DMA based, USB and
especially virtualisation) the layer allows a lot of driver specific
code that works around the tty layer with private secondary queues to be
removed. The IBM folks need this sort of layer, the smart serial port
people do, the virtualisers do (because a virtualised tty typically
operates at infinite speed rather than emulating 9600 baud).
Finally many drivers had invalid and unsafe attempts to avoid buffer
overflows by directly invoking tty methods extracted out of the innards
of work queue structs. These are no longer needed and all go away. That
fixes various random hangs with serial ports on overflow.
The other change in here is to optimise the receive_room path that is
used by some callers. It turns out that only one ldisc uses receive room
except asa constant and it updates it far far less than the value is
read. We thus make it a variable not a function call.
I expect the code to contain bugs due to the size alone but I'll be
watching and squashing them and feeding out new patches as it goes.
Because the buffers now dynamically expand you should only run out of
buffering when the kernel runs out of memory for real. That means a lot of
the horrible hacks high performance drivers used to do just aren't needed any
more.
Description:
tty_insert_flip_char is an old API and continues to work as before, as does
tty_flip_buffer_push() [this is why many drivers dont need modification]. It
does now also return the number of chars inserted
There are also
tty_buffer_request_room(tty, len)
which asks for a buffer block of the length requested and returns the space
found. This improves efficiency with hardware that knows how much to
transfer.
and tty_insert_flip_string_flags(tty, str, flags, len)
to insert a string of characters and flags
For a smart interface the usual code is
len = tty_request_buffer_room(tty, amount_hardware_says);
tty_insert_flip_string(tty, buffer_from_card, len);
More description!
At the moment tty buffers are attached directly to the tty. This is causing a
lot of the problems related to tty layer locking, also problems at high speed
and also with bursty data (such as occurs in virtualised environments)
I'm working on ripping out the flip buffers and replacing them with a pool of
dynamically allocated buffers. This allows both for old style "byte I/O"
devices and also helps virtualisation and smart devices where large blocks of
data suddenely materialise and need storing.
So far so good. Lots of drivers reference tty->flip.*. Several of them also
call directly and unsafely into function pointers it provides. This will all
break. Most drivers can use tty_insert_flip_char which can be kept as an API
but others need more.
At the moment I've added the following interfaces, if people think more will
be needed now is a good time to say
int tty_buffer_request_room(tty, size)
Try and ensure at least size bytes are available, returns actual room (may be
zero). At the moment it just uses the flipbuf space but that will change.
Repeated calls without characters being added are not cumulative. (ie if you
call it with 1, 1, 1, and then 4 you'll have four characters of space. The
other functions will also try and grow buffers in future but this will be a
more efficient way when you know block sizes.
int tty_insert_flip_char(tty, ch, flag)
As before insert a character if there is room. Now returns 1 for success, 0
for failure.
int tty_insert_flip_string(tty, str, len)
Insert a block of non error characters. Returns the number inserted.
int tty_prepare_flip_string(tty, strptr, len)
Adjust the buffer to allow len characters to be added. Returns a buffer
pointer in strptr and the length available. This allows for hardware that
needs to use functions like insl or mencpy_fromio.
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Trivial manual merge fixup for usb_find_interface clashes.
|
|
Leave the overloaded "hotplug" word to susbsystems which are handling
real devices. The driver core does not "plug" anything, it just exports
the state to userspace and generates events.
Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
I noticed that some of 'struct proto_ops' used in the kernel may share
a cache line used by locks or other heavily modified data. (default
linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
least)
This patch makes sure a 'struct proto_ops' can be declared as const,
so that all cpus can share all parts of it without false sharing.
This is not mandatory : a driver can still use a read/write structure
if it needs to (and eventually a __read_mostly)
I made a global stubstitute to change all existing occurences to make
them const.
This should reduce the possibility of false sharing on SMP, and
speedup some socket system calls.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It also looks like there were 2 places where the test on sk_err was
missing from the event wait logic (in sk_stream_wait_connect and
sk_stream_wait_memory), while the rest of the sock_error() users look
to be doing the right thing. This version of the patch fixes those,
and cleans up a few places that were testing ->sk_err directly.
Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch removes all relics of the /proc usage from the Bluetooth
subsystem core and its upper layers. All the previous information are
now available via /sys/class/bluetooth through appropriate functions.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds the endian annotations to the Bluetooth core.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
From: Jesper Juhl <jesper.juhl@gmail.com>
This is the net/ part of the big kfree cleanup patch.
Remove pointless checks for NULL prior to calling kfree() in net/.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
|
|
Bluetooth HIDP selects INPUT and it really needs it to be there - module
depends on input core. And input core is never built on s390...
Marked as broken on s390, for now; if somebody has better ideas, feel
free to fix it and remove dependency...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
|
|
This patch updates the HCI security filter with support for the Extended
Inquiry Response (EIR) feature.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
This patch makes another bunch of functions static.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
This patch moves rfcomm_crc_table[] into the RFCOMM core, because there
is no need to keep it in a separate file.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
Input: convert net/bluetooth to dynamic input_dev allocation
This is required for input_dev sysfs integration
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
- added typedef unsigned int __nocast gfp_t;
- replaced __nocast uses for gfp flags with gfp_t - it gives exactly
the same warnings as far as sparse is concerned, doesn't change
generated code (from gcc point of view we replaced unsigned int with
typedef) and documents what's going on far better.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch adds additional checks to prevent RFCOMM connections be
established through the RAW socket interface.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
This patch adds the handling of the extended inquiry responses and
inserts them into the inquiry cache.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
Reduces skb size by 8 bytes on 64-bit.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch moves the usage of packet type into the SKB control
buffer. After this patch it is now possible to shrink the sk_buff
structure and redefine its pkt_type.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes the sparse warnings "implicit cast to nocast type"
for the priority or gfp_mask parameters of the memory allocations.
Signed-off-by: Victor Fusco <victor@cetuc.puc-rio.br>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch implements the remote port negotiation (RPN) of the RFCOMM
protocol for Bluetooth.
Signed-off-by: J. Suter <jsuter@hardwave.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The RFCOMM layer does not handle properly the de-assertation
of CD signal. It should call tty_hangup() to work properly.
Signed-off-by: Timo Teräs <ext-timo.teras@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The HCI page scan repetition mode change event contains the actual
page scan repetition mode for the remote device. It is the same
value that is received from an inquiry response and it can be used
to make further reconnections faster.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch implements a workaround for buggy Bluetooth 1.2 devices from
Silicon Wave. Their inquiry results with RSSI contain the page scan mode
field. This field was removed in the final Bluetooth 1.2 specification.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch changes the direction to incoming and adds the timestamp
to all stack internal events.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
This patch removes the unused bt_dump() function and it also removes
its BT_DMP macro. It also unexports the hci_dev_get(), hci_send_cmd()
and hci_si_event() functions.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
The fix for the reference counting problem of the signal DLC introduced
a race condition which leads to an oops. The reason for it is not fully
understood by now and so revert this fix, because the reference counting
problem is not crashing the RFCOMM layer and its appearance it rare.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
This is part of the grand scheme to eliminate the qlen
member of skb_queue_head, and subsequently remove the
'list' member of sk_buff.
Most users of skb_queue_len() want to know if the queue is
empty or not, and that's trivially done with skb_queue_empty()
which doesn't use the skb_queue_head->qlen member and instead
uses the queue list emptyness as the test.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A lot of places in there are including major.h for no reason whatsoever.
Removed. And yes, it still builds.
The history of that stuff is often amusing. E.g. for net/core/sock.c
the story looks so, as far as I've been able to reconstruct it: we used
to need major.h in net/socket.c circa 1.1.early. In 1.1.13 that need
had disappeared, along with register_chrdev(SOCKET_MAJOR, "socket",
&net_fops) in sock_init(). Include had not. When 1.2 -> 1.3 reorg of
net/* had moved a lot of stuff from net/socket.c to net/core/sock.c,
this crap had followed...
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
|