summaryrefslogtreecommitdiffstats
path: root/net/bridge/br_forward.c
AgeCommit message (Collapse)AuthorFilesLines
2011-01-06net: bridge: check the length of skb after nf_bridge_maybe_copy_header()Changli Gao1-10/+6
Since nf_bridge_maybe_copy_header() may change the length of skb, we should check the length of skb after it to handle the ppoe skbs. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15bridge: add RCU annotation to bridge multicast tableEric Dumazet1-2/+2
Add modern __rcu annotatations to bridge multicast table. Use newer hlist macros to avoid direct access to hlist internals. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-23Merge branch 'master' of ↵David S. Miller1-2/+2
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/ip_output.c
2010-06-15bridge: Fix OOM crash in deliver_cloneHerbert Xu1-2/+2
The bridge multicast patches introduced an OOM crash in the forward path, when deliver_clone fails to clone the skb. Reported-by: Mark Wagner <mwagner@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15bridge: Fix netpoll supportHerbert Xu1-22/+12
There are multiple problems with the newly added netpoll support: 1) Use-after-free on each netpoll packet. 2) Invoking unsafe code on netpoll/IRQ path. 3) Breaks when netpoll is enabled on the underlying device. This patch fixes all of these problems. In particular, we now allocate proper netpoll structures for each underlying device. We only allow netpoll to be enabled on the bridge when all the devices underneath it support netpoll. Once it is enabled, we do not allow non-netpoll devices to join the bridge (until netpoll is disabled again). This allows us to do away with the npinfo juggling that caused problem number 1. Incidentally this patch fixes number 2 by bypassing unsafe code such as multicast snooping and netfilter. Reported-by: Qianfeng Zhang <frzhang@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-10Merge branch 'master' of /repos/git/net-next-2.6Patrick McHardy1-10/+29
Conflicts: net/bridge/br_device.c net/bridge/br_forward.c Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-05-06bridge: make bridge support netpollWANG Cong1-1/+22
Based on the previous patch, make bridge support netpoll by: 1) implement the 2 methods to support netpoll for bridge; 2) modify netpoll during forwarding packets via bridge; 3) disable netpoll support of bridge when a netpoll-unabled device is added to bridge; 4) enable netpoll support when all underlying devices support netpoll. Cc: David Miller <davem@davemloft.net> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: WANG Cong <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-27bridge: multicast_flood cleanupstephen hemminger1-5/+3
Move some declarations around to make it clearer which variables are being used inside loop. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-27bridge: multicast port group RCU fixstephen hemminger1-2/+2
The recently introduced bridge mulitcast port group list was only partially using RCU correctly. It was missing rcu_dereference() and missing the necessary barrier on deletion. The code should have used one of the standard list methods (list or hlist) instead of open coding a RCU based link list. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-27bridge: multicast floodstephen hemminger1-2/+2
Fix unsafe usage of RCU. Would never work on Alpha SMP because of lack of rcu_dereference() Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-20Merge branch 'master' of /repos/git/net-next-2.6Patrick McHardy1-0/+1
Conflicts: Documentation/feature-removal-schedule.txt net/ipv6/netfilter/ip6t_REJECT.c net/netfilter/xt_limit.c Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-04-13netfilter: bridge-netfilter: update a comment in br_forward.c about ↵Bart De Schuymer1-1/+1
ip_fragment() ip_refrag isn't used anymore in the bridge-netfilter code Signed-off-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo1-0/+1
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-25netfilter: bridge: use NFPROTO values for NF_HOOK invocationJan Engelhardt1-5/+5
The first argument to NF_HOOK* is an nfproto since quite some time. Commit v2.6.27-2457-gfdc9314 was the first to practically start using the new names. Do that now for the remaining NF_HOOK calls. The semantic patch used was: // <smpl> @@ @@ (NF_HOOK |NF_HOOK_THRESH )( -PF_BRIDGE, +NFPROTO_BRIDGE, ...) @@ @@ NF_HOOK( -PF_INET6, +NFPROTO_IPV6, ...) @@ @@ NF_HOOK( -PF_INET, +NFPROTO_IPV4, ...) // </smpl> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-16bridge: Make first arg to deliver_clone const.David S. Miller1-2/+4
Otherwise we get a warning from the call in br_forward(). Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16bridge: Fix br_forward crash in promiscuous modeMichael Braun1-3/+11
From: Michael Braun <michael-dev@fami-braun.de> bridge: Fix br_forward crash in promiscuous mode It's a linux-next kernel from 2010-03-12 on an x86 system and it OOPs in the bridge module in br_pass_frame_up (called by br_handle_frame_finish) because brdev cannot be dereferenced (its set to a non-null value). Adding some BUG_ON statements revealed that BR_INPUT_SKB_CB(skb)->brdev == br-dev (as set in br_handle_frame_finish first) only holds until br_forward is called. The next call to br_pass_frame_up then fails. Digging deeper it seems that br_forward either frees the skb or passes it to NF_HOOK which will in turn take care of freeing the skb. The same is holds for br_pass_frame_ip. So it seems as if two independent skb allocations are required. As far as I can see, commit b33084be192ee1e347d98bb5c9e38a53d98d35e2 ("bridge: Avoid unnecessary clone on forward path") removed skb duplication and so likely causes this crash. This crash does not happen on 2.6.33. I've therefore modified br_forward the same way br_flood has been modified so that the skb is not freed if skb0 is going to be used and I can confirm that the attached patch resolves the issue for me. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-28bridge: Add multicast forwarding functionsHerbert Xu1-0/+67
This patch adds code to perform selective multicast forwarding. We forward multicast traffic to a set of ports plus all multicast router ports. In order to avoid duplications among these two sets of ports, we order all ports by the numeric value of their pointers. The two lists are then walked in lock-step to eliminate duplicates. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-28bridge: Split may_deliver/deliver_clone out of br_floodHerbert Xu1-23/+46
This patch moves the main loop body in br_flood into the function may_deliver. The code that clones an skb and delivers it is moved into the deliver_clone function. This allows this to be reused by the future multicast forward function. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-28bridge: Use BR_INPUT_SKB_CB on xmit pathHerbert Xu1-2/+3
this patch makes BR_INPUT_SKB_CB available on the xmit path so that we could avoid passing the br pointer around for the purpose of collecting device statistics. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-28bridge: Avoid unnecessary clone on forward pathHerbert Xu1-11/+22
When the packet is delivered to the local bridge device we may end up cloning it unnecessarily if no bridge port can receive the packet in br_flood. This patch avoids this by moving the skb_clone into br_flood. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-13net/bridge: Add 'hairpin' port forwarding modeFischer, Anna1-1/+2
This patch adds a 'hairpin' (also called 'reflective relay') mode port configuration to the Linux Ethernet bridge kernel module. A bridge supporting hairpin forwarding mode can send frames back out through the port the frame was received on. Hairpin mode is required to support basic VEPA (Virtual Ethernet Port Aggregator) capabilities. You can find additional information on VEPA here: http://tech.groups.yahoo.com/group/evb/ http://www.ieee802.org/1/files/public/docs2009/new-hudson-vepa_seminar-20090514d.pdf http://www.internet2.edu/presentations/jt2009jul/20090719-congdon.pdf An additional patch 'bridge-utils: Add 'hairpin' port forwarding mode' is provided to allow configuring hairpin mode from userspace tools. Signed-off-by: Paul Congdon <paul.congdon@hp.com> Signed-off-by: Anna Fischer <anna.fischer@hp.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-09bridge: Fix LRO crash with tunHerbert Xu1-1/+6
> Kernel BUG at drivers/net/tun.c:444 > invalid opcode: 0000 [1] SMP > last sysfs file: /class/net/lo/ifindex > CPU 0 > Modules linked in: tun ipt_MASQUERADE iptable_nat ip_nat xt_state ip_conntrack > nfnetlink ipt_REJECT xt_tcpudp iptable_filter d > Pid: 6912, comm: qemu-kvm Tainted: G 2.6.18-128.el5 #1 > RIP: 0010:[<ffffffff886f57b0>] [<ffffffff886f57b0>] > :tun:tun_chr_readv+0x2b1/0x3a6 > RSP: 0018:ffff8102202c5e48 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: ffff8102202c5e98 RCX: 0000000004010000 > RDX: ffff810227063680 RSI: ffff8102202c5e9e RDI: ffff8102202c5e92 > RBP: 0000000000010ff6 R08: 0000000000000000 R09: 0000000000000001 > R10: ffff8102202c5e94 R11: 0000000000000202 R12: ffff8102275357c0 > R13: ffff81022755e500 R14: 0000000000000000 R15: ffff8102202c5ef8 > FS: 00002ae4398db980(0000) GS:ffffffff803ac000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00002ae4ab514000 CR3: 0000000221344000 CR4: 00000000000026e0 > Process qemu-kvm (pid: 6912, threadinfo ffff8102202c4000, task > ffff81022e58d820) > Stack: 00000000498735cb ffff810229d1a3c0 0000000000000000 ffff81022e58d820 > ffffffff8008a461 ffff81022755e528 ffff81022755e528 ffffffff8009f925 > 000005ea05ea0000 ffff8102209d0000 00001051143e1600 ffffffff8003c00e > Call Trace: > [<ffffffff8008a461>] default_wake_function+0x0/0xe > [<ffffffff8009f925>] enqueue_hrtimer+0x55/0x70 > [<ffffffff8003c00e>] hrtimer_start+0xbc/0xce > [<ffffffff886f58bf>] :tun:tun_chr_read+0x1a/0x1f > [<ffffffff8000b3f3>] vfs_read+0xcb/0x171 > [<ffffffff800117d4>] sys_read+0x45/0x6e > [<ffffffff8005d116>] system_call+0x7e/0x83 > > > Code: 0f 0b 68 40 62 6f 88 c2 bc 01 f6 42 0a 08 74 0c 80 4c 24 41 > RIP [<ffffffff886f57b0>] :tun:tun_chr_readv+0x2b1/0x3a6 > RSP <ffff8102202c5e48> > <0>Kernel panic - not syncing: Fatal exception This crashed when an LRO packet generated by bnx2x reached a tun device through the bridge. We're supposed to drop it at the bridge. However, because the check was placed in br_forward instead of __br_forward, it's only effective if we are sending the packet through a single port. This patch fixes it by moving the check into __br_forward. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19net: Discard and warn about LRO'd skbs received for forwardingBen Hutchings1-1/+1
Add skb_warn_if_lro() to test whether an skb was received with LRO and warn if so. Change br_forward(), ip_forward() and ip6_forward() to call it) and discard the skb if it returns true. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-11net: remove CVS keywordsAdrian Bunk1-2/+0
This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-21bridge: Use on-device stats instead of private ones.Pavel Emelyanov1-1/+1
Even though bridges require 6 fields from struct net_device_stats, the on-device stats are always there, so we may just use them. The br_dev_get_stats is no longer required after this. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-16[BRIDGE]: Kill clone argument to br_flood_*Herbert Xu1-16/+5
The clone argument is only used by one caller and that caller can clone the packet itself. This patch moves the clone call into the caller and kills the clone argument. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: Allow forwarding of ip_summed except CHECKSUM_COMPLETEHerbert Xu1-1/+1
Right now Xen has a horrible hack that lets it forward packets with partial checksums. One of the reasons that CHECKSUM_PARTIAL and CHECKSUM_COMPLETE were added is so that we can get rid of this hack (where it creates two extra bits in the skbuff to essentially mirror ip_summed without being destroyed by the forwarding code). I had forgotten that I've already gone through all the deivce drivers last time around to make sure that they're looking at ip_summed == CHECKSUM_PARTIAL rather than ip_summed != 0 on transmit. In any case, I've now done that again so it should definitely be safe. Unfortunately nobody has yet added any code to update CHECKSUM_COMPLETE values on forward so we I'm setting that to CHECKSUM_NONE. This should be safe to remove for bridging but I'd like to check that code path first. So here is the patch that lets us get rid of the hack by preserving ip_summed (mostly) on forwarded packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-10[NET] BRIDGE: Fix whitespace errors.YOSHIFUJI Hideaki1-2/+2
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NETFILTER] bridge: code rearrangement for clarityStephen Hemminger1-4/+1
Cleanup and rearrangement for better style and clarity: Split the function nf_bridge_maybe_copy_header into two pieces Move copy portion out of line. Use Ethernet header size macros. Use header file to handle CONFIG_NETFILTER_BRIDGE differences Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-26[BRIDGE] netfilter: memory corruption fixStephen Hemminger1-3/+7
The bridge-netfilter code will overwrite memory if there is not headroom in the skb to save the header. This first showed up when using Xen with sky2 driver that doesn't allocate the extra space. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-08[NET] gso: Add skb_is_gsoHerbert Xu1-1/+1
This patch adds the wrapper function skb_is_gso which can be used instead of directly testing skb_shinfo(skb)->gso_size. This makes things a little nicer and allows us to change the primary key for indicating whether an skb is GSO (if we ever want to do that). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Merge TSO/UFO fields in sk_buffHerbert Xu1-2/+2
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not going to scale if we add any more segmentation methods (e.g., DCCP). So let's merge them. They were used to tell the protocol of a packet. This function has been subsumed by the new gso_type field. This is essentially a set of netdev feature bits (shifted by 16 bits) that are required to process a specific skb. As such it's easy to tell whether a given device can process a GSO skb: you just have to and the gso_type field and the netdev's features field. I've made gso_type a conjunction. The idea is that you have a base type (e.g., SKB_GSO_TCPV4) that can be modified further to support new features. For example, if we add a hardware TSO type that supports ECN, they would declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO packets would be SKB_GSO_TCPV4. This means that only the CWR packets need to be emulated in software. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[BRIDGE]: optimize conditional in forward pathStephen Hemminger1-8/+4
Small optimizations of bridge forwarding path. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-26[BRIDGE]: allow full size vlan packetsStephen Hemminger1-1/+7
Need to allow for VLAN header when bridging. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22[BRIDGE]: TSO fix in br_dev_queue_push_xmitVlad Drukker1-1/+2
Signed-off-by: Vlad Drukker <vlad@storewiz.com> Acked-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-21[NETFILTER]: Kill nf_debugPatrick McHardy1-3/+0
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-16Linux-2.6.12-rc2v2.6.12-rc2Linus Torvalds1-0/+159
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!