linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2011-01-10	Merge branch 'for-linus' of ↵	Linus Torvalds	11	-931/+944
	git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (30 commits) MAINTAINERS: Add tomoyo-dev-en ML. SELinux: define permissions for DCB netlink messages encrypted-keys: style and other cleanup encrypted-keys: verify datablob size before converting to binary trusted-keys: kzalloc and other cleanup trusted-keys: additional TSS return code and other error handling syslog: check cap_syslog when dmesg_restrict Smack: Transmute labels on specified directories selinux: cache sidtab_context_to_sid results SELinux: do not compute transition labels on mountpoint labeled filesystems This patch adds a new security attribute to Smack called SMACK64EXEC. It defines label that is used while task is running. SELinux: merge policydb_index_classes and policydb_index_others selinux: convert part of the sym_val_to_name array to use flex_array selinux: convert type_val_to_struct to flex_array flex_array: fix flex_array_put_ptr macro to be valid C SELinux: do not set automatic i_ino in selinuxfs selinux: rework security_netlbl_secattr_to_sid SELinux: standardize return code handling in selinuxfs.c SELinux: standardize return code handling in selinuxfs.c SELinux: standardize return code handling in policydb.c ...
2011-01-10	headers: path.h redux	Alexey Dobriyan	1	-1/+0
	Remove path.h from sched.h and other files. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-10	Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into next	James Morris	10	-930/+943

2011-01-10	Merge branch 'master' into next	James Morris	2	-17/+21
	Conflicts: security/smack/smack_lsm.c Verified and added fix by Stephen Rothwell <sfr@canb.auug.org.au> Ok'd by Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: James Morris <jmorris@namei.org>
2011-01-07	Merge branch 'vfs-scale-working' of ↵	Linus Torvalds	1	-6/+10
	git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin * 'vfs-scale-working' of git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin: (57 commits) fs: scale mntget/mntput fs: rename vfsmount counter helpers fs: implement faster dentry memcmp fs: prefetch inode data in dcache lookup fs: improve scalability of pseudo filesystems fs: dcache per-inode inode alias locking fs: dcache per-bucket dcache hash locking bit_spinlock: add required includes kernel: add bl_list xfs: provide simple rcu-walk ACL implementation btrfs: provide simple rcu-walk ACL implementation ext2,3,4: provide simple rcu-walk ACL implementation fs: provide simple rcu-walk generic_check_acl implementation fs: provide rcu-walk aware permission i_ops fs: rcu-walk aware d_revalidate method fs: cache optimise dentry and inode for rcu-walk fs: dcache reduce branches in lookup path fs: dcache remove d_mounted fs: fs_struct use seqlock fs: rcu-walk for path lookup ...
2011-01-07	fs: dcache rationalise dget variants	Nick Piggin	1	-1/+1
	dget_locked was a shortcut to avoid the lazy lru manipulation when we already held dcache_lock (lru manipulation was relatively cheap at that point). However, how that the lru lock is an innermost one, we never hold it at any caller, so the lock cost can now be avoided. We already have well working lazy dcache LRU, so it should be fine to defer LRU manipulations to scan time. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07	fs: dcache remove dcache_lock	Nick Piggin	1	-4/+0
	dcache_lock no longer protects anything. remove it. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07	fs: dcache scale subdirs	Nick Piggin	1	-2/+10
	Protect d_subdirs and d_child with d_lock, except in filesystems that aren't using dcache_lock for these anyway (eg. using i_mutex). Note: if we change the locking rule in future so that ->d_child protection is provided only with ->d_parent->d_lock, it may allow us to reduce some locking. But it would be an exception to an otherwise regular locking scheme, so we'd have to see some good results. Probably not worthwhile. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-05	af_unix: Avoid socket->sk NULL OOPS in stream connect security hooks.	David S. Miller	1	-5/+5
	unix_release() can asynchornously set socket->sk to NULL, and it does so without holding the unix_state_lock() on "other" during stream connects. However, the reverse mapping, sk->sk_socket, is only transitioned to NULL under the unix_state_lock(). Therefore make the security hooks follow the reverse mapping instead of the forward mapping. Reported-by: Jeremy Fitzhardinge <jeremy@goop.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-26	Merge branch 'master' of ↵	David S. Miller	1	-5/+1
	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/fib_frontend.c
2010-12-16	SELinux: define permissions for DCB netlink messages	Eric Paris	1	-0/+2
	Commit 2f90b865 added two new netlink message types to the netlink route socket. SELinux has hooks to define if netlink messages are allowed to be sent or received, but it did not know about these two new message types. By default we allow such actions so noone likely noticed. This patch adds the proper definitions and thus proper permissions enforcement. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-12-07	selinux: cache sidtab_context_to_sid results	Eric Paris	2	-2/+39
	sidtab_context_to_sid takes up a large share of time when creating large numbers of new inodes (~30-40% in oprofile runs). This patch implements a cache of 3 entries which is checked before we do a full context_to_sid lookup. On one system this showed over a x3 improvement in the number of inodes that could be created per second and around a 20% improvement on another system. Any time we look up the same context string sucessivly (imagine ls -lZ) we should hit this cache hot. A cache miss should have a relatively minor affect on performance next to doing the full table search. All operations on the cache are done COMPLETELY lockless. We know that all struct sidtab_node objects created will never be deleted until a new policy is loaded thus we never have to worry about a pointer being dereferenced. Since we also know that pointer assignment is atomic we know that the cache will always have valid pointers. Given this information we implement a FIFO cache in an array of 3 pointers. Every result (whether a cache hit or table lookup) will be places in the 0 spot of the cache and the rest of the entries moved down one spot. The 3rd entry will be lost. Races are possible and are even likely to happen. Lets assume that 4 tasks are hitting sidtab_context_to_sid. The first task checks against the first entry in the cache and it is a miss. Now lets assume a second task updates the cache with a new entry. This will push the first entry back to the second spot. Now the first task might check against the second entry (which it already checked) and will miss again. Now say some third task updates the cache and push the second entry to the third spot. The first task my check the third entry (for the third time!) and again have a miss. At which point it will just do a full table lookup. No big deal! Signed-off-by: Eric Paris <eparis@redhat.com>
2010-12-02	SELinux: do not compute transition labels on mountpoint labeled filesystems	Eric Paris	1	-1/+4
	selinux_inode_init_security computes transitions sids even for filesystems that use mount point labeling. It shouldn't do that. It should just use the mount point label always and no matter what. This causes 2 problems. 1) it makes file creation slower than it needs to be since we calculate the transition sid and 2) it allows files to be created with a different label than the mount point! # id -Z staff_u:sysadm_r:sysadm_t:s0-s0:c0.c1023 # sesearch --type --class file --source sysadm_t --target tmp_t Found 1 semantic te rules: type_transition sysadm_t tmp_t : file user_tmp_t; # mount -o loop,context="system_u:object_r:tmp_t:s0" /tmp/fs /mnt/tmp # ls -lZ /mnt/tmp drwx------. root root system_u:object_r:tmp_t:s0 lost+found # touch /mnt/tmp/file1 # ls -lZ /mnt/tmp -rw-r--r--. root root staff_u:object_r:user_tmp_t:s0 file1 drwx------. root root system_u:object_r:tmp_t:s0 lost+found Whoops, we have a mount point labeled filesystem tmp_t with a user_tmp_t labeled file! Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: Reviewed-by: James Morris <jmorris@namei.org>
2010-11-30	SELinux: merge policydb_index_classes and policydb_index_others	Eric Paris	1	-59/+10
	We duplicate functionality in policydb_index_classes() and policydb_index_others(). This patch merges those functions just to make it clear there is nothing special happening here. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30	selinux: convert part of the sym_val_to_name array to use flex_array	Eric Paris	5	-68/+127
	The sym_val_to_name type array can be quite large as it grows linearly with the number of types. With known policies having over 5k types these allocations are growing large enough that they are likely to fail. Convert those to flex_array so no allocation is larger than PAGE_SIZE Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30	selinux: convert type_val_to_struct to flex_array	Eric Paris	3	-13/+34
	In rawhide type_val_to_struct will allocate 26848 bytes, an order 3 allocations. While this hasn't been seen to fail it isn't outside the realm of possibiliy on systems with severe memory fragmentation. Convert to flex_array so no allocation will ever be bigger than PAGE_SIZE. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30	SELinux: do not set automatic i_ino in selinuxfs	Eric Paris	1	-1/+0
	selinuxfs carefully uses i_ino to figure out what the inode refers to. The VFS used to generically set this value and we would reset it to something useable. After 85fe4025c616 each filesystem sets this value to a default if needed. Since selinuxfs doesn't use the default value and it can only lead to problems (I'd rather have 2 inodes with i_ino == 0 than one pointing to the wrong data) lets just stop setting a default. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: James Morris <jmorris@namei.org>
2010-11-30	selinux: rework security_netlbl_secattr_to_sid	Eric Paris	1	-21/+21
	security_netlbl_secattr_to_sid is difficult to follow, especially the return codes. Try to make the function obvious. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30	SELinux: standardize return code handling in selinuxfs.c	Eric Paris	1	-171/+157
	selinuxfs.c has lots of different standards on how to handle return paths on error. For the most part transition to rc=errno if (failure) goto out; [...] out: cleanup() return rc; Instead of doing cleanup mid function, or having multiple returns or other options. This doesn't do that for every function, but most of the complex functions which have cleanup routines on error. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30	SELinux: standardize return code handling in selinuxfs.c	Eric Paris	1	-337/+311
	selinuxfs.c has lots of different standards on how to handle return paths on error. For the most part transition to rc=errno if (failure) goto out; [...] out: cleanup() return rc; Instead of doing cleanup mid function, or having multiple returns or other options. This doesn't do that for every function, but most of the complex functions which have cleanup routines on error. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30	SELinux: standardize return code handling in policydb.c	Eric Paris	1	-287/+268
	policydb.c has lots of different standards on how to handle return paths on error. For the most part transition to rc=errno if (failure) goto out; [...] out: cleanup() return rc; Instead of doing cleanup mid function, or having multiple returns or other options. This doesn't do that for every function, but most of the complex functions which have cleanup routines on error. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-29	security: Define CAP_SYSLOG	Serge E. Hallyn	1	-1/+1
	Privileged syslog operations currently require CAP_SYS_ADMIN. Split this off into a new CAP_SYSLOG privilege which we can sanely take away from a container through the capability bounding set. With this patch, an lxc container can be prevented from messing with the host's syslog (i.e. dmesg -c). Changelog: mar 12 2010: add selinux capability2:cap_syslog perm Changelog: nov 22 2010: . port to new kernel . add a WARN_ONCE if userspace isn't using CAP_SYSLOG Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com> Acked-by: Andrew G. Morgan <morgan@kernel.org> Acked-By: Kees Cook <kees.cook@canonical.com> Cc: James Morris <jmorris@namei.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: "Christopher J. PeBenito" <cpebenito@tresys.com> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: James Morris <jmorris@namei.org>
2010-11-23	SELinux: indicate fatal error in compat netfilter code	Eric Paris	1	-2/+2
	The SELinux ip postroute code indicates when policy rejected a packet and passes the error back up the stack. The compat code does not. This patch sends the same kind of error back up the stack in the compat code. Based-on-patch-by: Paul Moore <paul.moore@hp.com> Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-23	SELinux: Only return netlink error when we know the return is fatal	Eric Paris	1	-4/+4
	Some of the SELinux netlink code returns a fatal error when the error might actually be transient. This patch just silently drops packets on potentially transient errors but continues to return a permanant error indicator when the denial was because of policy. Based-on-comments-by: Paul Moore <paul.moore@hp.com> Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17	SELinux: return -ECONNREFUSED from ip_postroute to signal fatal error	Eric Paris	1	-8/+8
	The SELinux netfilter hooks just return NF_DROP if they drop a packet. We want to signal that a drop in this hook is a permanant fatal error and is not transient. If we do this the error will be passed back up the stack in some places and applications will get a faster interaction that something went wrong. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15	capabilities/syslog: open code cap_syslog logic to fix build failure	Eric Paris	1	-5/+1
	The addition of CONFIG_SECURITY_DMESG_RESTRICT resulted in a build failure when CONFIG_PRINTK=n. This is because the capabilities code which used the new option was built even though the variable in question didn't exist. The patch here fixes this by moving the capabilities checks out of the LSM and into the caller. All (known) LSMs should have been calling the capabilities hook already so it actually makes the code organization better to eliminate the hook altogether. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-29	convert get_sb_single() users	Al Viro	1	-5/+4
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25	fs: do not assign default i_ino in new_inode	Christoph Hellwig	1	-0/+1
	Instead of always assigning an increasing inode number in new_inode move the call to assign it into those callers that actually need it. For now callers that need it is estimated conservatively, that is the call is added to all filesystems that do not assign an i_ino by themselves. For a few more filesystems we can avoid assigning any inode number given that they aren't user visible, and for others it could be done lazily when an inode number is actually needed, but that's left for later patches. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-21	selinux: include vmalloc.h for vmalloc_user	Stephen Rothwell	1	-0/+1
	Include vmalloc.h for vmalloc_user (fixes ppc build warning). Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	selinux: implement mmap on /selinux/policy	Eric Paris	2	-1/+45
	/selinux/policy allows a user to copy the policy back out of the kernel. This patch allows userspace to actually mmap that file and use it directly. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	SELinux: allow userspace to read policy back out of the kernel	Eric Paris	12	-3/+1256
	There is interest in being able to see what the actual policy is that was loaded into the kernel. The patch creates a new selinuxfs file /selinux/policy which can be read by userspace. The actual policy that is loaded into the kernel will be written back out to userspace. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	SELinux: drop useless (and incorrect) AVTAB_MAX_SIZE	Eric Paris	2	-3/+2
	AVTAB_MAX_SIZE was a define which was supposed to be used in userspace to define a maximally sized avtab when userspace wasn't sure how big of a table it needed. It doesn't make sense in the kernel since we always know our table sizes. The only place it is used we have a more appropiately named define called AVTAB_MAX_HASH_BUCKETS, use that instead. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	SELinux: deterministic ordering of range transition rules	Eric Paris	1	-3/+13
	Range transition rules are placed in the hash table in an (almost) arbitrary order. This patch inserts them in a fixed order to make policy retrival more predictable. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	security: secid_to_secctx returns len when data is NULL	Eric Paris	1	-2/+9
	With the (long ago) interface change to have the secid_to_secctx functions do the string allocation instead of having the caller do the allocation we lost the ability to query the security server for the length of the upcoming string. The SECMARK code would like to allocate a netlink skb with enough length to hold the string but it is just too unclean to do the string allocation twice or to do the allocation the first time and hold onto the string and slen. This patch adds the ability to call security_secid_to_secctx() with a NULL data pointer and it will just set the slen pointer. Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	secmark: make secmark object handling generic	Eric Paris	3	-49/+25
	Right now secmark has lots of direct selinux calls. Use all LSM calls and remove all SELinux specific knowledge. The only SELinux specific knowledge we leave is the mode. The only point is to make sure that other LSMs at least test this generic code before they assume it works. (They may also have to make changes if they do not represent labels as strings) Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Paul Moore <paul.moore@hp.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	security: remove unused parameter from security_task_setscheduler()	KOSAKI Motohiro	1	-2/+2
	All security modules shouldn't change sched_param parameter of security_task_setscheduler(). This is not only meaningless, but also make a harmful result if caller pass a static variable. This patch remove policy and sched_param parameter from security_task_setscheduler() becuase none of security module is using it. Cc: James Morris <jmorris@namei.org> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	selinux: fix up style problem on /selinux/status	KaiGai Kohei	2	-11/+7
	This patch fixes up coding-style problem at this commit: 4f27a7d49789b04404eca26ccde5f527231d01d5 selinux: fast status update interface (/selinux/status) Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	selinux: change to new flag variable	matt mooney	1	-1/+1
	Replace EXTRA_CFLAGS with ccflags-y. Signed-off-by: matt mooney <mfm@muteddisk.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	selinux: really fix dependency causing parallel compile failure.	Paul Gortmaker	2	-20/+6
	While the previous change to the selinux Makefile reduced the window significantly for this failure, it is still possible to see a compile failure where cpp starts processing selinux files before the auto generated flask.h file is completed. This is easily reproduced by adding the following temporary change to expose the issue everytime: - cmd_flask = scripts/selinux/genheaders/genheaders ... + cmd_flask = sleep 30 ; scripts/selinux/genheaders/genheaders ... This failure happens because the creation of the object files in the ss subdir also depends on flask.h. So simply incorporate them into the parent Makefile, as the ss/Makefile really doesn't do anything unique. With this change, compiling of all selinux files is dependent on completion of the header file generation, and this test case with the "sleep 30" now confirms it is functioning as expected. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	selinux: fix parallel compile error	Paul Gortmaker	1	-1/+1
	Selinux has an autogenerated file, "flask.h" which is included by two other selinux files. The current makefile has a single dependency on the first object file in the selinux-y list, assuming that will get flask.h generated before anyone looks for it, but that assumption breaks down in a "make -jN" situation and you get: selinux/selinuxfs.c:35: fatal error: flask.h: No such file or directory compilation terminated. remake[9]: *** [security/selinux/selinuxfs.o] Error 1 Since flask.h is included by security.h which in turn is included nearly everywhere, make the dependency apply to all of the selinux-y list of objs. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	selinux: fast status update interface (/selinux/status)	KaiGai Kohei	5	-1/+210
	This patch provides a new /selinux/status entry which allows applications read-only mmap(2). This region reflects selinux_kernel_status structure in kernel space. struct selinux_kernel_status { u32 length; /* length of this structure / u32 sequence; / sequence number of seqlock logic / u32 enforcing; / current setting of enforcing mode / u32 policyload; / times of policy reloaded / u32 deny_unknown; / current setting of deny_unknown */ }; When userspace object manager caches access control decisions provided by SELinux, it needs to invalidate the cache on policy reload and setenforce to keep consistency. However, the applications need to check the kernel state for each accesses on userspace avc, or launch a background worker process. In heuristic, frequency of invalidation is much less than frequency of making access control decision, so it is annoying to invoke a system call to check we don't need to invalidate the userspace cache. If we can use a background worker thread, it allows to receive invalidation messages from the kernel. But it requires us an invasive coding toward the base application in some cases; E.g, when we provide a feature performing with SELinux as a plugin module, it is unwelcome manner to launch its own worker thread from the module. If we could map /selinux/status to process memory space, application can know updates of selinux status; policy reload or setenforce. A typical application checks selinux_kernel_status::sequence when it tries to reference userspace avc. If it was changed from the last time when it checked userspace avc, it means something was updated in the kernel space. Then, the application can reset userspace avc or update current enforcing mode, without any system call invocations. This sequence number is updated according to the seqlock logic, so we need to wait for a while if it is odd number. Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com> Acked-by: Eric Paris <eparis@redhat.com> -- security/selinux/include/security.h \| 21 ++++++ security/selinux/selinuxfs.c \| 56 +++++++++++++++ security/selinux/ss/Makefile \| 2 +- security/selinux/ss/services.c \| 3 + security/selinux/ss/status.c \| 129 +++++++++++++++++++++++++++++++++++ 5 files changed, 210 insertions(+), 1 deletions(-) Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21	selinux: type_bounds_sanity_check has a meaningless variable declaration	Eric Paris	1	-2/+2
	type is not used at all, stop declaring and assigning it. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-18	tty: fix fu_list abuse	Nick Piggin	1	-1/+4
	tty: fix fu_list abuse tty code abuses fu_list, which causes a bug in remount,ro handling. If a tty device node is opened on a filesystem, then the last link to the inode removed, the filesystem will be allowed to be remounted readonly. This is because fs_may_remount_ro does not find the 0 link tty inode on the file sb list (because the tty code incorrectly removed it to use for its own purpose). This can result in a filesystem with errors after it is marked "clean". Taking idea from Christoph's initial patch, allocate a tty private struct at file->private_data and put our required list fields in there, linking file and tty. This makes tty nodes behave the same way as other device nodes and avoid meddling with the vfs, and avoids this bug. The error handling is not trivial in the tty code, so for this bugfix, I take the simple approach of using __GFP_NOFAIL and don't worry about memory errors. This is not a problem because our allocator doesn't fail small allocs as a rule anyway. So proper error handling is left as an exercise for tty hackers. [ Arguably filesystem's device inode would ideally be divorced from the driver's pseudo inode when it is opened, but in practice it's not clear whether that will ever be worth implementing. ] Cc: linux-kernel@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-18	fs: cleanup files_lock locking	Nick Piggin	1	-2/+2
	fs: cleanup files_lock locking Lock tty_files with a new spinlock, tty_files_lock; provide helpers to manipulate the per-sb files list; unexport the files_lock spinlock. Cc: linux-kernel@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-10	Merge branch 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux	Linus Torvalds	1	-4/+8
	* 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux: unistd: add __NR_prlimit64 syscall numbers rlimits: implement prlimit64 syscall rlimits: switch more rlimit syscalls to do_prlimit rlimits: redo do_setrlimit to more generic do_prlimit rlimits: add rlimit64 structure rlimits: do security check under task_lock rlimits: allow setrlimit to non-current tasks rlimits: split sys_setrlimit rlimits: selinux, do rlimits changes under task_lock rlimits: make sure ->rlim_max never grows in sys_setrlimit rlimits: add task_struct to update_rlimit_cpu rlimits: security, add task_struct to setrlimit Fix up various system call number conflicts. We not only added fanotify system calls in the meantime, but asm-generic/unistd.h added a wait4 along with a range of reserved per-architecture system calls.
2010-08-06	SELINUX: Fix build error.	Ralf Baechle	1	-1/+1
	Fix build error caused by a stale security/selinux/av_permissions.h in the $(src) directory which will override a more recent version in $(obj) that is it appears to strike only when building with a separate object directory. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02	selinux: convert the policy type_attr_map to flex_array	Eric Paris	3	-13/+39
	Current selinux policy can have over 3000 types. The type_attr_map in policy is an array sized by the number of types times sizeof(struct ebitmap) (12 on x86_64). Basic math tells us the array is going to be of length 3000 x 12 = 36,000 bytes. The largest 'safe' allocation on a long running system is 16k. Most of the time a 32k allocation will work. But on long running systems a 64k allocation (what we need) can fail quite regularly. In order to deal with this I am converting the type_attr_map to use flex_arrays. Let the library code deal with breaking this into PAGE_SIZE pieces. -v2 rework some of the if(!obj) BUG() to be BUG_ON(!obj) drop flex_array_put() calls and just use a _get() object directly -v3 make apply to James' tree (drop the policydb_write changes) Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02	SELinux: Move execmod to the common perms	Eric Paris	1	-4/+3
	execmod "could" show up on non regular files and non chr files. The current implementation would actually make these checks against non-existant bits since the code assumes the execmod permission is same for all file types. To make this line up for chr files we had to define execute_no_trans and entrypoint permissions. These permissions are unreachable and only existed to to make FILE__EXECMOD and CHR_FILE__EXECMOD the same. This patch drops those needless perms as well. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02	selinux: place open in the common file perms	Eric Paris	2	-28/+11
	kernel can dynamically remap perms. Drop the open lookup table and put open in the common file perms. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02	SELinux: special dontaudit for access checks	Eric Paris	3	-8/+38
	Currently there are a number of applications (nautilus being the main one) which calls access() on files in order to determine how they should be displayed. It is normal and expected that nautilus will want to see if files are executable or if they are really read/write-able. access() should return the real permission. SELinux policy checks are done in access() and can result in lots of AVC denials as policy denies RWX on files which DAC allows. Currently SELinux must dontaudit actual attempts to read/write/execute a file in order to silence these messages (and not flood the logs.) But dontaudit rules like that can hide real attacks. This patch addes a new common file permission audit_access. This permission is special in that it is meaningless and should never show up in an allow rule. Instead the only place this permission has meaning is in a dontaudit rule like so: dontaudit nautilus_t sbin_t:file audit_access With such a rule if nautilus just checks access() we will still get denied and thus userspace will still get the correct answer but we will not log the denial. If nautilus attempted to actually perform one of the forbidden actions (rather than just querying access(2) about it) we would still log a denial. This type of dontaudit rule should be used sparingly, as it could be a method for an attacker to probe the system permissions without detection. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>