summaryrefslogtreecommitdiffstats
path: root/security/keys/compat.c
AgeCommit message (Collapse)AuthorFilesLines
2020-10-03security/keys: remove compat_keyctl_instantiate_key_iovChristoph Hellwig1-34/+2
Now that import_iovec handles compat iovecs, the native version of keyctl_instantiate_key_iov can be used for the compat case as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03iov_iter: transparently handle compat iovecs in import_iovecChristoph Hellwig1-3/+2
Use in compat_syscall to import either native or the compat iovecs, and remove the now superflous compat_import_iovec. This removes the need for special compat logic in most callers, and the remaining ones can still be simplified by using __import_iovec with a bool compat parameter. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-19watch_queue: Add a key/keyring notification facilityDavid Howells1-0/+3
Add a key/keyring change notification facility whereby notifications about changes in key and keyring content and attributes can be received. Firstly, an event queue needs to be created: pipe2(fds, O_NOTIFICATION_PIPE); ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, 256); then a notification can be set up to report notifications via that queue: struct watch_notification_filter filter = { .nr_filters = 1, .filters = { [0] = { .type = WATCH_TYPE_KEY_NOTIFY, .subtype_filter[0] = UINT_MAX, }, }, }; ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter); keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01); After that, records will be placed into the queue when events occur in which keys are changed in some way. Records are of the following format: struct key_notification { struct watch_notification watch; __u32 key_id; __u32 aux; } *n; Where: n->watch.type will be WATCH_TYPE_KEY_NOTIFY. n->watch.subtype will indicate the type of event, such as NOTIFY_KEY_REVOKED. n->watch.info & WATCH_INFO_LENGTH will indicate the length of the record. n->watch.info & WATCH_INFO_ID will be the second argument to keyctl_watch_key(), shifted. n->key will be the ID of the affected key. n->aux will hold subtype-dependent information, such as the key being linked into the keyring specified by n->key in the case of NOTIFY_KEY_LINKED. Note that it is permissible for event records to be of variable length - or, at least, the length may be dependent on the subtype. Note also that the queue can be shared between multiple notifications of various types. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jamorris@linux.microsoft.com>
2019-12-12KEYS: remove CONFIG_KEYS_COMPATEric Biggers1-5/+0
KEYS_COMPAT now always takes the value of COMPAT && KEYS. But the security/keys/ directory is only compiled if KEYS is enabled, so in practice KEYS_COMPAT is the same as COMPAT. Therefore, remove the unnecessary KEYS_COMPAT and just use COMPAT directly. (Also remove an outdated comment from compat.c.) Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2019-07-10Revert "Merge tag 'keys-acl-20190703' of ↵Linus Torvalds1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" This reverts merge 0f75ef6a9cff49ff612f7ce0578bced9d0b38325 (and thus effectively commits 7a1ade847596 ("keys: Provide KEYCTL_GRANT_PERMISSION") 2e12256b9a76 ("keys: Replace uid/gid/perm permissions checking with an ACL") that the merge brought in). It turns out that it breaks booting with an encrypted volume, and Eric biggers reports that it also breaks the fscrypt tests [1] and loading of in-kernel X.509 certificates [2]. The root cause of all the breakage is likely the same, but David Howells is off email so rather than try to work it out it's getting reverted in order to not impact the rest of the merge window. [1] https://lore.kernel.org/lkml/20190710011559.GA7973@sol.localdomain/ [2] https://lore.kernel.org/lkml/20190710013225.GB7973@sol.localdomain/ Link: https://lore.kernel.org/lkml/CAHk-=wjxoeMJfeBahnWH=9zShKp2bsVy527vo3_y8HfOdhwAAw@mail.gmail.com/ Reported-by: Eric Biggers <ebiggers@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-08Merge tag 'keys-acl-20190703' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull keyring ACL support from David Howells: "This changes the permissions model used by keys and keyrings to be based on an internal ACL by the following means: - Replace the permissions mask internally with an ACL that contains a list of ACEs, each with a specific subject with a permissions mask. Potted default ACLs are available for new keys and keyrings. ACE subjects can be macroised to indicate the UID and GID specified on the key (which remain). Future commits will be able to add additional subject types, such as specific UIDs or domain tags/namespaces. Also split a number of permissions to give finer control. Examples include splitting the revocation permit from the change-attributes permit, thereby allowing someone to be granted permission to revoke a key without allowing them to change the owner; also the ability to join a keyring is split from the ability to link to it, thereby stopping a process accessing a keyring by joining it and thus acquiring use of possessor permits. - Provide a keyctl to allow the granting or denial of one or more permits to a specific subject. Direct access to the ACL is not granted, and the ACL cannot be viewed" * tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: keys: Provide KEYCTL_GRANT_PERMISSION keys: Replace uid/gid/perm permissions checking with an ACL
2019-07-08Merge tag 'keys-misc-20190619' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull misc keyring updates from David Howells: "These are some miscellaneous keyrings fixes and improvements: - Fix a bunch of warnings from sparse, including missing RCU bits and kdoc-function argument mismatches - Implement a keyctl to allow a key to be moved from one keyring to another, with the option of prohibiting key replacement in the destination keyring. - Grant Link permission to possessors of request_key_auth tokens so that upcall servicing daemons can more easily arrange things such that only the necessary auth key is passed to the actual service program, and not all the auth keys a daemon might possesss. - Improvement in lookup_user_key(). - Implement a keyctl to allow keyrings subsystem capabilities to be queried. The keyutils next branch has commits to make available, document and test the move-key and capabilities code: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log They're currently on the 'next' branch" * tag 'keys-misc-20190619' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: keys: Add capability-checking keyctl function keys: Reuse keyring_index_key::desc_len in lookup_user_key() keys: Grant Link permission to possessers of request_key auth keys keys: Add a keyctl to move a key between keyrings keys: Hoist locking out of __key_link_begin() keys: Break bits out of key_unlink() keys: Change keyring_serialise_link_sem to a mutex keys: sparse: Fix kdoc mismatches keys: sparse: Fix incorrect RCU accesses keys: sparse: Fix key_fs[ug]id_changed()
2019-07-03keys: Provide KEYCTL_GRANT_PERMISSIONDavid Howells1-0/+2
Provide a keyctl() operation to grant/remove permissions. The grant operation, wrapped by libkeyutils, looks like: int ret = keyctl_grant_permission(key_serial_t key, enum key_ace_subject_type type, unsigned int subject, unsigned int perm); Where key is the key to be modified, type and subject represent the subject to which permission is to be granted (or removed) and perm is the set of permissions to be granted. 0 is returned on success. SET_SECURITY permission is required for this. The subject type currently must be KEY_ACE_SUBJ_STANDARD for the moment (other subject types will come along later). For subject type KEY_ACE_SUBJ_STANDARD, the following subject values are available: KEY_ACE_POSSESSOR The possessor of the key KEY_ACE_OWNER The owner of the key KEY_ACE_GROUP The key's group KEY_ACE_EVERYONE Everyone perm lists the permissions to be granted: KEY_ACE_VIEW Can view the key metadata KEY_ACE_READ Can read the key content KEY_ACE_WRITE Can update/modify the key content KEY_ACE_SEARCH Can find the key by searching/requesting KEY_ACE_LINK Can make a link to the key KEY_ACE_SET_SECURITY Can set security KEY_ACE_INVAL Can invalidate KEY_ACE_REVOKE Can revoke KEY_ACE_JOIN Can join this keyring KEY_ACE_CLEAR Can clear this keyring If an ACE already exists for the subject, then the permissions mask will be overwritten; if perm is 0, it will be deleted. Currently, the internal ACL is limited to a maximum of 16 entries. For example: int ret = keyctl_grant_permission(key, KEY_ACE_SUBJ_STANDARD, KEY_ACE_OWNER, KEY_ACE_VIEW | KEY_ACE_READ); Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-19keys: Add capability-checking keyctl functionDavid Howells1-0/+3
Add a keyctl function that requests a set of capability bits to find out what features are supported. Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-30keys: Add a keyctl to move a key between keyringsDavid Howells1-0/+3
Add a keyctl to atomically move a link to a key from one keyring to another. The key must exist in "from" keyring and a flag can be given to cause the operation to fail if there's a matching key already in the "to" keyring. This can be done with: keyctl(KEYCTL_MOVE, key_serial_t key, key_serial_t from_keyring, key_serial_t to_keyring, unsigned int flags); The key being moved must grant Link permission and both keyrings must grant Write permission. flags should be 0 or KEYCTL_MOVE_EXCL, with the latter preventing displacement of a matching key from the "to" keyring. Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner1-5/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-26KEYS: Provide keyctls to drive the new key type ops for asymmetric keys [ver #2]David Howells1-0/+18
Provide five keyctl functions that permit userspace to make use of the new key type ops for accessing and driving asymmetric keys. (*) Query an asymmetric key. long keyctl(KEYCTL_PKEY_QUERY, key_serial_t key, unsigned long reserved, struct keyctl_pkey_query *info); Get information about an asymmetric key. The information is returned in the keyctl_pkey_query struct: __u32 supported_ops; A bit mask of flags indicating which ops are supported. This is constructed from a bitwise-OR of: KEYCTL_SUPPORTS_{ENCRYPT,DECRYPT,SIGN,VERIFY} __u32 key_size; The size in bits of the key. __u16 max_data_size; __u16 max_sig_size; __u16 max_enc_size; __u16 max_dec_size; The maximum sizes in bytes of a blob of data to be signed, a signature blob, a blob to be encrypted and a blob to be decrypted. reserved must be set to 0. This is intended for future use to hand over one or more passphrases needed unlock a key. If successful, 0 is returned. If the key is not an asymmetric key, EOPNOTSUPP is returned. (*) Encrypt, decrypt, sign or verify a blob using an asymmetric key. long keyctl(KEYCTL_PKEY_ENCRYPT, const struct keyctl_pkey_params *params, const char *info, const void *in, void *out); long keyctl(KEYCTL_PKEY_DECRYPT, const struct keyctl_pkey_params *params, const char *info, const void *in, void *out); long keyctl(KEYCTL_PKEY_SIGN, const struct keyctl_pkey_params *params, const char *info, const void *in, void *out); long keyctl(KEYCTL_PKEY_VERIFY, const struct keyctl_pkey_params *params, const char *info, const void *in, const void *in2); Use an asymmetric key to perform a public-key cryptographic operation a blob of data. The parameter block pointed to by params contains a number of integer values: __s32 key_id; __u32 in_len; __u32 out_len; __u32 in2_len; For a given operation, the in and out buffers are used as follows: Operation ID in,in_len out,out_len in2,in2_len ======================= =============== =============== =========== KEYCTL_PKEY_ENCRYPT Raw data Encrypted data - KEYCTL_PKEY_DECRYPT Encrypted data Raw data - KEYCTL_PKEY_SIGN Raw data Signature - KEYCTL_PKEY_VERIFY Raw data - Signature info is a string of key=value pairs that supply supplementary information. The __spare space in the parameter block must be set to 0. This is intended, amongst other things, to allow the passing of passphrases required to unlock a key. If successful, encrypt, decrypt and sign all return the amount of data written into the output buffer. Verification returns 0 on success. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Marcel Holtmann <marcel@holtmann.org> Reviewed-by: Marcel Holtmann <marcel@holtmann.org> Reviewed-by: Denis Kenzior <denkenz@gmail.com> Tested-by: Denis Kenzior <denkenz@gmail.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2017-04-04KEYS: add SP800-56A KDF support for DHStephan Mueller1-2/+3
SP800-56A defines the use of DH with key derivation function based on a counter. The input to the KDF is defined as (DH shared secret || other information). The value for the "other information" is to be provided by the caller. The KDF is implemented using the hash support from the kernel crypto API. The implementation uses the symmetric hash support as the input to the hash operation is usually very small. The caller is allowed to specify the hash name that he wants to use to derive the key material allowing the use of all supported hashes provided with the kernel crypto API. As the KDF implements the proper truncation of the DH shared secret to the requested size, this patch fills the caller buffer up to its size. The patch is tested with a new test added to the keyutils user space code which uses a CAVS test vector testing the compliance with SP800-56A. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: David Howells <dhowells@redhat.com>
2017-04-04KEYS: Add KEYCTL_RESTRICT_KEYRINGMat Martineau1-0/+4
Keyrings recently gained restrict_link capabilities that allow individual keys to be validated prior to linking. This functionality was only available using internal kernel APIs. With the KEYCTL_RESTRICT_KEYRING command existing keyrings can be configured to check the content of keys before they are linked, and then allow or disallow linkage of that key to the keyring. To restrict a keyring, call: keyctl(KEYCTL_RESTRICT_KEYRING, key_serial_t keyring, const char *type, const char *restriction) where 'type' is the name of a registered key type and 'restriction' is a string describing how key linkage is to be restricted. The restriction option syntax is specific to each key type. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
2016-06-03KEYS: Add placeholder for KDF usage with DHStephan Mueller1-1/+1
The values computed during Diffie-Hellman key exchange are often used in combination with key derivation functions to create cryptographic keys. Add a placeholder for a later implementation to configure a key derivation function that will transform the Diffie-Hellman result returned by the KEYCTL_DH_COMPUTE command. [This patch was stripped down from a patch produced by Mat Martineau that had a bug in the compat code - so for the moment Stephan's patch simply requires that the placeholder argument must be NULL] Original-signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-04-12KEYS: Add KEYCTL_DH_COMPUTE commandMat Martineau1-0/+4
This adds userspace access to Diffie-Hellman computations through a new keyctl() syscall command to calculate shared secrets or public keys using input parameters stored in the keyring. Input key ids are provided in a struct due to the current 5-arg limit for the keyctl syscall. Only user keys are supported in order to avoid exposing the content of logon or encrypted keys. The output is written to the provided buffer, based on the assumption that the values are only needed in userspace. Future support for other types of key derivation would involve a new command, like KEYCTL_ECDH_COMPUTE. Once Diffie-Hellman support is included in the crypto API, this code can be converted to use the crypto API to take advantage of possible hardware acceleration and reduce redundant code. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David Howells <dhowells@redhat.com>
2015-04-11switch keyctl_instantiate_key_common() to iov_iterAl Viro1-19/+10
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-03-06security/compat: convert to COMPAT_SYSCALL_DEFINEHeiko Carstens1-2/+2
Convert all compat system call functions where all parameter types have a size of four or less than four bytes, or are pointer types to COMPAT_SYSCALL_DEFINE. The implicit casts within COMPAT_SYSCALL_DEFINE will perform proper zero and sign extension to 64 bit of all parameters if needed. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2013-09-24KEYS: Add per-user_namespace registers for persistent per-UID kerberos cachesDavid Howells1-0/+3
Add support for per-user_namespace registers of persistent per-UID kerberos caches held within the kernel. This allows the kerberos cache to be retained beyond the life of all a user's processes so that the user's cron jobs can work. The kerberos cache is envisioned as a keyring/key tree looking something like: struct user_namespace \___ .krb_cache keyring - The register \___ _krb.0 keyring - Root's Kerberos cache \___ _krb.5000 keyring - User 5000's Kerberos cache \___ _krb.5001 keyring - User 5001's Kerberos cache \___ tkt785 big_key - A ccache blob \___ tkt12345 big_key - Another ccache blob Or possibly: struct user_namespace \___ .krb_cache keyring - The register \___ _krb.0 keyring - Root's Kerberos cache \___ _krb.5000 keyring - User 5000's Kerberos cache \___ _krb.5001 keyring - User 5001's Kerberos cache \___ tkt785 keyring - A ccache \___ krbtgt/REDHAT.COM@REDHAT.COM big_key \___ http/REDHAT.COM@REDHAT.COM user \___ afs/REDHAT.COM@REDHAT.COM user \___ nfs/REDHAT.COM@REDHAT.COM user \___ krbtgt/KERNEL.ORG@KERNEL.ORG big_key \___ http/KERNEL.ORG@KERNEL.ORG big_key What goes into a particular Kerberos cache is entirely up to userspace. Kernel support is limited to giving you the Kerberos cache keyring that you want. The user asks for their Kerberos cache by: krb_cache = keyctl_get_krbcache(uid, dest_keyring); The uid is -1 or the user's own UID for the user's own cache or the uid of some other user's cache (requires CAP_SETUID). This permits rpc.gssd or whatever to mess with the cache. The cache returned is a keyring named "_krb.<uid>" that the possessor can read, search, clear, invalidate, unlink from and add links to. Active LSMs get a chance to rule on whether the caller is permitted to make a link. Each uid's cache keyring is created when it first accessed and is given a timeout that is extended each time this function is called so that the keyring goes away after a while. The timeout is configurable by sysctl but defaults to three days. Each user_namespace struct gets a lazily-created keyring that serves as the register. The cache keyrings are added to it. This means that standard key search and garbage collection facilities are available. The user_namespace struct's register goes away when it does and anything left in it is then automatically gc'd. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Simo Sorce <simo@redhat.com> cc: Serge E. Hallyn <serge.hallyn@ubuntu.com> cc: Eric W. Biederman <ebiederm@xmission.com>
2013-03-12Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and ↵Mathieu Desnoyers1-2/+2
security keys Looking at mm/process_vm_access.c:process_vm_rw() and comparing it to compat_process_vm_rw() shows that the compatibility code requires an explicit "access_ok()" check before calling compat_rw_copy_check_uvector(). The same difference seems to appear when we compare fs/read_write.c:do_readv_writev() to fs/compat.c:compat_do_readv_writev(). This subtle difference between the compat and non-compat requirements should probably be debated, as it seems to be error-prone. In fact, there are two others sites that use this function in the Linux kernel, and they both seem to get it wrong: Now shifting our attention to fs/aio.c, we see that aio_setup_iocb() also ends up calling compat_rw_copy_check_uvector() through aio_setup_vectored_rw(). Unfortunately, the access_ok() check appears to be missing. Same situation for security/keys/compat.c:compat_keyctl_instantiate_key_iov(). I propose that we add the access_ok() check directly into compat_rw_copy_check_uvector(), so callers don't have to worry about it, and it therefore makes the compat call code similar to its non-compat counterpart. Place the access_ok() check in the same location where copy_from_user() can trigger a -EFAULT error in the non-compat code, so the ABI behaviors are alike on both compat and non-compat. While we are here, fix compat_do_readv_writev() so it checks for compat_rw_copy_check_uvector() negative return values. And also, fix a memory leak in compat_keyctl_instantiate_key_iov() error handling. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-10Merge commit 'v3.5-rc2' into nextJames Morris1-1/+1
2012-05-31aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector()Christopher Yeoh1-1/+1
A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after changes made to support CMA in an earlier patch. Rather than having an additional check_access parameter to these functions, the first paramater type is overloaded to allow the caller to specify CHECK_IOVEC_ONLY which means check that the contents of the iovec are valid, but do not check the memory that they point to. This is used by process_vm_readv/writev where we need to validate that a iovec passed to the syscall is valid but do not want to check the memory that it points to at this point because it refers to an address space in another process. Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-25KEYS: Fix some sparse warningsDavid Howells1-2/+2
Fix some sparse warnings in the keyrings code: (1) compat_keyctl_instantiate_key_iov() should be static. (2) There were a couple of places where a pointer was being compared against integer 0 rather than NULL. (3) keyctl_instantiate_key_common() should not take a __user-labelled iovec pointer as the caller must have copied the iovec to kernel space. (4) __key_link_begin() takes and __key_link_end() releases keyring_serialise_link_sem under some circumstances and so this should be declared. Note that adding __acquires() and __releases() for this doesn't help cure the warnings messages - something only commenting out both helps. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-05-11KEYS: Add invalidation supportDavid Howells1-0/+3
Add support for invalidating a key - which renders it immediately invisible to further searches and causes the garbage collector to immediately wake up, remove it from keyrings and then destroy it when it's no longer referenced. It's better not to do this with keyctl_revoke() as that marks the key to start returning -EKEYREVOKED to searches when what is actually desired is to have the key refetched. To invalidate a key the caller must be granted SEARCH permission by the key. This may be too strict. It may be better to also permit invalidation if the caller has any of READ, WRITE or SETATTR permission. The primary use for this is to evict keys that are cached in special keyrings, such as the DNS resolver or an ID mapper. Signed-off-by: David Howells <dhowells@redhat.com>
2011-10-31Cross Memory AttachChristopher Yeoh1-1/+1
The basic idea behind cross memory attach is to allow MPI programs doing intra-node communication to do a single copy of the message rather than a double copy of the message via shared memory. The following patch attempts to achieve this by allowing a destination process, given an address and size from a source process, to copy memory directly from the source process into its own address space via a system call. There is also a symmetrical ability to copy from the current process's address space into a destination process's address space. - Use of /proc/pid/mem has been considered, but there are issues with using it: - Does not allow for specifying iovecs for both src and dest, assuming preadv or pwritev was implemented either the area read from or written to would need to be contiguous. - Currently mem_read allows only processes who are currently ptrace'ing the target and are still able to ptrace the target to read from the target. This check could possibly be moved to the open call, but its not clear exactly what race this restriction is stopping (reason appears to have been lost) - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix domain socket is a bit ugly from a userspace point of view, especially when you may have hundreds if not (eventually) thousands of processes that all need to do this with each other - Doesn't allow for some future use of the interface we would like to consider adding in the future (see below) - Interestingly reading from /proc/pid/mem currently actually involves two copies! (But this could be fixed pretty easily) As mentioned previously use of vmsplice instead was considered, but has problems. Since you need the reader and writer working co-operatively if the pipe is not drained then you block. Which requires some wrapping to do non blocking on the send side or polling on the receive. In all to all communication it requires ordering otherwise you can deadlock. And in the example of many MPI tasks writing to one MPI task vmsplice serialises the copying. There are some cases of MPI collectives where even a single copy interface does not get us the performance gain we could. For example in an MPI_Reduce rather than copy the data from the source we would like to instead use it directly in a mathops (say the reduce is doing a sum) as this would save us doing a copy. We don't need to keep a copy of the data from the source. I haven't implemented this, but I think this interface could in the future do all this through the use of the flags - eg could specify the math operation and type and the kernel rather than just copying the data would apply the specified operation between the source and destination and store it in the destination. Although we don't have a "second user" of the interface (though I've had some nibbles from people who may be interested in using it for intra process messaging which is not MPI). This interface is something which hardware vendors are already doing for their custom drivers to implement fast local communication. And so in addition to this being useful for OpenMPI it would mean the driver maintainers don't have to fix things up when the mm changes. There was some discussion about how much faster a true zero copy would go. Here's a link back to the email with some testing I did on that: http://marc.info/?l=linux-mm&m=130105930902915&w=2 There is a basic man page for the proposed interface here: http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt This has been implemented for x86 and powerpc, other architecture should mainly (I think) just need to add syscall numbers for the process_vm_readv and process_vm_writev. There are 32 bit compatibility versions for 64-bit kernels. For arch maintainers there are some simple tests to be able to quickly verify that the syscalls are working correctly here: http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: <linux-man@vger.kernel.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-08KEYS: Add an iovec version of KEYCTL_INSTANTIATEDavid Howells1-0/+47
Add a keyctl op (KEYCTL_INSTANTIATE_IOV) that is like KEYCTL_INSTANTIATE, but takes an iovec array and concatenates the data in-kernel into one buffer. Since the KEYCTL_INSTANTIATE copies the data anyway, this isn't too much of a problem. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2011-03-08KEYS: Add a new keyctl op to reject a key with a specified error codeDavid Howells1-0/+3
Add a new keyctl op to reject a key with a specified error code. This works much the same as negating a key, and so keyctl_negate_key() is made a special case of keyctl_reject_key(). The difference is that keyctl_negate_key() selects ENOKEY as the error to be reported. Typically the key would be rejected with EKEYEXPIRED, EKEYREVOKED or EKEYREJECTED, but this is not mandatory. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2011-01-21KEYS: Fix up comments in key management codeDavid Howells1-6/+7
Fix up comments in the key management code. No functional changes. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-21KEYS: Do some style cleanup in the key management code.David Howells1-3/+1
Do a bit of a style clean up in the key management code. No functional changes. Done using: perl -p -i -e 's!^/[*]*/\n!!' security/keys/*.c perl -p -i -e 's!} /[*] end [a-z0-9_]*[(][)] [*]/\n!}\n!' security/keys/*.c sed -i -s -e ": next" -e N -e 's/^\n[}]$/}/' -e t -e P -e 's/^.*\n//' -e "b next" security/keys/*.c To remove /*****/ lines, remove comments on the closing brace of a function to name the function and remove blank lines before the closing brace of a function. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-02KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]David Howells1-0/+3
Add a keyctl to install a process's session keyring onto its parent. This replaces the parent's session keyring. Because the COW credential code does not permit one process to change another process's credentials directly, the change is deferred until userspace next starts executing again. Normally this will be after a wait*() syscall. To support this, three new security hooks have been provided: cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in the blank security creds and key_session_to_parent() - which asks the LSM if the process may replace its parent's session keyring. The replacement may only happen if the process has the same ownership details as its parent, and the process has LINK permission on the session keyring, and the session keyring is owned by the process, and the LSM permits it. Note that this requires alteration to each architecture's notify_resume path. This has been done for all arches barring blackfin, m68k* and xtensa, all of which need assembly alteration to support TIF_NOTIFY_RESUME. This allows the replacement to be performed at the point the parent process resumes userspace execution. This allows the userspace AFS pioctl emulation to fully emulate newpag() and the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to alter the parent process's PAG membership. However, since kAFS doesn't use PAGs per se, but rather dumps the keys into the session keyring, the session keyring of the parent must be replaced if, for example, VIOCSETTOK is passed the newpag flag. This can be tested with the following program: #include <stdio.h> #include <stdlib.h> #include <keyutils.h> #define KEYCTL_SESSION_TO_PARENT 18 #define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0) int main(int argc, char **argv) { key_serial_t keyring, key; long ret; keyring = keyctl_join_session_keyring(argv[1]); OSERROR(keyring, "keyctl_join_session_keyring"); key = add_key("user", "a", "b", 1, keyring); OSERROR(key, "add_key"); ret = keyctl(KEYCTL_SESSION_TO_PARENT); OSERROR(ret, "KEYCTL_SESSION_TO_PARENT"); return 0; } Compiled and linked with -lkeyutils, you should see something like: [dhowells@andromeda ~]$ keyctl show Session Keyring -3 --alswrv 4043 4043 keyring: _ses 355907932 --alswrv 4043 -1 \_ keyring: _uid.4043 [dhowells@andromeda ~]$ /tmp/newpag [dhowells@andromeda ~]$ keyctl show Session Keyring -3 --alswrv 4043 4043 keyring: _ses 1055658746 --alswrv 4043 4043 \_ user: a [dhowells@andromeda ~]$ /tmp/newpag hello [dhowells@andromeda ~]$ keyctl show Session Keyring -3 --alswrv 4043 4043 keyring: hello 340417692 --alswrv 4043 4043 \_ user: a Where the test program creates a new session keyring, sticks a user key named 'a' into it and then installs it on its parent. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2008-04-29keys: add keyctl function to get a security labelDavid Howells1-0/+3
Add a keyctl() function to get the security label of a key. The following is added to Documentation/keys.txt: (*) Get the LSM security context attached to a key. long keyctl(KEYCTL_GET_SECURITY, key_serial_t key, char *buffer, size_t buflen) This function returns a string that represents the LSM security context attached to a key in the buffer provided. Unless there's an error, it always returns the amount of data it could produce, even if that's too big for the buffer, but it won't copy more than requested to userspace. If the buffer pointer is NULL then no copy will take place. A NUL character is included at the end of the string if the buffer is sufficiently big. This is included in the returned count. If no LSM is in force then an empty string will be returned. A process must have view permission on the key for this function to be successful. [akpm@linux-foundation.org: declare keyctl_get_security()] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Cc: Paul Moore <paul.moore@hp.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: James Morris <jmorris@namei.org> Cc: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14[PATCH] remove many unneeded #includes of sched.hTim Schmielau1-1/+0
After Al Viro (finally) succeeded in removing the sched.h #include in module.h recently, it makes sense again to remove other superfluous sched.h includes. There are quite a lot of files which include it but don't actually need anything defined in there. Presumably these includes were once needed for macros that used to live in sched.h, but moved to other header files in the course of cleaning it up. To ease the pain, this time I did not fiddle with any header files and only removed #includes from .c-files, which tend to cause less trouble. Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha, arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig, allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all configs in arch/arm/configs on arm. I also checked that no new warnings were introduced by the patch (actually, some warnings are removed that were emitted by unnecessarily included header files). Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2006-01-08[PATCH] keys: Permit running process to instantiate keysDavid Howells1-0/+3
Make it possible for a running process (such as gssapid) to be able to instantiate a key, as was requested by Trond Myklebust for NFS4. The patch makes the following changes: (1) A new, optional key type method has been added. This permits a key type to intercept requests at the point /sbin/request-key is about to be spawned and do something else with them - passing them over the rpc_pipefs files or netlink sockets for instance. The uninstantiated key, the authorisation key and the intended operation name are passed to the method. (2) The callout_info is no longer passed as an argument to /sbin/request-key to prevent unauthorised viewing of this data using ps or by looking in /proc/pid/cmdline. This means that the old /sbin/request-key program will not work with the patched kernel as it will expect to see an extra argument that is no longer there. A revised keyutils package will be made available tomorrow. (3) The callout_info is now attached to the authorisation key. Reading this key will retrieve the information. (4) A new field has been added to the task_struct. This holds the authorisation key currently active for a thread. Searches now look here for the caller's set of keys rather than looking for an auth key in the lowest level of the session keyring. This permits a thread to be servicing multiple requests at once and to switch between them. Note that this is per-thread, not per-process, and so is usable in multithreaded programs. The setting of this field is inherited across fork and exec. (5) A new keyctl function (KEYCTL_ASSUME_AUTHORITY) has been added that permits a thread to assume the authority to deal with an uninstantiated key. Assumption is only permitted if the authorisation key associated with the uninstantiated key is somewhere in the thread's keyrings. This function can also clear the assumption. (6) A new magic key specifier has been added to refer to the currently assumed authorisation key (KEY_SPEC_REQKEY_AUTH_KEY). (7) Instantiation will only proceed if the appropriate authorisation key is assumed first. The assumed authorisation key is discarded if instantiation is successful. (8) key_validate() is moved from the file of request_key functions to the file of permissions functions. (9) The documentation is updated. From: <Valdis.Kletnieks@vt.edu> Build fix. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Alexander Zangerl <az@bond.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] keys: Permit key expiry time to be setDavid Howells1-0/+3
Add a new keyctl function that allows the expiry time to be set on a key or removed from a key, provided the caller has attribute modification access. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Alexander Zangerl <az@bond.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24[PATCH] Keys: Make request-key create an authorisation keyDavid Howells1-2/+5
The attached patch makes the following changes: (1) There's a new special key type called ".request_key_auth". This is an authorisation key for when one process requests a key and another process is started to construct it. This type of key cannot be created by the user; nor can it be requested by kernel services. Authorisation keys hold two references: (a) Each refers to a key being constructed. When the key being constructed is instantiated the authorisation key is revoked, rendering it of no further use. (b) The "authorising process". This is either: (i) the process that called request_key(), or: (ii) if the process that called request_key() itself had an authorisation key in its session keyring, then the authorising process referred to by that authorisation key will also be referred to by the new authorisation key. This means that the process that initiated a chain of key requests will authorise the lot of them, and will, by default, wind up with the keys obtained from them in its keyrings. (2) request_key() creates an authorisation key which is then passed to /sbin/request-key in as part of a new session keyring. (3) When request_key() is searching for a key to hand back to the caller, if it comes across an authorisation key in the session keyring of the calling process, it will also search the keyrings of the process specified therein and it will use the specified process's credentials (fsuid, fsgid, groups) to do that rather than the calling process's credentials. This allows a process started by /sbin/request-key to find keys belonging to the authorising process. (4) A key can be read, even if the process executing KEYCTL_READ doesn't have direct read or search permission if that key is contained within the keyrings of a process specified by an authorisation key found within the calling process's session keyring, and is searchable using the credentials of the authorising process. This allows a process started by /sbin/request-key to read keys belonging to the authorising process. (5) The magic KEY_SPEC_*_KEYRING key IDs when passed to KEYCTL_INSTANTIATE or KEYCTL_NEGATE will specify a keyring of the authorising process, rather than the process doing the instantiation. (6) One of the process keyrings can be nominated as the default to which request_key() should attach new keys if not otherwise specified. This is done with KEYCTL_SET_REQKEY_KEYRING and one of the KEY_REQKEY_DEFL_* constants. The current setting can also be read using this call. (7) request_key() is partially interruptible. If it is waiting for another process to finish constructing a key, it can be interrupted. This permits a request-key cycle to be broken without recourse to rebooting. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-Off-By: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16Linux-2.6.12-rc2v2.6.12-rc2Linus Torvalds1-0/+78
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!